Template:Least Squares/Rank Regression Equations

From ReliaWiki
Revision as of 17:27, 25 July 2011 by Nicolette Young (talk | contribs) (Created page with 'Least Squares/Rank Regression Equations Rank Regression on Y Assume that a set of data pairs (x1, y1), (x2, y2), ... , (xN, yN), were obtained and plotted. Then, according to the…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Least Squares/Rank Regression Equations Rank Regression on Y Assume that a set of data pairs (x1, y1), (x2, y2), ... , (xN, yN), were obtained and plotted. Then, according to the least squares principle, which minimizes the vertical distance between the data points and the straight line fitted to the data, the best fitting straight line to these data is the straight line y = + x such that:


and where and are the least squares estimates of a and b, and N is the number of data points.

To obtain and , let:


Differentiating F with respect to a and b yields:

(1)

and:

(2)

Setting Eqns. (1) and (2) equal to zero yields:


and:


Solving the equations simultaneously yields:

(3)

and:

(4)

Rank Regression on X Assume that a set of data pairs (x1, y1), (x2, y2), ... , (xN, yN) were obtained and plotted. Then, according to the least squares principle, which minimizes the horizontal distance between the data points and the straight line fitted to the data, the best fitting straight line to these data is the straight line x = + y such that:


Again, and are the least squares estimates of a and b, and N is the number of data points.

To obtain and , let:


Differentiating F with respect to a and b yields:

(5)

and:

(6)

Setting Eqns. (5) and (6) equal to zero yields:


and:


Solving the above equations simultaneously yields:

(7)

and:

(8)

Solving the equation of the line for y yields:


Illustrating with an Example Fit a least squares straight line using regression on X and regression on Y to the following data:

x

1
2.5
4
6
8
9
11
15

y

1.5
2
4
4
5
7
8
10

The first step is to generate the following table:

Table A.1 - Data analysis for the least squares method


Using the results in Table A.1, Eqns. (3) and (4) yield:


and:


The least squares line is given by:


The plotted line is shown in the next figure.


For rank regression on X using the analyzed data in Table A.1, Eqns. (8) and (7) yield:


and:


The least squares line is given by:


The plotted line is shown in the next figure.


Note that the regression on Y is not necessarily the same as the regression on X. The only time when the two regressions are the same (i.e. will yield the same equation for a line) is when the data lie perfectly on a line.

The correlation coefficient is given by: