Template:Least squares (linear regression)

From ReliaWiki
Revision as of 20:44, 5 January 2012 by Nicolette Young (talk | contribs) (Created page with '===Least Squares (Linear Regression)=== <br> The parameters can also be estimated using a mathematical approach. To do this, apply least squares analysis on Eqn. (eq73): ::<ma…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Least Squares (Linear Regression)


The parameters can also be estimated using a mathematical approach. To do this, apply least squares analysis on Eqn. (eq73):

[math]\displaystyle{ \ln ({{\hat{m}}_{c}})=\ln (b)+\alpha \ln (T) }[/math]

And for simplicity in the calculations, let:

[math]\displaystyle{ \begin{align} & \ln ({{m}_{ci}})= & {{Y}_{i}} \\ & \ln (b)= & a \\ & \alpha = & c \\ & \ln ({{T}_{i}})= & {{X}_{i}} \end{align} }[/math]

Therefore, Eqn. (mc) becomes:

[math]\displaystyle{ {{Y}_{i}}=\widehat{a}+\widehat{c}{{X}_{i}} }[/math]


Assume that a set of data pairs [math]\displaystyle{ ({{X}_{1}},{{Y}_{1}}) }[/math] , [math]\displaystyle{ ({{X}_{2}},{{Y}_{2}}) }[/math] ,..., [math]\displaystyle{ ({{X}_{N}},{{Y}_{N}}) }[/math] were obtained and plotted. Then according to the Least Squares Principle, which minimizes the vertical distance between the data points and the straight line fitted to the data, the best fitting straight line to this data set is the straight line [math]\displaystyle{ Y=\widehat{a}+\widehat{c}X }[/math] such that:

[math]\displaystyle{ \underset{i=1}{\overset{N}{\mathop \sum }}\,{{(\widehat{a}+\widehat{c}{{X}_{i}}-{{Y}_{i}})}^{2}}=\underset{(a,c)}{\mathop{min}}\,\underset{i=1}{\overset{N}{\mathop \sum }}\,{{(a+c{{X}_{i}}-{{Y}_{i}})}^{2}} }[/math]

And where [math]\displaystyle{ \widehat{a} }[/math] and [math]\displaystyle{ \widehat{c} }[/math] are the least squares estimates of [math]\displaystyle{ a }[/math] and [math]\displaystyle{ c }[/math] . To obtain [math]\displaystyle{ \widehat{a} }[/math] and [math]\displaystyle{ \widehat{c} }[/math] , let:

[math]\displaystyle{ F=\underset{i=1}{\overset{N}{\mathop \sum }}\,{{(a+c{{X}_{i}}-{{Y}_{i}})}^{2}} }[/math]

Differentiating [math]\displaystyle{ F }[/math] with respect to [math]\displaystyle{ a }[/math] and [math]\displaystyle{ c }[/math] yields:

[math]\displaystyle{ \frac{\partial F}{\partial a}=2\underset{i=1}{\overset{N}{\mathop \sum }}\,(a+c{{X}_{i}}-{{Y}_{i}}) }[/math]
and:
[math]\displaystyle{ \frac{\partial F}{\partial c}=2\underset{i=1}{\overset{N}{\mathop \sum }}\,(a+c{{X}_{i}}-{{Y}_{i}}){{X}_{i}} }[/math]

Set Eqns. (ls2) and (ls3) equal to zero:

[math]\displaystyle{ \underset{i=1}{\overset{N}{\mathop \sum }}\,(a+c{{X}_{i}}-{{Y}_{i}})=\underset{i=1}{\overset{N}{\mathop \sum }}\,(\widehat{{{Y}_{i}}}-{{Y}_{i}})=-\underset{i=1}{\overset{N}{\mathop \sum }}\,({{Y}_{i}}-\widehat{{{Y}_{i}}})=0 }[/math]
and:
[math]\displaystyle{ \underset{i=1}{\overset{N}{\mathop \sum }}\,(a+c{{X}_{i}}-{{Y}_{i}}){{X}_{i}}=\underset{i=1}{\overset{N}{\mathop \sum }}\,(\widehat{{{Y}_{i}}}-{{Y}_{i}}){{X}_{i}}=-\underset{i=1}{\overset{N}{\mathop \sum }}\,({{Y}_{i}}-\widehat{{{Y}_{i}}}){{X}_{i}}=0 }[/math]

Solve the equations simultaneously:

and:
[math]\displaystyle{ \widehat{c}=\frac{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{X}_{i}}{{Y}_{i}}-\tfrac{\left( \underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{X}_{i}}\underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{Y}_{i}} \right)}{N}}{\underset{i=1}{\overset{N}{\mathop{\sum }}}\,X_{i}^{2}-\tfrac{{{\left( \underset{i=1}{\overset{N}{\mathop{\sum }}}\,{{X}_{i}} \right)}^{2}}}{N}} }[/math]

Now substituting back [math]\displaystyle{ \ln ({{m}_{ci}})={{Y}_{i}}, }[/math] [math]\displaystyle{ \ln (b)=a, }[/math] [math]\displaystyle{ \alpha =c }[/math] and [math]\displaystyle{ \ln ({{T}_{i}})={{X}_{i}}, }[/math] we have:

[math]\displaystyle{ \widehat{b}={{e}^{\tfrac{1}{n}\left[ \underset{i=1}{\overset{n}{\mathop{\sum }}}\,\ln ({{m}_{ci}})-\alpha \underset{i=1}{\overset{n}{\mathop{\sum }}}\,\ln ({{T}_{i}}) \right]}} }[/math]
where:
[math]\displaystyle{ \widehat{\alpha }=\frac{\underset{i=1}{\overset{n}{\mathop{\sum }}}\,\ln ({{T}_{i}})\ln ({{m}_{ci}})-\tfrac{\underset{i=1}{\overset{n}{\mathop{\sum }}}\,\ln ({{T}_{i}})\underset{i=1}{\overset{n}{\mathop{\sum }}}\,\ln ({{m}_{ci}})}{n}}{\underset{i=1}{\overset{n}{\mathop{\sum }}}\,{{\left[ \ln ({{T}_{i}}) \right]}^{2}}-\tfrac{{{\left( \underset{i=1}{\overset{n}{\mathop{\sum }}}\,\ln ({{T}_{i}}) \right)}^{2}}}{n}} }[/math]

Example 2
Using the data from Table 4.2, estimate the parameters of the MTBF model using least squares.

Solution
From Table 4.2:

[math]\displaystyle{ \begin{align} & \underset{i=1}{\overset{n}{\mathop \sum }}\,\ln ({{T}_{i}})= & 25.693 \\ & \underset{i=1}{\overset{n}{\mathop \sum }}\,\ln ({{T}_{i}})\ln ({{m}_{ci}})= & 130.66 \\ & \underset{i=1}{\overset{n}{\mathop \sum }}\,\ln ({{m}_{ci}})= & 20.116 \\ & \underset{i=1}{\overset{n}{\mathop \sum }}\,{{\left[ \ln ({{T}_{i}}) \right]}^{2}}= & 168.99 \end{align} }[/math]

From Eqn. (Dalpha):

[math]\displaystyle{ \begin{align} & \widehat{\alpha }= & \frac{130.66-\tfrac{25.693\cdot 20.116}{4}}{168.99-\tfrac{{{25.693}^{2}}}{4}} \\ & = & 0.3671 \end{align} }[/math]

Also from Eqn. (Dbi):

[math]\displaystyle{ \begin{align} & \widehat{b}= & {{e}^{\tfrac{1}{4}(20.116-0.3671\cdot 25.693)}} \\ & = & 14.456 \end{align} }[/math]

Therefore, Eqn. (duane6) becomes:

[math]\displaystyle{ {{\hat{m}}_{c}}=14.456\cdot {{T}^{0.3671}} }[/math]

The equation for the instantaneous MTBF growth curve using Eqn. (eq76) is:

[math]\displaystyle{ {{\hat{m}}_{i}}=\frac{1}{1-0.3671}(14.456){{T}^{0.3671}} }[/math]


Example 3
For the data given in columns 1 and 2 of Table 4.3, estimate the Duane parameters using least squares.


Table 4.3 - Failure times data
(1)Failure Number (2)Failure Time(hr) (3)[math]\displaystyle{ \ln{T_i} }[/math] (4)[math]\displaystyle{ \ln{T_i}^2 }[/math] (5)[math]\displaystyle{ m_c }[/math] (6)[math]\displaystyle{ \ln{m_c} }[/math] (7)[math]\displaystyle{ \ln{m_c}\cdot\ln{T_i} }[/math]
1 9.2 2.219 4.925 9.200 2.219 4.925
2 25 3.219 10.361 12.500 2.526 8.130
3 61.5 4.119 16.966 20.500 3.020 12.441
4 260 5.561 30.921 65.000 4.174 23.212
5 300 5.704 32.533 60.000 4.094 23.353
6 710 6.565 43.103 118.333 4.774 31.339
7 916 6.820 46.513 130.857 4.874 33.241
8 1010 6.918 47.855 126.250 4.838 33.470
9 1220 7.107 50.504 135.556 4.909 34.889
10 2530 7.836 61.402 253.000 5.533 43.359
11 3350 8.117 65.881 304.545 5.719 46.418
12 4200 8.343 69.603 350.000 5.858 48.872
13 4410 8.392 70.419 339.231 5.827 48.895
14 4990 8.515 72.508 356.429 5.876 50.036
15 5570 8.625 74.393 371.333 5.917 51.036
16 8310 9.025 81.455 519.375 6.253 56.431
17 8530 9.051 81.927 501.765 6.218 56.282
18 9200 9.127 83.301 511.111 6.237 56.921
19 10500 9.259 85.731 552.632 6.315 58.469
20 12100 9.401 88.378 605.000 6.405 60.215
21 13400 9.503 90.307 638.095 6.458 61.375
22 14600 9.589 91.945 663.636 6.498 62.305
23 22000 9.999 99.976 956.522 6.863 68.625
[math]\displaystyle{ \color{Blue}Sum = }[/math] [math]\displaystyle{ \color{Blue}173.013 }[/math] [math]\displaystyle{ \color{Blue}1400.908 }[/math] [math]\displaystyle{ \color{Blue}7600.870 }[/math] [math]\displaystyle{ \color{Blue}121.406 }[/math] [math]\displaystyle{ \color{Blue}974.242 }[/math]


Solution
To estimate the parameters using least squares, the values in columns 3, 4, 5, 6 and 7 are calculated. The cumulative MTBF, [math]\displaystyle{ {{m}_{c}} }[/math] , is calculated by dividing the failure time by the failure number. From Eqn. (Dalpha), [math]\displaystyle{ \widehat{\alpha } }[/math] is:

[math]\displaystyle{ \begin{align} & \widehat{\alpha }= & \frac{974.242-\tfrac{173.013\cdot 121.406}{23}}{1400.908-\tfrac{{{(173.013)}^{2}}}{23}} \\ & = & 0.6133 \end{align} }[/math]

The estimator of [math]\displaystyle{ b }[/math] can be estimated from Eqn. (Dbi):

[math]\displaystyle{ \begin{align} & \widehat{b}= & {{e}^{\tfrac{1}{23}(121.406-0.6133\cdot 173.013)}} \\ & = & 1.9453 \end{align} }[/math]

Therefore, Eqn. (duane6) becomes:

[math]\displaystyle{ {{\hat{m}}_{c}}=1.9453\cdot {{T}^{0.613}} }[/math]

Using Eqn. (eq76), the equation for the instantaneous MTBF growth curve is:

[math]\displaystyle{ {{\hat{m}}_{i}}=\frac{1}{1-0.613}(1.945){{T}^{0.613}} }[/math]



Duane plot for Example 3.


Example 4

For the data given in the Table 4.4, estimate the Duane parameters using least squares.


Table 4.4 - Multiple systems (known operating times) data}
Run Number Failed Unit Test Time 1 Test Time 2 Cumulative Time
1 1 0.2 2.0 2.2
2 2 1.7 2.9 4.6
3 2 4.5 5.2 9.7
4 2 5.8 9.1 14.9
5 2 17.3 9.2 26.5
6 2 29.3 24.1 53.4
7 1 36.5 61.1 97.6
8 2 46.3 69.6 115.9
9 1 63.6 78.1 141.7
10 2 64.4 85.4 149.8
11 1 74.3 93.6 167.9
12 1 106.6 103 209.6
13 2 195.2 117 312.2
14 2 235.1 134.3 369.4
15 1 248.7 150.2 398.9
16 2 256.8 164.6 421.4
17 2 261.1 174.3 435.4
18 2 299.4 193.2 492.6
19 1 305.3 234.2 539.5
20 1 326.9 257.3 584.2
21 1 339.2 290.2 629.4
22 1 366.1 293.1 659.2
23 2 466.4 316.4 782.8
24 1 504 373.2 877.2
25 1 510 375.1 885.1
26 2 543.2 386.1 929.3
27 2 635.4 453.3 1088.7
28 1 641.2 485.8 1127
29 2 755.8 573.6 1329.4

Solution

The solution to this example follows the same procedure as the previous example. Therefore, from Table 4.4:

[math]\displaystyle{ \begin{align} & \underset{i=1}{\overset{29}{\mathop \sum }}\,\ln ({{T}_{i}})= & 154.151 \\ & \underset{i=1}{\overset{29}{\mathop \sum }}\,\ln {{({{T}_{i}})}^{2}}= & 902.592 \\ & \underset{i=1}{\overset{29}{\mathop \sum }}\,\ln ({{m}_{c}})= & 82.884 \\ & \underset{i=1}{\overset{29}{\mathop \sum }}\,\ln ({{T}_{i}})\cdot \ln ({{m}_{c}})= & 483.154 \end{align} }[/math]

For least squares, Eqn. (Dalpha) is used to estimate [math]\displaystyle{ \alpha }[/math] :

[math]\displaystyle{ \begin{align} & \widehat{\alpha }= & \frac{483.154-\tfrac{154.151\cdot 82.884}{29}}{902.592-\tfrac{{{(154.151)}^{2}}}{29}} \\ & = & 0.5115 \end{align} }[/math]

The estimator of [math]\displaystyle{ b }[/math] can be estimated from Eqn. (Dbi):

[math]\displaystyle{ \begin{align} & \widehat{b}= & {{e}^{\tfrac{1}{29}(82.884-0.5115\cdot 154.151)}} \\ & = & 1.1495 \end{align} }[/math]

Therefore, from Eqn. (duane6):

[math]\displaystyle{ {{\hat{m}}_{c}}=1.1495\cdot {{T}^{0.5115}} }[/math]

Using Eqn. (eq76), the equation for the instantaneous MTBF growth curve is:

[math]\displaystyle{ {{\hat{m}}_{i}}=\frac{1}{1-0.5115}(1.1495){{T}^{0.5115}} }[/math]