ReliaSoft's Alternate Ranking Method: Difference between revisions
No edit summary |
No edit summary |
||
Line 5: | Line 5: | ||
In probability plotting or rank regression analysis of interval or left censored data, difficulties arise when attempting to estimate the exact time within the interval when the failure actually occurs, especially when an overlap on the intervals is present. In this case, the ''standard ranking method'' (SRM) is not applicable when dealing with interval data; thus, ReliaSoft has formulated a more sophisticated methodology to allow for more accurate probability plotting and regression analysis of data sets with interval or left censored data. This method utilizes the traditional rank regression method and iteratively improves upon the computed ranks by parametrically recomputing new ranks and the most probable failure time for interval data. | In probability plotting or rank regression analysis of interval or left censored data, difficulties arise when attempting to estimate the exact time within the interval when the failure actually occurs, especially when an overlap on the intervals is present. In this case, the ''standard ranking method'' (SRM) is not applicable when dealing with interval data; thus, ReliaSoft has formulated a more sophisticated methodology to allow for more accurate probability plotting and regression analysis of data sets with interval or left censored data. This method utilizes the traditional rank regression method and iteratively improves upon the computed ranks by parametrically recomputing new ranks and the most probable failure time for interval data. | ||
In the traditional method for plotting or rank regression analysis of right censored data, the effect of the exact censoring time is not considered. One example of this can be seen at [[ | In the traditional method for plotting or rank regression analysis of right censored data, the effect of the exact censoring time is not considered. One example of this can be seen at [[Parameter_Estimation#Shortfalls_of_the_Rank_Adjustment_Method]]. The ReliaSoft ranking method also can be used to overcome this shortfall of the standard ranking method. | ||
The following step-by-step example illustrates the ReliaSoft ranking method (RRM), which is an iterative improvement on the standard ranking method (SRM). Although this method is illustrated by the use of the two-parameter Weibull distribution, it can be easily generalized for other models. | The following step-by-step example illustrates the ReliaSoft ranking method (RRM), which is an iterative improvement on the standard ranking method (SRM). Although this method is illustrated by the use of the two-parameter Weibull distribution, it can be easily generalized for other models. |
Revision as of 21:04, 25 February 2015
This article appears in the Life Data Analysis Reference book.
In probability plotting or rank regression analysis of interval or left censored data, difficulties arise when attempting to estimate the exact time within the interval when the failure actually occurs, especially when an overlap on the intervals is present. In this case, the standard ranking method (SRM) is not applicable when dealing with interval data; thus, ReliaSoft has formulated a more sophisticated methodology to allow for more accurate probability plotting and regression analysis of data sets with interval or left censored data. This method utilizes the traditional rank regression method and iteratively improves upon the computed ranks by parametrically recomputing new ranks and the most probable failure time for interval data.
In the traditional method for plotting or rank regression analysis of right censored data, the effect of the exact censoring time is not considered. One example of this can be seen at Parameter_Estimation#Shortfalls_of_the_Rank_Adjustment_Method. The ReliaSoft ranking method also can be used to overcome this shortfall of the standard ranking method.
The following step-by-step example illustrates the ReliaSoft ranking method (RRM), which is an iterative improvement on the standard ranking method (SRM). Although this method is illustrated by the use of the two-parameter Weibull distribution, it can be easily generalized for other models.
Consider the following test data:
Table B.1- The Test Data | |||
Number of Items | Type | Last Inspection | Time |
---|---|---|---|
1 | Exact Failure | 10 | |
1 | Right Censored | 20 | |
2 | Left Censored | 0 | 30 |
2 | Exact Failure | 40 | |
1 | Exact Failure | 50 | |
1 | Right Censored | 60 | |
1 | Left Censored | 0 | 70 |
2 | Interval Failure | 20 | 80 |
1 | Interval Failure | 10 | 85 |
1 | Left Censored | 0 | 100 |
Initial Parameter Estimation
As a preliminary step, we need to provide a crude estimate of the Weibull parameters for this data. To begin, we will extract the exact times-to-failure (10, 40, and 50) and the midpoints of the interval failures. The midpoints are 50 (for the interval of 20 to 80) and 47.5 (for the interval of 10 to 85). Next, we group together the items that have the same failure times, as shown in Table B.2.
Using the traditional rank regression, we obtain the first initial estimates:
- [math]\displaystyle{ \begin{align} & {{\widehat{\beta }}_{0}}= & 1.91367089 \\ & {{\widehat{\eta }}_{0}}= & 43.91657736 \end{align}\,\! }[/math]
Table B.2- The Union of Exact Times-to-Failure with the "Midpoint" of the Interval Failures | |||
Number of Items | Type | Last Inspection | Time |
---|---|---|---|
1 | Exact Failure | 10 | |
2 | Exact Failure | 40 | |
1 | Exact Failure | 47.5 | |
3 | Exact Failure | 50 |
Step 1
For all intervals, we obtain a weighted midpoint using:
- [math]\displaystyle{ \begin{align} {{{\hat{t}}}_{m}}\left( \hat{\beta },\hat{\eta } \right)= & \frac{\int_{LI}^{TF}t\text{ }f(t;\hat{\beta },\hat{\eta })dt}{\int_{LI}^{TF}f(t;\hat{\beta },\hat{\eta })dt}, \\ = & \frac{\int_{LI}^{TF}t\tfrac{{\hat{\beta }}}{{\hat{\eta }}}{{\left( \tfrac{t}{{\hat{\eta }}} \right)}^{\hat{\beta }-1}}{{e}^{-{{\left( \tfrac{t}{{\hat{\eta }}} \right)}^{{\hat{\beta }}}}}}dt}{\int_{LI}^{TF}\tfrac{{\hat{\beta }}}{{\hat{\eta }}}{{\left( \tfrac{t}{{\hat{\eta }}} \right)}^{\hat{\beta }-1}}{{e}^{-{{\left( \tfrac{t}{{\hat{\eta }}} \right)}^{{\hat{\beta }}}}}}dt} \end{align}\,\! }[/math]
This transforms our data into the format in Table B.3.
Table B.3- The Union of Exact Times-to-Failure with the "Midpoint" of the Interval Failures, Based upon the Parameters [math]\displaystyle{ \beta\,\! }[/math] and [math]\displaystyle{ \eta\,\! }[/math]. | ||||
Number of Items | Type | Last Inspection | Time | Weighted "Midpoint" |
---|---|---|---|---|
1 | Exact Failure | 10 | ||
2 | Exact Failure | 40 | ||
1 | Exact Failure | 50 | ||
2 | Interval Failure | 20 | 80 | 42.837 |
1 | Interval Failure | 10 | 85 | 39.169 |
Step 2
Now we arrange the data as in Table B.4.
Table B.4- The Union of Exact Times-to-Failure with the "Midpoint" of the Interval Failures, in Ascending Order. | |
Number of Items | Time |
---|---|
1 | 10 |
1 | 39.169 |
2 | 40 |
2 | 42.837 |
1 | 50 |
Step 3
We now consider the left and right censored data, as in Table B.5.
Table B.5- Computation of Increments in a Matrix Format for Computing a Revised Mean Order Number. | ||||||
Number of items | Time of Failure | 2 Left Censored t = 30 | 1 Left Censored t = 70 | 1 Left Censored t = 100 | 1 Right Censored t = 20 | 1 Right Censored t = 60 |
---|---|---|---|---|---|---|
1 | 10 | [math]\displaystyle{ 2 \frac{\int_0^{10} f_0(t)dt}{F_0 (30)-F_0 (0)}\,\! }[/math] | [math]\displaystyle{ \frac{\int_0^{10} f_0 (t)dt}{F_0(70)-F_1(0)}\,\! }[/math] | [math]\displaystyle{ \frac{\int_0^{10} f_0(t)dt}{F_0(100)-F_0(0)}\,\! }[/math] | 0 | 0 |
1 | 39.169 | [math]\displaystyle{ 2 \frac{\int_{10}^{30} f_0(t)dt}{F_0(30)-F_0(0)}\,\! }[/math] | [math]\displaystyle{ \frac{\int_{10}^{39.169} f_0(t)dt}{F_0(70)-F_0(0)}\,\! }[/math] | [math]\displaystyle{ \frac{\int_{10}^{39.169} f_0(t)dt}{F_0(100)-F_0(0)}\,\! }[/math] | [math]\displaystyle{ \frac{\int_{20}^{39.169} f_0(t)dt}{F_0(\infty)-F_0(20)}\,\! }[/math] | 0 |
2 | 40 | 0 | [math]\displaystyle{ \frac{\int_{39.169}^{40} f_0(t)dt}{F_0(70)-F_0(0)}\,\! }[/math] | [math]\displaystyle{ \frac{\int_{39.169}^{40} f_0(t)dt}{F_0(100)-F_0(0)}\,\! }[/math] | [math]\displaystyle{ \frac{\int_{39.169}^{40} f_0(t)dt}{F_0(\infty)-F_0(20)}\,\! }[/math] | 0 |
2 | 42.837 | 0 | [math]\displaystyle{ \frac{\int_{40}^{42.837} f_0(t)dt}{F_0(70)-F_0(0)}\,\! }[/math] | [math]\displaystyle{ \frac{\int_{40}^{42.837} f_0(t)dt}{F_0(100)-F_0(0)}\,\! }[/math] | [math]\displaystyle{ \frac{\int_{40}^{42.837} f_0(t)dt}{F_0(\infty)-F_0(0)}\,\! }[/math] | 0 |
1 | 50 | 0 | [math]\displaystyle{ \frac{\int_{42.837}^{50} f_0(t)dt}{F_0(70)-F_0(0)}\,\! }[/math] | [math]\displaystyle{ \frac{\int_{42.837}^{50} f_0(t)dt}{F_0(100)-F_0(0)}\,\! }[/math] | [math]\displaystyle{ \frac{\int_{42.837}^{50} f_0(t)dt}{F_0(\infty)-F_0(0)}\,\! }[/math] | 0 |
In general, for left censored data:
- • The increment term for [math]\displaystyle{ n\,\! }[/math] left censored items at time [math]\displaystyle{ ={{t}_{0}},\,\! }[/math] with a time-to-failure of [math]\displaystyle{ {{t}_{i}}\,\! }[/math] when [math]\displaystyle{ {{t}_{0}}\le {{t}_{i-1}}\,\! }[/math] is zero.
- • When [math]\displaystyle{ {{t}_{0}}\gt {{t}_{i-1}},\,\! }[/math] the contribution is:
- [math]\displaystyle{ \frac{n}{{{F}_{0}}({{t}_{0}})-{{F}_{0}}(0)}\underset{{{t}_{i-1}}}{\overset{MIN({{t}_{i}},{{t}_{0}})}{\mathop \int }}\,{{f}_{0}}\left( t \right)dt\,\! }[/math]
- or:
- [math]\displaystyle{ n\frac{{{F}_{0}}(MIN({{t}_{i}},{{t}_{0}}))-{{F}_{0}}({{t}_{i-1}})}{{{F}_{0}}({{t}_{0}})-{{F}_{0}}(0)}\,\! }[/math]
where [math]\displaystyle{ {{t}_{i-1}}\,\! }[/math] is the time-to-failure previous to the [math]\displaystyle{ {{t}_{i}}\,\! }[/math] time-to-failure and [math]\displaystyle{ n\,\! }[/math] is the number of units associated with that time-to-failure (or units in the group).
In general, for right censored data:
- • The increment term for [math]\displaystyle{ n\,\! }[/math] right censored at time [math]\displaystyle{ ={{t}_{0}},\,\! }[/math] with a time-to-failure of [math]\displaystyle{ {{t}_{i}}\,\! }[/math], when [math]\displaystyle{ {{t}_{0}}\ge {{t}_{i}}\,\! }[/math] is zero.
- • When [math]\displaystyle{ {{t}_{0}}\lt {{t}_{i}},\,\! }[/math] the contribution is:
- [math]\displaystyle{ \frac{n}{{{F}_{0}}(\infty )-{{F}_{0}}({{t}_{0}})}\underset{MAX({{t}_{0}},{{t}_{i-1}})}{\overset{{{t}_{i}}}{\mathop \int }}\,{{f}_{0}}\left( t \right)dt\,\! }[/math]
- or:
- [math]\displaystyle{ n\frac{{{F}_{0}}({{t}_{i}})-{{F}_{0}}(MAX({{t}_{0}},{{t}_{i-1}}))}{{{F}_{0}}(\infty )-{{F}_{0}}({{t}_{0}})}\,\! }[/math]
where [math]\displaystyle{ {{t}_{i-1}}\,\! }[/math] is the time-to-failure previous to the [math]\displaystyle{ {{t}_{i}}\,\! }[/math] time-to-failure and [math]\displaystyle{ n\,\! }[/math] is the number of units associated with that time-to-failure (or units in the group).
Step 4
Sum up the increments (horizontally in rows), as in Table B.6.
Table B.6- Increments Solved Numerically, Along with the Sum of Each Row. | |||||||
Number of items | Time of Failure | 2 Left Censored t=30 | 1 Left Censored t=70 | 1 Left Censored t=100 | 1 Right Censored t=20 | 1 Right Censored t=60 | Sum of row(increment) |
---|---|---|---|---|---|---|---|
1 | 10 | 0.299065 | 0.062673 | 0.057673 | 0 | 0 | 0.419411 |
1 | 39.169 | 1.700935 | 0.542213 | 0.498959 | 0.440887 | 0 | 3.182994 |
2 | 40 | 0 | 0.015892 | 0.014625 | 0.018113 | 0 | 0.048630 |
2 | 42.831 | 0 | 0.052486 | 0.048299 | 0.059821 | 0 | 0.160606 |
1 | 50 | 0 | 0.118151 | 0.108726 | 0.134663 | 0 | 0.361540 |
Step 5
Compute new mean order numbers (MON), as shown Table B.7, utilizing the increments obtained in Table B.6, by adding the number of items plus the previous MON plus the current increment.
Table B.7- Mean Order Numbers (MON) | |||
Number of items | Time of Failure | Sum of row(increment) | Mean Order Number |
---|---|---|---|
1 | 10 | 0.419411 | 1.419411 |
1 | 39.169 | 3.182994 | 5.602405 |
2 | 40 | 0.048630 | 7.651035 |
2 | 42.837 | 0.160606 | 9.811641 |
1 | 50 | 0.361540 | 11.173181 |
Step 6
Compute the median ranks based on these new MONs as shown in Table B.8.
Table B.8- Mean Order Numbers with Their Ranks for a Sample Size of 13 Units. | ||
Time | MON | Ranks |
---|---|---|
10 | 1.419411 | 0.0825889 |
39.169 | 5.602405 | 0.3952894 |
40 | 7.651035 | 0.5487781 |
42.837 | 9.811641 | 0.7106217 |
50 | 11.173181 | 0.8124983 |
Step 7
Compute new [math]\displaystyle{ \beta \,\! }[/math] and [math]\displaystyle{ \eta ,\,\! }[/math] using standard rank regression and based upon the data as shown in Table B.9.
Time | Ranks |
---|---|
10 | 0.0826889 |
39.169 | 0.3952894 |
40 | 0.5487781 |
42.837 | 0.7106217 |
50 | 0.8124983 |
Step 8
Return and repeat the process from Step 1 until an acceptable convergence is reached on the parameters (i.e., the parameter values stabilize).
Results
The results of the first five iterations are shown in Table B.10.
Using Weibull++ with rank regression on X yields:
Table B.10-The parameters after the first five iterations | ||
Iteration | [math]\displaystyle{ \beta\,\! }[/math] | [math]\displaystyle{ \eta\,\! }[/math] |
---|---|---|
1 | 1.845638 | 42.576422 |
2 | 1.830621 | 42.039743 |
3 | 1.828010 | 41.830615 |
4 | 1.828030 | 41.749708 |
5 | 1.828383 | 41.717990 |
- [math]\displaystyle{ {{\widehat{\beta }}_{RRX}}=1.82890,\text{ }{{\widehat{\eta }}_{RRX}}=41.69774\,\! }[/math]
The direct MLE solution yields:
- [math]\displaystyle{ {{\widehat{\beta }}_{MLE}}=2.10432,\text{ }{{\widehat{\eta }}_{MLE}}=42.31535\,\! }[/math]