Two Level Factorial Experiments: Difference between revisions

From ReliaWiki
Jump to navigation Jump to search
Line 408: Line 408:




[[Image:doe7_13.png|center|650px|Experiment design for the [[Two_Level_Factorial_Experiments#Example_2| example]].]]
[[Image:doe7_13.png|center|650px|Experiment design for the [[Two_Level_Factorial_Experiments#Example_2| example]].|link=]]





Revision as of 18:22, 20 April 2016

New format available! This reference is now available in a new format that offers faster page load, improved display for calculations and images, more targeted search and the latest content available as a PDF. As of September 2023, this Reliawiki page will not continue to be updated. Please update all links and bookmarks to the latest reference at help.reliasoft.com/reference/experiment_design_and_analysis

Chapter 8: Two Level Factorial Experiments


DOEbox.png

Chapter 8  
Two Level Factorial Experiments  

Synthesis-icon.png

Available Software:
Weibull++

Examples icon.png

More Resources:
DOE examples


Two level factorial experiments are factorial experiments in which each factor is investigated at only two levels. The early stages of experimentation usually involve the investigation of a large number of potential factors to discover the "vital few" factors. Two level factorial experiments are used during these stages to quickly filter out unwanted effects so that attention can then be focused on the important ones.

2k Designs

The factorial experiments, where all combination of the levels of the factors are run, are usually referred to as full factorial experiments. Full factorial two level experiments are also referred to as 2k designs where k denotes the number of factors being investigated in the experiment. In DOE++, these designs are referred to as 2 Level Factorial Designs as shown in the figure below.

Selection of full factorial experiments with two levels in DOE++.


A full factorial two level design with k factors requires 2k runs for a single replicate. For example, a two level experiment with three factors will require 2×2×2=23=8 runs. The choice of the two levels of factors used in two level experiments depends on the factor; some factors naturally have two levels. For example, if gender is a factor, then male and female are the two levels. For other factors, the limits of the range of interest are usually used. For example, if temperature is a factor that varies from 45oC to 90oC, then the two levels used in the 2k design for this factor would be 45oC and 90oC.

The two levels of the factor in the 2k design are usually represented as 1 (for the first level) and 1 (for the second level). Note that this representation is reversed from the coding used in General Full Factorial Designs for the indicator variables that represent two level factors in ANOVA models. For ANOVA models, the first level of the factor was represented using a value of 1 for the indicator variable, while the second level was represented using a value of 1. For details on the notation used for two level experiments refer to Notation.


The 22 Design

The simplest of the two level factorial experiments is the 22 design where two factors (say factor A and factor B) are investigated at two levels. A single replicate of this design will require four runs (22=2×2=4) The effects investigated by this design are the two main effects, A and B, and the interaction effect AB. The treatments for this design are shown in figure (a) below. In figure (a), letters are used to represent the treatments. The presence of a letter indicates the high level of the corresponding factor and the absence indicates the low level. For example, (1) represents the treatment combination where all factors involved are at the low level or the level represented by 1 ; a represents the treatment combination where factor A is at the high level or the level of 1, while the remaining factors (in this case, factor B) are at the low level or the level of 1. Similarly, b represents the treatment combination where factor B is at the high level or the level of 1, while factor A is at the low level and ab represents the treatment combination where factors A and B are at the high level or the level of the 1. Figure (b) below shows the design matrix for the 22 design. It can be noted that the sum of the terms resulting from the product of any two columns of the design matrix is zero. As a result the 22 design is an orthogonal design. In fact, all 2k designs are orthogonal designs. This property of the 2k designs offers a great advantage in the analysis because of the simplifications that result from orthogonality. These simplifications are explained later on in this chapter. The 22 design can also be represented geometrically using a square with the four treatment combinations lying at the four corners, as shown in figure (c) below.


The [math]\displaystyle{ 2^2\,\! }[/math] design. Figure (a) displays the experiment design, (b) displays the design matrix and (c) displays the geometric representation for the design. In Figure (b), the column names I, A, B and AB are used. Column I represents the intercept term. Columns A and B represent the respective factor settings. Column AB represents the interaction and is the product of columns A and B.


The 23 Design

The 23 design is a two level factorial experiment design with three factors (say factors A, B and C). This design tests three (k=3) main effects, A, B and C ; three ((2k)= (23)=3) two factor interaction effects, AB, BC, AC ; and one ((3k)= (33)=1) three factor interaction effect, ABC. The design requires eight runs per replicate. The eight treatment combinations corresponding to these runs are (1), a, b, ab, c, ac, bc and abc. Note that the treatment combinations are written in such an order that factors are introduced one by one with each new factor being combined with the preceding terms. This order of writing the treatments is called the standard order or Yates' order. The 23 design is shown in figure (a) below. The design matrix for the 23 design is shown in figure (b). The design matrix can be constructed by following the standard order for the treatment combinations to obtain the columns for the main effects and then multiplying the main effects columns to obtain the interaction columns.


The [math]\displaystyle{ 2^3\,\! }[/math] design. Figure (a) shows the experiment design and (b) shows the design matrix.
Geometric representation of the [math]\displaystyle{ 2^3\,\! }[/math] design.


The 23 design can also be represented geometrically using a cube with the eight treatment combinations lying at the eight corners as shown in the figure above.

Analysis of 2k Designs

The 2k designs are a special category of the factorial experiments where all the factors are at two levels. The fact that these designs contain factors at only two levels and are orthogonal greatly simplifies their analysis even when the number of factors is large. The use of 2k designs in investigating a large number of factors calls for a revision of the notation used previously for the ANOVA models. The case for revised notation is made stronger by the fact that the ANOVA and multiple linear regression models are identical for 2k designs because all factors are only at two levels. Therefore, the notation of the regression models is applied to the ANOVA models for these designs, as explained next.

Notation

Based on the notation used in General Full Factorial Designs, the ANOVA model for a two level factorial experiment with three factors would be as follows:

Y=μ+τ1x1+δ1x2+(τδ)11x1x2+γ1x3+(τγ)11x1x3+(δγ)11x2x3+(τδγ)111x1x2x3+ϵ

where:

μ represents the overall mean
τ1 represents the independent effect of the first factor (factor A) out of the two effects τ1 and τ2
δ1 represents the independent effect of the second factor (factor B) out of the two effects δ1 and δ2
(τδ)11 represents the independent effect of the interaction AB out of the other interaction effects
γ1 represents the effect of the third factor (factor C) out of the two effects γ1 and γ2
(τγ)11 represents the effect of the interaction AC out of the other interaction effects
(δγ)11 represents the effect of the interaction BC out of the other interaction effects
(τδγ)111 represents the effect of the interaction ABC out of the other interaction effects

and ϵ is the random error term.


The notation for a linear regression model having three predictor variables with interactions is:

Y=β0+β1x1+β2x2+β12x1x2+β3x3+β13x1x3+β23x2x3+β123x1x2x3+ϵ


The notation for the regression model is much more convenient, especially for the case when a large number of higher order interactions are present. In two level experiments, the ANOVA model requires only one indicator variable to represent each factor for both qualitative and quantitative factors. Therefore, the notation for the multiple linear regression model can be applied to the ANOVA model of the experiment that has all the factors at two levels. For example, for the experiment of the ANOVA model given above, β0 can represent the overall mean instead of μ, and β1 can represent the independent effect, τ1, of factor A. Other main effects can be represented in a similar manner. The notation for the interaction effects is much more simplified (e.g., β123 can be used to represent the three factor interaction effect, (τβγ)111).

As mentioned earlier, it is important to note that the coding for the indicator variables for the ANOVA models of two level factorial experiments is reversed from the coding followed in General Full Factorial Designs. Here 1 represents the first level of the factor while 1 represents the second level. This is because for a two level factor a single variable is needed to represent the factor for both qualitative and quantitative factors. For quantitative factors, using 1 for the first level (which is the low level) and 1 for the second level (which is the high level) keeps the coding consistent with the numerical value of the factors. The change in coding between the two coding schemes does not affect the analysis except that signs of the estimated effect coefficients will be reversed (i.e., numerical values of τ^1, obtained based on the coding of General Full Factorial Designs, and β^1, obtained based on the new coding, will be the same but their signs would be opposite).


Factor A Coding (two level factor)


Previous CodingCoding for 2k DesignsEffect τ1  :  x1=1 Effect τ1 (or β1)  :  x1=1 Effect τ2  :  x1=1 Effect τ2 (or β1)  :  x1=1 


In summary, the ANOVA model for the experiments with all factors at two levels is different from the ANOVA models for other experiments in terms of the notation in the following two ways:

• The notation of the regression models is used for the effect coefficients.
• The coding of the indicator variables is reversed.

Special Features

Consider the design matrix, X, for the 23 design discussed above. The (XX) 1 matrix is:


(XX)1=[0.125000000000.125000000000.125000000000.125000000000.125000000000.125000000000.125000000000.125]


Notice that, due to the orthogonal design of the X matrix, the (XX)1 has been simplified to a diagonal matrix which can be written as:


(XX)1=0.125I=18I=123I


where I represents the identity matrix of the same order as the design matrix, X. Since there are eight observations per replicate of the 23 design, the (X ' X)1 matrix for m replicates of this design can be written as:


(XX)1=1(23m)I


The (XX)1 matrix for any 2k design can now be written as:


(XX)1=1(2km)I


Then the variance-covariance matrix for the 2k design is:


C=σ^2(XX)1=MSE(XX)1=MSE(2km)I


Note that the variance-covariance matrix for the 2k design is also a diagonal matrix. Therefore, the estimated effect coefficients (β1, β2, β12, etc.) for these designs are uncorrelated. This implies that the terms in the 2k design (main effects, interactions) are independent of each other. Consequently, the extra sum of squares for each of the terms in these designs is independent of the sequence of terms in the model, and also independent of the presence of other terms in the model. As a result the sequential and partial sum of squares for the terms are identical for these designs and will always add up to the model sum of squares. Multicollinearity is also not an issue for these designs.

It can also be noted from the equation given above, that in addition to the C matrix being diagonal, all diagonal elements of the C matrix are identical. This means that the variance (or its square root, the standard error) of all estimated effect coefficients are the same. The standard error, se(β^j), for all the coefficients is:


se(β^j)=Cjj=MSE(2km) for all j


This property is used to construct the normal probability plot of effects in 2k designs and identify significant effects using graphical techniques. For details on the normal probability plot of effects in DOE++, refer to Normal Probability Plot of Effects.

Example

To illustrate the analysis of a full factorial 2k design, consider a three factor experiment to investigate the effect of honing pressure, number of strokes and cycle time on the surface finish of automobile brake drums. Each of these factors is investigated at two levels. The honing pressure is investigated at levels of 200 psi and 400 psi, the number of strokes used is 3 and 5 and the two levels of the cycle time are 3 and 5 seconds. The design for this experiment is set up in DOE++ as shown in the first two following figures. It is decided to run two replicates for this experiment. The surface finish data collected from each run (using randomization) and the complete design is shown in the third following figure. The analysis of the experiment data is explained next.


Design properties for the experiment in the example.


Design summary for the experiment in the example.


Experiment design for the example to investigate the surface finish of automobile brake drums.


The applicable model using the notation for 2k designs is:


Y=β0+β1x1+β2x2+β12x1x2+β3x3+β13x1x3+β23x2x3+β123x1x2x3+ϵ


where the indicator variable, x1, represents factor A (honing pressure), x1=1 represents the low level of 200 psi and x1=1 represents the high level of 400 psi. Similarly, x2 and x3 represent factors B (number of strokes) and C (cycle time), respectively. β0 is the overall mean, while β1, β2 and β3 are the effect coefficients for the main effects of factors A, B and C, respectively. β12, β13 and β23 are the effect coefficients for the AB, AC and BC interactions, while β123 represents the ABC interaction.


If the subscripts for the run (i ; i= 1 to 8) and replicates (j ; j= 1,2) are included, then the model can be written as:


Yij=β0+β1xij1+β2xij2+β12xij1xij2+β3xij3+β13xij1xij3+β23xij2xij3+β123xij1xij2xij3+ϵij


To investigate how the given factors affect the response, the following hypothesis tests need to be carried:

H0  :  β1=0
H1  :  β10

This test investigates the main effect of factor A (honing pressure). The statistic for this test is:

(F0)A=MSAMSE

where MSA is the mean square for factor A and MSE is the error mean square. Hypotheses for the other main effects, B and C, can be written in a similar manner.

H0  :  β12=0
H1  :  β120

This test investigates the two factor interaction AB. The statistic for this test is:

(F0)AB=MSABMSE

where MSAB is the mean square for the interaction AB and MSE is the error mean square. Hypotheses for the other two factor interactions, AC and BC, can be written in a similar manner.

H0  :  β123=0
H1  :  β1230

This test investigates the three factor interaction ABC. The statistic for this test is:

(F0)ABC=MSABCMSE

where MSABC is the mean square for the interaction ABC and MSE is the error mean square. To calculate the test statistics, it is convenient to express the ANOVA model in the form y=Xβ+ϵ.

Expression of the ANOVA Model as y=Xβ+ϵ

In matrix notation, the ANOVA model can be expressed as:

y=Xβ+ϵ

where:

y=[Y11Y21.Y81Y12.Y82]=[9090.9086.80] X=[1111111111111111........1111111111111111........11111111]


β=[β0β1β2β12β3β13β23β123] ϵ=[ϵ11ϵ21.ϵ81ϵ12..ϵ82]


Calculation of the Extra Sum of Squares for the Factors

Knowing the matrices y, X and β, the extra sum of squares for the factors can be calculated. These are used to calculate the mean squares that are used to obtain the test statistics. Since the experiment design is orthogonal, the partial and sequential extra sum of squares are identical. The extra sum of squares for each effect can be calculated as shown next. As an example, the extra sum of squares for the main effect of factor A is:


SSA=Model Sum of SquaresSum of Squares of model excluding the main effect of A=y[H(1/16)J]yy[H ~A(1/16)J]y


where H is the hat matrix and J is the matrix of ones. The matrix H ~A can be calculated using H ~A=X ~A(X ~AX ~A)1X ~A where X ~A is the design matrix, X, excluding the second column that represents the main effect of factor A. Thus, the sum of squares for the main effect of factor A is:


SSA=y[H(1/16)J]yy[H ~A(1/16)J]y=654.4375549.375=105.0625


Similarly, the extra sum of squares for the interaction effect AB is:


SSAB=y[H(1/16)J]yy[H ~AB(1/16)J]y=654.4375636.375=18.0625


The extra sum of squares for other effects can be obtained in a similar manner.

Calculation of the Test Statistics

Knowing the extra sum of squares, the test statistic for the effects can be calculated. For example, the test statistic for the interaction AB is:

(f0)AB=MSABMSE=SSAB/dof(SSAB)SSE/dof(SSE)=18.0625/1147.5/8=0.9797

where MSAB is the mean square for the AB interaction and MSE is the error mean square. The p value corresponding to the statistic, (f0)AB=0.9797, based on the F distribution with one degree of freedom in the numerator and eight degrees of freedom in the denominator is:

p value=1P(F(f0)AB)=10.6487=0.3513

Assuming that the desired significance is 0.1, since p value > 0.1, it can be concluded that the interaction between honing pressure and number of strokes does not affect the surface finish of the brake drums. Tests for other effects can be carried out in a similar manner. The results are shown in the ANOVA Table in the following figure. The values S, R-sq and R-sq(adj) in the figure indicate how well the model fits the data. The value of S represents the standard error of the model, R-sq represents the coefficient of multiple determination and R-sq(adj) represents the adjusted coefficient of multiple determination. For details on these values refer to Multiple Linear Regression Analysis.


ANOVA table for the experiment in the example.

Calculation of Effect Coefficients

The estimate of effect coefficients can also be obtained:


β^=(XX)1Xy=[86.43752.56254.93751.06251.06252.43751.31250.1875]


Regression Information table for the experiment in the example.

The coefficients and related results are shown in the Regression Information table above. In the table, the Effect column displays the effects, which are simply twice the coefficients. The Standard Error column displays the standard error, se(β^j). The Low CI and High CI columns display the confidence interval on the coefficients. The interval shown is the 90% interval as the significance is chosen as 0.1. The T Value column displays the t statistic, t0, corresponding to the coefficients. The P Value column displays the p value corresponding to the t statistic. (For details on how these results are calculated, refer to General Full Factorial Designs). Plots of residuals can also be obtained from DOE++ to ensure that the assumptions related to the ANOVA model are not violated.

Model Equation

From the analysis results in the above figure within calculation of effect coefficients section, it is seen that effects A, B and AC are significant. In DOE++, the p values for the significant effects are displayed in red in the ANOVA Table for easy identification. Using the values of the estimated effect coefficients, the model for the present 23 design in terms of the coded values can be written as:


y^=β0+β1x1+β2x2+β13x1x3=86.4375+2.5625x14.9375x2+2.4375x1x3


To make the model hierarchical, the main effect, C, needs to be included in the model (because the interaction AC is included in the model). The resulting model is:


y^=86.4375+2.5625x14.9375x2+1.0625x3+2.4375x1x3


This equation can be viewed in DOE++, as shown in the following figure, using the Show Analysis Summary icon in the Control Panel. The equation shown in the figure will match the hierarchical model once the required terms are selected using the Select Effects icon.

The model equation for the experiment of the example.

Replicated and Repeated Runs

In the case of replicated experiments, it is important to note the difference between replicated runs and repeated runs. Both repeated and replicated runs are multiple response readings taken at the same factor levels. However, repeated runs are response observations taken at the same time or in succession. Replicated runs are response observations recorded in a random order. Therefore, replicated runs include more variation than repeated runs. For example, a baker, who wants to investigate the effect of two factors on the quality of cakes, will have to bake four cakes to complete one replicate of a 22 design. Assume that the baker bakes eight cakes in all. If, for each of the four treatments of the 22 design, the baker selects one treatment at random and then bakes two cakes for this treatment at the same time then this is a case of two repeated runs. If, however, the baker bakes all the eight cakes randomly, then the eight cakes represent two sets of replicated runs. For repeated measurements, the average values of the response for each treatment should be entered into DOE++ as shown in the following figure (a) when the two cakes for a particular treatment are baked together. For replicated measurements, when all the cakes are baked randomly, the data is entered as shown in the following figure (b).


Data entry for repeated and replicated runs. Figure (a) shows repeated runs and (b) shows replicated runs.

Unreplicated 2k Designs

If a factorial experiment is run only for a single replicate then it is not possible to test hypotheses about the main effects and interactions as the error sum of squares cannot be obtained. This is because the number of observations in a single replicate equals the number of terms in the ANOVA model. Hence the model fits the data perfectly and no degrees of freedom are available to obtain the error sum of squares.

However, sometimes it is only possible to run a single replicate of the 2k design because of constraints on resources and time. In the absence of the error sum of squares, hypothesis tests to identify significant factors cannot be conducted. A number of methods of analyzing information obtained from unreplicated 2k designs are available. These include pooling higher order interactions, using the normal probability plot of effects or including center point replicates in the design.

Pooling Higher Order Interactions

One of the ways to deal with unreplicated 2k designs is to use the sum of squares of some of the higher order interactions as the error sum of squares provided these higher order interactions can be assumed to be insignificant. By dropping some of the higher order interactions from the model, the degrees of freedom corresponding to these interactions can be used to estimate the error mean square. Once the error mean square is known, the test statistics to conduct hypothesis tests on the factors can be calculated.

Normal Probability Plot of Effects

Another way to use unreplicated 2k designs to identify significant effects is to construct the normal probability plot of the effects. As mentioned in Special Features, the standard error for all effect coefficients in the 2k designs is the same. Therefore, on a normal probability plot of effect coefficients, all non-significant effect coefficients (with β=0) will fall along the straight line representative of the normal distribution, N(0,σ2/(2km)). Effect coefficients that show large deviations from this line will be significant since they do not come from this normal distribution. Similarly, since effects =2× effect coefficients, all non-significant effects will also follow a straight line on the normal probability plot of effects. For replicated designs, the Effects Probability plot of DOE++ plots the normalized effect values (or the T Values) on the standard normal probability line, N(0,1). However, in the case of unreplicated 2k designs, σ2 remains unknown since MSE cannot be obtained. Lenth's method is used in this case to estimate the variance of the effects. For details on Lenth's method, please refer to Montgomery (2001). DOE++ then uses this variance value to plot effects along the N(0, Lenth's effect variance) line. The method is illustrated in the following example.

Example

Vinyl panels, used as instrument panels in a certain automobile, are seen to develop defects after a certain amount of time. To investigate the issue, it is decided to carry out a two level factorial experiment. Potential factors to be investigated in the experiment are vacuum rate (factor A), material temperature (factor B), element intensity (factor C) and pre-stretch (factor D). The two levels of the factors used in the experiment are as shown in below.

Factors to investigate defects in vinyl panels.

With a 24 design requiring 16 runs per replicate it is only feasible for the manufacturer to run a single replicate.

The experiment design and data, collected as percent defects, are shown in the following figure. Since the present experiment design contains only a single replicate, it is not possible to obtain an estimate of the error sum of squares, SSE. It is decided to use the normal probability plot of effects to identify the significant effects. The effect values for each term are obtained as shown in the following figure.


Experiment design for the example.


Lenth's method uses these values to estimate the variance. As described in [Lenth, 1989], if all effects are arranged in ascending order, using their absolute values, then s0 is defined as 1.5 times the median value:


s0=1.5median(|effect|)=1.52=3


Using s0, the "pseudo standard error" (PSE) is calculated as 1.5 times the median value of all effects that are less than 2.5 s0 :


PSE=1.5median(|effect|  :  |effect|<2.5s0)=1.51.5=2.25


Using PSE as an estimate of the effect variance, the effect variance is 2.25. Knowing the effect variance, the normal probability plot of effects for the present unreplicated experiment can be constructed as shown in the following figure. The line on this plot is the line N(0, 2.25). The plot shows that the effects A, D and the interaction AD do not follow the distribution represented by this line. Therefore, these effects are significant.

The significant effects can also be identified by comparing individual effect values to the margin of error or the threshold value using the pareto chart (see the third following figure). If the required significance is 0.1, then:


margin of error=tα/2,dPSE


The t statistic, tα/2,d, is calculated at a significance of α/2 (for the two-sided hypothesis) and degrees of freedom d=( number of effects )/3. Thus:


margin of error=t0.05,5PSE=2.0152.25=4.534


The value of 4.534 is shown as the critical value line in the third following figure. All effects with absolute values greater than the margin of error can be considered to be significant. These effects are A, D and the interaction AD. Therefore, the vacuum rate, the pre-stretch and their interaction have a significant effect on the defects of the vinyl panels.


Effect values for the experiment in the example.


Normal probability plot of effects for the experiment in the example.


Pareto chart for the experiment in the example.

Center Point Replicates

Another method of dealing with unreplicated 2k designs that only have quantitative factors is to use replicated runs at the center point. The center point is the response corresponding to the treatment exactly midway between the two levels of all factors. Running multiple replicates at this point provides an estimate of pure error. Although running multiple replicates at any treatment level can provide an estimate of pure error, the other advantage of running center point replicates in the 2k design is in checking for the presence of curvature. The test for curvature investigates whether the model between the response and the factors is linear and is discussed in Center Pt. Replicates to Test Curvature.

Example: Use Center Point to Get Pure Error

Consider a 22 experiment design to investigate the effect of two factors, A and B, on a certain response. The energy consumed when the treatments of the 22 design are run is considerably larger than the energy consumed for the center point run (because at the center point the factors are at their middle levels). Therefore, the analyst decides to run only a single replicate of the design and augment the design by five replicated runs at the center point as shown in the following figure. The design properties for this experiment are shown in the second following figure. The complete experiment design is shown in the third following figure. The center points can be used in the identification of significant effects as shown next.


[math]\displaystyle{ 2^2\,\! }[/math] design augmented by five center point runs.
Design properties for the experiment in the example.
Experiment design for the example.

Since the present 22 design is unreplicated, there are no degrees of freedom available to calculate the error sum of squares. By augmenting this design with five center points, the response values at the center points, yic, can be used to obtain an estimate of pure error, SSPE. Let y¯c represent the average response for the five replicates at the center. Then:


SSPE=Sum of Squares for center points


SSPE=5i=1(yicy¯c)2=(25.225.26)2+...+(25.325.26)2=0.052


Then the corresponding mean square is:

MSPE=SSPEdegrees of freedom=0.05251=0.013


Alternatively, MSPE can be directly obtained by calculating the variance of the response values at the center points:

MSPE=s2=5i=1(yicy¯c)251


Once MSPE is known, it can be used as the error mean square, MSE, to carry out the test of significance for each effect. For example, to test the significance of the main effect of factor A, the sum of squares corresponding to this effect is obtained in the usual manner by considering only the four runs of the original 22 design.

SSA=y[H(1/4)J]yy[H ~A(1/4)J]y=0.5625


Then, the test statistic to test the significance of the main effect of factor A is:

(f0)A=MSAMSE=0.5625/10.052/4=43.2692


The p value corresponding to the statistic, (f0)A=43.2692, based on the F distribution with one degree of freedom in the numerator and eight degrees of freedom in the denominator is:

p value=1P(F(f0)A)=10.9972=0.0028


Assuming that the desired significance is 0.1, since p value < 0.1, it can be concluded that the main effect of factor A significantly affects the response. This result is displayed in the ANOVA table as shown in the following figure. Test for the significance of other factors can be carried out in a similar manner.

Results for the experiment in the example.

Using Center Point Replicates to Test Curvature

Center point replicates can also be used to check for curvature in replicated or unreplicated 2k designs. The test for curvature investigates whether the model between the response and the factors is linear. The way DOE++ handles center point replicates is similar to its handling of blocks. The center point replicates are treated as an additional factor in the model. The factor is labeled as Curvature in the results of DOE++. If Curvature turns out to be a significant factor in the results, then this indicates the presence of curvature in the model.


Example: Use Center Point to Test Curvature

To illustrate the use of center point replicates in testing for curvature, consider again the data of the single replicate 22 experiment from a preceding figure(labeled "22 design augmented by five center point runs"). Let x1 be the indicator variable to indicate if the run is a center point:


x1=0Center point runx1=1Other run


If x2 and x3 are the indicator variables representing factors A and B, respectively, then the model for this experiment is:


Y=β0+β1x1+β2x2+β3x3+β23x2x3



To investigate the presence of curvature, the following hypotheses need to be tested:

H0:β1=0 (Curvature is absent)H1:β10


The test statistic to be used for this test is:

(F0)curvature=MScurvatureMSE


where MScurvature is the mean square for Curvature and MSE is the error mean square.


Calculation of the Sum of Squares

The X matrix and y vector for this experiment are:


X=[111111111111111111111000010000100001000010000] y=[24.625.425.025.725.225.325.425.125.3]


The sum of squares can now be calculated. For example, the error sum of squares is:

SSE=y[IH]y=0.052


where I is the identity matrix and H is the hat matrix. It can be seen that this is equal to SSPE  (the sum of squares due to pure error) because of the replicates at the center point, as obtained in the example. The number of degrees of freedom associated with SSE, dof(SSE) is four. The extra sum of squares corresponding to the center point replicates (or Curvature) is:

SSCurvature=Model Sum of SquaresSum of Squares of model excluding the center point=y[H(1/9)J]yy[H ~Curvature(1/9)J]y


where H is the hat matrix and J is the matrix of ones. The matrix H ~Curvature can be calculated using H ~Curvature=X ~Curv(X ~CurvX ~Curv)1X ~Curv where X ~Curv is the design matrix, X, excluding the second column that represents the center point. Thus, the extra sum of squares corresponding to Curvature is:

SSCurvature=y[H(1/9)J]yy[H ~Center(1/9)J]y=0.70360.6875=0.0161


This extra sum of squares can be used to test for the significance of curvature. The corresponding mean square is:

MSCurvature=Sum of squares corresponding to Curvaturedegrees of freedom=0.01611=0.0161


Calculation of the Test Statistic

Knowing the mean squares, the statistic to check the significance of curvature can be calculated.

(f0)Curvature=MSCurvatureMSE=0.0161/10.052/4=1.24


The p value corresponding to the statistic, (f0)Curvature=1.24, based on the F distribution with one degree of freedom in the numerator and four degrees of freedom in the denominator is:

p value=1P(F(f0)Curvature)=10.6713=0.3287


Assuming that the desired significance is 0.1, since p value > 0.1, it can be concluded that curvature does not exist for this design. This results is shown in the ANOVA table in the figure above. The surface of the fitted model based on these results, along with the observed response values, is shown in the figure below.

Model surface and observed response values for the design in the example.


Blocking in 2k Designs

Blocking can be used in the 2k designs to deal with cases when replicates cannot be run under identical conditions. Randomized complete block designs that were discussed in Randomization and Blocking in DOE for factorial experiments are also applicable here. At times, even with just two levels per factor, it is not possible to run all treatment combinations for one replicate of the experiment under homogeneous conditions. For example, each replicate of the 22 design requires four runs. If each run requires two hours and testing facilities are available for only four hours per day, two days of testing would be required to run one complete replicate. Blocking can be used to separate the treatment runs on the two different days. Blocks that do not contain all treatments of a replicate are called incomplete blocks. In incomplete block designs, the block effect is confounded with certain effect(s) under investigation. For the 22 design assume that treatments (1) and ab were run on the first day and treatments a and b were run on the second day. Then, the incomplete block design for this experiment is:


Block 1Block 2[(1)ab][ab]


For this design the block effect may be calculated as:

Block Effect=Average response for Block 1Average response for Block 2=(1)+ab2a+b2=12[(1)+abab]


The AB interaction effect is:

AB=Average response at Ahigh-Bhigh and Alow-BlowAverage response at Alow-Bhigh and Ahigh-Blow=ab+(1)2b+a2=12[(1)+abab]


The two equations given above show that, in this design, the AB interaction effect cannot be distinguished from the block effect because the formulas to calculate these effects are the same. In other words, the AB interaction is said to be confounded with the block effect and it is not possible to say if the effect calculated based on these equations is due to the AB interaction effect, the block effect or both. In incomplete block designs some effects are always confounded with the blocks. Therefore, it is important to design these experiments in such a way that the important effects are not confounded with the blocks. In most cases, the experimenter can assume that higher order interactions are unimportant. In this case, it would better to use incomplete block designs that confound these effects with the blocks. One way to design incomplete block designs is to use defining contrasts as shown next:

L=α1q1+α2q2+...+αkqk


where the αi s are the exponents for the factors in the effect that is to be confounded with the block effect and the qi s are values based on the level of the i the factor (in a treatment that is to be allocated to a block). For 2k designs the αi s are either 0 or 1 and the qi s have a value of 0 for the low level of the i th factor and a value of 1 for the high level of the factor in the treatment under consideration. As an example, consider the 22 design where the interaction effect AB is confounded with the block. Since there are two factors, k=2, with i=1 representing factor A and i=2 representing factor B. Therefore:

L=α1q1+α2q2


The value of α1 is one because the exponent of factor A in the confounded interaction AB is one. Similarly, the value of α2 is one because the exponent of factor B in the confounded interaction AB is also one. Therefore, the defining contrast for this design can be written as:

L=α1q1+α2q2=1q1+1q2=q1+q2


Once the defining contrast is known, it can be used to allocate treatments to the blocks. For the 22 design, there are four treatments (1), a, b and ab. Assume that L=0 represents block 2 and L=1 represents block 1. In order to decide which block the treatment (1) belongs to, the levels of factors A and B for this run are used. Since factor A is at the low level in this treatment, q1=0. Similarly, since factor B is also at the low level in this treatment, q2=0. Therefore:

L=q1+q2=0+0=0 (mod 2)


Note that the value of L used to decide the block allocation is "mod 2" of the original value. This value is obtained by taking the value of 1 for odd numbers and 0 otherwise. Based on the value of L, treatment (1) is assigned to block 1. Other treatments can be assigned using the following calculations:

(1): L=0+0=0=0 (mod 2)a: L=1+0=1=1 (mod 2)b: L=0+1=1=1 (mod 2)ab: L=1+1=2=0 (mod 2)


Therefore, to confound the interaction AB with the block effect in the 22 incomplete block design, treatments (1) and ab (with L=0) should be assigned to block 2 and treatment combinations a and b (with L=1) should be assigned to block 1.

Example: Two Level Factorial Design with Two Blocks

This example illustrates how treatments can be allocated to two blocks for an unreplicated 2k design. Consider the unreplicated 24 design to investigate the four factors affecting the defects in automobile vinyl panels discussed in Normal Probability Plot of Effects. Assume that the 16 treatments required for this experiment were run by two different operators with each operator conducting 8 runs. This experiment is an example of an incomplete block design. The analyst in charge of this experiment assumed that the interaction ABCD was not significant and decided to allocate treatments to the two operators so that the ABCD interaction was confounded with the block effect (the two operators are the blocks). The allocation scheme to assign treatments to the two operators can be obtained as follows.
The defining contrast for the 24 design where the ABCD interaction is confounded with the blocks is:

L=q1+q2+q3+q4


The treatments can be allocated to the two operators using the values of the defining contrast. Assume that L=0 represents block 2 and L=1 represents block 1. Then the value of the defining contrast for treatment a is:

a  :   L=1+0+0+0=1=1 (mod 2)


Therefore, treatment a should be assigned to Block 1 or the first operator. Similarly, for treatment ab we have:

ab  :   L=1+1+0+0=2=0 (mod 2)
Allocation of treatments to two blocks for the [math]\displaystyle{ 2^4 }[/math] design in the example by confounding interaction of [math]\displaystyle{ ABCD }[/math] with the blocks.

Therefore, ab should be assigned to Block 2 or the second operator. Other treatments can be allocated to the two operators in a similar manner to arrive at the allocation scheme shown in the figure below. In DOE++, to confound the ABCD interaction for the 24 design into two blocks, the number of blocks are specified as shown in the figure below. Then the interaction ABCD is entered in the Block Generator window (second following figure) which is available using the Block Generator button in the following figure. The design generated by DOE++ is shown in the third of the following figures. This design matches the allocation scheme of the preceding figure.

Adding block properties for the experiment in the example.
Specifying the interaction ABCD as the interaction to be confounded with the blocks for the example.
Two block design for the experiment in the example.


For the analysis of this design, the sum of squares for all effects are calculated assuming no blocking. Then, to account for blocking, the sum of squares corresponding to the ABCD interaction is considered as the sum of squares due to blocks and ABCD. In DOE++ this is done by displaying this sum of squares as the sum of squares due to the blocks. This is shown in the following figure where the sum of squares in question is obtained as 72.25 and is displayed against Block. The interaction ABCD, which is confounded with the blocks, is not displayed. Since the design is unreplicated, any of the methods to analyze unreplicated designs mentioned in Unreplicated 2k designs have to be used to identify significant effects.


ANOVA table for the experiment of the example.

Unreplicated 2k Designs in 2p Blocks

A single replicate of the 2k design can be run in up to 2p blocks where p<k. The number of effects confounded with the blocks equals the degrees of freedom associated with the block effect.


If two blocks are used (the block effect has two levels), then one (21=1) effect is confounded with the blocks. If four blocks are used, then three (41=3) effects are confounded with the blocks and so on. For example an unreplicated 24 design may be confounded in 22 (four) blocks using two contrasts, L1 and L2. Let AC and BD be the effects to be confounded with the blocks. Corresponding to these two effects, the contrasts are respectively:

L1=q1+q3L2=q2+q4

Based on the values of L1 and L2, the treatments can be assigned to the four blocks as follows:


Block 4Block 3Block 2Block 1L1=0,L2=0L1=1,L2=0L1=0,L2=1L1=1,L2=1[(1)acbdabcd][acabdbcd][babcdacd][abbcadcd]


Since the block effect has three degrees of freedom, three effects are confounded with the block effect. In addition to AC and BD, the third effect confounded with the block effect is their generalized interaction, (AC)(BD)=ABCD. In general, when an unreplicated 2k design is confounded in 2p blocks, p contrasts are needed (L1,L2...Lp). p effects are selected to define these contrasts such that none of these effects are the generalized interaction of the others. The 2p blocks can then be assigned the treatments using the p contrasts. 2p(p+1) effects, that are also confounded with the blocks, are then obtained as the generalized interaction of the p effects. In the statistical analysis of these designs, the sum of squares are computed as if no blocking were used. Then the block sum of squares is obtained by adding the sum of squares for all the effects confounded with the blocks.

Example: 2 Level Factorial Design with Four Blocks

This example illustrates how DOE++ obtains the sum of squares when treatments for an unreplicated 2k design are allocated among four blocks. Consider again the unreplicated 24 design used to investigate the defects in automobile vinyl panels presented in Normal Probability Plot of Effects. Assume that the 16 treatments needed to complete the experiment were run by four operators. Therefore, there are four blocks. Assume that the treatments were allocated to the blocks using the generators mentioned in the previous section, i.e., treatments were allocated among the four operators by confounding the effects, AC and BD, with the blocks. These effects can be specified as Block Generators as shown in the following figure. (The generalized interaction of these two effects, interaction ABCD, will also get confounded with the blocks.) The resulting design is shown in the second following figure and matches the allocation scheme obtained in the previous section.


Specifying the interactions AC and BD as block generators for the example.


The sum of squares in this case can be obtained by calculating the sum of squares for each of the effects assuming there is no blocking. Once the individual sum of squares have been obtained, the block sum of squares can be calculated. The block sum of squares is the sum of the sum of squares of effects, AC, BD and ABCD, since these effects are confounded with the block effect. As shown in the second following figure, this sum of squares is 92.25 and is displayed against Block. The interactions AC, BD and ABCD, which are confounded with the blocks, are not displayed. Since the present design is unreplicated any of the methods to analyze unreplicated designs mentioned in Unreplicated 2k designs have to be used to identify significant effects.


Design for the experiment in the example.


ANOVA table for the experiment in the example.

Variability Analysis

For replicated two level factorial experiments, DOE++ provides the option of conducting variability analysis (using the Variability Analysis icon under the Data menu). The analysis is used to identify the treatment that results in the least amount of variation in the product or process being investigated. Variability analysis is conducted by treating the standard deviation of the response for each treatment of the experiment as an additional response. The standard deviation for a treatment is obtained by using the replicated response values at that treatment run. As an example, consider the 23 design shown in the following figure where each run is replicated four times. A variability analysis can be conducted for this design. DOE++ calculates eight standard deviation values corresponding to each treatment of the design (see second following figure). Then, the design is analyzed as an unreplicated 23 design with the standard deviations (displayed as Y Standard Deviation. in second following figure) as the response. The normal probability plot of effects identifies AC as the effect that influences variability (see third figure following). Based on the effect coefficients obtained in the fourth figure following, the model for Y Std. is:


Y Std.=0.6779+0.2491AC=0.6779+0.2491x1x3


Based on the model, the experimenter has two choices to minimize variability (by minimizing Y Std.). The first choice is that x1 should be 1 (i.e., A should be set at the high level) and x3 should be 1 (i.e., C should be set at the low level). The second choice is that x1 should be 1 (i.e., A should be set at the low level) and x3 should be 1 (i.e., C should be set at the high level). The experimenter can select the most feasible choice.


A [math]\displaystyle{ 2^3\,\! }[/math] design with four replicated response values that can be used to conduct a variability analysis.


Variability analysis in DOE++.


Normal probability plot of effects for the variability analysis example.


Effect coefficients for the variability analysis example.


Two Level Fractional Factorial Designs

As the number of factors in a two level factorial design increases, the number of runs for even a single replicate of the 2k design becomes very large. For example, a single replicate of an eight factor two level experiment would require 256 runs. Fractional factorial designs can be used in these cases to draw out valuable conclusions from fewer runs. The basis of fractional factorial designs is the sparsity of effects principle.[Wu, 2000] The principle states that, most of the time, responses are affected by a small number of main effects and lower order interactions, while higher order interactions are relatively unimportant. Fractional factorial designs are used as screening experiments during the initial stages of experimentation. At these stages, a large number of factors have to be investigated and the focus is on the main effects and two factor interactions. These designs obtain information about main effects and lower order interactions with fewer experiment runs by confounding these effects with unimportant higher order interactions. As an example, consider a 28 design that requires 256 runs. This design allows for the investigation of 8 main effects and 28 two factor interactions. However, 219 degrees of freedom are devoted to three factor or higher order interactions. This full factorial design can prove to be very inefficient when these higher order interactions can be assumed to be unimportant. Instead, a fractional design can be used here to identify the important factors that can then be investigated more thoroughly in subsequent experiments. In unreplicated fractional factorial designs, no degrees of freedom are available to calculate the error sum of squares and the techniques mentioned in Unreplicated 2k designs should be employed for the analysis of these designs.

Half-fraction Designs

A half-fraction of the 2k design involves running only half of the treatments of the full factorial design. For example, consider a 23 design that requires eight runs in all. The design matrix for this design is shown in the figure (a) below. A half-fraction of this design is the design in which only four of the eight treatments are run. The fraction is denoted as 231 with the "1" in the index denoting a half-fraction. Assume that the treatments chosen for the half-fraction design are the ones where the interaction ABC is at the high level (i.e., only those rows are chosen from the following figure (a) where the column for ABC has entries of 1). The resulting 231 design has a design matrix as shown in figure (b) below.

Half-fractions of the [math]\displaystyle{ 2^3\,\! }[/math] design. (a) shows the full factorial [math]\displaystyle{ 2^3\,\! }[/math] design, (b) shows the [math]\displaystyle{ 2^{3-1}\,\! }[/math] design with the defining relation [math]\displaystyle{ I=ABC\,\! }[/math] and (c) shows the [math]\displaystyle{ {2}^{3-1}\,\! }[/math] design with the defining relation [math]\displaystyle{ I=-ABC\,\! }[/math].

In the 231 design of figure (b), since the interaction ABC is always included at the same level (the high level represented by 1), it is not possible to measure this interaction effect. The effect, ABC, is called the generator or word for this design. It can be noted that, in the design matrix of the following figure (b), the column corresponding to the intercept, I, and column corresponding to the interaction ABC, are identical. The identical columns are written as I=ABC and this equation is called the defining relation for the design. In DOE++, the present 231 design can be obtained by specifying the design properties as shown in the following figure.

Design properties for the [math]\displaystyle{ 2^{3-1}\,\! }[/math] design.

The defining relation, I=ABC, is entered in the Fraction Generator window as shown next.

Specifying the defining relation for the [math]\displaystyle{ 2^{3-1}\,\! }[/math] design.

Note that in the figure following that, the defining relation is specified as C=AB. This relation is obtained by multiplying the defining relation, I=ABC, by the last factor, C, of the design.


Calculation of Effects

Using the four runs of the 231 design in figure (b) discussed above, the main effects can be calculated as follows:

A=(a+abc)2(b+c)2=12(abc+abc)B=(b+abc)2(a+c)2=12(a+bc+abc)C=(c+abc)2(a+b)2=12(ab+c+abc)


where a, b, c and abc are the treatments included in the 231 design.


Similarly, the two factor interactions can also be obtained as:

BC=(a+abc)2(b+c)2=12(abc+abc)AC=(b+abc)2(a+c)2=12(a+bc+abc)AB=(c+abc)2(a+b)2=12(ab+c+abc)


The equations for A and BC above result in the same effect values showing that effects A and BC are confounded in the present 231 design. Thus, the quantity, 12(abc+abc), estimates A+BC (i.e., both the main effect A and the two-factor interaction BC). The effects, A and BC, are called aliases. From the remaining equations given above, it can be seen that the other aliases for this design are B and AC, and C and AB. Therefore, the equations to calculate the effects in the present 231 design can be written as follows:

A+BC=12(abc+abc)B+AC=12(a+bc+abc)C+AB=12(ab+c+abc)

Calculation of Aliases

Aliases for a fractional factorial design can be obtained using the defining relation for the design. The defining relation for the present 231 design is:

I=ABC


Multiplying both sides of the previous equation by the main effect, A, gives the alias effect of A :

AI=AABCA=A2BCA=BC


Note that in calculating the alias effects, any effect multiplied by I remains the same (AI=A), while an effect multiplied by itself results in I (A2=I). Other aliases can also be obtained:

BI=BABCB=AB2CB=AC
and:
CI=CABCC=ABC2C=AB

Fold-over Design

If it can be assumed for this design that the two-factor interactions are unimportant, then in the absence of BC, AC and AB, the equations for (A+BC), (B+AC) and (C+AB) can be used to estimate the main effects, A, B and C, respectively. However, if such an assumption is not applicable, then to uncouple the main effects from their two factor aliases, the alternate fraction that contains runs having ABC at the lower level should be run. The design matrix for this design is shown in the preceding figure (c). The defining relation for this design is I=ABC because the four runs for this design are obtained by selecting the rows of the preceding figure (a) for which the value of the ABC column is 1. The aliases for this fraction can be obtained as explained in Half-fraction Designs as A=BC, B=AC and C=AB. The effects for this design can be calculated as:

ABC=12(ab+ac(1)bc)BAC=12(abac+(1)bc)CAB=12(ab+ac(1)+bc)


These equations can be combined with the equations for (A+BC), (B+AC) and (C+AB) to obtain the de-aliased main effects and two factor interactions. For example, adding equations (A+BC) and (A-BC) returns the main effect A.

2A=12(abc+abc)+12(ab+ac(1)bc)


The process of augmenting a fractional factorial design by a second fraction of the same size by simply reversing the signs (of all effect columns except I) is called folding over. The combined design is referred to as a fold-over design.

Quarter and Smaller Fraction Designs

At times, the number of runs even for a half-fraction design are very large. In these cases, smaller fractions are used. A quarter-fraction design, denoted as 2k2, consists of a fourth of the runs of the full factorial design. Quarter-fraction designs require two defining relations. The first defining relation returns the half-fraction or the 2k1 design. The second defining relation selects half of the runs of the 2k1 design to give the quarter-fraction. For example, consider the 24 design. To obtain a 242 design from this design, first a half-fraction of this design is obtained by using a defining relation. Assume that the defining relation used is I=ABCD. The design matrix for the resulting 241 design is shown in figure (a) below. Now, a quarter-fraction can be obtained from the 241 design shown in figure (a) below using a second defining relation I=AD. The resulting 242 design obtained is shown in figure (b) below.


Fractions of the [math]\displaystyle{ 2^4\,\! }[/math] design - Figure (a) shows the [math]\displaystyle{ 2^{4-1} }[/math] design with the defining relation [math]\displaystyle{ I=ABCD\,\! }[/math] and (b) shows the [math]\displaystyle{ 2^{4-2}\,\! }[/math] design with the defining relation [math]\displaystyle{ I=ABCD=AD=BC\,\! }[/math].


The complete defining relation for this 242 design is:

I=ABCD=AD=BC

Note that the effect, BC, in the defining relation is the generalized interaction of ABCD and AD and is obtained using (ABCD)(AD)=A2BCD2=BC. In general, a 2kp fractional factorial design requires p independent generators. The defining relation for the design consists of the p independent generators and their 2p - (p +1) generalized interactions.


Calculation of Aliases

The alias structure for the present 242 design can be obtained using the defining relation of equation (I=ABCD=AD=BC) following the procedure explained in Half-fraction Designs. For example, multiplying the defining relation by A returns the effects aliased with the main effect, A, as follows:

AI=AABCD=AAD=ABCA=A2BCD=A2D=ABCA=BCD=D=ABC


Therefore, in the present 242 design, it is not possible to distinguish between effects A, D, BCD and ABC. Similarly, multiplying the defining relation by B and AB returns the effects that are aliased with these effects:

B=ACD=ABD=CAB=CD=AD=AC


Other aliases can be obtained in a similar way. It can be seen that each effect in this design has three aliases. In general, each effect in a 2kp design has 2p1 aliases. The aliases for the 242 design show that in this design the main effects are aliased with each other (A is aliased with D and B is aliased with C). Therefore, this design is not a useful design and is not available in DOE++. It is important to ensure that main effects and lower order interactions of interest are not aliased in a fractional factorial design. This is known by looking at the resolution of the fractional factorial design.

Design Resolution

The resolution of a fractional factorial design is defined as the number of factors in the lowest order effect in the defining relation. For example, in the defining relation I=ABCD=AD=BC of the previous 242 design, the lowest-order effect is either AD or BC, containing two factors. Therefore, the resolution of this design is equal to two. The resolution of a fractional factorial design is represented using Roman numerals. For example, the previously mentioned 242 design with a resolution of two can be represented as 2 II42. The resolution provides information about the confounding in the design as explained next:

  1. Resolution III Designs
    In these designs, the lowest order effect in the defining relation has three factors (e.g., a 252 design with the defining relation I=ABDE=ABC=CDE). In resolution III designs, no main effects are aliased with any other main effects, but main effects are aliased with two factor interactions. In addition, some two factor interactions are aliased with each other.
  2. Resolution IV Designs
    In these designs, the lowest order effect in the defining relation has four factors (e.g., a 251 design with the defining relation I=ABDE). In resolution IV designs, no main effects are aliased with any other main effects or two factor interactions. However, some main effects are aliased with three factor interactions and the two factor interactions are aliased with each other.
  3. Resolution V Designs
    In these designs the lowest order effect in the defining relation has five factors (e.g., a 251 design with the defining relation I=ABCDE). In resolution V designs, no main effects or two factor interactions are aliased with any other main effects or two factor interactions. However, some main effects are aliased with four factor interactions and the two factor interactions are aliased with three factor interactions.


Fractional factorial designs with the highest resolution possible should be selected because the higher the resolution of the design, the less severe the degree of confounding. In general, designs with a resolution less than III are never used because in these designs some of the main effects are aliased with each other. The table below shows fractional factorial designs with the highest available resolution for three to ten factor designs along with their defining relations.


Highest resolution designs available for fractional factorial designs with 3 to 10 factors.


All of the two level fractional factorial designs available in DOE++ are shown next.


Two level fractional factorial designs available in DOE++ and their resolutions.


Minimum Aberration Designs

At times, different designs with the same resolution but different aliasing may be available. The best design to select in such a case is the minimum aberration design. For example, all 272 designs in the fourth table have a resolution of four (since the generator with the minimum number of factors in each design has four factors). Design 1 has three generators of length four (ABCF, BCDG, ADFG). Design 2 has two generators of length four (ABCF, ADEG). Design 3 has one generator of length four (CEFG). Therefore, design 3 has the least number of generators with the minimum length of four. Design 3 is called the minimum aberration design. It can be seen that the alias structure for design 3 is less involved compared to the other designs. For details refer to [Wu, 2000].


Three [math]\displaystyle{ 2_{IV}^{7-2}\,\! }[/math] designs with different defining relations.


Example

The design of an automobile fuel cone is thought to be affected by six factors in the manufacturing process: cavity temperature (factor A), core temperature (factor B), melt temperature (factor C), hold pressure (factor D), injection speed (factor E) and cool time (factor F). The manufacturer of the fuel cone is unable to run the 26=64 runs required to complete one replicate for a two level full factorial experiment with six factors. Instead, they decide to run a fractional factorial design. Considering that three factor and higher order interactions are likely to be inactive, the manufacturer selects a 262 design that will require only 16 runs. The manufacturer chooses the resolution IV design which will ensure that all main effects are free from aliasing (assuming three factor and higher order interactions are absent). However, in this design the two factor interactions may be aliased with each other. It is decided that, if important two factor interactions are found to be present, additional experiment trials may be conducted to separate the aliased effects. The performance of the fuel cone is measured on a scale of 1 to 15. In DOE++, the design for this experiment is set up using the properties shown in the following figure. The Fraction Generators for the design, E=ABC and F=BCD, are the same as the defaults used in DOE++. The resulting 262 design and the corresponding response values are shown in the following two figures.


Design properties for the experiment in the example.


Experiment design for the example.


The complete alias structure for the 2 IV62 design is shown next.

I=ABCE=ADEF=BCDF


A=BCE=DEF=ABCDFB=ACE=CDF=ABDEFC=ABE=BDF=ACDEFD=AEF=BCF=ABCDEE=ABC=ADF=BCDEFF=ADE=BCD=ABCEF
AB=CE=ACDF=BDEFAC=BE=ABDF=CDEFAD=EF=ABCF=BCDEAE=BC=DF=ABCDEFAF=DE=ABCD=BCEFBD=CF=ABEF=ACDEBF=CD=ABDE=ACEF
ABD=ACF=BEF=CDEABF=ACD=BDE=CEF

In DOE++, the alias structure is displayed in the Design Summary and as part of the Design Evaluation result, as shown next:

Alias structure for the experiment design in the example.

The normal probability plot of effects for this unreplicated design shows the main effects of factors C and D and the interaction effect, BF, to be significant (see the following figure).


Normal probability plot of effects for the experiment in the example.


From the alias structure, it can be seen that for the present design interaction effect, BF, is confounded with CD. Therefore, the actual source of this effect cannot be known on the basis of the present experiment. However because neither factor B nor F is found to be significant there is an indication the observed effect is likely due to interaction, CD. To confirm this, a follow-up 22 experiment is run involving only factors B and F. The interaction, BF, is found to be inactive, leading to the conclusion that the interaction effect in the original experiment is effect, CD. Given these results, the fitted regression model for the fuel cone design as per the coefficients obtained from DOE++ is shown next.

y^=7.6875+C+2D+2.1875CD


Effect coefficients for the experiment in the example.

Projection

Projection refers to the reduction of a fractional factorial design to a full factorial design by dropping out some of the factors of the design. Any fractional factorial design of resolution, R, can be reduced to complete factorial designs in any subset of R1 factors. For example, consider the 2 IV73 design. The resolution of this design is four. Therefore, this design can be reduced to full factorial designs in any three (41=3) of the original seven factors (by dropping the remaining four of factors). Further, a fractional factorial design can also be reduced to a full factorial design in any R of the original factors, as long as these R factors are not part of the generator in the defining relation. Again consider the 2 IV73 design. This design can be reduced to a full factorial design in four factors provided these four factors do not appear together as a generator in the defining relation. The complete defining relation for this design is:


I=ABCE=BCDF=ACDG=ADEF=ABFG=BDEG=CEFG


Therefore, there are seven four factor combinations out of the 35 ((74)=35) possible four-factor combinations that are used as generators in the defining relation. The designs with the remaining 28 four factor combinations would be full factorial 16-run designs. For example, factors A, B, C and D do not occur as a generator in the defining relation of the 2 IV73 design. If the remaining factors, E, F and G, are dropped, the 2 IV73 design will reduce to a full factorial design in A, B, C and D.

Resolution III Designs

At times, the factors to be investigated in screening experiments are so large that even running a fractional factorial design is impractical. This can be partially solved by using resolution III fractional factorial designs in the cases where three factor and higher order interactions can be assumed to be unimportant. Resolution III designs, such as the 2 III31 design, can be used to estimate k main effects using just k+1 runs. In these designs, the main effects are aliased with two factor interactions. Once the results from these designs are obtained, and knowing that three factor and higher order interactions are unimportant, the experimenter can decide if there is a need to run a fold-over design to de-alias the main effects from the two factor interactions. Thus, the 2 III31 design can be used to investigate three factors in four runs, the 2 III74 design can be used to investigate seven factors in eight runs, the 2 III1511 design can be used to investigate fifteen factors in sixteen runs and so on.

Example

A baker wants to investigate the factors that most affect the taste of the cakes made in his bakery. He chooses to investigate seven factors, each at two levels: flour type (factor A), conditioner type (factor B), sugar quantity (factor C), egg quantity (factor D), preservative type (factor E), bake time (factor F) and bake temperature (factor G). The baker expects most of these factors and all higher order interactions to be inactive. On the basis of this, he decides to run a screening experiment using a 2 III74 design that requires just 8 runs. The cakes are rated on a scale of 1 to 10. The design properties for the 2 III74 design (with generators D=AB, E=AC, F=BC and G=ABC) are shown in the following figure.


Design properties for the experiment in the example.


The resulting design along with the rating of the cakes corresponding to each run is shown in the following figure.


Experiment design for the example.


The normal probability plot of effects for the unreplicated design shows main effects B, C, D and G to be significant, as shown in the next figure.


Normal probability plot of effects for the experiment in the example.


However, for this design, the following alias relations exist for the main effects:

A=A+BD+CE+FGB=B+AD+CF+EGC=C+AE+BF+DGD=D+AB+CG+EFE=E+AC+BG+DFF=F+BC+AG+DEG=G+CD+BE+AF


Based on the alias structure, three separate possible conclusions can be drawn. It can be concluded that effect CD is active instead of G so that effects C, D and their interaction, CD, are the significant effects. Another conclusion can be that effect DG is active instead of C so that effects D, G and their interaction, DG, are significant. Yet another conclusion can be that effects C, G and their interaction, CG, are significant. To accurately discover the active effects, the baker decides to a run a fold-over of the present design and base his conclusions on the effect values calculated once results from both the designs are available.


The present design is shown next.


Effect values for the experiment in the example.


Using the alias relations, the effects obtained from the DOE folio for the present design can be expressed as:


EffectA=0.025=A+BD+CE+FGEffectB=0.225=B+AD+CF+EGEffectC=2.075=C+AE+BF+DGEffectD=2.875=D+AB+CG+EFEffectE=0.025=E+AC+BG+DFEffectF=0.075=F+BC+AG+DEEffectG=3.825=G+CD+BE+AF


The fold-over design for the experiment is obtained by reversing the signs of the columns D, E, and F. In a DOE folio, you can fold over a design using the following window.

Fold-over design window

The resulting design and the corresponding response values obtained are shown in the following figures.

Fold-over design for the experiment in the example.


Effect values for the fold-over design in the example.


Comparing the absolute values of the effects, the active effects are B, C, D and the interaction CD. Therefore, the most important factors affecting the taste of the cakes in the present case are sugar quantity, egg quantity and their interaction.

Alias Matrix

In Half-fraction designs and Quarter and Smaller Fraction Designs, the alias structure for fractional factorial designs was obtained using the defining relation. However, this method of obtaining the alias structure is not very efficient when the alias structure is very complex or when partial aliasing is involved. One of the ways to obtain the alias structure for any design, regardless of its complexity, is to use the alias matrix. The alias matrix for a design is calculated using (X1X1)1X1X2 where X1 is the portion of the design matrix, X, that contains the effects for which the aliases need to be calculated, and X2 contains the remaining columns of the design matrix, other than those included in X1.


To illustrate the use of the alias matrix, consider the design matrix for the 2 IV41 design (using the defining relation I=ABCD) shown next:


Chapter7 879.png


The alias structure for this design can be obtained by defining X1 using eight columns since the 2 IV41 design estimates eight effects. If the first eight columns of X are used then X1 is:


Chapter7 884.png


X2 is obtained using the remaining columns as:


Chapter7 886.png


Then the alias matrix (X1X1)1X1X2 is:


Chapter7 888.png


The alias relations can be easily obtained by observing the alias matrix as:


I=ABCDA=BCDB=ACDAB=CDC=ABDAC=BDBC=ADD=ABC