ANOVA for Designed Experiments
In Simple Linear Regression Analysis and Multiple Linear Regression Analysis, methods were presented to model the relationship between a response and the associated factors (referred to as predictor variables in the context of regression) based on an observed data set. Such studies, where observed values of the response are used to establish an association between the response and the factors, are called observational studies. However, in the case of observational studies, it is difficult to establish a cause-and-effect relationship between the observed factors and the response. This is because a number of alternative justifications can be used to explain the observed change in the response values. For example, a regression model fitted to data on the population of cities and road accidents might show a positive regression relation. However, this relation does not imply that an increase in a city's population causes an increase in road accidents. It could be that a number of other factors such as road conditions, traffic control and the degree to which the residents of the city follow the traffic rules affect the number of road accidents in the city and the increase in the number of accidents seen in the study is caused by these factors. Since the observational study does not take the effect of these factors into account, the assumption that an increase in a city's population will lead to an increase in road accidents is not a valid one. For example, the population of a city may increase but road accidents in the city may decrease because of better traffic control. To establish a cause-and-effect relationship, the study should be conducted in such a way that the effect of all other factors is excluded from the investigation.
The studies that enable the establishment of a cause-and-effect relationship are called experiments. In experiments the response is investigated by studying only the effect of the factor(s) of interest and excluding all other effects that may provide alternative justifications to the observed change in response. This is done in two ways. First, the levels of the factors to be investigated are carefully selected and then strictly controlled during the execution of the experiment. The aspect of selecting what factor levels should be investigated in the experiment is called the design of the experiment. The second distinguishing feature of experiments is that observations in an experiment are recorded in a random order. By doing this, it is hoped that the effect of all other factors not being investigated in the experiment will get cancelled out so that the change in the response is the result of only the investigated factors. Using these two techniques, experiments tend to ensure that alternative justifications to observed changes in the response are voided, thereby enabling the establishment of a cause-and-effect relationship between the response and the investigated factors.
Randomization
The aspect of recording observations in an experiment in a random order is referred to as randomization. Specifically, randomization is the process of assigning the various levels of the investigated factors to the experimental units in a random fashion. An experiment is said to be completely randomized if the probability of an experimental unit to be subjected to any level of a factor is equal for all the experimental units. The importance of randomization can be illustrated using an example. Consider an experiment where the effect of the speed of a lathe machine on the surface finish of a product is being investigated. In order to save time, the experimenter records surface finish values by running the lathe machine continuously and recording observations in the order of increasing speeds. The analysis of the experiment data shows that an increase in lathe speeds causes a decrease in the quality of surface finish. However the results of the experiment are disputed by the lathe operator who claims that he has been able to obtain better surface finish quality in the products by operating the lathe machine at higher speeds. It is later found that the faulty results were caused because of overheating of the tool used in the machine. Since the lathe was run continuously in the order of increased speeds the observations were recorded in the order of increased tool temperatures. This problem could have been avoided if the experimenter had randomized the experiment and taken reading at the various lathe speeds in a random fashion. This would require the experimenter to stop and restart the machine at every observation, thereby keeping the temperature of the tool within a reasonable range. Randomization would have ensured that the effect of heating of the machine tool is not included in the experiment.
Analysis of Single Factor Experiments
As explained in Simple Linear Regression Analysis and Multiple Linear Regression Analysis, the analysis of observational studies involves the use of regression models. The analysis of experimental studies involves the use of analysis of variance (ANOVA) models. For a comparison of the two models see Fitting ANOVA Models. In single factor experiments, ANOVA models are used to compare the mean response values at different levels of the factor. Each level of the factor is investigated to see if the response is significantly different from the response at other levels of the factor. The analysis of single factor experiments is often referred to as one-way ANOVA.
To illustrate the use of ANOVA models in the analysis of experiments, consider a single factor experiment where the analyst wants to see if the surface finish of certain parts is affected by the speed of a lathe machine. Data is collected for three speeds (or three treatments). Each treatment is replicated four times. Therefore, this experiment design is balanced. Surface finish values recorded using randomization are shown in the following table.
The ANOVA model for this experiment can be stated as follows:
The ANOVA model assumes that the response at each factor level,
The ANOVA model of the means model can also be written using
Such an ANOVA model is called the effects model. In the effects models the treatment effects,
Fitting ANOVA Models
To fit ANOVA models and carry out hypothesis testing in single factor experiments, it is convenient to express the effects model of the effects model in the form
where
For the first treatment, the ANOVA model for the single factor experiment in the above table can be written as:
Using
Models for the second and third treatments can be obtained in a similar way. The models for the three treatments are:
The coefficients of the treatment effects
Using the indicator variables
The equation can be rewritten by including subscripts
The equation given above represents the "regression version" of the ANOVA model.
Comparison of ANOVA and Regression Models
It can be seen from the equation given above that in an ANOVA model each factor is treated as a qualitative factor. In the present example the factor, lathe speed, is a quantitative factor with three levels. But the ANOVA model treats this factor as a qualitative factor with three levels. Therefore, two indicator variables,
The choice of the two models for a particular data set depends on the objective of the experimenter. In the case of the data of the first table, the objective of the experimenter is to compare the levels of the factor to see if change in the levels leads to a significant change in the response. The objective is not to make predictions on the response for a given level of the factor. Therefore, the ANOVA model is used in this case. If the objective of the experimenter were prediction or optimization, the experimenter would use a regression model and focus on aspects such as the nature of relationship between the factor, lathe speed, and the response, surface finish, so that the regression model obtained is able to make accurate predictions.
Expression of the ANOVA Model in the Form
The regression version of the ANOVA model can be expanded for the three treatments and four replicates of the data in the first table as follows:
The corresponding matrix notation is:
- where
- Thus:
The matrices
Hypothesis Test in Single Factor Experiments
The hypothesis test in single factor experiments examines the ANOVA model to see if the response at any level of the investigated factor is significantly different from that at the other levels. If this is not the case and the response at all levels is not significantly different, then it can be concluded that the investigated factor does not affect the response. The test on the ANOVA model is carried out by checking to see if any of the treatment effects,
The test for
where
Calculation of the Statistic
The sum of squares to obtain the statistic
In the previous equation,
The total sum of squares,
In the previous equation,
Knowing
The number of degrees of freedom associated with
The test statistic can now be calculated using the equation given in Hypothesis Test in Single Factor Experiments as:
The
Assuming that the desired significance level is 0.1, since
Confidence Interval on the ith Treatment Mean
The response at each treatment of a single factor experiment can be assumed to be a normal population with a mean of
has a
For example, for the first treatment of the lathe speed we have:
In DOE++, this value is displayed as the Estimated Mean for the first level, as shown in the Data Summary table in the figure below. The value displayed as the standard deviation for this level is simply the sample standard deviation calculated using the observations corresponding to this level. The 90% confidence interval for this treatment is:
The 90% limits on
Confidence Interval on the Difference in Two Treatment Means
The confidence interval on the difference in two treatment means,
For balanced designs all
The standard deviation for
The
Then a 100 (1-
For example, an estimate of the difference in the first and second treatment means of the lathe speed,
The pooled standard error for this difference is:
To test
In DOE++, the value of the statistic is displayed in the Mean Comparisons table under the column T Value as shown in the figure below. The 90% confidence interval on the difference
Hence the 90% limits on
Since
Residual Analysis
Plots of residuals,
Equality of variance is checked by plotting residuals against the treatments and the treatment averages,
Box-Cox Method
Transformations on the response may be used when residual plots for an experiment show a pattern. This indicates that the equality of variance does not hold for the residuals of the given model. The Box-Cox method can be used to automatically identify a suitable power transformation for the data based on the relation:
where
Confidence intervals on the selected
The required limits for
Here
Example
To illustrate the Box-Cox method, consider the experiment given in the first table. Transformed response values for various values of
A plot of
Therefore,
Experiments with Several Factors - Factorial Experiments
Experiments with two or more factors are encountered frequently. The best way to carry out such experiments is by using factorial experiments. Factorial experiments are experiments in which all combinations of factors are investigated in each replicate of the experiment. Factorial experiments are the only means to completely and systematically study interactions between factors in addition to identifying significant factors. One-factor-at-a-time experiments (where each factor is investigated separately by keeping all the remaining factors constant) do not reveal the interaction effects between the factors. Further, in one-factor-at-a-time experiments full randomization is not possible.
To illustrate factorial experiments consider an experiment where the response is investigated for two factors,
Investigating Factor Effects
The effect of factor
Therefore, when
Investigating Interactions
Now assume that the response values for each of the four treatment combinations were obtained as shown in the fourth table. The main effect of
It appears that
Note that in this case, if a one-factor-at-a-time experiment were used to investigate the effect of factor
Analysis of General Factorial Experiments
In DOE++, factorial experiments are referred to as factorial designs. The experiments explained in this section are referred to as general factorial designs. This is done to distinguish these experiments from the other factorial designs supported by DOE++ (see the figure below). The other designs (such as the 2 level full factorial designs that are explained in Two Level Factorial Experiments) are special cases of these experiments in which factors are limited to a specified number of levels. The ANOVA model for the analysis of factorial experiments is formulated as shown next. Assume a factorial experiment in which the effect of two factors,
where:
- •
represents the overall mean effect - •
is the effect of the th level of factor ( ) - •
is the effect of the th level of factor ( ) - •
represents the interaction effect between and - •
represents the random error terms (which are assumed to be normally distributed with a mean of zero and variance of ) - • and the subscript
denotes the replicates ( )
Since the effects
Hypothesis Tests in General Factorial Experiments
These tests are used to check whether each of the factors investigated in the experiment is significant or not. For the previous example, with two factors,
The test statistics for the three tests are as follows:
- 1)
- where
is the mean square due to factor and is the error mean square.
- where
- 1)
- 2)
- where
is the mean square due to factor and is the error mean square.
- where
- 2)
- 3)
- where
is the mean square due to interaction and is the error mean square.
- where
- 3)
The tests are identical to the partial
where
Similarly the test statistic to test significance of factor
It is recommended to conduct the test for interactions before conducting the test for the main effects. This is because, if an interaction is present, then the main effect of the factor depends on the level of the other factors and looking at the main effect is of little value. However, if the interaction is absent then the main effects become important.
Example
Consider an experiment to investigate the effect of speed and type of fuel additive used on the mileage of a sports utility vehicle. Three speeds and two types of fuel additives are investigated. Each of the treatment combinations are replicated three times. The mileage values observed are displayed in the fifth table.
The experimental design for the data in the fifth table is shown in the figure below. In the figure, the factor Speed is represented as factor
The test statistics for the three tests are:
- 1.
- where
is the mean square for factor and is the error mean square
- where
- 1.
- 2.
- where
is the mean square for factor and is the error mean square
- where
- 2.
- 3.
- where
is the mean square for interaction and is the error mean square
- where
- 3.
The ANOVA model for this experiment can be written as:
where
Expression of the ANOVA Model as
Since the effects
Therefore, only two of the
Therefore, only one of the
The last five equations given above represent four constraints, as only four of these five equations are independent. Therefore, only two out of the six
The regression version of the ANOVA model can be obtained using indicator variables, similar to the case of the single factor experiment in Fitting ANOVA Models. Since factor
Factor
The
In matrix notation this model can be expressed as:
- where:
The vector
Knowing
Calculation of Sum of Squares for the Model
The model sum of squares,
where
The total sum of squares,
Since there are 18 observed response values, the number of degrees of freedom associated with the total sum of squares is 17 (
Since there are three replicates of the full factorial experiment, all of the error sum of squares is pure error. (This can also be seen from the preceding figure, where each treatment combination of the full factorial design is repeated three times.) The number of degrees of freedom associated with the error sum of squares is:
Calculation of Extra Sum of Squares for the Factors
The sequential sum of squares for factor
where
Since there are two independent effects (
Similarly, the sum of squares for factor
Since there is one independent effect,
The sum of squares for the interaction
Since there are two independent interaction effects,
Calculation of the Test Statistics
Knowing the sum of squares, the test statistic for each of the factors can be calculated. Analyzing the interaction first, the test statistic for interaction
The
Assuming that the desired significance level is 0.1, since
The test statistic for factor
The
Since
The test statistic for factor
The
Since
Calculation of Effect Coefficients
Results for the effect coefficients of the model of the regression version of the ANOVA model are displayed in the Regression Information table in the following figure. Calculations of the results in this table are discussed next. The effect coefficients can be calculated as follows:
Therefore,
For example, the standard error for
Then the
The
Confidence intervals on
Thus, the 90% limits on
Least Squares Means
The estimated mean response corresponding to the
Residual Analysis
As in the case of single factor experiments, plots of residuals can also be used to check for model adequacy in factorial experiments. Box-Cox transformations are also available in DOE++ for factorial experiments.
Factorial Experiments with a Single Replicate
If a factorial experiment is run only for a single replicate then it is not possible to test hypotheses about the main effects and interactions as the error sum of squares cannot be obtained. This is because the number of observations in a single replicate equals the number of terms in the ANOVA model. Hence the model fits the data perfectly and no degrees of freedom are available to obtain the error sum of squares. For example, if the two factor experiment to study the effect of speed and fuel additive type on mileage was run only as a single replicate there would be only six response values. The regression version of the ANOVA model has six terms and therefore will fit the six response values perfectly. The error sum of squares,
Blocking
Many times a factorial experiment requires so many runs that all of them cannot be completed under homogeneous conditions. This may lead to inclusion of the effects of nuisance factors into the investigation. Nuisance factors are factors that have an effect on the response but are not of primary interest to the investigator. For example two replicates of a two-factor factorial experiment require eight runs. If four runs require the duration of one day to be completed then the total experiment will require two days to be completed. The difference in the conditions on the two days may lead to introduction of effects on the response that are not the result of the two factors being investigated. Therefore, the day is a nuisance factor for this experiment. Nuisance factors can be accounted for using blocking. In blocking, experimental runs are separated based on levels of the nuisance factor. For the case of the two-factor factorial experiment where the day is a nuisance factor, separation can be made into two groups or blocks -runs that are carried out on the first day belong to block 1, and runs that are carried out on the second day belong to block 2. Thus, within each block conditions are the same with respect to the nuisance factor. As a result, each block investigates the effects of the factors of interest, while the difference in the blocks measures the effect of the nuisance factor. For the example of the two factor factorial experiment, a possible assignment of runs to the blocks could be - one replicate of the experiment is assigned to block 1 and the second replicate is assigned to block 2 (now each block contains all possible treatment combinations). Within each block, runs are subjected to randomization (i.e., randomization is now restricted to the runs within a block). Such a design, where each block contains one complete replicate and the treatments within a block are subjected to randomization, is called randomized complete block design.
In summary, blocking should always be used to account for the effects of nuisance factors if it is not possible to hold the nuisance factor at a constant level through all of the experimental runs. Randomization should be used within each block to counter the effects of any unknown variability that may still be present.
Example
Consider the experiment of the fifth table where the mileage of a sports utility vehicle was investigated for the effects of speed and fuel additive type. Now assume that the three replicates for this experiment were carried out on three different vehicles. To ensure that the variation from one vehicle to another does not have an effect on the analysis, each vehicle is considered as one block. See the experiment design in the following figure.
For the purpose of the analysis, the block is considered as a main effect except that it is assumed that interactions between the block and the other main effects do not exist. Therefore, there is one block main effect (having three levels - block 1, block 2 and block 3), two main effects (speed -having three levels; and fuel additive type - having two levels) and one interaction effect (speed-fuel additive interaction) for this experiment. Let
The test statistic for this test is:
where
where:
- •
represents the overall mean effect is the effect of the th level of the block ( )
- •
is the effect of the th level of factor ( ) - •
is the effect of the th level of factor ( ) - •
represents the interaction effect between and - • and
represents the random error terms (which are assumed to be normally distributed with a mean of zero and variance of )
In order to calculate the test statistics, it is convenient to express the ANOVA model of the equation given above in the form
Expression of the ANOVA Model as
Since the effects
Constraints on
Therefore, only two of the
Constraints on
Therefore, only two of the
Therefore, only one of the
The last five equations given above represent four constraints as only four of the five equations are independent. Therefore, only two out of the six
The regression version of the ANOVA model can be obtained using indicator variables. Since the block has three levels, two indicator variables,
Factor
Factor
The
In matrix notation this model can be expressed as:
- or:
Knowing
Calculation of the Sum of Squares for the Model
The model sum of squares,
Since seven effect terms (
The total sum of squares can be calculated as:
Since there are 18 observed response values, the number of degrees of freedom associated with the total sum of squares is 17 (
The number of degrees of freedom associated with the error sum of squares is:
Since there are no true replicates of the treatments (as can be seen from the design of the previous figure, where all of the treatments are seen to be run just once), all of the error sum of squares is the sum of squares due to lack of fit. The lack of fit arises because the model used is not a full model since it is assumed that there are no interactions between blocks and other effects.
Calculation of the Extra Sum of Squares for the Factors
The sequential sum of squares for the blocks can be calculated as:
where
Since there are two independent block effects, and
Similarly, the sequential sum of squares for factor
Sequential sum of squares for the other effects are obtained as
Calculation of the Test Statistics
Knowing the sum of squares, the test statistics for each of the factors can be calculated. For example, the test statistic for the main effect of the blocks is:
The
Assuming that the desired significance level is 0.1, since

Use of Regression to Calculate Sum of Squares
This section explains the reason behind the use of regression in DOE++ in all calculations related to the sum of squares. A number of textbooks present the method of direct summation to calculate the sum of squares. But this method is only applicable for balanced designs and may give incorrect results for unbalanced designs. For example, the sum of squares for factor
where
The analogous term to calculate
where
Applying these relations to the unbalanced data of the last table, the sum of squares for the interaction
which is obviously incorrect since the sum of squares cannot be negative. For a detailed discussion on this refer to [Searle].
The correct sum of squares can be calculated as shown next. The
Then the sum of squares for the interaction
where
This is the value that is calculated by DOE++ (see the first figure below, for the experiment design and the second figure below for the analysis).