## Analysis Of Variance

Analysis of variance (ANOVA) is one of the most commonly used statistical techniques in psychological research. The basic approach (and the reason for the name of the procedure) is to use estimates of variability to test hypotheses about group means.

To be more specific, consider an experimental design with a single factor (independent variable) that has, say, four levels. Suppose that the scores at each level are the numbers of items correctly recalled by participants in a memory experiment and the factor is learning strategy; that is, the levels of the factor correspond to different learning strategies. Each learning strategy can be thought of as being associated with a hypothetical population of scores: all the scores that have been, or could be, obtained using the strategy if the experiment were conducted over and over again. If the participants in the current experiment are appropriately chosen and assigned to the learning groups, the scores actually obtained in the four groups can be thought of as random samples from the populations associated with the different strategies. ANOVA can be used to test the null hypothesis that the means of the populations corresponding to the different strategies are all the same. That is, ANOVAprovides a procedure for deciding whether the data collected in the experiment provide sufficient evidence to reject the null hypothesis, so that the strategy factor can be considered to be statistically significant.

Even if the null hypothesis was true, we would not expect all the sample means in our experiment to be equal. Any true differences among the different strategies will be obscured by random error variability in the obtained scores. That is, scores may differ from one another not only because they are associated with different learning strategies, but also because of a possible host of additional variables. For example, some participants might be better learners than others or be more motivated to perform well in the experiment. Perhaps for some participants there was background noise or other factors that interfered with learning present during the experiment. Because of this uncontrolled "error"

variability, even if participants were assigned randomly to groups so that the groups would not differ systematically, the more talented or motivated participants would not be distributed exactly evenly across the groups, so the group means would be expected to differ from one another. The ANOVA procedure attempts to determine whether the group means associated with the different levels of an independent variable or factor differ from one another by more than would be expected on the basis of the error variability.

The mean of the variances of the scores within each group provides one estimate of the error variability. If the null hypothesis is true, the variability among the group means can be used to generate another estimate of the error variability. Under certain assumptions, the ratio of these two estimates is distributed as the F distribution if the null hypothesis is true. If the null hypothesis is not true, the estimate based on the group means should be larger than that based on the within-group variability because it includes not only random variability but all systematic variability due to the difference in the population means, and the ratio of the estimates should be larger than would be expected from the F distribution. In standard usage, if the value obtained for the ratio of the two estimates would place it in the extreme upper tail (the usual criterion is the upper 5%) of the F distribution, the null hypothesis is rejected.

ANOVAcan deal with the effects of several factors in the same analysis. If we apply ANOVA to a design with two factors, we can test whether each is significant. Moreover, we can test whether there is a significant interaction between the factors—that is, whether there is a joint effect of the two factors that cannot be determined by considering each factor separately (see the entry dealing with factorial designs).

The null hypotheses tested by an ANOVA are very general. For tests of a main effect, the null hypothesis is that the population means of a factor are all equal. For tests of the interactions of two or more factors, the null hypothesis is that the joint effects—that is, the effects that cannot be obtained by adding up the main effects of the factors in question—are all 0.

There are many different kinds of ANOVA designs. When each subject provides a single score at only one combination of levels of the factors in the design, we have what is called a pure between-subjects design. When each subject provides a score at every combination of levels of the factors in the design, we have a pure within-subjects or repeated-measures design. It is common to encounter mixed designs, in which a given subject provides scores at all levels of one or more within-subjects factors, but at only one level of one or more between-subjects factors.

ANOVAis commonly employed to analyze the data from experiments. It is less appropriate for data obtained from observational research, because ANOVA treats all factors as categorical and uncorrelated.

Keppel, G. (1991). Design and analysis: A researcher's handbook.

Englewood Cliffs, NJ: Prentice Hall. Moore, D. S. (2000). The basic practice of statistics (2nd ed.). New York: Freeman.

Myers, J. L., & Well, A. D. (2002). Research design and statistical analysis (2nd ed.). Mahwah, NJ: Erlbaum.

Arnold D. Well University of Massachusetts 