The Analysis of Covariance (ANCOVA)

ANOVA can be extended to include one or more continuous variables that predict the outcome (or dependent variable). Continuous variables such as these, that are not part of the main experimental manipulation but have an influence on the dependent variable, are known as covariates and they can be included in an ANOVA analysis. For example, in the Viagra example, we might expect there to be other things that influence a person's libido other than Viagra. Some possible influences on libido might be the libido of the participant's spouse (after all 'it takes two to tango'!), other medication that suppresses libido (such as antidepressants), and fatigue. If these variables are measured, then it is possible to control for the influence they have on the dependent variable by including them in the model. What, in effect, happens is that we carry out a hierarchical regression in which our dependent variable is the outcome, and the covariate is entered in the first block. In a second block, our experimental manipulations are entered (in the form of what are called Dummy variables). So, we end up seeing what effect an independent variable has after the effect of the covariate. As such, we control for (or partial out) the effect of the covariate. The purpose of including covariates in ANOVA is two-fold:

 

1. To reduce within-group error variance: in ANOVA we assess the effect of an experiment by comparing the amount of variability in the data that the experiment can explain, against the variability that it cannot explain. If we can explain some of this 'unexplained' variance in terms of other variables (covariates), then we reduce the error variance, allowing us to more accurately assess the effect of the experimental manipulation.
2. Elimination of Confounds: In any experiment, there may be unmeasured variables that confound the results (i.e. a variable that varies systematically with the experimental manipulation). if any variables are known to influence the dependent variable being measured, then ANCOVA is ideally suited to remove the bias of these variables. Once a possible confounding variable has been identified, it can be measured and entered into the analysis as a covariate. 

Assumptions in ANCOVA

ANCOVA has the same assumptions as ANOVA except that there are two important additional considerations: (1) independence of the covariate and treatment effect, and (2) homogeneity of regression slopes. The first one basically means that the covariate should not be different across the groups in the analysis (in other words, if you did an ANOVA or t-test using the groups as the independent variable and the covariate as the outcome, this analysis should be non-significant).

When an ANCOVA is conducted we look at the overall relationship between the outcome (dependent variable) and the covariate: we fit a regression line to the entire data set, ignoring to which group a person belongs. in fitting this overall model we, therefore, assume that this overall relationship is true for all groups of participants. For example, if there's a positive relationship between the covariate and the outcome in one group, we assume that there is a positive relationship in all of the other groups too, if, however, the relationship between the outcome (dependent variable) and covariate differs across the groups then the overall regression model is inaccurate (it does not represent all of the groups). This assumption is very important and is called the assumption of homogeneity of regression slopes. The best way to think of this assumption is to imagine plotting a scatterplot for each experimental condition with the covariate on one axis and the outcome on the other. If you then calculated, and drew, the regression line for each of these scatterplots you should find that the regression lines look more or less the same (i.e. the values of b in each group should be equal). We will have a peek at an example of this assumption and how to test it later. 


 

Imagine that a researcher who conducted a Viagra study suddenly realized that the libido of the participants' spouse would effect that participant's own libido (especially because the measure of libido was behavioral). Therefore, the researcher repeated the study on a different set of participants, but this time took a measure of the spouse's libido. The spouse's libido was measured in terms of how often they tried to initiate sexual contact.

Entering Data

The data for this example are in Table 1. Without the covariate, the design is simply a one-way independent design, so we would enter these data using a coding variable for the independent variable, and scores on the dependent variable will go in a different column. All that changes is that we have an extra column for the covariate scores. So, create a coding variable called dose and use the Labels option to define value labels (e.g. 1 = placebo, 2 = low dose, 3 = high dose). There were five participants in each condition, so you need to enter 5 values of 1 into this column (so that the first 5 rows contain the value 1), followed by five values of 2, and followed by five values of 3. At this point, you should have one column with 15 rows of data entered. Next, create a second variable called libido and enter the 15 scores that correspond to the participant's libido. Finally, create a third variable called spouse, use the Labels option to give this variable a more descriptive title of "spouse's libido". Then, enter the 15 scores that correspond to the spouse's libido.

 

 Main Analysis

Most of the General Linear Model (GLM) procedures in SPSS contain the facility to include one or more covariates. For designs that don't involve repeated measures it is easiest to conduct ANCOVA via the GLM Univariate procedure.  To access the main dialog box select similar to that for one-way ANOVA, except that there is a space to specify covariates. Select Libido and drag this variable to the box labeled Dependent Variable or click on . Select Dose and drag it to the box labeled Fixed Factor(s) and then select Spouse-Libido and drag it to the box labeled -Covariate(s).
 

Contrasts and Other Options

There are various dialog boxes that can be accessed from the main dialog box. The first thing to notice is that if a covariate is selected, the post hoc tests are disabled (you cannot access this dialog box). Post hoc tests are not designed for situations in which a covariate is specified, however, some comparisons can still be done using contrasts. Click on to access the contrasts dialog box. This dialog box is different to the one we met for ANOVA in that you cannot enter codes to specify particular contrasts. Instead, you can specify one of several standard contrasts. These standard contrasts were listed in my book. In this example, there was a placebo control condition (coded as the first group), so a sensible set of contrasts would be simple contrasts comparing each experimental group with the control. To select a type of contrast click on to access a drop-down list of possible contrasts. Select a type of contrast (in this case Simple) from this list and the list will automatically disappear. For simple contrasts you have the option of specifying a reference category (which is the category against which all other groups are compared). By default the reference category is the last category: because in this case the control group was the first category (assuming that you coded placebo as 1) we need to change this option by selecting . When you have selected a new contrast option, you must click to get this change. The final dialog box should look like Figure 2. Click on to  return to the main dialog box.

                                                            

Another way to get post hoc tests is by clicking on to access the options dialog box. To specify post hoc tests, select the independent variable (in this case Dose) from the box labeled Estimated Marginal Means: Factor(s) and Factor Interactions and drag it to the box labeled Display Means for or click on . Once a variable has been transferred, the box labeled Compare main effects becomes active and you should check this option (). If this option is selected, the box labeled Confidence Interval Adjustment becomes active and you can click on to see a choice of three adjustment levels. The default is to have no adjustment and simply perform a TukeyLSD post hoc test (this option is not recommended); the second is to ask for a Bonferroni correction (recommended); the final option is to have a Sidak correction. The Sidak correction is similar to the Bonferroni correction but is less conservative and so should be selected if you are concerned about the loss of power associated with Bonferroni corrected values. For this example use the Sidak correction. As well as producing post hoc tests for the Dose variable, placing dose in the Display Means for box will create a table of estimated marginal means for this variable. These means provide an estimate of the adjusted group means (i.e. the means adjusted for the effect of the covariate). When you have selected the options required, click on  to return to the main dialog box and click on to run the analysis.

On to the Output...