Regression Analysis

An Introduction to Regression
In regression analysis we fit a predictive model to our data: we use a model to predict values of the dependent variable (DV) from one or more independent variables (IVs). Simple regression seeks to predict an outcome from a single predictor whereas multiple regression seeks to predict an outcome from several predictors. This is an incredibly useful tool because it allows us to go a step beyond the data that we actually possess. The model that we fit to our data is a linear one and can be imagined by trying to summarize a data set with a straight line. With any data set there are a number of lines that could be used to summarize the general trend and so we need a way to decide which of many possible lines to choose. For the sake of drawing accurate conclusions we want to fit a model that best describes the data. The simplest way to fit a line is to use your eye to gauge a line that looks as though it summarizes the data well. However, the "eyeball" method is very subjective and so offers no assurance that the model is the best one that could have been chosen. Instead, we use a mathematical technique called the method of least squares to find the line that best describes the data collected.

Some Important Information about Straight Lines
Any straight line can be drawn if you know: (1) the slope (or gradient) of the line, and (2) the point at which the line crosses the vertical axis of the graph (the intercept of the line). The equation of a straight line is defined in equation (1), in which Y is the outcome variable that we want to predict and Xi is the ith subject's score on the predictor variable. b1 is the gradient and b0 is the intercept of the straight line fitted to the data. There is a residual term, εi, which represents the difference between the score predicted by the line for subject i and the score that subject i actually obtained. The equation is often conceptualized without this residual term (so, ignore it if it's upsetting you); however, it is worth knowing that this term represents the fact our model will not fit perfectly the data collected.

                Yi = b0 + b1 Xi + εi

A particular line has a specific intercept and gradient. Figure 1 shows a set of lines that have the same intercept but different gradients, and a set of lines that have the same gradient but different intercepts. Figure 1 also illustrates another useful point: that the gradient of the line tells us something about the nature of the relationship being described: a line that has a gradient with a positive value describes a positive relationship, whereas a line with a negative gradient describes a negative relationship. So, if you look at the graph in Figure 1 in which the gradients differ but the intercepts are the same, then the thicker line describes a positive relationship whereas the thinner line
describes a negative relationship.

If it is possible to describe a line knowing only the gradient and the intercept of that line, then the model that we fit to our data in linear regression (a straight line) can also be described mathematically by equation (1). With regression we strive to find the line that best describes the data collected, then estimate the gradient and intercept of that line. Having defined these values, we can insert different values of our predictor variable into the model to estimate the value of the outcome variable.

                                

 

Simple Regression on SPSS