Residuals (College Board AP® Statistics): Study Guide
Syllabus Edition
First teaching 2026
First exams 2027
Residuals
What are residuals?
A residual of a data point on a scatterplot is its vertical distance from the regression line
A positive residual means the point lies above the regression line
A negative residual means the point lies below the regression line
When a residual is positive, the regression line underestimates the
-value of that data point
whereas when a residual is negative, the regression line overestimates it
An outlier gives a larger residual than the other points

What is the formula for calculating a residual?
The formula for calculating a residual is residual =
The residual is the actual
-value minus the predicted
-value
Examiner Tips and Tricks
Residuals can be negative. Make sure you input the values into the formula in the correct order. You subtract the predicted value from the actual value.
Worked Example
A scatterplot and regression line are shown below. Calculate the residual for each of the five data points.

Answer:
The residuals are the numbers shown in brackets on the diagram below

Worked Example
A city planner is investigating the relationship between the distance a commuter lives from the downtown business district (in miles) and their average morning commute time (in minutes). The planner selects a random sample of 20 commuters and records their distance and commute time. A least-squares regression line is fit to the data, yielding the following equation:
One commuter in the sample lives 15 miles from the downtown business district and has an average morning commute time of 48 minutes. Calculate and interpret the residual for this commuter. Based on the residual, does the linear model overpredict or underpredict this commuter's commute time?
Answer:
Calculate the predicted commute time for a commuter living 15 miles away
Subtract the predicted value from the actual value
Interpret the residual
This commuter's actual average morning commute time is 3.4 minutes less than the commute time predicted by the least-squares regression model
Because the residual is negative, the linear model overpredicts (overestimates) the commuter's average morning commute time
Unlock more, it's free!
Was this revision note helpful?