Interpolation & Extrapolation using Linear Models (College Board AP® Statistics): Revision Note

Syllabus Edition

First teaching 2026

First exams 2027

Mark Curtis

Written by: Mark Curtis

Reviewed by: Dan Finlay

Updated on

Linear models for scatterplots

What is a linear model?

  • A linear model is a line that best fits the data on a scatterplot

  • The linear model shows the linear relationship between the explanatory variable, x, and the response variable, y

What is a linear regression line?

  • A linear regression model is of the form y with hat on top equals a plus b x

    • a is the y-intercept

    • b is the slope

  • A linear model allows you to make predictions (estimates) of a y-value, when given a specific x-value

    • The straight line is called a linear regression line ('regression line' for short)

  • For example, data for the price of a computer and the time it takes the computer to start up are shown below

    • The regression line predicts that a computer worth $620 will start up in 3.4 seconds

A scatter plot with a downward-sloping regression line shows time (seconds) on the y-axis and price ($) on the x-axis. Points are scattered around the line.

Interpolation & extrapolation

What is interpolation?

  • Interpolation means using a regression line to predict a y-value from a given x-value

    • where the value of x lies within the interval of x-values seen in the data

  • This is seen as a reliable prediction

  • For example, data for the price of the computer and the time it takes the computer to start up are shown below

    • Predicting the start up time of a computer worth $620 is interpolation

      • because $620 lies within the interval of x-values seen, from $220 to $900

A scatter plot with a downward-sloping regression line shows time (seconds) on the y-axis and price ($) on the x-axis. Points are scattered around the line.

What is extrapolation?

  • Extrapolation means using a regression line to predict a y-value from a given x-value

    • where the value of x lies outside the interval of x-values seen in the data

  • It can be thought of as extending the regression line on either side

    • then using those line segments to predict values

      • e.g. from above, predicting the start up time for a computer costing over $900 would be extrapolation

  • This is far less reliable as you do not know how the variables relate outside of the range of data given

    • The linear relationship might break down or change direction

  • The further you extrapolate, the less reliable the estimates become

Worked Example

A marine biologist collected data on the length (in centimeters) and weight (in kilograms) of 15 adult female harbor seals in a specific region. The lengths of the adult seals in the sample ranged from 135 cm to 175 cm. The data were used to create the following least-squares regression line:

predicted space weight equals negative 45.2 plus 0.75 cross times left parenthesis length right parenthesis

(a) Calculate the predicted weight for a harbor seal from this population that is 160 cm long. Does this calculation represent interpolation or extrapolation? Justify your answer.

(b) The biologist wants to use this regression model to predict the weight of a newborn harbor seal pup that is 85 cm long. Calculate the predicted weight for this seal pup. Does this calculation represent interpolation or extrapolation? Explain why this prediction might be unreliable.

Answer:

(a)

Calculate the predicted weight

negative 45.2 plus 0.75 cross times left parenthesis 160 right parenthesis equals 74.8

74.8 kilograms

Interpolation is predicting a response value using a value for the explanatory variable that is within the interval of x-values used to determine the regression line

This calculation represents interpolation

Because 160 cm falls between the sample lengths of 135 cm and 175 cm, it is an interpolated prediction

(b)

Calculate the predicted weight

negative 45.2 plus 0.75 cross times left parenthesis 85 right parenthesis equals 18.55

18.55 kilograms

Extrapolation is predicting a response value using a value for the explanatory variable that is beyond the interval of x-values used to determine the regression line

This calculation represents extrapolation

Because 85 cm is significantly outside the observed domain of 135 cm to 175 cm, the prediction is less reliable

There is no guarantee that the linear relationship observed for adult female seals will remain the same for newborn pups (the true relationship might be curved or have a different rate of change outside the studied interval)

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Mark Curtis

Author: Mark Curtis

Expertise: Maths Content Creator

Mark graduated twice from the University of Oxford: once in 2009 with a First in Mathematics, then again in 2013 with a PhD (DPhil) in Mathematics. He has had nine successful years as a secondary school teacher, specialising in A-Level Further Maths and running extension classes for Oxbridge Maths applicants. Alongside his teaching, he has written five internal textbooks, introduced new spiralling school curriculums and trained other Maths teachers through outreach programmes.

Dan Finlay

Reviewer: Dan Finlay

Expertise: Maths Subject Lead

Dan graduated from the University of Oxford with a First class degree in mathematics. As well as teaching maths for over 8 years, Dan has marked a range of exams for Edexcel, tutored students and taught A Level Accounting. Dan has a keen interest in statistics and probability and their real-life applications.