Linear Regression (DP IB Applications & Interpretation (AI)): Revision Note

Linear regression

What is linear regression?

  • If strong linear correlation exists on a scatter diagram then the data can be modelled by a linear model

    • Drawing lines of best fit by eye is not the best method as it can be difficult to judge the best position for the line

  • The least squares regression line is the line of best fit that minimises the sum of the squares of the gap between the line and each data value

    • This is usually called the regression line of y on x

    • It can be calculated by looking at the vertical distances between the line and the data values

  • The regression line of y on x is written in the form space y equals a x plus b

  • a is the gradient of the line

    • It represents the change in y for each individual unit change in x

      • If is positive this means increases by when x increases by one

      • If is negative this means decreases by |a| when x increases by one

  • b is the y – intercept

    • It shows the value of y when x is zero

  • You are expected to use your GDC to find the equation of the regression line

    • Enter the bivariate data and choose the model “ax + b”

    • Remember the mean point left parenthesis x with bar on top comma space y with bar on top right parenthesis will lie on the regression line

How do I use a regression line?

  • The equation of the regression line can be used to decide what type of correlation there is if there is no scatter diagram

    • If a is positive then the data set has positive correlation

    • If a is negative then the data set has negative correlation

  • The equation of the regression line can also be used to predict the value of a dependent variable (y) from an independent variable (x)

    • The equation should only be used to make predictions for y

      • Using a y on x line to predict x is not always reliable

    • Making a prediction within the range of the given data is called interpolation

      • This is usually reliable

      • The stronger the correlation the more reliable the prediction

    • Making a prediction outside of the range of the given data is called extrapolation

      • This is much less reliable

    • The prediction will be more reliable if the number of data values in the original sample set is bigger

Examiner Tips and Tricks

Once you calculate the values of and b, store them in your GDC.

This helps to avoid rounding errors, as you can use the full display values rather than the rounded values when using the linear regression equation to predict other values.

Worked Example

Barry is a music teacher. For 7 students, he records the time they spend practising per week (x hours) and their score in a test (y %).

Time (x)

2

5

6

7

10

11

12

Score (y)

11

49

55

75

63

68

82

a) Write down the equation of the regression line of y on x, giving your answer in the form y equals a x plus b where a and b are constants to be found.

4-2-3-ib-ai-sl-linear-regression-a-we-solution

b) Give an interpretation of the value of a.

4-2-3-ib-ai-sl-linear-regression-b-we-solution

c) Another of Barry’s students practises for 15 hours a week, estimate their score. Comment on the validity of this prediction.

4-2-3-ib-ai-sl-linear-regression-c-we-solution

You've read 0 of your 5 free revision notes this week

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Dan Finlay

Author: Dan Finlay

Expertise: Maths Subject Lead

Dan graduated from the University of Oxford with a First class degree in mathematics. As well as teaching maths for over 8 years, Dan has marked a range of exams for Edexcel, tutored students and taught A Level Accounting. Dan has a keen interest in statistics and probability and their real-life applications.

Roger B

Reviewer: Roger B

Expertise: Maths Content Creator

Roger's teaching experience stretches all the way back to 1992, and in that time he has taught students at all levels between Year 7 and university undergraduate. Having conducted and published postgraduate research into the mathematical theory behind quantum computing, he is more than confident in dealing with mathematics at any level the exam boards might throw at you.