Correlation Coefficients (DP IB Applications & Interpretation (AI)): Revision Note

Did this video help you?

PMCC

What is Pearson’s product-moment correlation coefficient?

  • Pearson’s product-moment correlation coefficient (PMCC) is a way of giving a numerical value to a linear relationship of bivariate data

  • The PMCC of a sample is denoted by the letter r

    • r can take any value such that negative 1 less or equal than r less or equal than 1

    • A positive value of r describes positive correlation

    • A negative value of r describes negative correlation

    • r = 0 means there is no linear correlation

    • r = 1 means perfect positive linear correlation

    • r = -1 means perfect negative linear correlation

    • The closer to 1 or -1 the stronger the correlation

2-5-1-pmcc-diagram-1

How do I calculate Pearson’s product-moment correlation coefficient (PMCC)?

  • You will be expected to use the statistics mode on your GDC to calculate the PMCC

  • The formula can be useful to deepen your understanding

begin mathsize 22px style r equals fraction numerator S subscript x y end subscript over denominator S subscript x S subscript y end fraction end style 

  • S subscript x y end subscript equals sum from i equals 1 to n of x subscript i y subscript i minus 1 over n stretchy left parenthesis sum from i equals 1 to n of x subscript i stretchy right parenthesis stretchy left parenthesis sum from i equals 1 to n of y subscript i stretchy right parenthesis is linked to the covariance

  • S subscript x equals square root of sum from i equals 1 to n of x subscript i squared minus 1 over n stretchy left parenthesis sum from i equals 1 to n of x subscript i stretchy right parenthesis squared end root and S subscript y equals square root of sum from i equals 1 to n of y subscript i squared minus 1 over n stretchy left parenthesis sum from i equals 1 to n of y subscript i stretchy right parenthesis squared end root are linked to the variances

  • You do not need to learn this as using your GDC will be expected

When does the PMCC suggest there is a linear relationship?

  • Critical values of r indicate when the PMCC would suggest there is a linear relationship

    • In your exam you will be given critical values where appropriate

    • Critical values will depend on the size of the sample

  • If the absolute value of the PMCC is bigger than the critical value then this suggests a linear model is appropriate

Did this video help you?

Spearman’s Rank

What is Spearman’s rank correlation coefficient?

  • Spearman's rank correlation coefficient is a measure of how well the relationship between two variables can be described using a monotonic function

    • Monotonic means the points are either always increasing or always decreasing

    • This can be used as a way to measure correlation in linear models

    • Though Spearman's Rank correlation coefficient can also be used to assess a non-linear relationship

  • Each data is ranked, from biggest to smallest or from smallest to biggest

    • For n data values, they are ranked from 1 to n

    • It doesn't matter whether variables are ranked from biggest to smallest or smallest to biggest, but they must be ranked in the same order for both variables

  • Spearman’s rank of a sample is denoted by r subscript s

    • rs can take any value such that negative 1 less or equal than r subscript s less or equal than 1

    • A positive value of rs describes a degree of agreement between the rankings

    • A negative value of rs describes a degree of disagreement between the rankings

    • rs = 0 means the data shows no monotonic behaviour

    • rs = 1 means the rankings are in complete agreement: the data is strictly increasing

      • An increase in one variable means an increase in the other

    • rs = -1 means the rankings are in complete disagreement: the data is strictly decreasing

      • An increase in one variable means a decrease in the other

    • The closer to 1 or -1 the stronger the correlation of the rankings

4-2-2-ib-ai-sl-spearman-rank-diagram-1

How do I calculate Spearman’s rank correlation coefficient (PMCC)?

  • Rank each set of data independently

    • 1 to n for the x-values

    • 1 to n for the y-values

  • If some values are equal then give each the average of the ranks they would occupy

    • For example: if the 3rd, 4th and 5th highest values are equal then give each the ranking of 4

      • fraction numerator 3 plus 4 plus 5 over denominator 3 end fraction equals 4

  • Calculate the PMCC of the rankings using your GDC

    • This value is Spearman's rank correlation coefficient

Did this video help you?

Appropriateness & Limitations

Which correlation coefficient should I use?

  • Pearson’s PMCC tests for a linear relationship between two variables

    • It will not tell you if the variables have a non-linear relationship

      • Such as exponential growth

    • Use this if you are interested in a linear relationship

  • Spearman’s rank tests for a monotonic relationship (always increasing or always decreasing) between two variables

    • It will not tell you what function can be used to model the relationship

      • Both linear relationships and exponential relationships can be monotonic

    • Use this if you think there is a non-linear monotonic relationship

How are Pearson’s and Spearman’s correlation coefficients connected?

  • If there is linear correlation then the relationship is also monotonic

    • r equals 1 rightwards double arrow r subscript s equals 1

    • r equals negative 1 rightwards double arrow r subscript s equals negative 1

    • However the converse is not true

  • It is possible for Spearman’s rank to be 1 (or -1) but for the PMCC to be different

    • For example: data that follows an exponential growth model

      • r subscript s equals 1 as the points are always increasing

      • r less than 1 as the points do not lie on a straight line

Are Pearson’s and Spearman’s correlation coefficients affected by outliers?

  • Pearson’s PMCC is affected by outliers

    • as it uses the numerical value of each data point

  • Spearman’s rank is not usually affected by outliers

    • as it only uses the ranks of each data point

Examiner Tips and Tricks

  • You can use your GDC to plot the scatter diagram to help you visualise the data

Worked Example

The table below shows the scores of eight students for a maths test and an English test.

Maths left parenthesis x right parenthesis

7

18

37

52

61

68

75

82

English left parenthesis y right parenthesis

5

3

9

12

17

41

49

97

a) Write down the value of Pearson’s product-moment correlation coefficient, r.

4-2-2-ib-ai-sl-correlation-coefficients-a-we-solution

b) Find the value of Spearman’s rank correlation coefficient, r subscript s.

4-2-2-ib-ai-sl-correlation-coefficients-b-we-solution

c) Comment on the values of the two correlation coefficients.

4-2-2-ib-ai-sl-new-we-c
👀 You've read 1 of your 5 free revision notes this week
An illustration of students holding their exam resultsUnlock more revision notes. It's free!

By signing up you agree to our Terms and Privacy Policy.

Already have an account? Log in

Did this page help you?

Dan Finlay

Author: Dan Finlay

Expertise: Maths Subject Lead

Dan graduated from the University of Oxford with a First class degree in mathematics. As well as teaching maths for over 8 years, Dan has marked a range of exams for Edexcel, tutored students and taught A Level Accounting. Dan has a keen interest in statistics and probability and their real-life applications.

Download notes on Correlation Coefficients