Goodness of Fit Test (DP IB Applications & Interpretation (AI)): Revision Note

Chi-Squared GOF: Uniform

What is a chi-squared goodness of fit test for a given distribution?

  • A chi-squared (chi squared) goodness of fit test is used to test data from a sample which suggests that the population has a given distribution

  • This could be that: 

    • the proportions of the population for different categories follows a given ratio 

    • the population follows a uniform distribution

      • This means all outcomes are equally likely

What are the steps for a chi-squared goodness of fit test for a given distribution?

  • STEP 1
    Write the hypotheses

    • H0 : Variable X can be modelled by the given distribution

    • H1 : Variable X cannot be modelled by the given distribution

      • Make sure you clearly write what the variable is and don’t just call it X

  • STEP 2
    Calculate the expected frequencies

    • Split the total frequency using the given ratio

    • For a uniform distribution: divide the total frequency N by the number of possible outcomes k

  • STEP 3
    Calculate the degrees of freedom for the test

    • For k possible outcomes

    • Degrees of freedom is nu equals k minus 1

  • STEP 4
    Enter the frequencies and the degrees of freedom into your GDC

    • Enter the observed and expected frequencies as two separate lists

    • Your GDC will then give you the χ² statistic and its p-value

    • The χ² statistic is denoted as chi subscript c a l c end subscript superscript 2

  • STEP 5
    Decide whether there is evidence to reject the null hypothesis

    • EITHER compare the χ² statistic with the given critical value

      • If χ² statistic > critical value then reject H0

      • If χ² statistic < critical value then accept H0

    • OR compare the p-value with the given significance level

      • If p-value < significance level then reject H0

      • If p-value > significance level then accept H0

  • STEP 6
    Write your conclusion

    • If you reject H0

      • There is sufficient evidence to suggest that variable X does not follow the given distribution

      • Therefore this suggests that the data is not distributed as claimed

    •  If you accept H0

      • There is insufficient evidence to suggest that variable X does not follow the given distribution

      • Therefore this suggests that the data is distributed as claimed

Worked Example

A car salesman is interested in how his sales are distributed and records his sales results over a period of six weeks. The data is shown in the table.

Week

1

2

3

4

5

6

Number of sales

15

17

11

21

14

12

chi squared goodness of fit test is to be performed on the data at the 5% significance level to find out whether the data fits a uniform distribution.

a) Find the expected frequency of sales for each week if the data were uniformly distributed.

4-7-3-ib-ai-sl-gof-uniform-a-we-solution

b) Write down the null and alternative hypotheses.

4-7-3-ib-ai-sl-gof-uniform-b-we-solution

c) Write down the number of degrees of freedom for this test.

4-7-3-ib-ai-sl-gof-uniform-c-we-solution

d) Calculate the p-value.

4-7-3-ib-ai-sl-gof-uniform-d-we-solution

e) State the conclusion of the test. Give a reason for your answer.

4-7-3-ib-ai-sl-gof-uniform-e-we-solution

Chi-Squared GOF: Binomial

What is a chi-squared goodness of fit test for a binomial distribution?

  • A chi-squared (chi squared) goodness of fit test is used to test data from a sample suggesting that the population has a binomial distribution

    • You will be given the value of p for the binomial distribution

What are the steps for a chi-squared goodness of fit test for a binomial distribution?

  • STEP 1
    Write the hypotheses

    • H0 : Variable X can be modelled by the binomial distribution straight B left parenthesis n comma space p right parenthesis

    • H1 : Variable X cannot be modelled by the binomial distribution straight B left parenthesis n comma space p right parenthesis

      • Make sure you clearly write what the variable is and don’t just call it X

      • State the values of n and p clearly

  • STEP 2
    Calculate the expected frequencies

    • Find the probability of each outcome using the binomial distribution straight P left parenthesis X equals x right parenthesis

    • Multiply the probability by the total frequency straight P left parenthesis X equals x right parenthesis cross times N

  • STEP 3
    Calculate the degrees of freedom for the test

    • For k outcomes,

      • Degrees of freedom is nu equals k minus 1

  • STEP 4
    Enter the frequencies and the degrees of freedom into your GDC

    • Enter the observed and expected frequencies as two separate lists

    • Your GDC will then give you the χ² statistic and its p-value

    • The χ² statistic is denoted as chi subscript c a l c end subscript superscript 2

  • STEP 5
    Decide whether there is evidence to reject the null hypothesis

    • EITHER compare the χ² statistic with the given critical value

      • If χ² statistic > critical value then reject H0

      • If χ² statistic < critical value then accept H0

    • OR compare the p-value with the given significance level

      • If p-value < significance level then reject H0

      • If p-value > significance level then accept H0

  • STEP 6
    Write your conclusion

    • If you reject H0

      • There is sufficient evidence to suggest that variable X does not follow the binomial distribution straight B left parenthesis n comma space p right parenthesis

      • Therefore this suggests that the data does not follow straight B left parenthesis n comma space p right parenthesis

    • If you accept H0

      • There is insufficient evidence to suggest that variable X does not follow the binomial distribution straight B left parenthesis n comma space p right parenthesis

      • Therefore this suggests that the data follows straight B left parenthesis n comma space p right parenthesis

Worked Example

A stage in a video game has three boss battles. 1000 people try this stage of the video game and the number of bosses defeated by each player is recorded.

Number of bosses defeated

0

1

2

3

Frequency

490

384

111

15

chi squared goodness of fit test at the 5% significance level is used to decide whether the number of bosses defeated can be modelled by a binomial distribution with a 20% probability of success.

a) State the null and alternative hypotheses.

4-7-3-ib-ai-sl-gof-binomial-a-we-solution

b) Assuming the binomial distribution holds, find the expected number of people that would defeat exactly one boss.

t9ph9q9z_4-7-3-ib-ai-sl-gof-binomial-b-we-solution

c) Calculate the p-value for the test.

3sGACCT3_4-7-3-ib-ai-sl-gof-binomial-c-we-solution

d) State the conclusion of the test. Give a reason for your answer.

opxxE5_K_4-7-3-ib-ai-sl-gof-binomial-d-we-solution

Chi-Squared GOF: Normal

What is a chi-squared goodness of fit test for a normal distribution?

  • A chi-squared (chi squared) goodness of fit test is used to test data from a sample suggesting that the population has a normal distribution

    • You will be given the value of μ and σ for the normal distribution

What are the steps for a chi-squared goodness of fit test for a normal distribution?

  • STEP 1
    Write the hypotheses

    • H0 : Variable X can be modelled by the normal distribution straight N left parenthesis mu comma space sigma squared right parenthesis

    • H1 : Variable X cannot be modelled by the normal distribution straight N left parenthesis mu comma space sigma squared right parenthesis

      •  Make sure you clearly write what the variable is and don’t just call it X

      • State the values of μ and σ clearly

  • STEP 2
    Calculate the expected frequencies

    • Find the probability of each outcome using the normal distribution straight P left parenthesis a less than X less than b right parenthesis

      • Beware of unbounded inequalities straight P left parenthesis X less than b right parenthesis or straight P left parenthesis X greater than a right parenthesis for the class intervals on the 'ends'

    • Multiply the probability by the total frequency straight P left parenthesis a less than X less than b right parenthesis cross times N

  • STEP 3
    Calculate the degrees of freedom for the test

    •  For k class intervals,

      • Degrees of freedom is nu equals k minus 1

  •  STEP 4
    Enter the frequencies and the degrees of freedom into your GDC

    • Enter the observed and expected frequencies as two separate lists

    • Your GDC will then give you the χ² statistic and its p-value

    • The χ² statistic is denoted as chi subscript c a l c end subscript superscript 2

  • STEP 5
    Decide whether there is evidence to reject the null hypothesis

    • EITHER compare the χ² statistic with the given critical value

      • If χ² statistic > critical value then reject H0

      • If χ² statistic < critical value then accept H0

    • OR compare the p-value with the given significance level

      • If p-value < significance level then reject H0

      • If p-value > significance level then accept H0

  •  STEP 6
    Write your conclusion

    •  If you reject H0

      • There is sufficient evidence to suggest that variable X does not follow the normal distribution straight N left parenthesis mu comma space sigma squared right parenthesis

      • Therefore this suggests that the data does not follow straight N left parenthesis mu comma space sigma squared right parenthesis

    • If you accept H0

      •  There is insufficient evidence to suggest that variable X does not follow the normal distribution straight N left parenthesis mu comma space sigma squared right parenthesis

      •  Therefore this suggests that the data follows straight N left parenthesis mu comma space sigma squared right parenthesis

Worked Example

300 marbled ducks in Quacktown are weighed and the results are shown in the table below.

Mass (g)

Frequency

m less than 470

10

470 less or equal than m less than 520

158

520 less or equal than m less than 570

123

m greater or equal than 570

9

chi squared goodness of fit test at the 10% significance level is used to decide whether the mass of a marbled duck can be modelled by a normal distribution with mean 520 g and standard deviation 30 g.

a) Calculate the expected frequencies, giving your answers correct to 2 decimal places.

4-7-3-ib-ai-sl-gof-normal-a-we-solution

b) Write down the null and alternative hypotheses.

4-7-3-ib-ai-sl-gof-normal-b-we-solution

c) Calculate the chi squared statistic.

4-7-3-ib-ai-sl-gof-normal-c-we-solution

d) Given that the critical value is 6.251, state the conclusion of the test. Give a reason for your answer.

4-7-3-ib-ai-sl-gof-normal-d-we-solution

You've read 0 of your 5 free revision notes this week

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Dan Finlay

Author: Dan Finlay

Expertise: Maths Subject Lead

Dan graduated from the University of Oxford with a First class degree in mathematics. As well as teaching maths for over 8 years, Dan has marked a range of exams for Edexcel, tutored students and taught A Level Accounting. Dan has a keen interest in statistics and probability and their real-life applications.

Roger B

Reviewer: Roger B

Expertise: Maths Content Creator

Roger's teaching experience stretches all the way back to 1992, and in that time he has taught students at all levels between Year 7 and university undergraduate. Having conducted and published postgraduate research into the mathematical theory behind quantum computing, he is more than confident in dealing with mathematics at any level the exam boards might throw at you.