Planning & Types of Data (Edexcel GCSE Statistics: Foundation): Exam Questions

Exam code: 1ST0

3 hours20 questions
1a
2 marks

Leyla wants to find out how often people in her town eat in a restaurant.

She asked a sample of 30 people how many times they had eaten in a restaurant during the last week.

Here are Leyla’s results.

3

4

2

1

1

5

1

1

1

2

2

1

2

1

1

2

5

1

3

1

1

4

3

3

1

4

2

1

1

2

Fill in the tally chart for this information and complete the frequency column.

Number of times

Tally

Frequency

1

2

3

4

5

1b
1 mark

Write down the mode.

1c
2 marks

Work out the number of people in Leyla’s sample who had eaten in a restaurant fewer than 4 times during the last week.

1d
1 mark

Suggest a suitable diagram that Leyla could use to represent her data.

2a
1 mark

A researcher is investigating how much the employees at a large company are paid.

One hypothesis she investigates is

“Men are paid more than women”.

The researcher could find it difficult to collect information to test her hypothesis.

Give one difficulty the researcher could have when trying to find out how much each employee is paid.

2b
1 mark

State the population for this investigation.

2c
3 marks

(i) Explain the difference between primary data and secondary data.

(2)

(ii) The researcher plans to collect primary data. Give a reason why.

(1)

2d
2 marks

The researcher plans to give a questionnaire to 60 employees of the company.

She asks the first 30 males and the first 30 females who come into work one morning to complete her questionnaire.

Give one advantage and one disadvantage of this sampling method.

Advantage.......................................

Disadvantage ..................................

3a
2 marks

Claire collected data on the weights of the England football team and the weights of the England rugby team from the internet.

She calculated the mean and range of the weights of each team. Her results are shown in this table.

Mean

Range

Football team

77.0 kg

30 kg

Rugby team

104.6 kg

42 kg

(Sources: thefa.com (opens in a new tab) and englandrugby.com (opens in a new tab))

State two possible problems with obtaining data from the internet

3b
3 marks

Use the information in the table to compare the distribution of weights of the England football team with the weights of the England rugby team. Interpret your comparison.

3c
1 mark

Suggest a possible problem with collecting primary data in this situation.

4a
2 marks

Tomoyo found the weight, in grams, of each of 100 cherries.

Circle the two words from the list that best describe the data Tomoyo found.

quantitative    qualitative    discrete    continuous    bivariate    ordinal    categorical

4b
Sme Calculator
5 marks

Tomoyo grouped the weights and she then drew this diagram for her results.

Bar chart showing frequency distribution of weights in grams; x-axis ranges 0-9 grams, y-axis shows frequency; bars at 2, 4, 6 units high.

The incomplete frequency table shows some information about her results.

Weight (w grams)

Frequency

1 less or equal than w less than 3

10

3 less or equal than w less than 5

5 less or equal than w less than 7

7 less or equal than w less than 9


(i) Complete the frequency column in the table.

(2)

(ii) Calculate an estimate of the mean weight of the 100 cherries.

........................ g

(3)

5
6 marks

Gary is going to investigate the amounts of time students spend watching TV.

He is going to write a plan for this investigation.

His hypothesis is

"The amount of time that boys spend watching TV is greater than the amount of time that girls spend watching TV".

Write down three other things he should include in his plan. Explain why each of these things is appropriate. You must refer to more than one stage of the statistical enquiry cycle.

6a
1 mark

A basketball team played 9 matches at the start of a season.

The total number of points they scored in each match is listed below.

80

64

87

64

42

81

89

138

68

Here are some words used to describe data.

grouped    discrete    categorical    continuous

Select a word from the list to complete the sentence.

The total number of points scored in a match is an example of ___________ data.

6b
2 marks

Work out the median score for these 9 matches.

6c
1 mark

Give one advantage of using the median to summarise this data.

6d
Sme Calculator
2 marks

Work out the range of points for these 9 matches.

6e
4 marks

The median and range for the final 9 matches of the season are shown in the table below.

Median

Range

90

25

Use your answers to part (b) and part (d) to compare the performance of the basketball team in the first 9 matches with the performance in the final 9 matches.

Give two comparisons and interpret both in context.

7a
1 mark

The manager of a gym is reviewing the current opening times of the gym. The manager thinks that if the gym is open for more hours it will affect the number of people using the gym.

Suggest a hypothesis that the manager could use.

7b
1 mark

The manager wants to get the opinions of the people who have a membership at the gym by giving them a questionnaire.

The manager obtains a numbered list of the 1500 people with a membership and decides to take a sample of 10% of the gym members.

The manager chooses the person who is numbered 0004 as the random starting point on the list and then picks every 20th person.

Name the sampling method that the manager plans to use.

7c
Sme Calculator
3 marks

i) Give one reason why this is a good plan.

(1)

(ii) Will the manager’s plan give a 10% sample of the gym members? Give a reason for your answer.

(2)

7d
2 marks

Here is one of the questions that the manager is considering for the questionnaire.

"Do you agree that the gym should stay open for 24 hours a day?"

Suggest two improvements to this question.

7e
3 marks

The manager decides to do a pre-test of the questionnaire by giving it to a small group of people.

(i) What is it called when a questionnaire is tested in this way?

(1)

(ii) Give two reasons why the manager might do this.

(2)

7f
1 mark

Following the full survey the manager concludes that if the gym is open for 24 hours a day it will not affect the number of people using the gym.

Give a reason why it would also be appropriate for the manager to find the opinions of people who do not have a gym membership.

8a
2 marks

A fjord is a deep and narrow part of a sea with steep land on three sides.

Emily is investigating the length of fjords in Norway. She collects some data from the internet and puts the data into a grouped frequency table.

The grouped frequency table below shows information about the results she collected.

Length of fjord (l km)

Frequency

0 less or equal than l less than 50

199

50 less or equal than l less than 100

17

100 less or equal than l less than 150

12

150 less or equal than l less than 200

3

200 less or equal than l less than 250

1

(Source: https://en.wikipedia.org/wiki/List_of_Norwegian_fjords (opens in a new tab))

Work out the number of fjords that have a length of at least 100km.

8b
Sme Calculator
5 marks

(i) Calculate an estimate of the mean length of these fjords.
Give your answer to 1 decimal place.

......................... km (3)

(ii) Explain why your answer to part (b)(i) is only an estimate.

(1)

(iii) How could Emily have improved the accuracy of her answer to part (b)(i)?

(1)

8c
2 marks

Emily plans to use a frequency polygon to represent the lengths of the fjords.

Discuss whether or not a frequency polygon would be an appropriate diagram to use.

9a
2 marks

Ben is researching information about the number of British swimming medals won at the Olympics.

Here are his results, giving the number of British swimming medals won at the Olympics from 1900 to 2016

3

0

7

6

2

4

4

2

0

1

1

2

3

1

1

1

3

5

5

3

1

2

0

2

3

3

6

(Source: www.teamgb.com (opens in a new tab))

Fill in the tally chart for Ben’s results and complete the frequency column.

Number of Olympic medals won

Tally

Frequency

0

1

2

3

4

5

6

7

9b
1 mark

Suggest a suitable diagram that could be used for Ben’s results.

9c
1 mark

Write down the mode or modes.

9d
2 marks

Work out the median.

9e
2 marks

Ben wants to use an average to summarise the data.

Which of the mode or the median would be more appropriate?
Give a reason for your answer.

10
6 marks

Claire is planning an investigation into the length of time that a learner has to wait for a driving test.
She wants to find out about how waiting time varies in different regions of the UK.

Here is her plan for data collection, for calculations and for diagrams.

Data collection
Visit a random sample of driving test centres in each region to ask for their waiting time in June.

Calculations
Calculate the average waiting time for each region for June.
Calculate the range of the waiting times for each region for June.

Diagrams
Draw a bar chart showing the average waiting time for each region in June.
Draw a pie chart showing the range of waiting times for each region in June.

Discuss whether Claire's plans for data collection, for calculations and for diagrams are appropriate.

11a
1 mark

Matthew is investigating average household income for different states in the USA.

Give a reason why it is appropriate to use secondary data for this.

11b
1 mark

Matthew creates a choropleth map giving information about the mean household income by state for the USA in 2023

Mean annual household income in $ thousands.

US map with states shaded based on a key: white for <70, dotted for 70-80, striped for 80-90, grey for 90-100, and black for >100.

Which three states have the lowest mean household income?

11c
2 marks

Matthew concludes that the mean household incomes are highest on the West coast and the East coast.

Does the choropleth map support this conclusion?
Give a reason for your answer.

12a
1 mark

Some researchers investigated the hand span, in centimetres, of adult pianists by their level – international, national and amateur.

The box plots below give information about the hand spans for national level and amateur level pianists.

Box plot comparing hand spans at amateur, national, and international levels, ranging from 16 to 28 cm, with sources cited below the graph.

Circle the word in the list below that describes hand span, in centimetres, as a type of data.

qualitative     ordinal     continuous     bivariate

12b
3 marks

The table gives information about the hand spans of the international level pianists.

Greatest hand span

27.4 cm

Median hand span

23.9 cm

Lower quartile

23.2 cm

Range

5.1 cm

Interquartile range

1.1 cm

Using the information in the table, draw on the grid above a box plot for the hand spans of the international level pianists.

12c
5 marks

Compare the three distributions of hand spans.
Give three comparisons and interpret two of your comparisons.

12d
Sme Calculator
3 marks

Pavel owns a music shop.
He wants to investigate the keyboard sizes used by pianists with different hand spans.
He collects data about the hand spans of the pianists who use his shop.

The table gives information about the number of these pianists with hand spans in each of four size categories.

Hand span (cm)

A
(less than 19)

B
(19 ≤ span < 22)

C
(22 ≤ span < 24)

D
(24 or more)

Number of pianists

24

65

57

14

Pavel plans to sample 20 of these pianists stratified by hand span size.

Explain how Pavel can obtain his stratified sample.
You should include details of any calculations he should use.

13a
2 marks

A town council is proposing to build a new leisure centre. Michelle is going to carry out a survey to find out what all the people in the town think of the proposal.

Michelle thinks that she should take a sample rather than a census.

Give two reasons why Michelle might think this.

13b
2 marks

Michelle plans to use the electoral register as the sampling frame.

(i) Explain what you understand by the term sampling frame.

[1]

(ii) Give one problem Michelle may have using the electoral register as the sampling frame.

[1]

13c
2 marks

Michelle intends to conduct a pilot study.

Give two reasons why it is a good idea to conduct a pilot study

13d
6 marks

Michelle is writing a plan for her investigation into people’s views on the leisure centre proposal.

Write down what Michelle should include in her plan. You should include each of the following

  • a sampling method

  • a question she could ask in her questionnaire

  • a statistical diagram she could use to show the results of the survey.

Explain why each of the things you have written down is appropriate.

14a
1 mark

David asked 15 of his friends about the number of pets they each have. Here is the data he collected.

0   0   0   0   0   1   1   2   2   2   4   4   4   4   8

Choose the word in the list below that describes this type of data.

  • continuous

  • qualitative

  • discrete

  • grouped

14b
1 mark

Write down the modal number of pets.

14c
1 mark

Find the median number of pets.

14d
1 mark

State which average, the mode or the median, best represents these data. Give a reason for your answer.

14e
1 mark

Find the interquartile range of the number of pets.

14f
4 marks

Wanda asked some of her friends about the number of pets they each have.

The table below is a summary of the data she collected.

Lower quartile

Median

Upper quartile

1

3

6

Compare the distribution of the numbers of pets for David with the distribution of the numbers of pets for Wanda.

Give two comparisons and interpret each of your comparisons.

14g
1 mark

Wanda recorded the highest number of pets as 15

She says that this must be an outlier and concludes that it should be removed from her data.

(i) Give one reason why Wanda’s conclusion may be appropriate.

[1]

(ii) Give one reason why Wanda’s conclusion may not be appropriate.

[1]

15a
1 mark

Kyle is investigating the heights and the weights of professional basketball players.

He found the weight, in kilograms, of some professional basketball players from 1950 to 1959

Choose the word in the list below that describes weight, in kilograms, as a type of data.

  • discrete

  • continuous

  • ordinal

  • categorical

15b
2 marks

The incomplete histogram and incomplete grouped frequency table give information about the weights, in kilograms, of the professional basketball players from 1950 to 1959

Histogram displaying weight distribution in kilograms, ranging from 60 to 120. Peaks between 80-90 kg. Frequency is plotted on the y-axis. Source: kaggle.com.

Weight (w kilograms)

Frequency

65 less than w less or equal than 70

5

70 less than w less or equal than 75

15

75 less than w less or equal than 80

61

80 less than w less or equal than 85

81

85 less than w less or equal than 90

___

90 less than w less or equal than 95

___

95 less than w less or equal than 100

35

100 less than w less or equal than 105

14

105 less than w less or equal than 110

9

110 less than w less or equal than 115

1

Use the information in the histogram to complete the table.

15c
2 marks

Use the information in the table to complete the histogram.

15d
1 mark

Kyle also drew a histogram for the weights of professional basketball players from 2000 to 2009
This histogram was negatively skewed.

Interpret the negative skew of the weights of professional basketball players from 2000 to 2009

15e
4 marks

Kyle also collected data about the heights of professional basketball players from 1950 to 1959 and the heights of professional basketball players from 2000 to 2009

The grouped frequency table below gives information about the heights of professional basketball players from 2000 to 2009

Height (h centimetres)

Frequency

170 less than h less or equal than 180

12

180 less than h less or equal than 190

146

190 less than h less or equal than 200

175

200 less than h less or equal than 210

323

210 less than h less or equal than 220

146

220 less than h less or equal than 230

8

Total

810

The estimate of the mean height for professional basketball players from 1950 to 1959 is calculated to be 190.9cm to one decimal place.

(i) Calculate an estimate of the mean height of basketball players from 2000 to 2009

....................................................... cm

[3]

(ii) Comment on how the mean height of professional basketball players has changed between the two sets of data.

[1]

16a
1 mark

Claire is investigating sales of different types of vehicle over time.
She plans to collect data on the numbers of motorcycles first registered in the UK over time.

Write down a suitable hypothesis for this investigation.

16b
2 marks

The time series graph shows some information about the numbers of motorcycles first registered in the UK from 2017 to 2019

Line graph showing UK motorcycle registrations from 2017 to 2019, peaking in Q2 each year and dropping to a low in Q4. Data source: gov.uk.

Identify and interpret one example of seasonal trend shown by the time series graph.

16c
1 mark

Claire calculated 4-point moving averages for the information shown in the time series graph.

Explain why this is appropriate.

16d
1 mark

Claire also collected data on the numbers of cars first registered in the UK from 2017 to 2019

The time series graph shows some information about the numbers of cars first registered in the UK from 2017 to 2019 together with the first seven 4-point moving averages.

Line graph showing UK car registrations from 2017 to 2019 in thousands, displaying fluctuating trends across quarters, data from gov.uk.

Compare the seasonal trend shown for the numbers of motorcycles first registered in the UK with the seasonal trend for the numbers of cars first registered in the UK.

16e
3 marks

The last three 4-point moving averages (thousands) for the number of cars registered in the UK from 2017 to 2019 are

576.0 575.3 573.9

Plot these three moving averages on the time series graph and draw a trend line.

16f
2 marks

Describe and interpret the trend in the numbers of cars first registered in the UK from 2017 to 2019

17a
1 mark

The incomplete comparative bar chart shows the total number of medals won by three of the countries that took part in the 2014 and 2018 Winter Olympics.

Bar chart showing medals won by Sweden, Great Britain, and Switzerland in 2014 and 2018 Winter Olympics; Sweden leads in 2018 with 14 medals.

The total number of medals won by Sweden in the 2018 Winter Olympics was 14

Complete the comparative bar chart for Sweden.

17b
2 marks

Work out how many more medals were won by Sweden than Great Britain in the 2014 Winter Olympics.

17c
2 marks

Compare the total number of medals won by Sweden, Great Britain and Switzerland in the 2014 Winter Olympics.

17d
1 mark

Thomas says that the data displayed in the comparative bar chart is quantitative data.

Explain what is meant by quantitative data.

18a
1 mark

Connie is going to write a report on the difference in total rainfall between London and Aberdeen in 2019

She collects secondary data to investigate this.

What should Connie include in her report?

  • source of the data

  • her telephone number

  • her age

  • name of her school

18b
1 mark

Describe one way that she could obtain this secondary data.

18c
3 marks

The table shows the total rainfall, in cm, for each month in 2019 in London.

Month

Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Total rainfall (cm)

5.9

4.6

3.7

3.6

4.0

3.9

4.7

5.9

5.4

7.1

7.2

6.5

(Source: en.climate-data.org (opens in a new tab))

The mean monthly rainfall in Aberdeen in 2019 is 6.2cm.

Connie considers the data in the table and concludes that the mean monthly rainfall for Aberdeen in 2019 is greater than the mean monthly rainfall in London in 2019

Is Connie correct? You must show how you get your answer.

19a
2 marks

A Science teacher wants to know the effects of revision on a student’s performance in an exam. She decides to carry out an experimental test on a group of 15 students to find out the effects of any revision.

Describe one way the teacher could carry out an experimental test.

19b
1 mark

Give one reason why the results of this experimental test could be unreliable.

20a
2 marks

Sam used the internet to collect the times, in minutes, it took for 50 cyclists to compete in a hill climb competition. He used a group frequency table to record the results he collected.

(i) Give one advantage of using grouped data rather than raw data.

[1]

(ii) Give one disadvantage of using grouped data rather than raw data.

[1]

20b
1 mark

Sam used this grouped frequency table to show the results for the hill climb.

Time (t minutes)

Frequency

11 less or equal than t less than 12

2

12 less or equal than t less than 13

25

13 less or equal than t less than 14

15

14 less or equal than t less than 15

4

15 less or equal than t less than 16

1

16 less or equal than t less than 17

1

17 less or equal than t less than 18

1

(Source: cyclinguphill.com (opens in a new tab))

Before Sam collected the data he did not know what the longest time would be. The longest time in the hill climb was 28.3 minutes.

Explain why this table cannot be used to show the data for all 50 riders.

20c
1 mark

Sam drew this frequency polygon for the hill climb results.

Line graph showing frequency versus time taken in minutes, peaking at 24 frequency for 13 minutes, then declining to 2 frequency by 17 minutes.

Sam decided not to include the value of 28.3 minutes on his frequency polygon.

Suggest a reason why Sam’s decision might be appropriate.

20d
2 marks

(i) Describe the skew of the distribution

[1]

(ii) Interpret the skew of the distribution in context.

[1]