Representing Data Diagrammatically (AQA Level 3 Mathematical Studies (Core Maths)): Revision Note

Exam code: 1350

Naomi C

Written by: Naomi C

Reviewed by: Dan Finlay

Updated on

Histograms

What is a histogram?

  • A histogram is a graph used to show frequency distributions, it is similar to a bar chart but there are some significant differences

Histogram

Bar Chart

Used for quantitative, continuous data

Used for qualitative or discrete quantitative data

No gaps between bars

Gaps between bars

Class intervals may be equal or unequal

Class intervals must be equal

Frequency density on the y-axis

Frequency on the y-axis

  • For a histogram:

    • The bar for a class interval will begin at the lower boundary and end at the upper boundary

    • The area of each bar in a histogram is proportional to the frequency for that class interval

What is frequency density?

  • Frequency density is given by the formula

frequency space density equals fraction numerator frequency over denominator class space width end fraction

  • Frequency density is used with grouped data (class intervals)

    • it is particularly useful when the class intervals are of unequal width

    • it provides a measure of how spread out data within its class interval is, relative to its size

How do I draw a histogram?

  • Identify the size of each class interval

  • If the class intervals are unequal, calculate the frequency density for each class

  • As frequency is proportional to frequency density

    table row frequency proportional to cell frequency space density end cell row cell frequency space density end cell equals cell k cross times fraction numerator frequency over denominator class space width end fraction end cell row blank blank blank end table

  • In the majority of questions, k equals 1, so the frequency density can be found by dividing the frequency by the class width

  • Once the frequency densities are known:

    • Draw bars (rectangles) with widths being measured on the x-axis

    • Make sure the height of each bar is the frequency density for that class and is measured on the y-axis

Examiner Tips and Tricks

  • Always work out and write down the frequency densities

    • It is easy to make errors and lose marks by going straight to the graph

    • You can gain method marks by showing that you are using frequency density rather than frequency

How do I interpret a histogram?

  • It is important to remember that the frequency density (y-axis) does not tell us frequency

    • The area of the bar is proportional to the frequency

  • Most of the time, the frequency will be the area of the bar directly and is found by using

frequency equals area

  • Occasionally the frequency will be proportional to the area of the bar so use

frequency equals k cross times area

  • You will need to work out the value of k from other information given in the question

  • You may be asked to estimate the frequency of part of a bar/class interval within a histogram

    • Find the area of the bar for the part of the interval required

    • Once area is known, frequency can be found as above

Worked Example

The table below shows information regarding the average speeds travelled by trains in a region of the UK.

Average speed,
s (m/s)

Frequency

20 less or equal than s less than 40

5

40 less or equal than s less than 50

15

50 less or equal than s less than 55

28

55 less or equal than s less than 60

38

60 less or equal than s less than 70

14

Draw a histogram to represent the data.

Add two columns to the table - one for class width, one for frequency density
Writing the calculation in each box helps to keep accuracy

Average speed,
s (m/s)

Frequency

Class width

Frequency density

20 less or equal than s less than 40

5

40 - 20 = 20

5 ÷ 20 = 0.25

40 less or equal than s less than 50

15

50 - 40 = 10

15 ÷ 10 = 1.5

50 less or equal than s less than 55

28

55 - 50 = 5

28 ÷ 5 = 5.6

55 less or equal than s less than 60

38

60 - 55 = 5

38 ÷ 5 = 7.6

60 less or equal than s less than 70

14

70 - 60 = 10

14 ÷ 10 = 1.4

Mark on an appropriate scale for the x and y-axes

Draw the bars of the histogram
Making sure that:
The height of each bar is the frequency density
Each bar goes from the lower boundary of its class to the upper boundary
There are no gaps between the bars

Histogram showing the average speeds travelled by trains in the UK

Cumulative Frequency Graphs

What is cumulative frequency?

  • The cumulative frequency of x is the running total of the frequencies for the values that are less than or equal to x

  • For grouped data you use the upper boundary of a class interval to find the cumulative frequency of that class

What is a cumulative frequency graph?

  • A cumulative frequency graph is used with data that has been organised into a grouped frequency table

  • Some coordinates are plotted

    • The x-coordinates are the upper boundaries of the class intervals

    • The y-coordinates are the cumulative frequencies of that class interval

  • The coordinates are then joined together by hand using a smooth increasing curve

What are cumulative frequency graphs useful for?

  • Cumulative frequency graphs can be used to estimate statistical measures

    • Draw a horizontal line from the y-axis to the curve

      • For the median: draw the line at 50% of the total frequency, n over 2

      • For the lower quartile: draw the line at 25% of the total frequency, n over 4

      • For the upper quartile: draw the line at 75% of the total frequency, fraction numerator 3 n over denominator 4 end fraction

      • For the pth percentile: draw the line at p% of the total frequency

    • Draw a vertical line down from the curve to the x-axis

    • This x-value is the relevant statistical measure

  • Cumulative frequency graphs can also be used to estimate the number of values that are bigger or smaller than a given value

    • Draw a vertical line from the given value on the x-axis to the curve

    • Draw a horizontal line from the curve to the y-axis

    • This value is an estimate for how many values are less than or equal to the given value

      • To estimate the number that is greater than the value subtract this number from the total frequency

    • They can be used to estimate the interquartile range IQR equals Q subscript 3 minus Q subscript 1

    • They can be used to construct a box plot for grouped data

Worked Example

The cumulative frequency graph below shows the lengths in cm, l, of 30 puppies in a training group.

cumulative-frequency-graph-2-2-2

(a) Given that the interval 40 less or equal than l less than 45 was used when collecting data, find the frequency of this class.

Draw vertical lines from 40 and 45 on the x-axis until they meet the curve
Draw horizontal lines from the curve to the y-axis and read off the values

Cumulative frequency graph showing the lengths of a class of 30 puppies. Vertical lines are drawn from the x-axis at x=40 and x=45 to the curve and horizontal lines from these points to the y-axis. The values on the y-axis are y=8 and y=16.

Subtract the frequency associated with the lower boundary of the class from the frequency associated with the upper bound of the class

16 - 8 = 8

Frequency = 8

(b) Use the graph to find an estimate for the interquartile range of the lengths.

Find the n over 4th position, this will be the lower quartile, Q subscript 1

30 over 4 equals 7.5th position

Find the fraction numerator 3 n over denominator 4 end fractionth position, this will be the lower quartile, Q subscript 1

fraction numerator 3 cross times 30 over denominator 4 end fraction equals 22.5th position

Draw horizontal lines from these two values on the y-axis until they meet the curve
From these points on the curve draw vertical lines to the x-axis and read off the values

Cumulative frequency graph showing the lengths of a class of 30 puppies. Horizontal lines are drawn from the y-axis at y=7.5 and y=22.5 to the curve and vertical lines from these points to the x-axis. The values on the x-axis are x=39.5 and x=51.2..

The interquartile range is the difference between the upper and lower quartiles

51.2 - 39.5 = 11.7

11.7 cm

(c) Estimate the percentage of puppies with length more than 49 cm.

Draw a vertical line on the graph from 49 on the x-axis until it meets the curve
Draw a horizontal line from this point on the curve across to the y-axis and read off the value

Cumulative frequency graph showing the lengths of a class of 30 puppies. A vertical lines is drawn from the x-axis at x=49 to the curve and a horizontal line from this point to the y-axis. The value on the y-axis is y=20..

Subtract this value from the total number of puppies in the class to find the number that are longer than 49 cm

30 - 20 = 10

Divide this value by the total number of puppies and multiply by 100 to turn this value into a percentage

10 over 30 cross times 100 equals 33.33333...

33.3% (to 3.s.f.) of puppies are longer than 49 cm

Box and Whisker Plots

What is a box and whisker plot?

  • A box plot is a graph that clearly shows key statistics from a data set

    • It shows the median, quartiles, minimum and maximum values and outliers

    • It does not show any other individual data items

  • The middle 50% of the data is represented by the box section of the graph

  • The lower and upper 25% of the data will be represented by each of the whiskers

  • Any outliers are represented with a cross on the outside of the whiskers

    • If there is an outlier then the whisker will end at the value before the outlier

  • Only one axis is used when graphing a box plot

  • It is still important to make sure the axis has a clear, even scale and is labelled with units

2-2-2-box-plot-diagram-1

What are box plots useful for?

  • Box plots can clearly show the shape of the distribution

    • If a box plot is symmetrical about the median then the data could be normally distributed

  • Box plots are often used for comparing two sets of data

    • Two box plots will be drawn next to each other using the same axis

    • You can easily compare the medians and interquartile ranges

Examiner Tips and Tricks

  • You may be able to draw a box plot on your calculator if you have the raw data

    • You calculator's box plot can also include outliers so this is a good way to check

Worked Example

The distances, in metres, travelled by 15 snails in a one-minute period are recorded and shown below: 

0.5,   0.7,   1.0,   1.1,   1.2,   1.2,   1.2,   1.3,   1.4,   1.4,   1.4,   1.4,   1.5,   1.5,   1.5

(a)

(i) Find the values of Q subscript 1 comma space Q subscript 2 and Q subscript 3.

The values in the data set are already in size order

The lower quartile, Q subscript 1 will be at position fraction numerator n plus 1 over denominator 4 end fraction

fraction numerator 15 plus 1 over denominator 4 end fraction equals 4th position

0.5 comma space space space 0.7 comma space space space 1.0 comma space space space circle enclose 1.1 end enclose comma space space space 1.2 comma space space space 1.2 comma space space space 1.2 comma space space space 1.3 comma space space space 1.4 comma space space space 1.4 comma space space space 1.4 comma space space space 1.4 comma space space space 1.5 comma space space space 1.5 comma space space space 1.5

The median, Q subscript 2 will be at position fraction numerator n plus 1 over denominator 2 end fraction

fraction numerator 15 plus 1 over denominator 2 end fraction equals 8th position

0.5 comma space space space 0.7 comma space space space 1.0 comma space space 1.1 comma space space space 1.2 comma space space space 1.2 comma space space space 1.2 comma space space space circle enclose 1.3 end enclose comma space space space 1.4 comma space space space 1.4 comma space space space 1.4 comma space space space 1.4 comma space space space 1.5 comma space space space 1.5 comma space space space 1.5

The median, Q subscript 3 will be at position

fraction numerator 3 open parentheses 15 plus 1 close parentheses over denominator 4 end fraction equals 12th position
0.5 comma space space space 0.7 comma space space space 1.0 comma space space space 1.1 comma space space space 1.2 comma space space space 1.2 comma space space space 1.2 comma space space space 1.3 comma space space space 1.4 comma space space space 1.4 comma space space space 1.4 comma space space space circle enclose 1.4 end enclose comma space space space 1.5 comma space space space 1.5 comma space space space 1.5

bold italic Q subscript bold 1 bold equals bold 1 bold. bold 1 bold space bold m
bold italic Q subscript bold 2 bold equals bold 1 bold. bold 3 bold space bold m
bold italic Q subscript bold 3 bold equals bold 1 bold. bold 4 bold space bold m

(ii) Find the interquartile range.

The interquartile range is the difference between the upper quartile and the lower quartile

IQR equals Q subscript 3 minus Q subscript 1 equals 1.4 minus 1.1

bold IQR bold equals bold 0 bold. bold 3 bold space bold m

(iii) Identify any outliers.

Identify the bounds for outliers
An outlier is 1.5 cross times IQR above the upper quartile or below the lower quartile

Q subscript 1 minus 1.5 cross times IQR equals 1.1 minus 1.5 cross times 0.3 equals 0.65
Q subscript 3 plus 1.5 cross times IQR equals 1.4 plus 1.5 cross times 0.3 equals 1.85

Identify any items in the data set that are outliers

0.5 less than 0.65

bold 0 bold. bold 5 bold space bold m is an outlier

(b) Draw a box plot for the data.

Label the axis with a scale that covers the range of the data
Mark the outlier value with a cross
Draw vertical lines for the Maximum value as well as for Q subscript 1, Q subscript 2 and Q subscript 3
Draw a vertical line for the minimum value that is not an outlier
Connect Q subscript 1, Q subscript 2 and Q subscript 3 with two horizontal lines to form the box
Connect the maximum value and the minimum value that is not an outlier to the box with a single horizontal line to form the whiskers

A box plot for the distance data showing the outlier at 0.5 m, the minimum value that is not an outlier at 1.7 m, the lower quartile at 1.1 m, the median at 1.3 m, the upper quartile at 1.4 m and the maximum value at 1.6 m

Stem and Leaf Diagrams

What is a stem and leaf diagram?

  • A stem-and-leaf diagram is a simple way to display an ordered list of data using digits

    • It can show the shape of the distribution of the data

  • Two-digit numbers are split into a tens digit (the stem) and a units digit (the leaf)

    • 25 becomes 2 | 5

    • The stem is written vertically and the leaves are written horizontally (in size order)

  • The following diagram shows the ages below

    • 11, 18, 20, 21, 25, 28, 29, 35, 36, 40

 Age

1

 1   8

2

 0   1   5   8   9

3

 5   6

4

 0

Key: 1|8 means 18 years old

What is the key on a stem and leaf diagram?

  • The key shows how values are formed from digits

    • It should include units

  • Other keys are possible

    • 2 | 5 represents 2.5 degrees

    • 2 | 5 represents 2005 people

What is a back to back stem and leaf diagram?

  • Occasionally you may encounter a back to back stem and leaf diagram

  • These display two series of data, both using the same variable, on one diagram

    • For example, the number of ice-creams sold each day for 10 particular days in August compared to September

      • Pay close attention to the key

      • Note that the leaves on the left increase from the centre outwards

Ice creams sold each day in August

 

Ice creams sold each day in September

 

0

 1   8

5   3 

1

 0   1   5   8   9

9   7   2 

2

 5   6

9   7   6   5   2 

3

 0

Key: 2|1|8 represents 12 ice creams sold on a day in August, and 18 sold on a day in September

How do I find the mean, median and mode from a stem and leaf diagram?

  • The mean is the sum of the values divided by the number of values

    • Add up all of the values in the stem-and-leaf diagram and divide by the total number of items

  • The median is the middle number

    • Data values are already in order

    • The median will be at position fraction numerator n plus 1 over denominator 2 end fraction

      • Count up from the first data value until you reach the correct position

      • Remember that the numbers increase in size from the value closest to the stem to the value furthest from the stem on each line

      • If the position given is not an integer value find the midpoint of the values that lie either side of it

  • The mode is the value that appears the most often

    • Identify the data value that appears in the diagram the greatest number of times

How do I find the quartiles and range from a stem and leaf diagram?

  • You can find the quartiles using a method similar to that for finding the median

    • The lower quartile, Q subscript 1, will be at position fraction numerator n plus 1 over denominator 4 end fraction

    • The upper quartile, Q subscript 3, will be at position fraction numerator 3 open parentheses n plus 1 close parentheses over denominator 4 end fraction

      • Count up from the first data value until you reach the correct position

      • Remember that the numbers increase in size from the value closest to the stem to the value furthest from the stem on each line

      • If the position given is not an integer value find the midpoint of the values that lie either side of it

  • The range is the largest value of the data minus the smallest value of the data

    • The largest value will be the value furthest from the stem on the final line

    • The smallest value will be the value closest to the stem on the first line

How do I find the standard deviation from a stem and leaf diagram?

  • You can find the standard deviation using the individual values from the stem and leaf diagram

    • You can use the formula, sigma subscript n minus 1 end subscript equals square root of fraction numerator sum open parentheses x minus x with bar on top close parentheses squared over denominator n minus 1 end fraction end root

    • Or you can input the values into your calculator and use the stats calculation options to calculate the standard deviation of a data set

Examiner Tips and Tricks

  • Make sure that you write down the entire data value including the stem, e.g. median = 26

  • It is a common mistake to write down the leaf only, e.g. median = 6

Worked Example

A hospital is investigating a new drug that claims to reduce blood pressure.
The reductions in blood pressure, measured in mmHg (millimetres of mercury), for 11 patients are shown below.

12        31        24        18        21        34        40        19        23        17        16 

(a) Draw a stem and leaf diagram to show these results.

Split each value into its tens digit (stem) and units digit (leaf)
The values are not yet in order

Blood pressure reduction

1

 2   8   9   7   6

2

 4   1   3

3

 1   4

4

 0

Put the values in order
Write down the key

 

 Blood pressure reduction

1

 2   6   7   8   9

2

 1   3   4

3

 1   4

4

 0

Key: 1|2 means a blood pressure reduction of 12 mmHg 

(b) Use your stem and leaf diagram to find the median blood pressure reduction.

Find the middle value
This is the 6th patient

 

 Blood pressure reduction

1

 2   6   7   8   9

2

circle enclose 1   3   4

3

 1   4

4

 0

The median has a leaf of 1 and a stem of 2

The median is 21 mmHg

You've read 0 of your 5 free revision notes this week

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Naomi C

Author: Naomi C

Expertise: Maths Content Creator

Naomi graduated from Durham University in 2007 with a Masters degree in Civil Engineering. She has taught Mathematics in the UK, Malaysia and Switzerland covering GCSE, IGCSE, A-Level and IB. She particularly enjoys applying Mathematics to real life and endeavours to bring creativity to the content she creates.

Dan Finlay

Reviewer: Dan Finlay

Expertise: Maths Subject Lead

Dan graduated from the University of Oxford with a First class degree in mathematics. As well as teaching maths for over 8 years, Dan has marked a range of exams for Edexcel, tutored students and taught A Level Accounting. Dan has a keen interest in statistics and probability and their real-life applications.