Data Analysis & Patterns (College Board AP® Computer Science Principles): Study Guide
Data analysis fundamentals
What is data analysis?
Data analysis is the process of examining data to extract useful information, identify patterns, and draw conclusions
Raw data on its own is not meaningful until it has been organised, processed, and interpreted
The goal of data analysis is to discover trends, connections, and solutions that can inform decisions or answer questions
From data to information
Data is a set of raw facts or values that have not yet been interpreted (for example, a list of temperatures recorded every hour)
Information is data that has been processed and given meaning (for example, a graph showing that temperatures rose steadily throughout the day)
Identifying patterns in data (repeated behaviors, upward or downward trends, clusters) is a key part of turning data into useful information
Combining data from multiple sources can reveal connections that are not visible in a single dataset
The table below shows how raw data becomes meaningful information through processing and interpretation:
Data | Information |
|---|---|
A list of temperatures recorded every hour | A graph showing that temperatures rose steadily throughout the day |
The number of clicks on each link on a website | A report identifying which pages users visit most frequently |
A list of student test scores | An average score calculated to show overall class performance |
GPS coordinates recorded every minute | A map showing the route a delivery driver took |
Daily sales figures for a store | A trend line showing that sales increase every weekend |
Correlation vs causation
What is the difference between correlation and causation?
Correlation means that two variables appear to be related: when one changes, the other tends to change as well
Causation means that a change in one variable directly causes a change in the other
Correlation does not prove causation: two variables may change together without one being the reason for the other
A common error in data analysis is assuming that because two things are correlated, one must cause the other
Concept | Meaning | Example |
|---|---|---|
Correlation | Two variables change together in a pattern | Ice cream sales and sunburn rates both increase in summer |
Causation | One variable directly causes a change in the other | Increased sun exposure causes sunburn |
Drawing valid conclusions
Valid conclusions require evidence that goes beyond correlation: controlled experiments, logical reasoning, or ruling out other explanations
Combining data from multiple data sources can strengthen conclusions by providing additional evidence
When analyzing data, always consider whether a hidden third factor could explain the observed pattern (for example, hot weather explains both increased ice cream sales and increased sunburn rates)
Examiner Tips and Tricks
The AP exam frequently tests whether you can distinguish correlation from causation. If a question describes two variables that change together and asks what can be concluded, the safe answer is that they are correlated. Only choose causation if the scenario describes a direct, controlled cause-and-effect relationship. Look out for distractors that claim one variable "causes" the other based on data alone.
For the AP Create Performance Task, if your program processes or displays data, be prepared to explain on exam day what patterns or information your program helps users identify — and remember that any conclusions drawn from data should be supported by evidence, not assumed
Worked Example
A researcher finds that students who eat breakfast every day tend to have higher test scores than students who skip breakfast. Which of the following is the most accurate conclusion based on this data?
(A) Eating breakfast causes students to score higher on tests
(B) There is a correlation between eating breakfast and higher test scores
(C) Students who skip breakfast are less motivated to study
(D) Test scores have no relationship to breakfast habits
[1]
Answer:
(B) There is a correlation between eating breakfast and higher test scores [1 mark]
The data shows the two variables change together but does not prove one causes the other; other factors such as sleep or study habits could explain both, making correlation the most accurate conclusion
Unlock more, it's free!
Was this revision note helpful?