Data Insights (College Board AP® Computer Science Principles): Study Guide
Data processing techniques
How do programs extract information from data?
Programs can be used to process data and extract information from it
Tables, diagrams, text, and other visual tools communicate insight and knowledge from data
Search tools efficiently find information
Data filtering systems help find information and recognize patterns
Programs such as spreadsheets efficiently organize and find trends
Processes for extracting and modifying information
Transforming every element of a dataset (e.g. doubling every value in a list)
Filtering a dataset (e.g. keeping only positive numbers)
Combining or comparing data (e.g. summing a list)
Visualizing a dataset through a chart, graph, or other visual representation
How programs gain insight and knowledge from data
Programs are used iteratively and interactively to process information
Filtering and cleaning digital data with programs produces insight and knowledge
Combining data sources, clustering, and classifying are parts of this process
Translating and transforming digital information produces insight; patterns emerge from transformation
Examiner Tips and Tricks
The AP exam may describe a dataset and ask which data processing technique is being used. Clustering groups similar data points based on shared characteristics; classifying assigns data to categories. Both are processes for gaining insight and knowledge from data.
If a question asks how a pattern or trend was discovered, the answer usually involves one of the four processes — most often visualizing the data (converting it into a chart or graph) or filtering to isolate relevant records. Iterative and interactive processing is also a common answer when the scenario describes repeated cycles of examination and refinement.
For the AP Create Performance Task, if your program processes data, be prepared to explain on exam day which techniques your program uses. Being able to name processes (transforming, filtering, combining, or visualizing) and connect them to filtering and cleaning, combining data sources, clustering, classifying, translating and transforming demonstrates understanding of programmatic data processing.
Worked Example
A streaming service analyzes user listening data. The program groups users into categories based on the genres they listen to most frequently — the categories emerge from the data itself, rather than being set in advance.
Which data processing technique does this describe?
(A) Classifying, because users are placed into categories that already exist
(B) Clustering, because similar users are grouped together based on shared characteristics
(C) Filtering, because irrelevant data is removed before analysis
(D) Cleaning, because the data is corrected before processing
[1]
Answer:
(B) Clustering, because similar users are grouped together based on shared characteristics [1 mark]
Clustering groups similar data points based on shared characteristics. Classifying would require categories that already exist — the scenario describes categories emerging from the data itself, which rules classifying out.
Unlock more, it's free!
Was this revision note helpful?