Computing Bias (College Board AP® Computer Science Principles): Study Guide
Bias in computing
How does bias appear in computing systems?
Computing innovations can reflect existing human biases in two main ways:
Biases written into the algorithms that drive the innovation
Biases in the data used to train or operate the innovation
These biases are often embedded at all levels of software development, from initial design through deployment
Bias in computing systems can produce unequal or unfair outcomes for different groups of people, even when the system was not designed with intent to discriminate
How bias enters computing innovations
Biased algorithms: the logic of an algorithm may favor certain outcomes or groups over others (for example, a loan-approval algorithm that weighs zip code in a way that disadvantages certain neighborhoods)
Biased data: if the data used to train or operate a system reflects historical or societal inequalities, the system will reproduce those inequalities (for example, a facial recognition system trained mostly on one demographic group performs less accurately on others)
Bias can enter at any stage of development, from initial design through testing and deployment
Source of bias | What it means | Example |
|---|---|---|
Biased algorithm | The rules of the algorithm favor certain outcomes | A loan-approval algorithm that weighs zip code heavily in a way that disadvantages certain neighborhoods |
Biased data | The data used reflects historical or societal inequalities | A facial recognition system trained mostly on one demographic performs less accurately on others |
Reducing bias in computing
Programmers should take action to reduce bias in algorithms used for computing innovations as a way of combating existing human biases
Practical actions include:
Using diverse and representative data
Reviewing algorithms for unintended outcomes before release
Monitoring systems after release as data and use cases change
Reducing bias is an ongoing responsibility throughout development, not a single step
Examiner Tips and Tricks
Exam questions about bias often distinguish between bias in algorithms and bias in data; if a question describes biased outcomes, identify which source is responsible by looking at whether the rules of the system or the data feeding it are at fault.
For the CPT, if your program processes data that represents people (e.g., users, customers, students), consider in your written response whether the data could carry bias and what you might do to reduce its impact.
Worked Example
A company develops a hiring algorithm trained on the resumes of employees hired over the past 10 years. The company notices that the algorithm tends to favor male candidates over female candidates, even when their qualifications are equivalent. Which of the following best explains this outcome?
(A) The algorithm has a bug that must be fixed by rewriting it from scratch
(B) The training data reflects historical hiring patterns that favored male candidates, embedding that bias into the algorithm
(C) Hiring algorithms cannot be made fair regardless of the data used
(D) The algorithm is functioning correctly because it matches past hiring decisions
[1]
Answer:
(B) The training data reflects historical hiring patterns that favored male candidates, embedding that bias into the algorithm [1 mark]
The algorithm has learned from data that reflects existing human bias in past hiring decisions, demonstrating how bias in data can embed bias in computing innovations.
Unlock more, it's free!
Was this revision note helpful?