Compression Algorithms (College Board AP® Computer Science Principles): Study Guide

Robert Hampton

Written by: Robert Hampton

Reviewed by: James Woodhouse

Updated on

Compression fundamentals

What is data compression?

  • Compression is the process of reducing the number of bits needed to represent data

  • Compression decreases data size, which reduces the amount of storage space required and speeds up transmission over a network

  • Compression works by identifying patterns and redundancy in the data, or by removing detail that is unlikely to be noticed

  • The goal is to reduce size while preserving as much useful information as possible

Why compression matters

  • Smaller files use less storage on devices and servers

  • Compressed data transfers faster over networks, reducing download and upload times

  • Without compression, many everyday tasks (streaming music, sending images, loading web pages) would require significantly more time and bandwidth

How much can compression reduce file size?

  • The amount of size reduction from compression depends on both the amount of redundancy in the original data representation and the compression algorithm applied

  • Fewer bits does not necessarily mean less information, a good compression algorithm can reduce the number of bits needed without losing any of the original meaning

  • Data that contains a lot of repetition (for example, a text file where one word appears hundreds of times) can often be compressed more effectively than data that is already dense and non-repetitive

Lossless vs lossy compression

What is the difference between lossless and lossy compression?

  • Lossless compression can usually reduce file size without losing any data, the original data can be perfectly reconstructed from the compressed version

  • Lossy compression reduces file size by permanently removing some data: the result is an approximation of the original that cannot be fully restored

  • Lossy compression can usually reduce file size more than lossless compression

Feature

Lossless

Lossy

Data after decompression

Identical to original; fully preserved

An approximation; some detail is permanently lost

Typical file size reduction

Moderate

Significant

Best used for

Text, code, spreadsheets, medical images; where every detail matters

Audio, images, video; where small losses are not noticeable to humans

Compression trade-offs & selection

How do you choose between lossless and lossy compression?

  • Choosing a compression method involves weighing trade-offs between file size and quality

  • Lossless is the right choice when preserving the original data exactly is the priority (for example, compressing a program's source code or a legal document)

  • Lossy is the right choice when reducing file size is more important than keeping every detail (for example, compressing a photograph for a website where a small loss in quality is acceptable)

  • The decision depends on the purpose of the data and how it will be used

Factors that influence the choice

  • Purpose: will the data need to be reconstructed exactly, or is an approximation acceptable?

  • Storage constraints: is storage space limited, making aggressive compression necessary?

  • Transmission speed: does the file need to transfer quickly over a slow network?

  • Audience: will the end user notice the difference in quality?

Examiner Tips and Tricks

  • When the AP exam describes a scenario and asks which compression type is appropriate, identify what matters most: if accuracy and full reconstruction are essential, the answer is lossless. If reducing file size is the priority and minor quality loss is acceptable, the answer is lossy. Common exam distractors describe lossy compression as "losing the file" — lossy means some detail is removed, not that data is lost entirely or corrupted.

  • For the AP Create Performance Task, if your program uses images, audio, or video, the file formats you choose will affect storage size and quality — understanding the difference between lossless and lossy formats helps you make informed decisions about the media you include

Worked Example

Which of the following best describes what happens when a file is compressed using a lossless algorithm?

(A) Some information is removed to reduce the file size, so the original cannot be reconstructed

(B) The file size is reduced, and less information is stored than in the original

(C) The file size is reduced, but the original data can be perfectly reconstructed from the compressed version

(D) The file size stays the same, but the data is reorganized

[1]

Answer:

(C) The file size is reduced, but the original data can be perfectly reconstructed from the compressed version [1 mark]

  • Lossless compression reduces the number of bits stored or transmitted while guaranteeing complete reconstruction of the original data. Fewer bits does not necessarily mean less information — the original data is preserved even though the compressed file is smaller.

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Robert Hampton

Author: Robert Hampton

Expertise: Computer Science Content Creator

Rob has over 16 years' experience teaching Computer Science and ICT at KS3 & GCSE levels. Rob has demonstrated strong leadership as Head of Department since 2012 and previously supported teacher development as a Specialist Leader of Education, empowering departments to excel in Computer Science. Beyond his tech expertise, Robert embraces the virtual world as an avid gamer, conquering digital battlefields when he's not coding.

James Woodhouse

Reviewer: James Woodhouse

Expertise: Computer Science & English Subject Lead

James graduated from the University of Sunderland with a degree in ICT and Computing education. He has over 14 years of experience both teaching and leading in Computer Science, specialising in teaching GCSE and A-level. James has held various leadership roles, including Head of Computer Science and coordinator positions for Key Stage 3 and Key Stage 4. James has a keen interest in networking security and technologies aimed at preventing security breaches.