Skip to main content

Concept: Data-Intensive

1. The Snapshot

An application where the primary bottleneck is the volume, complexity, or speed of change of data, rather than raw CPU processing power.

2. The Description

Data-intensive applications are built from standard functional building blocks (like databases and caches) that are "stitched together" to meet specific requirements. The engineering challenge is not calculation, but managing the flow, storage, and retrieval of state across these various systems.

3. Author Quotes

"Many applications today are data-intensive, as opposed to compute-intensive. Raw CPU power is rarely a limiting factor for these applications—bigger problems are usually the amount of data, the complexity of data, and the speed at which it is changing." (p. 3)

4. Defining Features

  • Primary Bottlenecks: Volume (size), Complexity (structure), or Velocity (speed of change).
  • Core Abstractions: Composed of standard building blocks: Databases, Caches, Search Indexes, Stream Processing, and Batch Processing.
  • System Integration: Often requires combining multiple tools because a single tool cannot meet all requirements.

5. The Boundary

  • It is NOT just "Big Data": A small dataset with extreme complexity or high-velocity updates is still Data-Intensive.
  • It is NOT Compute-Intensive: If the bottleneck is purely CPU cycles (e.g., video transcoding), it is NOT data-intensive.

6. The Prototype

A real-time financial trading dashboard. It doesn't perform heavy math, but it must process high-velocity streams, search complex historical indexes, and provide low-latency cached views.

7. Helpful Info

Kleppmann notes that these building blocks are such successful abstractions that we often use them without thinking. The "Expert" level is knowing when to use which tool and how to stitch them together when they blur (e.g., Redis acting as both cache and queue).

8. The Swap Test

"Because our architecture is Data-Intensive, we should stop looking for a 'Perfect Database' and instead start composing a Data System from specialized building blocks."

9. Source Reference

ddia/pages/page_025.txt


🧠 Pedagogical Tracking

MilestoneStatusDateLesson RefNotes
Introduced in Lesson2026-01-28Lesson 01Initial Study
Active Recall #1
1-Day Review
1-Week Review
1-Month Review