Skip to main content

The Knowledge Tree

Recursive Elaboration: Moving from the "Whole" to the "Parts"

🛠️ Data Systems

Systems where data (quantity, complexity, rate of change) is the primary bottleneck, not CPU power.
🌱
Evaluate a system's architecture based on the trade-offs between Reliability, Scalability, and Maintainability.
Continuing to work correctly even when things go wrong (Faults).
🌱
The system's ability to cope with increased load without performance degradation.
🌱
Breaking a large dataset into smaller subsets (partitions) assigned to different nodes to enable unlimited horizontal scaling.
Scaling a system by adding more machines to the resource pool, rather than increasing the capacity of existing machines.
The metrics used to describe the pressure on a system, such as requests per second or the ratio of reads to writes.
The ease with which different people can work on the system over time (Operability, Simplicity, Evolvability).
🌱
The ease with which operations teams can keep a system running smoothly.
The reduction of accidental complexity to make a system easy for new engineers to understand.
The ease with which a system can be modified to adapt to changing requirements.
Translating in-memory data structures into bytes for storage or transmission, handling backward/forward compatibility.
The interface layer. How data is structured (Relational, Document, Graph) vs. how it is stored.
How we ask for data. Imperative tells the computer *how* to do it (loops); Declarative tells it *what* we want (SQL).
Choose the right underlying storage engine (LSM vs B-Tree) for a specific workload.
An append-only sequence of records. The simplest and fastest way to write data.
Sorted String Tables. Immutable on-disk files where keys are sorted, merged in the background.
The industry standard for storage. Fixed-size pages on disk updated in place using a wide tree structure.
Storing data by column rather than by row to optimize for analytical (OLAP) queries.
Scale out across multiple machines using Replication (copying) and Partitioning (splitting).
Keeping a copy of the same data on multiple nodes.
Navigate the dangerous waters of distributed state, transactions, and consensus.
A mechanism to group several reads and writes into a logical unit.
The problem of getting several nodes to agree on something (e.g., who is the leader) in an unreliable network.
The assumption that in a distributed system, packets will be lost, clocks will drift, and nodes will pause.
The difficulty of ordering events in a distributed system due to clock drift.
Stop-the-world events (like Garbage Collection) that cause a node to stop responding for seconds.
A consistency model that ensures operations that are causally related are seen by every node in the same order.
The strongest consistency model: ensuring the system behaves as if there is only one copy of the data and all operations are atomic.
Integrate disparate systems by deriving new datasets from a source of truth via Batch and Stream processing.
A programming model for processing large datasets with a parallel, distributed algorithm.
Batch processing engines that optimize workflows by keeping intermediate state in memory rather than writing to disk.
Processing unbounded data as it arrives (event-driven), rather than in fixed-size batches.
Composing disparate systems (OLTP, Cache, Search) into a coherent whole using Change Data Capture.
Understanding the responsibility of managing data that affects people's lives (Privacy, Fairness, and Democracy).

🧠 Pedagogy

The foundational principle that learning is deeper and more durable when it requires cognitive effort.
🌿
Design a study session that builds muscle (memory) rather than just watching others lift weights (reading).
Defining the two metrics of success for all learning: how long it stays (Durability) and how well you can use it (Transfer).
The ability to recall knowledge over long periods of time, resisting the natural decay of the forgetting curve.
🌿
The natural biological decay of memory over time if not reinforced.
The ability to apply knowledge in new, varied, and unfamiliar contexts.
Understanding how the act of recalling information physically modifies the brain's ability to access it later.
Cognitive biases that lead learners to mistake familiarity with text for mastery of content.
Kahneman's model: System 1 is fast/intuitive (recognition), System 2 is slow/analytical (retrieval).
The debunked theory that instruction must match a student's preferred mode (VARK).
Mistaking the ease of reading or recognizing a text for actual mastery of the underlying concepts.
The phenomenon where the act of taking a test produces better long-term retention than an equivalent amount of time spent studying.
The biological process where recalling a memory makes it plastic and subject to strengthening before being stored again.
🌱
The physical connections in the brain that are strengthened during retrieval.
Designing learning conditions that slow down the process to improve long-term retention and transfer.
Distributing study sessions over time with intervals long enough to cause some forgetting.
Mixing different types of problems or skills within a single session to improve discrimination.
Moving beyond rote memorization to build complex, interconnected knowledge structures.
Elaboration: Explaining new material in your own words. Generation: Solving a problem before being taught the method.
🌱
The act of taking a few minutes to review what has been learned and asking oneself questions (Retrieval + Elaboration).
Existing structures of knowledge that new information 'hooks' into.
The ability to extract key rules and underlying principles from new information to build a mental framework.
Internal representations of external reality that allow for inference and prediction.
A hierarchical model classifying learning objectives from recall (Knowledge) to creation (Synthesis).
Using objective tools to measure actual competence versus perceived competence.
Thinking about one's own thinking; monitoring accurate judgment of learning.
External tools (quizzes, peer review, simulation) used to verify mastery.
Explicitly explaining the 'why' and 'how' of learning difficulties to students to prevent discouragement.
Understanding that intelligence is not fixed and that learning physically changes the brain.
The biological process (mostly during sleep) where temporary memory traces are stabilized into long-term storage and connected to prior knowledge.
The belief that intelligence can be developed (Dweck).
Using artificial structures to organize and retrieve unruly information.
Triggers (images, locations, rhymes) that hook onto a memory trace to pull it out.