Metadata Card
- Prerequisites: Vol 5 Database (B+ trees, LSM-Tree), Chapter 3 (Distributed Consensus)
- Estimated Time: 45 minutes
- Core Difficulty: Intermediate
- Reading Mode: High focus
- Completion Milestone: Explain how consistent hashing handles node changes, understand HDFS block+replica design, grasp Cassandra's Ring and read/write paths, know Spanner's global-scale technologies
Your Progress
The 3-node etcd cluster can manage configs and locks steadily via consensus protocol. But the battle report system produces increasing log data — reconnaissance reports 5GB/day, HQ meeting records 3GB/day, plus troop movement records from each fortress. This data can't go into etcd (designed for small, strongly consistent metadata).
General Lin says: "Data is provisions. Provisions must not be stored in one warehouse — one fire and everything is lost. Distribute. How far? How to find it?" Your Task
Master three distributed storage paradigms: Consistent Hashing, HDFS (block storage for large files), Cassandra (column store + partition + eventual consistency).
Consistent Hashing: Hash ring. Each node and key hashed onto ring. Key goes to first clockwise node. Adding/removing a node only migrates keys in adjacent segment. Virtual nodes improve distribution.
HDFS: NameNode (metadata) + DataNodes (data). Files split into blocks (128MB), each block replicated to 3 DataNodes. Read: nearest replica (network topology). Write: pipeline replication.
Cassandra: Ring + Partition (consistent hashing with virtual nodes). Write path: Coordinator → MemTable + CommitLog (append). Read path: read repair, return latest version. AP design — eventual consistency, no single point of failure.
Spanner: Google's global database. Combines: sharding (directory-based), Paxos (within-shard consensus), TrueTime (global timestamps), SQL distributed query optimization. Provides external consistency globally.
Common Pitfalls: HDFS NameNode becomes bottleneck beyond 100 million files. Cassandra queries must be pre-designed (no flexible WHERE). Consistent hashing "balance" is not automatic — requires time for data migration. "Read-after-write" in distributed storage is not guaranteed even at QUORUM.
Traveler's Notes
No silver bullet in distributed storage. Consistent hashing solves "minimize data migration on node changes" but requires upfront query design. HDFS is batch processing's best partner, Cassandra suits write-heavy online services, Spanner achieves global external consistency with extreme engineering complexity.
Next: Distributed Computing (Chapter 5).