Chapter 18: System Analysis & Comparison
Metadata Card
| Attribute | Content |
|---|---|
| Difficulty | (Intermediate) |
| Prerequisites | All previous chapters in Vol 5 |
| Keywords | Comparison, trade-offs, OLTP vs OLAP, SQL vs NoSQL, scaling, consistency vs availability |
| Code | None (analytical) |
Your Progress
"Standing on the Data Fortress walls, overlooking the entire data landscape — a chain from SQL to distributed systems."
OLTP vs OLAP
| Dimension | OLTP | OLAP |
|---|---|---|
| Workload | Short transactions, high concurrency | Complex queries, aggregations |
| Read pattern | Point queries, small result sets | Scans over large data volumes |
| Write pattern | Frequent small writes | Bulk inserts, ETL batch loads |
| Data model | Normalized, 3NF | Denormalized, star/snowflake schema |
| Storage | Row-oriented | Column-oriented |
| Latency | Milliseconds | Seconds to minutes |
| Examples | PostgreSQL, MySQL, Oracle | ClickHouse, Redshift, BigQuery |
SQL vs NoSQL Trade-offs
| Decision | Choose SQL when | Choose NoSQL when |
|---|---|---|
| Schema | Fixed, well-defined | Flexible, evolving |
| Relationships | Complex joins needed | Mostly independent entities |
| Consistency | Strong consistency required | Eventual consistency acceptable |
| Scale | Vertical (more CPU/RAM) | Horizontal (more nodes) |
| Transactions | Multi-row ACID required | Single-key operations |
| Query complexity | Complex aggregations, subqueries | Simple KV lookups |
Vertical vs Horizontal Scaling
Vertical Scaling (Scale Up)
┌──────────────────────┐
│ Bigger Machine │
│ ┌────────────────┐ │
│ │ CPU ↑ RAM ↑ │ │
│ │ Disk ↑ │ │
│ └────────────────┘ │
└──────────────────────┘
Pros: Simple, no application changes
Cons: Upper limit, expensive, single point of failure
Horizontal Scaling (Scale Out)
┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐
│Node 1│ │Node 2│ │Node 3│ │Node 4│
└──────┘ └──────┘ └──────┘ └──────┘
Pros: Linear scaling, commodity hardware, fault tolerance
Cons: Complexity (networking, consistency, rebalancing)Consistency vs Availability
The CAP theorem dictates trade-offs. In practice:
- CP systems: Strong consistency, graceful degradation during partitions
- AP systems: High availability, eventual consistency
- Tunable: Systems like Cassandra allow per-operation tuning
Technology Radar
OLTP Row Stores: PostgreSQL, MySQL, SQL Server, Oracle
OLAP Column Stores: ClickHouse, Redshift, BigQuery, Snowflake
Distributed SQL: CockroachDB, TiDB, YugabyteDB, Spanner
Key-Value: Redis, DynamoDB, Etcd
Document: MongoDB, Couchbase, Firestore
Wide-Column: Cassandra, HBase, ScyllaDB
Graph: Neo4j, Neptune, JanusGraph
Search: Elasticsearch, Meilisearch, Solr
Streaming: Kafka, Pulsar, Kinesis
Cache: Redis, MemcachedTraveler's Notes
No single database is best for everything. The modern data landscape is a rich ecosystem of specialized systems. The skill is not in mastering one tool but in understanding the trade-offs across the entire space. Choose based on your workload's specific requirements — and don't fear using multiple systems for different use cases (polyglot persistence).