Chapter 18: Performance Engineering
Vol 3: Computer Core Expedition · Chapter 18
Metadata Card
| Attribute | Value |
|---|---|
| Keywords | Profiling, perf, Amdahl's Law, Cache-friendly, Performance Analysis Tools |
Your Progress
"You've learned how computers work at every level — from electrons to system calls. Now it's time to put that knowledge to work: making programs fast."
Encounter 1: Amdahl's Law
The speedup of a system is limited by the portion that cannot be parallelized:
Speedup = 1 / ((1 - P) + P/N)Where P is the parallelizable fraction and N is the number of processors.
Key insight: Even with infinite processors, the speedup is bounded by 1/(1-P).
Encounter 2: Profiling Tools
- Linux perf:
perf stat,perf record/perf report - gprof: GNU profiler (instrumentation-based)
- Valgrind / Callgrind: Cache miss analysis
- Flame graphs: Visualize CPU time distribution
Encounter 3: Cache-Friendly Code
// Cache-friendly: sequential access
for (int i = 0; i < N; i++)
sum += arr[i];
// Cache-unfriendly: strided access
for (int i = 0; i < N; i += 64)
sum += arr[i];Data-Oriented Design: Organize data layout first, then design algorithms around it.
Encounter 4: Common Optimization Techniques
- Reduce memory allocations (arena allocators, object pools)
- Minimize cache misses (SoA layout, loop tiling)
- Avoid branch mispredictions (branchless programming, lookup tables)
- Use SIMD instructions (vectorization)
- Profile first, optimize second — 90% of time is spent in 10% of code
Verification Checklist
- [ ] Can explain Amdahl's Law
- [ ] Can use perf stat to measure cache misses
- [ ] Can identify cache-friendly vs cache-unfriendly access patterns
- [ ] Can explain the "profile first" principle
Traveler's Notes
- "Make it work, make it right, make it fast" — in that order
- Premature optimization is the root of all evil (Knuth)
- Use a profiler before you optimize — your intuition about bottlenecks is often wrong
- Measure twice, optimize once
→ The Journey Continues
You've completed all three volumes of the Software Systems Atlas. From bits and logic gates to CPUs, operating systems, and performance engineering — you now understand how computers work at every level.
The code you write is never "just code." Every a = b + c is a journey through registers, caches, ALUs, and the operating system. Every function call is a stack frame being built and destroyed. Every malloc is a negotiation with the virtual memory system.
You are no longer just a programmer. You are a systems engineer.
— Software Systems Atlas · Completed —