Chapter 4: Hash Tables
Vol 2: Algorithm Forest Adventure · Station 4
Metadata Card
| Attribute | Value |
|---|---|
| Chapter | Vol 2 · Chapter 4 |
| Difficulty | (Core Challenge) |
| Prerequisites | Arrays, Linked Lists (Chapter 2) |
| Core Skills | Hash function design, collision handling, resizing, consistent hashing |
| Keywords | Hash Table Hash Function Collision Chaining Open Addressing Load Factor Rehash Consistent Hashing |
Your Progress
You encounter a strange thicket in the forest — every bush has a lock, and with the right key, the bush instantly pops out what you stored. This is the Hash Forest. Trees here live by hash values, not order.
Breakthrough · Origins
Encounter #1: Hash Function
A Hash Function takes any object and returns a fixed-range integer — the array index.
def simple_hash(key, table_size):
return hash(key) % table_sizeCollisions are inevitable (pigeonhole principle). When "apple" and "pen" hash to the same index, we have a Hash Collision.
Encounter #2: Separate Chaining
Each array slot holds a linked list. On collision, append to the list.
class ChainingHashTable:
def __init__(self, capacity=8):
self._table = [[] for _ in range(capacity)]
self._capacity = capacity
self._size = 0
def _hash(self, key):
return hash(key) % self._capacity
def put(self, key, value):
idx = self._hash(key)
bucket = self._table[idx]
for i, (k, v) in enumerate(bucket):
if k == key:
bucket[i] = (key, value)
return
bucket.append((key, value))
self._size += 1
def get(self, key):
idx = self._hash(key)
for k, v in self._table[idx]:
if k == key:
return v
raise KeyError(key)
def remove(self, key):
idx = self._hash(key)
bucket = self._table[idx]
for i, (k, v) in enumerate(bucket):
if k == key:
bucket.pop(i)
self._size -= 1
return
raise KeyError(key)Encounter #3: Open Addressing (Linear Probing)
On collision, walk forward to find the next empty slot.
class OpenAddressingHashTable:
DELETED = object()
def __init__(self, capacity=8):
self._keys = [None] * capacity
self._vals = [None] * capacity
self._capacity = capacity
self._size = 0
def _hash(self, key):
return hash(key) % self._capacity
def put(self, key, value):
idx = self._hash(key)
while self._keys[idx] is not None and self._keys[idx] is not key:
idx = (idx + 1) % self._capacity
if self._keys[idx] is None:
self._size += 1
self._keys[idx] = key
self._vals[idx] = value
def get(self, key):
idx = self._hash(key)
while self._keys[idx] is not None:
if self._keys[idx] is key:
return self._vals[idx]
idx = (idx + 1) % self._capacity
raise KeyError(key)Encounter #4: Load Factor and Rehashing
class AutoResizingHashTable(ChainingHashTable):
LOAD_FACTOR_THRESHOLD = 0.75
def put(self, key, value):
super().put(key, value)
if self._size / self._capacity > self.LOAD_FACTOR_THRESHOLD:
self._resize(self._capacity * 2)
def _resize(self, new_cap):
old = self._table
self._capacity = new_cap
self._table = [[] for _ in range(new_cap)]
self._size = 0
for bucket in old:
for k, v in bucket:
super().put(k, v)Encounter #5: Consistent Hashing (Preview)
Hash servers and keys onto a ring. Adding/removing a server only affects neighboring keys — no full rehashing.
Common Pitfalls
Two Sum: O(n²) → O(n)
def two_sum(nums, target):
seen = {}
for i, num in enumerate(nums):
if target - num in seen:
return [seen[target - num], i]
seen[num] = i
return NoneVerification Checklist
- [ ] Can explain why hash tables achieve O(1) lookup
- [ ] Can hand-write separate chaining (insert/find/delete)
- [ ] Can explain linear probing vs quadratic probing
- [ ] Understands load factor and why 0.75 is standard
- [ ] Can explain the core problem consistent hashing solves
Traveler's Notes
- Hash function → array index. Collisions are inevitable.
- Separate Chaining — chain on collision. Java 8+ upgrades to red-black tree after 8 nodes.
- Open Addressing — find next slot on collision. Memory-compact, tricky deletion.
- Load factor 0.75 — industry consensus for space-time balance.
- Consistent hashing — minimizes data migration in distributed caches.
→ Next Stop Preview
Chapter 5: Binary Trees and BST — "From linear to hierarchical — the first nonlinear structure."