Software Systems Atlas

Metadata Card

Prerequisites: Chapter 6 — Combinatorics
Estimated time: 45 minutes
Core difficulty: Intermediate
Completion marker: Understand modular arithmetic, Euclid's algorithm, and the basic idea of the Chinese remainder theorem

Your Progress

The highest floor of the Combinatorics Tower. The numbers on the wall are peculiar — they move in circles. "In a clock, 11 plus 2 equals 1. That's not a mistake; it's modular arithmetic," says the Librarian.

Your Task

Integers are the most basic objects in a computer, but the integer arithmetic you perform in code every day has a mathematical foundation far richer than "elementary arithmetic." Why do hash tables use modular arithmetic for indexing? What is the mathematical basis of RSA encryption? Why are some numbers called "prime"? This chapter lets you see the underlying structure of the integer world.

Chapter Layers
Required reading: Divisibility and primes, modular arithmetic, greatest common divisor and Euclid's algorithm
Optional reading: Chinese remainder theorem, Euler's theorem

Breakthrough · Origin Story

You're using hashCode() % tableSize to locate hash buckets. A colleague tells you that using a power of 2 as tableSize causes more hash collisions. You think it's superstition, but test data supports his claim — fewer collisions when using a prime modulus. Why? This is a number theory problem.

Divisibility and modulo: a ≡ b (mod m) means a and b have the same remainder when divided by m.

Properties of modular arithmetic are remarkably consistent with ordinary arithmetic:

(a + b) mod m = ((a mod m) + (b mod m)) mod m
(a × b) mod m = ((a mod m) × (b mod m)) mod m
a^b mod m can be computed efficiently (fast exponentiation)

The first property explains why hash tables use modular arithmetic: you can map arbitrarily large hashCode values into a fixed range of buckets.

Why prime-sized hash tables: if m has a factor d, key distribution across remainder classes modulo d becomes uneven. A prime m has only 1 and itself as factors, maximizing distribution uniformity.

Euclid's algorithm: Finding gcd(a, b) — the greatest common divisor.

This algorithm is over 2,300 years old, one of the oldest in mathematics, yet remains active at the forefront of cryptography today. Whether in RSA encryption, fractional reduction, or even collision detection in game development, you're indirectly invoking it.

gcd(a, b):
  while b ≠ 0:
    a, b = b, a % b
  return a

This algorithm is both efficient and elegant — it's one of the oldest algorithms in mathematics (circa 300 BCE) and remains active in cryptography today.

Extended Euclid's algorithm: Not only finds gcd(a, b), but also gives integers x, y such that ax + by = gcd(a, b). This may look like pure math, but it's the core step in computing the private key in RSA encryption.

Prime numbers: Integers greater than 1 that are only divisible by 1 and themselves.

The Fundamental Theorem of Arithmetic: Every integer greater than 1 can be uniquely decomposed into a product of primes.
The naive method to check if a number is prime is trial division up to √n.
For large prime detection, the Miller-Rabin test uses a probabilistic method and is the cornerstone of modern cryptography.

Chinese remainder theorem: If the moduli are pairwise coprime, a system of congruences has a unique solution modulo the product of the moduli. It has practical applications in RSA acceleration and large integer arithmetic. A simplified scenario: you know a number leaves remainder 2 when divided by 3, remainder 3 when divided by 5, and remainder 2 when divided by 7 — you can uniquely determine the number modulo 105.

Common Pitfalls

Confusing modulo arithmetic with negative numbers. In mathematics, -1 mod 5 = 4 (because -1 + 5 = 4). But in C/Java, -1 % 5 gives -1. This can lead to bugs in hash function implementations involving negative numbers.
Assuming large integer factorization is "obviously" hard. It's not — it's an unsolved major conjecture in number theory. RSA security depends on the assumption that "large integer factorization is hard." If disproven, the whole system needs to be rebuilt.
Using the modulo operator directly in scenarios requiring non-negative results. It's recommended to use ((x % m) + m) % m to ensure a non-negative result.

Challenge Questions

Use Euclid's algorithm to compute gcd(123456, 789012).
Write a fast exponentiation algorithm to compute a^b mod m (Python). Verify whether Fermat's Little Theorem a^(m-1) ≡ 1 (mod m) holds when m is prime.
Explain why a hash table of size 2^n causes more collisions among keys with the same lower bits (hint: a mod 2^n depends only on the lower n bits of the binary representation).

Traveler's Notes

Number theory makes the integer world far more transparent. Modular arithmetic maps large values into a fixed range, Euclid's algorithm finds common divisors, and primes support the entire foundation of cryptography.

→ Next stop: You leave the Combinatorics Tower. Towards the third tower — the Applied Tower. There, numbers are no longer just integers and discrete counts; they become vectors, functions, and continuous flows.