Skip to content

Metadata Card

  • Prerequisites: Vol 4 Networking (basic concepts of HTTP/TLS), basic math (modular arithmetic, prime numbers)
  • Estimated time: 60 minutes
  • Core difficulty: Advanced
  • Completion mark: Can explain the applicable scenarios of symmetric vs asymmetric encryption, state the three core properties of hash functions, configure argon2 for password hashing

Your Progress

You've been traveling along the magic courier routes for a while now, and you can already freely send messages between two wizard towers using magical conduits.

But while organizing your courier pack, you've noticed a problem: all the magical letters you've sent so far have been in plaintext. Anyone who sets up a magic listening device along the courier route can read the contents of your messages—including the enchantment recipes and tower defense deployment coordinates.

You've seen what the TLS encryption formation looks like in Vol 4, and you know it uses some kind of "magic" to turn communications into ciphertext. But that was a protective formation built by someone else. Now you need to understand its material yourself—encryption spells are the material science of security.

Your Task

Understand the four pillars of cryptography: symmetric encryption (fast, suitable for large amounts of data), asymmetric encryption (solution to the key distribution problem), hash functions (irreversible fingerprints), and key derivation (generating keys from passwords). This chapter's goal is not to make you a cryptographer, but to enable you to read the design of a security protocol and know which tool to use when.

Chapter Layers

  • Required: Symmetric encryption (AES), asymmetric encryption (RSA), hashing (SHA-256), key derivation (Argon2)
  • Optional: How ECC works, AEAD modes
  • Advanced: Side-channel attacks, introductory quantum-resistant concepts

Breaking Ground · Tracing the Origin

Problem: You have an important letter to deliver to the Border Fortress, but it could be intercepted along the way.

The most basic idea: lock the letter in a box. But how do you get the key to the other side? If the key travels the same route as the letter, the bandits who grab the key can open the box.

This is the core contradiction of encryption: the content you want to protect (the letter) travels over an insecure channel, and the key needed for decryption must also go through that same channel.

First Lock: Symmetric Encryption

Symmetric encryption is the most intuitive solution—encryption and decryption use the same key.

plaintext + key -> encryption algorithm -> ciphertext
ciphertext + key -> decryption algorithm -> plaintext

AES (Advanced Encryption Standard) is the most widely used symmetric encryption algorithm today. It was standardized by NIST in 2001 as a block cipher.

AES core parameters:

ParameterOptionsDescription
Key length128 / 192 / 256 bitLonger is more secure; AES-128 is still safe
Block size128 bit (fixed)Encrypts 16 bytes at a time
ModeECB / CBC / GCM / CTRAffects security; never use ECB

A minimal AES-GCM encryption in Python (GCM is the recommended mode because it includes authentication). Imagine you want to lock a reconnaissance report into a box that only you and the fortress commander can open—AES is the lock mechanism:

python
# Python 3.11+ uses the cryptography library
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
import os

# Generate key: securely random, don't hardcode
key = AESGCM.generate_key(bit_length=256) # 32 bytes
aesgcm = AESGCM(key)

# nonce (number used once): must be different for each encryption
nonce = os.urandom(12)

plaintext = b"Border report: Found three suspicious campfires"
ciphertext = aesgcm.encrypt(nonce, plaintext, None)
print(f"Ciphertext (hex): {ciphertext.hex()}")

# Decrypt
decrypted = aesgcm.decrypt(nonce, ciphertext, None)
print(f"Decrypted: {decrypted.decode()}")

Save as aes_demo.py, run:

pip install cryptography
python aes_demo.py

Output looks like:

Ciphertext (hex): a1b2c3d4e5f6...
Decrypted: Border report: Found three suspicious campfires

Why is GCM recommended?

AES itself only encrypts, it doesn't authenticate. An attacker who intercepts the ciphertext can't decrypt it, but can modify certain bits in the ciphertext. If the decrypting side doesn't verify integrity, it will decrypt corrupted data. GCM (Galois/Counter Mode) generates an authentication tag (MAC) during encryption, so any tampering is detected.

This is the concept of AEAD (Authenticated Encryption with Associated Data). In real systems, always use AEAD mode. ECB mode is forbidden, CBC mode requires an additional HMAC—using GCM or ChaCha20-Poly1305 directly is simpler.

The Achilles' Heel of Symmetric Encryption

You have AES now, but there's an unsolved problem: how do both sides agree on the key?

If you and I have never met, how can we securely negotiate a key known only to the two of us? This is the Key Distribution Problem.

Second Lock: Asymmetric Encryption

Asymmetric encryption uses a pair of keys: the public key is publicly shared, the private key is kept secret. Content encrypted with the public key can only be decrypted by the corresponding private key.

plaintext + public key -> encryption algorithm -> ciphertext
ciphertext + private key -> decryption algorithm -> plaintext

RSA is the classic asymmetric encryption algorithm, based on the difficulty of factoring large integers. The symmetric lock is solved, but a new problem appears: how do you deliver the key to the other party? RSA lets everyone make a pair of keys—one public, one private—share the public key freely, and keep the private key on yourself:

python
from cryptography.hazmat.primitives.asymmetric import rsa, padding
from cryptography.hazmat.primitives import hashes

# Generate key pair (this is a slow operation)
private_key = rsa.generate_private_key(
    public_exponent=65537,
    key_size=2048, # 2048 bit is the current minimum standard
)
public_key = private_key.public_key()

# Encrypt with public key (RSA cannot encrypt large amounts of data! Max length ≈ key_size/8 - padding)
message = b"Key: AES-GCM: a1b2c3d4"
ciphertext = public_key.encrypt(
    message,
    padding.OAEP(
        mgf=padding.MGF1(algorithm=hashes.SHA256()),
        algorithm=hashes.SHA256(),
        label=None,
    ),
)

# Decrypt with private key
plaintext = private_key.decrypt(
    ciphertext,
    padding.OAEP(
        mgf=padding.MGF1(algorithm=hashes.SHA256()),
        algorithm=hashes.SHA256(),
        label=None,
    ),
)
print(f"Decrypted: {plaintext.decode()}")

The limitation of RSA is clear: it can only encrypt small amounts of data at a time (2048-bit RSA can encrypt about 190 bytes max), and it's thousands of times slower than AES.

Hybrid Encryption: Best Practice

In practice, you never use just one type of encryption. In the TLS handshake, this is what you see:

1. Client encrypts a "pre-master secret" with the server's public key (from the certificate)
2. Both sides derive session keys (AES keys) from the pre-master secret
3. All subsequent communication is encrypted with AES-GCM

Advantages: solves key distribution (RSA) while getting high performance (AES)

This is Hybrid Encryption—using asymmetric encryption to exchange symmetric keys, and symmetric encryption to protect actual data. Almost all modern security protocols (TLS, SSH, Signal) use this pattern.

Modern Alternative: ECC

ECC (Elliptic Curve Cryptography) is the modern replacement for RSA. At the same security strength, ECC keys are shorter and operations are faster:

Security StrengthRSA Key LengthECC Key Length
80 bit1024160-223
112 bit2048224-255
128 bit3072256-383
256 bit15360512+

At the 128-bit security level: RSA needs 3072-bit keys, while ECC only needs 256 bits.

In real systems, you're more likely to see ECDH (Elliptic Curve Diffie-Hellman) key exchange rather than direct ECC encryption. In TLS 1.3, ECDHE (ephemeral key exchange based on ECDH) is mainstream.

Both sides generate a key pair, exchange only the public half—then each independently computes the same shared secret. Alice and Bob don't need to agree on any secret beforehand:

python
from cryptography.hazmat.primitives.asymmetric import ec
from cryptography.hazmat.primitives.kdf.hkdf import HKDF
from cryptography.hazmat.primitives import hashes

# Alice generates key pair
alice_private = ec.generate_private_key(ec.SECP256R1())
alice_public = alice_private.public_key()

# Bob generates key pair
bob_private = ec.generate_private_key(ec.SECP256R1())
bob_public = bob_private.public_key()

# After exchanging public keys, each computes the shared secret
alice_shared = alice_private.exchange(ec.ECDH(), bob_public)
bob_shared = bob_private.exchange(ec.ECDH(), alice_public)

assert alice_shared == bob_shared # Both sides get the same shared secret

# Derive a symmetric key using HKDF
derived_key = HKDF(
    algorithm=hashes.SHA256(),
    length=32,
    salt=None,
    info=b"handshake data",
).derive(alice_shared)

print(f"Shared symmetric key (hex): {derived_key.hex()}")

This example shows the core of ECDH: two parties can negotiate a shared key known only to them over an insecure channel. Even if an attacker intercepts Alice's and Bob's public keys, they cannot compute the shared key—this is guaranteed by the difficulty of the elliptic curve discrete logarithm problem.

Third Tool: Hash Functions

Encryption protects confidentiality. But how do you know the data hasn't been tampered with? That's the domain of hash functions.

Three core properties of hash functions:

  1. Preimage resistance (one-way): Given h = H(x), cannot compute x
  2. Second preimage resistance: Given x, cannot find x' != x such that H(x) = H(x')
  3. Collision resistance: Cannot find any x != y such that H(x) = H(y)

SHA-256 is the most widely used hash function (outputs 256 bit / 32 bytes). Hashing is not encryption—it's an irreversible fingerprint. The letter you send to the other party can be verified by comparing the hash value at the end to confirm the content hasn't been tampered with:

python
import hashlib

data = b"Border report: Found three suspicious campfires"
hash_value = hashlib.sha256(data).digest()
print(f"SHA-256: {hash_value.hex()}")

# Any tiny change produces a completely different hash
data2 = b"Border report: Found three suspicious campfires."
hash2 = hashlib.sha256(data2).digest()
print(f"Changed slightly: {hash2.hex()}")
print(f"Same? {hash_value == hash2}") # False

Typical uses of hashing:

UsageDescriptionExample
Data integrityCompare hash after file download to confirm no tamperingsha256sum
Message authentication (HMAC)Combined with a key, authenticates sender and verifies integrityHMAC-SHA256
Password storageStore hash instead of plaintext passwordSee below
Digital signaturesHash first, then sign (solves asymmetric encryption's small data limit)RSA-SHA256

HMAC (Hash-based Message Authentication Code) solves a problem: if you only use a hash, anyone can forge a hash value. With a key added, only those who know the key can compute a valid hash. It's like the courier and the fortress commander agreeing on a secret code—the fingerprint computed from the message plus the code is what counts:

python
import hmac
import hashlib

key = b"shared-secret-key"
message = b"Meet at the fortress north gate at 3 PM today"

hmac_value = hmac.new(key, message, hashlib.sha256).digest()
print(f"HMAC: {hmac_value.hex()}")

# The verifying side computes HMAC with the same key and message, then compares

Password Storage: From Plaintext to Argon2

Storing user passwords is a different scenario. The problem here isn't "transmission security" but "storage security."

  Plaintext storage: all passwords exposed if the database is leaked
  Single hash (SHA-256): easily reversed with rainbow tables
  Salted SHA-256: defeats rainbow tables, but GPUs can compute hundreds of millions per second
  Slow hash functions: Argon2, bcrypt, scrypt

The design philosophy of slow hash functions (also known as key derivation functions, KDFs): make each computation slow enough that brute-force becomes impractical.

Argon2 is the winner of the 2015 Password Hashing Competition, with three variants:

  • Argon2d: Resistant to GPU attacks (data-dependent memory access)
  • Argon2i: Resistant to side-channel attacks
  • Argon2id: Hybrid mode (recommended)

The fortress sentry passwords can't be stored in plaintext—but you also can't just do one SHA-256, because a GPU can exhaust all common passwords in seconds. Argon2 deliberately makes each calculation slow, slow enough that brute-force isn't worth it:

python
# Using the argon2-cffi library (pip install argon2-cffi)
from argon2 import PasswordHasher

ph = PasswordHasher(
    time_cost=3,       # Number of iterations
    memory_cost=65536, # 64 MB memory
    parallelism=4,     # Parallel threads
    hash_len=32,       # Output hash length
    salt_len=16,       # Salt length
)

password = "correct-horse-battery-staple"
hash_str = ph.hash(password)
print(f"Argon2 hash:\n{hash_str}")

# Verify
try:
    ph.verify(hash_str, "correct-horse-battery-staple")
    print("Verification passed")
except:
    print("Verification failed")

The output format of hash_str includes all necessary parameters for storage and migration:

$argon2id$v=19$m=65536,t=3,p=4$<salt_hex>$<hash_hex>

Even if the database is leaked, cracking a single Argon2 password takes tens of milliseconds. Ten thousand passwords would take hundreds of seconds, hundreds of thousands would take hours. With SHA-256, it would be done in seconds.


Common Pitfalls

  • Using ECB mode. In ECB mode, identical plaintext blocks produce identical ciphertext blocks. In an image encryption example, ECB-encrypted images still show visible outlines. Don't use ECB.
  • Inventing your own encryption algorithm. Even experienced cryptographers often fail. Use standard libraries and audited implementations.
  • Hardcoding keys. Keys should come from environment variables, key management systems, or HSMs, not from code.
  • Using outdated algorithms. MD5 and SHA-1 have broken collision resistance; don't use them for security. RC4 and DES are fully broken.
  • Ignoring random number quality. The random module is not a secure random number generator. Use os.urandom or secrets.
  • RSA directly encrypting short messages. 2048-bit RSA can encrypt about 245 bytes max. Use hybrid encryption.

Pass Challenges

  • Warm-up: Encrypt a message with AES-GCM, then deliberately modify the last byte of the ciphertext and observe the decryption failure.
  • Challenge: Implement a demonstration with ECDH where "two parties exchange public keys on paper and can share a key." Derive the shared secret into an AES key, then use it to encrypt and decrypt a message.
  • Troubleshoot: You find an online service storing passwords as SHA256(password). Design an attack plan, then propose a technical plan to the team to migrate to Argon2.
  • Observe: Run openssl speed aes-128-gcm and openssl speed rsa2048 to compare the performance difference (order of magnitude) between symmetric and asymmetric encryption.

Traveler's Notes

  • Symmetric encryption (AES-GCM) solves transmission security but requires solving the key distribution problem
  • Asymmetric encryption (RSA/ECC) solves key distribution but is slow and capacity-limited
  • Real systems use hybrid encryption: asymmetric handshake, symmetric transmission
  • Hash functions provide integrity verification but cannot replace encryption
  • Store passwords with slow hashes (Argon2), not fast hashes or plaintext

Next Stop Preview

Now you have the lock of encryption. But how do you know that the public key in your hand really belongs to the other party? Next chapter, we enter PKI—the trust network of the digital world.

Built with VitePress | Software Systems Atlas