Chapter 1: Data Encoding — The Rune System of Computers
Vol 3: Computer Core Expedition
Metadata Card
| Attribute | Value |
|---|---|
| Difficulty | (Foundation) |
| Keywords | Binary, Hexadecimal, ASCII, Unicode, UTF-8, Endianness |
Your Progress
"Welcome to the Computer Core. The journey begins with the most fundamental question: how does a computer represent information? Everything — numbers, text, images, instructions — is just bits."
Encounter: From Bits to Bytes to Words
Binary and Hexadecimal
Computers use binary (base 2) internally. Humans use hexadecimal (base 16) as shorthand — 4 bits = 1 hex digit.
Binary: 0000 0001 0010 0011 ... 1111
Decimal: 0 1 2 3 ... 15
Hex: 0 1 2 3 ... FASCII
7-bit encoding for English characters. 'A' = 65, 'a' = 97, '0' = 48.
Unicode and UTF-8
Unicode assigns every character a code point (U+0000 to U+10FFFF). UTF-8 encodes these code points into 1-4 bytes with a clever prefix scheme:
- 1 byte: 0xxxxxxx (ASCII, U+0000-U+007F)
- 2 bytes: 110xxxxx 10xxxxxx (U+0080-U+07FF)
- 3 bytes: 1110xxxx 10xxxxxx 10xxxxxx (U+0800-U+FFFF)
- 4 bytes: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx (U+10000-U+10FFFF)
Endianness
- Little-endian: Least significant byte first (x86)
- Big-endian: Most significant byte first (network protocols)
Verification Checklist
- [ ] Can convert between binary, hex, and decimal
- [ ] Can explain UTF-8 encoding scheme
- [ ] Can explain endianness and its practical impact
Traveler's Notes
- All data is just bits. The meaning depends on interpretation.
- UTF-8 is the dominant encoding on the web — backward compatible with ASCII.
→ Next Stop Preview
Chapter 2: Integers and Floats — The Secrets of Numbers