Skip to content

Chapter 1: Data Encoding — The Rune System of Computers

Vol 3: Computer Core Expedition


Metadata Card

AttributeValue
Difficulty(Foundation)
KeywordsBinary, Hexadecimal, ASCII, Unicode, UTF-8, Endianness

Your Progress

"Welcome to the Computer Core. The journey begins with the most fundamental question: how does a computer represent information? Everything — numbers, text, images, instructions — is just bits."


Encounter: From Bits to Bytes to Words

Binary and Hexadecimal

Computers use binary (base 2) internally. Humans use hexadecimal (base 16) as shorthand — 4 bits = 1 hex digit.

Binary:    0000 0001 0010 0011 ... 1111
Decimal:   0    1    2    3    ... 15
Hex:       0    1    2    3    ... F

ASCII

7-bit encoding for English characters. 'A' = 65, 'a' = 97, '0' = 48.

Unicode and UTF-8

Unicode assigns every character a code point (U+0000 to U+10FFFF). UTF-8 encodes these code points into 1-4 bytes with a clever prefix scheme:

  • 1 byte: 0xxxxxxx (ASCII, U+0000-U+007F)
  • 2 bytes: 110xxxxx 10xxxxxx (U+0080-U+07FF)
  • 3 bytes: 1110xxxx 10xxxxxx 10xxxxxx (U+0800-U+FFFF)
  • 4 bytes: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx (U+10000-U+10FFFF)

Endianness

  • Little-endian: Least significant byte first (x86)
  • Big-endian: Most significant byte first (network protocols)

Verification Checklist

  • [ ] Can convert between binary, hex, and decimal
  • [ ] Can explain UTF-8 encoding scheme
  • [ ] Can explain endianness and its practical impact

Traveler's Notes

  • All data is just bits. The meaning depends on interpretation.
  • UTF-8 is the dominant encoding on the web — backward compatible with ASCII.

Next Stop Preview

Chapter 2: Integers and Floats — The Secrets of Numbers

Built with VitePress | Software Systems Atlas