Skip to content

Metadata Card

  • Prerequisites: Vol 9 Chapter 1 (generics/type systems), Vol 6 Software Engineering Basics (annotations, package management)
  • Estimated time: 45 minutes
  • Core difficulty: Advanced
  • Reading mode: High focus
  • Optional skip: Rust macro hygiene section can be skimmed
  • Completion mark: Can explain what "code that generates code" means; understand the core differences between Java annotation processing, Python metaclasses, and Rust macros

Your Progress

This is the last door of the ruins. You push it open and find the room empty except for a single mirror—reflecting not your face, but the code you wrote. A line is carved into the wall: "The most powerful tool is the tool that can modify itself."

Master Chen left a note in the corner: "I wrote a factory function that generates API interfaces in Python, a macro that generates parsers in Rust, and an annotation processor that handles database mapping in Java. They all do the same thing—code that writes code."

Your Task

Metaprogramming is "code that writes code." Your normal code operates on data; meta-code operates on code itself. From C macros (text replacement) to Rust macros (AST manipulation), from Python metaclasses to Java annotation processors—every metaprogramming approach answers the same question: can you make the compiler or runtime do the repetitive work for you?

Chapter Layers

  • Required: Core categories of metaprogramming, Java annotation processing, Rust declarative macros
  • Optional: Python metaclasses
  • Advanced: Rust procedural macros, internal DSL patterns

Breaking Ground · Tracing the Origin

You realize you've been writing the same kind of code repeatedly—every controller does parameter validation, calls a Service, logs. These structures are largely similar, just with different entity names. In the ruins' blueprints, this repetition is marked as "the call of metaprogramming":

java
public class UserController {
    public Response createUser(CreateUserRequest req) {
        // Parameter validation
        if (req.getName() == null || req.getName().isEmpty()) {
            throw new ValidationException("name is required");
        }
        // Conversion + Service call
        User user = new User();
        user.setName(req.getName());
        userService.create(user);
        // Logging
        logger.info("User created: {}", user.getId());
        return Response.ok(user);
    }
}

public class OrderController {
    public Response createOrder(CreateOrderRequest req) {
        // Almost the same as above, different object names
        if (req.getProductId() == null) {
            throw new ValidationException("productId is required");
        }
        Order order = new Order();
        order.setProductId(req.getProductId());
        orderService.create(order);
        logger.info("Order created: {}", order.getId());
        return Response.ok(order);
    }
}

You can copy-paste manually and pray you don't miss updating one spot when changing another. Or, you can let code generate this code for you.


Three Levels of Metaprogramming

LevelTimingRepresentative
Compile-time code generationAt compile timeJava annotation processors, Rust macros
Runtime code modificationAt runtimePython metaclasses, Ruby method_missing
DSL (Domain-Specific Language)Abstraction layerInternal DSL (Java Builder), External DSL (DSL parser)

Rust Macros: Operating on the AST at Compile Time

Rust has two types of macros. Rust macros operate on the AST at compile time, not simple text replacement. Look at what Rust's standard library vec! macro does under the hood:

rust
// Define a macro: quickly create a vector
macro_rules! vec {
    ( $( $x:expr ),* ) => {
        {
            let mut temp_vec = Vec::new();
            $(
                temp_vec.push($x);
            )*
            temp_vec
        }
    };
}

// Usage
let v = vec![1, 2, 3];  // Expands to vec.push(1); vec.push(2); vec.push(3);

This macro matches the pattern $( $x:expr ),*—meaning "comma-separated, any number of expressions." After matching, it executes temp_vec.push($x) for each matched expression.

C macros vs Rust macros:

c
#define SQUARE(x) x * x
// SQUARE(1+2) expands to 1+2*1+2 = 5 — not what you wanted!

C macros operate at the text level. Rust macros operate at the Abstract Syntax Tree (AST) level—after parsing the syntax tree, they match and replace AST nodes. 1+2 is a complete expression node in the macro expansion and won't be split.

This is what "hygiene" means: Rust macros don't accidentally capture variables from the outer scope and don't produce syntactic parsing ambiguities.

Procedural macros go further—the macro itself is a Rust program that receives a TokenStream and outputs a TokenStream:

rust
// One of the most commonly used procedural macros: derive macro
#[derive(Debug, Clone)]
struct User {
    name: String,
    age: u32,
}

#[derive(Debug)] tells the compiler: automatically generate the implementation of the Debug trait. You can customize derive macros:

rust
// This is a simplified illustration; in practice you'd use the proc_macro crate
#[proc_macro_derive(MyBuilder)]
pub fn my_builder_derive(input: TokenStream) -> TokenStream {
    // Parse input struct -> generate Builder pattern code -> output as TokenStream
}

Rust's procedural macros let you fully control code generation at compile time—no runtime reflection, no performance overhead.

Why did Rust choose macros over annotation processors? The language's built-in macro system deeply binds code generation to the syntax tree, without needing an external annotation processor process. The cost is that writing macros has a steep learning curve.


Java Annotation Processors: Compile-Time Code Generation

Java's metaprogramming solution is annotation processors—not built into the language, but a plugin point in the javac compilation process.

Java annotation processors are compile-time plugins—you put a @Builder annotation on your code, and the compiler calls your processor during compilation to automatically generate the complete Builder pattern code:

java
// Define an annotation
@Retention(RetentionPolicy.SOURCE)  // Keep only until source phase
@Target(ElementType.TYPE)
public @interface Builder {
    // Marker for classes needing Builder pattern generation
}

// Define an annotation processor
@SupportedAnnotationTypes("com.example.Builder")
@SupportedSourceVersion(SourceVersion.RELEASE_21)
public class BuilderProcessor extends AbstractProcessor {
    @Override
    public boolean process(Set<? extends TypeElement> annotations,
                          RoundEnvironment roundEnv) {
        for (Element element : roundEnv.getElementsAnnotatedWith(Builder.class)) {
            // Read class info, generate Builder source code
            generateBuilder((TypeElement) element);
        }
        return true;
    }

    private void generateBuilder(TypeElement element) {
        // Use Filer to create a new .java file
        // Write generated Builder code
    }
}

How does it work?

  1. javac scans classes annotated with @Builder
  2. Calls BuilderProcessor.process()
  3. Processor reads the annotated class's metadata (fields, types)
  4. Writes a new .java file via the Filer API
  5. javac automatically compiles the newly generated file

How is this different from Rust macros?

Rust MacrosJava Annotation Processors
InputTokenStream/ASTCompiled TypeElement
OutputTokenStreamComplete .java file
TimingMacro expansion during compilationDuring compilation
ReflectionNot neededNeeds mirror API to read structure
Code styleGeneration logic inside macroGenerator class + output .java

Java's approach is more "heavyweight"—it needs an external processor and registration with javac. But it's also easier to debug: what's generated is readable Java source code.

Classic example: Project Lombok—@Data, @Getter, @Setter are all generated by annotation processors. You can write your own simplified @Builder to understand the process.

In Java 21, annotation processors' role has shifted somewhat: the language introduced new annotations like @PreviewFeature, but the basic mechanism of annotation processors hasn't changed. Understanding it is important because Spring Boot, MapStruct, and Lombok all depend on it.


Python Metaclasses: Runtime Class Factories

Python's metaprogramming is completely different from Java/Rust—it happens at runtime, not at compile time.

In Python, classes themselves are objects. When you write class User:, Python calls type() to create this class object. A metaclass is a "class that creates classes"—you can customize the class creation process.

python
# A simple metaclass: automatically adds a created_at field to classes
class TimestampMeta(type):
    def __new__(cls, name, bases, attrs):
        # Automatically add a field when creating the class
        attrs['created_at'] = None
        return super().__new__(cls, name, bases, attrs)

# Using the metaclass
class User(metaclass=TimestampMeta):
    def __init__(self, name):
        self.name = name

u = User("Alice")
print(u.created_at)  # None — automatically added by metaclass

How does this work? type.__new__ is a class factory—it receives the class name, base classes, and attribute dictionary, and returns a new class object. A custom metaclass overrides this factory process, modifying the attribute dictionary before the class is created.

Comparison with Java annotation processing:

Python's metaclasses are runtime class factories—before the class is created, the metaclass intercepts it, modifies it, and lets it through. No external processor needed, no recompilation required, but the metaclass is triggered as soon as the class is imported:

python
# Python metaclass: runtime, intercepts class creation
# You don't need an external processor, no recompilation needed
# But: runtime performance overhead, IDE can't easily infer metaclass-generated members

class ApiEndpointMeta(type):
    def __new__(cls, name, bases, attrs):
        # Scan all methods, automatically register routes for methods starting with 'api_'
        for method_name, method in attrs.items():
            if method_name.startswith('api_'):
                route = method_name[4:]  # api_users -> /users
                # Can register the route here
                print(f"Registering route: {route}")
        return super().__new__(cls, name, bases, attrs)

class MyAPI(metaclass=ApiEndpointMeta):
    def api_users(self): pass
    def api_orders(self): pass
    def helper(self): pass  # Not registered

Metaclass pitfalls: The metaclass executes when the class is created, not when instances are created. If you do IO operations in the metaclass, importing the class will trigger them. Additionally, metaclass inheritance is multi-layered—a class's metaclass and its parent's metaclass may conflict. Python 3 has a class creation order algorithm to resolve this, but it takes time to understand.


Internal DSLs: Writing Domain-Specific Languages in the Host Language

A DSL (Domain-Specific Language) is a "small language" you design for a specific problem domain. SQL is a DSL (querying), Regex is a DSL (matching), Makefiles are a DSL (building).

An internal DSL means using the host language's syntax to construct a natural, fluent domain expression. Java's Builder pattern is actually an internal DSL:

java
// Internal DSL style
Pizza pizza = Pizza.builder()
    .size(Size.LARGE)
    .crust(Crust.THIN)
    .addTopping(Topping.CHEESE)
    .addTopping(Topping.PEPPERONI)
    .build();

These size(), crust(), addTopping() methods are just regular Java methods returning this—but the chain of calls gives the feeling of "describing a pizza in Java."

More professional internal DSL (e.g., jOOQ—writing SQL in Java):

java
create.selectFrom(BOOK)
      .where(BOOK.PUBLISHED_IN.eq(2009))
      .and(BOOK.TITLE.like("%Java%"))
      .fetch();

This isn't string SQL; it's a type-safe DSL—BOOK.PUBLISHED_IN is a code-generated class. You get auto-completion in your IDE, and misspelled field names won't compile.

Internal DSL vs External DSL:

Internal DSLExternal DSL
ImplementationLeverage host language syntaxCustom parser
Tool supportHost IDE supportNeeds independent toolchain
FlexibilityLimited by host syntaxFully customizable
Learning costKnow the host languageNeed to learn another language
ExamplesjOOQ, AssertJ, SpockSQL, YAML, GraphQL

The philosophy of internal DSLs: don't create an entirely new language; push your host language to its limits until it naturally expresses your domain concepts.


Three Approaches Compared

Rust MacrosJava Annotation ProcessorsPython Metaclasses
TimingCompile timeCompile timeRuntime (class import)
Language built-inYes (built-in macro system)No (stdlib + compiler hooks)Yes (type system)
Operates onAST nodesCompiled TypeElementRuntime class object structure
Performance overhead0 (expanded at compile time)0 (hardcoded in generated source)Yes (each import executes)
Learning curveSteepMediumMedium

Common Pitfalls

  1. "Metaprogramming = magic = use everywhere indiscriminately" — Metaprogramming increases comprehension cost. Every layer of abstraction makes debugging more complex. If a problem can be solved with regular code, don't use macros/metaclasses.
  2. "Annotation processors can only generate glue code" — Correct, but that's also their sweet spot. Lombok generates getters/setters, MapStruct generates mappers—this is the annotation processor's happy place.
  3. "Metaclasses can replace inheritance" — They can, but they shouldn't. Metaclasses modify the class creation process, while inheritance modifies class behavior. They have different goals.

Pass Challenges

  • Warm-up: In Java, write a @LogExecutionTime annotation (just declare it). Think about what information an annotation processor would need to implement it.
  • Hands-on: In Rust, use macro_rules! to write a hashmap! macro that lets you write hashmap!("key" => "value") instead of explicit HashMap insertion.
  • Observe: In Python, define a metaclass that prints name and attrs in __new__, then define a class using that metaclass—observe the metaclass's execution timing.

Traveler's Notes

Metaprogramming is manipulating the language from within the language—not using the compiler, but becoming part of the compiler. But it's a double-edged sword: every code generation is a promise of readability, a promise that "this black box does what it should."

Next Stop Preview

You've walked through all six chambers of the Language Ruins. Types, memory, concurrency, functions, bytecode, metaprogramming—behind every door is a directional choice a language could make. This knowledge isn't tied to any single language; it lets you, when learning a new language, see at a glance: "Ah, you chose a different path at this fork from Java."

Ahead lies the Mathematics Tower. The curator is waiting there, holding a glowing book.

Built with VitePress | Software Systems Atlas