Garbage Collection¶
This document describes Lona's garbage collection system: a per-process generational copying collector inspired by BEAM, adapted for seL4's capability-based memory model.
Note: All code examples in this document are pseudocode for illustration purposes.
Overview¶
Lona uses a per-process generational semi-space copying collector following BEAM's proven design. Each Lonala process has its own isolated heap, enabling independent garbage collection with no global pauses.
Key Properties¶
| Property | Description |
|---|---|
| Per-process | Each process GCs independently; no stop-the-world pauses |
| Generational | Two generations: young (frequently collected) and old (rarely collected) |
| Copying | Live objects are copied to new space; dead objects are implicitly reclaimed |
| Semi-space | Young heap uses two spaces (from/to) for efficient copying |
| Soft real-time | GC pauses are per-process, typically 10-100 µs |
| Immutable heap | All heap objects are immutable after allocation (no write barriers needed) |
Immutability Guarantee¶
Critical invariant: All heap objects in Lona are immutable after allocation. This is fundamental to the GC design.
WHY IMMUTABILITY MATTERS FOR GENERATIONAL GC
════════════════════════════════════════════════════════════════════════════════
Problem with mutable objects:
- Old object A promoted to old heap
- Later, A is mutated to point to young object B
- Minor GC runs, only scanning young heap
- A→B reference is NOT scanned (A is in old heap)
- B appears dead, is not copied
- A now has dangling pointer → CRASH
Solutions:
1. Write barrier + remembered set (complex, runtime overhead)
2. Immutable objects (no old→young references can be created)
Lona chooses #2: ALL HEAP OBJECTS ARE IMMUTABLE.
This is enforced by:
- Lonala's functional semantics (persistent data structures)
- No mutation primitives in the language
- "Updates" create new objects via structural sharing
This immutability guarantee eliminates the need for write barriers or remembered sets, simplifying the GC and improving performance.
Why Per-Process GC?¶
SYSTEM-WIDE VIEW
════════════════════════════════════════════════════════════════════════════════
Process A Process B Process C
│ │ │
│ GC in progress │ executing │ executing
│ ████████████████ │ │
│ (A paused ~50 µs) │ (unaffected) │ (unaffected)
│ │ │
│ executing │ GC in progress │ executing
│ │ ████████████████ │
│ (unaffected) │ (B paused ~30 µs) │ (unaffected)
│ │ │
▼ ▼ ▼
- NO coordination between processes
- GC of Process A has ZERO impact on Process B or C
- System remains responsive during all GC operations
- This is fundamental to soft real-time properties
BEAM's Garbage Collector (Reference)¶
Understanding BEAM's GC is essential since Lona's design closely follows it. This section documents BEAM's approach.
BEAM Memory Layout¶
Each BEAM process has a private heap containing both stack and young objects:
BEAM PROCESS HEAP
════════════════════════════════════════════════════════════════════════════════
┌────────────────────────────────────────────────────────────────────────┐
│ │
│ STACK FREE YOUNG HEAP │
│ (grows down) SPACE (grows up) │
│ │
│ ┌─────────┐ ┌─────────┐ │
│ │ frame 2 │ │ tuple │ │
│ ├─────────┤ ├─────────┤ │
│ │ frame 1 │ │ pair │ │
│ ├─────────┤ ├─────────┤ │
│ │ frame 0 │◄─ SP HTOP─►│ binary │ │
│ └─────────┘ └─────────┘ │
│ │ │ │
│ ▼ ▲ │
│ grows toward ──────────────────────────────────── grows toward │
│ lower addresses higher addresses │
│ │
└────────────────────────────────────────────────────────────────────────┘
GC triggered when: HTOP >= SP (heap meets stack)
BEAM Generational Model¶
BEAM divides objects into generations using a high-water mark:
BEAM GENERATIONAL DIVISION
════════════════════════════════════════════════════════════════════════════════
BEFORE Minor GC:
┌─────────────────────────────────────────────────────────────────────────────┐
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ STACK │ │ OLD OBJECTS │ YOUNG OBJECTS │ (full) │ │
│ │ │ │ (below h_water) │ (above h_water) │ │ │
│ │ ───► │ │ [old1][old2] │ [new1][new2][new3]│ ◄─── │ │
│ └───────┴──────────┴─────────────────┴──────────────────┴─────────────┘ │
│ ▲ ▲ ▲ │
│ SP high_water HTOP │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
The high_water mark separates:
- Objects below: survived previous GC (old generation)
- Objects above: recently allocated (young generation)
Minor GC only processes young objects (above high_water).
BEAM's Two-Heap Architecture¶
BEAM actually uses two separate memory blocks for better performance:
BEAM TWO-HEAP ARCHITECTURE
════════════════════════════════════════════════════════════════════════════════
YOUNG HEAP (active allocation):
┌─────────────────────────────────────────────────────────────────────────────┐
│ STACK ◄─────────────────────────────────────────────────────► YOUNG HEAP │
│ [frames] FREE SPACE [recently allocated] │
└─────────────────────────────────────────────────────────────────────────────┘
OLD HEAP (promoted objects):
┌─────────────────────────────────────────────────────────────────────────────┐
│ PROMOTED OBJECTS │ FREE SPACE │
│ [survived GC N-3][survived GC N-2][survived GC N-1] │
└─────────────────────────────────────────────────────────────────────────────┘
Minor GC: Copy live young objects → Old heap. Reset young heap.
Major GC: Collect both heaps → New young heap. Fresh old heap.
BEAM Cheney's Copying Algorithm¶
BEAM uses Cheney's algorithm for efficient copying:
CHENEY'S COPYING ALGORITHM
════════════════════════════════════════════════════════════════════════════════
Phase 1: Root Scanning
────────────────────────────────────────────────────────────────────────────────
Scan roots (stack, registers, process dictionary)
For each pointer to young generation:
→ Copy object to old heap (or new young heap for major GC)
→ Leave "forwarding pointer" in original location
→ Update root to point to new location
Phase 2: Scan Copied Objects
────────────────────────────────────────────────────────────────────────────────
Maintain a "scan" pointer in to-space
While scan < allocation_pointer:
For each pointer in current object:
If points to from-space:
If forwarding pointer exists → update to new location
Else → copy object, leave forwarding pointer, update
FROM-SPACE (before): TO-SPACE (after):
┌───────────────────────┐ ┌───────────────────────┐
│ [A]──────►[B] │ ──► │ [A']─────►[B'] │
│ [C]◄──┘ │ copy │ [C']◄─┘ │
│ [D] (dead, no refs) │ │ │
└───────────────────────┘ └───────────────────────┘
Dead object D not copied
Key properties:
- Single pass through live data (no separate mark phase)
- Compacts memory (no fragmentation)
- Work proportional to LIVE data, not total heap
BEAM Heap Growth¶
BEAM uses a Fibonacci-like sequence for small heaps, then switches to percentage growth:
BEAM HEAP GROWTH SEQUENCE
════════════════════════════════════════════════════════════════════════════════
Phase 1: Fibonacci-like (small heaps)
────────────────────────────────────────────────────────────────────────────────
233 → 377 → 610 → 987 → 1597 → 2584 → 4181 → 6765 → 10946 → ... (words)
In bytes (64-bit, 8 bytes/word):
~2KB → ~3KB → ~5KB → ~8KB → ~13KB → ~21KB → ~33KB → ~54KB → ~87KB → ...
Phase 2: 20% growth (large heaps, after ~1 megaword / ~8 MB)
────────────────────────────────────────────────────────────────────────────────
8 MB → 9.6 MB → 11.5 MB → 13.8 MB → 16.6 MB → 19.9 MB → ...
Heap shrinking: When live data < 25% of heap capacity after GC
BEAM Large Binary Handling¶
Large binaries (≥64 bytes) are stored outside the process heap:
BEAM LARGE BINARY HANDLING
════════════════════════════════════════════════════════════════════════════════
Process A heap: Shared Binary Heap:
┌─────────────────┐ ┌─────────────────────────────────────────┐
│ ProcBin ────────┼───────────────►│ refcount: 2 │
│ (small header) │ │ size: 10 MB │
└─────────────────┘ │ data: [................] │
└─────────────────────────────────────────┘
Process B heap: ▲
┌─────────────────┐ │
│ ProcBin ────────┼────────────────────────────────┘
│ (small header) │
└─────────────────┘
- Only ProcBin (small pointer structure) in process heap
- Actual binary data in shared heap with reference counting
- Zero-copy message passing for large binaries
- MSO (Mark-Sweep Object) list tracks ProcBins for GC
Virtual Binary Heap:
- Tracks cumulative off-heap binary size per process
- Triggers early GC when binary memory becomes substantial
- Prevents binary memory from growing unbounded
Lona's Garbage Collector¶
Lona adapts BEAM's GC model for seL4's capability-based memory system.
Architectural Context¶
LONA MEMORY HIERARCHY
════════════════════════════════════════════════════════════════════════════════
┌─────────────────────────────────────────────────────────────────────────────┐
│ LONA MEMORY MANAGER │
│ │
│ - Manages physical memory (Untyped capabilities) │
│ - Allocates memory to realms via IPC │
│ - Handles page faults for inherited regions │
│ - Enforces per-realm memory quotas │
└─────────────────────────────────────────────────────────────────────────────┘
│
│ allocates pages via IPC
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ REALM │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Per-Worker Allocator Instances (lock-free within worker) │ │
│ │ │ │
│ │ Worker 0 Allocator Worker 1 Allocator Worker N Allocator │ │
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ │
│ │ │ [carriers] │ │ [carriers] │ │ [carriers] │ │ │
│ │ └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ │ │
│ └───────────┼─────────────────────┼─────────────────────┼─────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐ │
│ │ Process A │ │ Process D │ │ Process G │ │
│ │ [young][old] │ │ [young][old] │ │ [young][old] │ │
│ ├───────────────────┤ ├───────────────────┤ ├───────────────────┤ │
│ │ Process B │ │ Process E │ │ Process H │ │
│ │ [young][old] │ │ [young][old] │ │ [young][old] │ │
│ └───────────────────┘ └───────────────────┘ └───────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Realm Binary Heap (reference-counted large binaries) │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Process Memory Layout¶
Each process owns two memory blocks following BEAM's model. See Term Representation for value encoding details.
LONA PROCESS MEMORY LAYOUT
════════════════════════════════════════════════════════════════════════════════
YOUNG HEAP (stack + young objects, single contiguous block):
┌─────────────────────────────────────────────────────────────────────────────┐
│ │
│ STACK FREE YOUNG OBJECTS │
│ (grows down) SPACE (grows up) │
│ │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ Frame Header (32B) │ │ Term (8B) │ │
│ │ ├─ return_ip │ ├─────────────────────┤ │
│ │ ├─ chunk_addr │ │ Pair (16B) │ │
│ │ ├─ caller_frame │ ├─────────────────────┤ │
│ │ └─ y_count │ │ HeapTuple (8B + N×8)│ │
│ ├─────────────────────┤ ├─────────────────────┤ │
│ │ Y(0) (8B) │ │ HeapString (8B + N) │ │
│ │ Y(1) (8B) │ ├─────────────────────┤ │
│ │ ... │◄─stop htop─►│ [next allocation] │ │
│ └─────────────────────┘ └─────────────────────┘ │
│ │ │ │
│ ▼ ▲ │
│ grows toward ──────────────────────────────────── grows toward │
│ lower addresses higher addresses │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
▲ ▲
│ │
heap (low address) hend (high)
Key pointers:
heap = base address of young heap
hend = end address of young heap
htop = heap top (allocation pointer, grows UP)
stop = stack pointer (grows DOWN)
GC trigger: htop >= stop (young heap exhausted)
OLD HEAP (promoted objects, separate block):
┌─────────────────────────────────────────────────────────────────────────────┐
│ │
│ PROMOTED OBJECTS (survived Minor GC) FREE │
│ SPACE │
│ ┌─────────────────────┬─────────────────────┬─────────────────────┐ │
│ │ [promoted GC N-2] │ [promoted GC N-1] │ [promoted GC N] │◄──────│
│ └─────────────────────┴─────────────────────┴─────────────────────┘ │
│ ▲ │
│ old_htop │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
▲ ▲
│ │
old_heap old_hend
Never touched during Minor GC. Collected only during Major GC.
Term Representation¶
Lona uses BEAM-style tagged words (8 bytes) for memory efficiency. See Term Representation for complete details.
TERM FORMAT (8 bytes)
════════════════════════════════════════════════════════════════════════════════
┌─────────────────────────────────────────────────────────────────┬──────┐
│ Payload (62 bits) │ Tag │
│ │ (2b) │
└─────────────────────────────────────────────────────────────────┴──────┘
Primary Tags:
00 = HEADER (heap object marker, only on heap)
01 = LIST (pointer to pair)
10 = BOXED (pointer to heap object with header)
11 = IMMEDIATE (nil, bool, small int, symbol, keyword)
GC Classification:
- Immediate (tag 11): No GC action needed
- List (tag 01): Points to 16-byte pair (no header)
- Boxed (tag 10): Points to heap object with 8-byte header
- Symbols/keywords: Interned in realm, not on process heap (not copied)
Heap Object Layouts¶
All heap objects (except pairs) have an 8-byte header word. See Term Representation for full details.
HEAP OBJECT LAYOUTS
════════════════════════════════════════════════════════════════════════════════
Header word format (8 bytes):
┌───────────────────────────────────────────┬────────────────────┬──────┐
│ Arity / Size (54 bits) │ Object Tag (8b) │ 00 │
└───────────────────────────────────────────┴────────────────────┴──────┘
Pair (NO header - identified by PAIR tag on pointer):
┌────────────────────────────────────┬────────────────────────────────────┐
│ Head: Term (8B) │ Rest: Term (8B) │
└────────────────────────────────────┴────────────────────────────────────┘
Total: 16 bytes
Tuple (header + elements):
┌────────────────────────────────────┬────────────────────────────────────┐
│ Header (arity=N, tag=TUPLE) (8B) │ elements: [Term; N] │
└────────────────────────────────────┴────────────────────────────────────┘
Total: 8 + N × 8 bytes
String (header + UTF-8 data):
┌────────────────────────────────────┬────────────────────────────────────┐
│ Header (arity=len, tag=STRING) (8B)│ UTF-8 bytes (aligned to 8) │
└────────────────────────────────────┴────────────────────────────────────┘
Total: 8 + align8(len) bytes
Map (header + entries term):
┌────────────────────────────────────┬────────────────────────────────────┐
│ Header (tag=MAP) (8B) │ entries: Term (8B) │
└────────────────────────────────────┴────────────────────────────────────┘
Total: 16 bytes
Closure (header + function + captures):
┌────────────────────────────────────┬────────────────────────────────────┐
│ Header (arity=N, tag=CLOSURE) (8B) │ function: Term (8B) │
├────────────────────────────────────┴────────────────────────────────────┤
│ captures: [Term; N] │
└─────────────────────────────────────────────────────────────────────────┘
Total: 16 + N × 8 bytes
Minor GC (Young Generation Collection)¶
Minor GC runs when the young heap is exhausted. It collects only young objects, promoting live ones to the old heap.
Trigger Condition¶
MINOR GC TRIGGER
════════════════════════════════════════════════════════════════════════════════
fn alloc(&mut self, size: usize) -> Option<Vaddr> {
let aligned_size = align_up(size, 8);
let new_htop = self.htop + aligned_size;
if new_htop >= self.stop {
// Young heap exhausted - trigger Minor GC
self.minor_gc()?;
// Retry allocation after GC
let new_htop = self.htop + aligned_size;
if new_htop >= self.stop {
// Still not enough space - need heap growth
self.grow_heap(aligned_size)?;
}
}
let addr = self.htop;
self.htop = new_htop;
Some(addr)
}
Minor GC Algorithm¶
IMPORTANT: The stack does NOT move during minor GC. Only young heap objects (between heap and htop) are copied to old heap. The stack (between stop and hend) stays in place.
MINOR GC ALGORITHM
════════════════════════════════════════════════════════════════════════════════
Phase 1: Prepare
────────────────────────────────────────────────────────────────────────────────
old_scan = old_htop // Track where we start copying in old heap
Phase 2: Scan Roots
────────────────────────────────────────────────────────────────────────────────
For each root:
- X registers (from worker)
- Y registers (walk stack frames)
- Process-bound var bindings
- Process execution state (chunk_addr)
- Heap fragments (mbuf_list)
For each root term:
If needs_tracing(term) && is_in_young_heap(term.to_ptr()):
new_addr = copy_to_old_heap(term)
update root to point to new_addr
Phase 3: Scan Copied Objects (Cheney's scan)
────────────────────────────────────────────────────────────────────────────────
scan = old_scan
While scan < old_htop:
object = read_object_at(scan)
For each pointer field in object:
If needs_tracing(field) && is_in_young_heap(field.to_ptr()):
new_addr = copy_or_forward(field)
update field to new_addr
scan += object.size()
Phase 4: Sweep MSO List
────────────────────────────────────────────────────────────────────────────────
For each MSO entry (ProcBin reference):
If ProcBin in young heap:
If has forwarding pointer:
Update MSO entry to new ProcBin address // CRITICAL
Else (dead):
Decrement binary refcount
Remove from MSO list
Else (in old heap):
Keep entry (implicitly live during minor GC)
Phase 5: Reset Young Heap
────────────────────────────────────────────────────────────────────────────────
htop = heap // Young heap space now free
// Note: stop is NOT changed - stack stays at top of young heap block
Phase 6: Free Heap Fragments
────────────────────────────────────────────────────────────────────────────────
For each heap fragment in mbuf_list:
Free fragment memory
mbuf_list = nil
The young heap block is reused, not deallocated. Only htop resets to heap.
Copy and Forwarding¶
COPY AND FORWARDING
════════════════════════════════════════════════════════════════════════════════
fn copy_to_old_heap(&mut self, addr: Vaddr) -> Vaddr {
// Check for existing forwarding pointer
let header = read_u64(addr);
if is_forwarding_pointer(header) {
return extract_forward_addr(header);
}
// Calculate object size based on type
let size = object_size_at(addr);
// Ensure old heap has space
if self.old_htop + size > self.old_hend {
self.grow_old_heap(size);
}
// Copy object to old heap
let new_addr = self.old_htop;
copy_bytes(addr, new_addr, size);
self.old_htop += size;
// Leave forwarding pointer in original location
write_forwarding_pointer(addr, new_addr);
new_addr
}
FORWARDING POINTER FORMAT:
┌──────────────────────────────────────────────────────────────────────────────┐
│ Original object header replaced with forwarding header (8 bytes): │
│ ┌────────────────────────────────────────────────────────────────────────┐ │
│ │ HEADER: new_address >> 3, tag=FORWARD (0xFF) (8B) │ │
│ └────────────────────────────────────────────────────────────────────────┘ │
│ │
│ Detection: header.object_tag() == 0xFF │
│ Address extraction: (header >> 10) << 3 // Re-align from arity field │
│ │
│ This uses the same 8-byte header word format as all boxed objects. │
│ See term-representation.md for encoding details. │
└──────────────────────────────────────────────────────────────────────────────┘
Root Finding¶
ROOT FINDING
════════════════════════════════════════════════════════════════════════════════
fn scan_roots(&mut self, worker: &Worker) {
// 1. X Registers (on worker, shared by all processes)
// X registers hold Terms (8 bytes each)
for i in 0..X_REG_COUNT {
self.trace_term(&mut worker.x_regs[i]);
}
// 2. Y Registers (walk stack frames)
// Y registers are Terms stored on the stack
let mut frame_ptr = self.frame_base;
while let Some(fp) = frame_ptr {
let frame = read_frame_header(fp);
for y in 0..frame.y_count {
let y_addr = fp - FRAME_HEADER_SIZE - (y + 1) * TERM_SIZE; // TERM_SIZE = 8
self.trace_term_at(y_addr);
}
frame_ptr = frame.caller_frame_base;
}
// 3. Process-bound var bindings
// binding_values is array of Terms
for i in 0..self.binding_len {
self.trace_term(&mut self.binding_values[i]);
}
// 4. Mailbox and heap fragments
// Mailbox messages contain Terms that must be traced
self.trace_mailbox();
self.trace_heap_fragments();
}
fn trace_term(&mut self, term: &mut Term) {
// Immediates (tag 11) need no tracing - value is inline
if term.is_immediate() {
return;
}
// Get pointer (LIST tag 01 or BOXED tag 10)
let addr = term.to_ptr();
// Skip realm-resident values (symbols, keywords, vars, namespaces)
// These are in code region, not process heap
if self.is_realm_address(addr) {
return;
}
// Skip if already in old heap
if self.is_in_old_heap(addr) {
return;
}
// Must be in young heap - copy to old
let new_addr = self.copy_to_old_heap(addr);
// Update term to point to new location, preserving tag
if term.is_list() {
*term = Term::list(new_addr as *const Pair);
} else {
*term = Term::boxed(new_addr as *const Header);
}
}
Major GC (Full Collection)¶
Major GC collects both young and old heaps, reclaiming dead objects in the old generation.
Trigger Conditions¶
MAJOR GC TRIGGERS
════════════════════════════════════════════════════════════════════════════════
1. Old heap space exhausted during Minor GC
- Minor GC tries to promote, but old_htop + size > old_hend
- Major GC compacts old heap to make room
2. After N minor GCs without fullsweep (configurable)
- Prevents dead old objects from accumulating indefinitely
- Default: fullsweep_after = 65535 (or process option)
3. Explicit GC request
- (garbage-collect :full) from Lonala code
4. Old heap utilization too low
- After Minor GC, if old heap is < 25% utilized
- Indicates significant old generation garbage
Major GC Algorithm¶
MAJOR GC ALGORITHM
════════════════════════════════════════════════════════════════════════════════
Phase 1: Allocate New Heaps
────────────────────────────────────────────────────────────────────────────────
Calculate required size based on live data estimate
new_young_heap = allocate_heap(calculated_size)
new_old_heap = allocate_heap(initial_old_size)
new_htop = new_young_heap.base // Start copying to new young
new_stop = new_young_heap.end // Stack will grow down from here
Phase 2: Copy Stack to New Young Heap
────────────────────────────────────────────────────────────────────────────────
Copy stack frames to top of new young heap
Update new_stop accordingly
Update frame pointers within copied stack
Phase 3: Scan Roots (into new young heap)
────────────────────────────────────────────────────────────────────────────────
For each root value:
If value.is_heap_pointer():
If in old_heap OR in young_heap:
new_addr = copy_to_new_young(value.addr)
update root
Phase 4: Cheney's Scan (single pass)
────────────────────────────────────────────────────────────────────────────────
scan = new_young_heap.base
While scan < new_htop:
For each pointer in object at scan:
If in old_heap OR in young_heap:
new_addr = copy_or_forward(addr)
update pointer
scan += object.size()
Phase 5: Swap Heaps
────────────────────────────────────────────────────────────────────────────────
Free old young heap
Free old old heap
heap = new_young_heap.base
hend = new_young_heap.end
htop = new_htop
stop = new_stop
old_heap = new_old_heap.base
old_hend = new_old_heap.end
old_htop = new_old_heap.base // Empty, ready for future promotions
Phase 6: Cleanup
────────────────────────────────────────────────────────────────────────────────
Free all heap fragments
Decrement binary reference counts (see Binary Handling)
Visualization¶
MAJOR GC VISUALIZATION
════════════════════════════════════════════════════════════════════════════════
BEFORE Major GC:
┌─────────────────────────────────────────────────────────────────────────────┐
│ Young Heap: [STACK][young live][young dead] │
│ Old Heap: [old live][old dead][old live][old dead][old live] │
│ ▲ Dead objects accumulate in old heap │
└─────────────────────────────────────────────────────────────────────────────┘
AFTER Major GC:
┌─────────────────────────────────────────────────────────────────────────────┐
│ New Young Heap: [STACK][all live data compacted][ FREE ] │
│ New Old Heap: [ EMPTY ] │
│ │
│ All live data now in young heap, fresh start │
│ Dead objects from both generations reclaimed │
└─────────────────────────────────────────────────────────────────────────────┘
Heap Growth and Shrinking¶
Growth Strategy¶
LONA HEAP GROWTH
════════════════════════════════════════════════════════════════════════════════
GROWTH SEQUENCE (matching BEAM):
────────────────────────────────────────────────────────────────────────────────
const HEAP_SIZES: [usize; 24] = [
// Phase 1: Fibonacci-like (words, multiply by 8 for bytes)
233, 377, 610, 987, 1597, 2584, 4181, 6765, 10946, 17711,
28657, 46368, 75025, 121393, 196418, 317811, 514229, 832040,
// Phase 2: ~20% growth kicks in after this
1346269, 2178309, 3524578, 5702887, 9227465, 14930352
];
// After 14930352 words (~119 MB), grow by 20%
const LARGE_HEAP_GROWTH_RATE: f64 = 1.20;
fn next_heap_size(current: usize, needed: usize) -> usize {
// Find smallest size that fits
for &size in HEAP_SIZES.iter() {
if size >= needed {
return size;
}
}
// Beyond table: 20% growth
let mut size = HEAP_SIZES[HEAP_SIZES.len() - 1];
while size < needed {
size = (size as f64 * LARGE_HEAP_GROWTH_RATE) as usize;
}
size
}
INITIAL SIZES:
────────────────────────────────────────────────────────────────────────────────
Young heap: 233 words = 1864 bytes ≈ 2 KB
Old heap: 64 words = 512 bytes
These small defaults enable millions of lightweight processes.
Heaps grow only when needed.
Shrinking Strategy¶
HEAP SHRINKING
════════════════════════════════════════════════════════════════════════════════
After Major GC, consider shrinking if heap is underutilized:
fn maybe_shrink_heap(&mut self, live_data_size: usize) {
let current_size = self.hend - self.heap;
let utilization = live_data_size as f64 / current_size as f64;
if utilization < 0.25 {
// Less than 25% utilized - try to shrink
let target_size = next_heap_size(0, live_data_size * 2);
if target_size < current_size {
// Allocate smaller heap, copy data
self.resize_heap(target_size);
}
}
}
Prevents memory waste from processes that had temporary spikes.
Large Binary Handling¶
Lona uses reference-counted binaries in a realm-wide pool, following BEAM's model.
Binary Classification¶
BINARY CLASSIFICATION
════════════════════════════════════════════════════════════════════════════════
SMALL BINARY (< 64 bytes):
- Stored directly in process heap as HeapString
- Copied when sent in messages
- GC'd with other heap objects
- Example: (def name "hello") → "hello" on process heap
LARGE BINARY (>= 64 bytes):
- Binary data stored in realm-wide binary heap
- ProcBin (reference) stored in process heap
- Reference counted (shared between processes WITHIN SAME REALM)
- NOT copied when sent in messages (intra-realm)
- Example: (slurp "large-file.txt") → ProcBin pointing to binary heap
CROSS-REALM BINARY SHARING
────────────────────────────────────────────────────────────────────────────────
Large binaries CAN be shared across realm boundaries via capability-based
frame sharing. This enables zero-copy binary transfer between realms.
When sending a message containing a large binary to another realm:
1. The sender grants the receiver a read capability to the binary's frame(s)
2. The receiving realm maps those frames into its address space
3. A new ProcBin is created pointing to the shared memory
Security is maintained through seL4's capability system:
- The receiver only gets read access (no modification)
- The capability can be revoked by the sender's realm
- Memory is reclaimed only when all capabilities are revoked
This follows BEAM's zero-copy philosophy while maintaining capability security.
ProcBin Structure¶
PROCBIN AND REFC BINARY
════════════════════════════════════════════════════════════════════════════════
Process Heap: Realm Binary Heap:
┌─────────────────────────────┐ ┌─────────────────────────────────────────┐
│ ProcBin │ │ RefcBinary │
│ ┌─────────────────────────┐ │ │ ┌─────────────────────────────────────┐ │
│ │ tag: PROCBIN │ │ │ │ refcount: AtomicU32 │ │
│ │ binary_addr: Vaddr ─────┼─┼──►│ │ size: u32 │ │
│ │ offset: u32 │ │ │ │ data: [u8; size] │ │
│ │ size: u32 │ │ │ └─────────────────────────────────────┘ │
│ └─────────────────────────┘ │ │ │
│ Size: 24 bytes │ │ Size: 8 + size bytes │
└─────────────────────────────┘ └─────────────────────────────────────────┘
Sub-binary (view into existing binary):
┌─────────────────────────────┐
│ SubBin │
│ ┌─────────────────────────┐ │
│ │ tag: SUBBIN │ │
│ │ original: Vaddr ────────┼─┼──► RefcBinary (increments refcount)
│ │ offset: u32 │ │
│ │ size: u32 │ │
│ └─────────────────────────┘ │
│ Size: 24 bytes │
└─────────────────────────────┘
MSO (Mark-Sweep Object) List¶
MSO LIST FOR BINARY GC
════════════════════════════════════════════════════════════════════════════════
Each process tracks off-heap references in an MSO list:
struct Process {
// ... other fields ...
mso_list: Option<Vaddr>, // Head of MSO linked list
}
struct MsoEntry {
next: Option<Vaddr>,
object: Vaddr, // ProcBin or other off-heap reference
}
DURING MINOR/MAJOR GC:
────────────────────────────────────────────────────────────────────────────────
fn sweep_mso_list(&mut self) {
let mut prev: Option<Vaddr> = None;
let mut current = self.mso_list;
while let Some(entry_addr) = current {
let entry = read_mso_entry(entry_addr);
let procbin_addr = entry.object;
// Check if ProcBin was copied (has forwarding pointer)
if has_forwarding_pointer(procbin_addr) {
// ProcBin is still live - update next pointer and continue
prev = Some(entry_addr);
} else {
// ProcBin is dead - decrement binary refcount
let binary_addr = read_procbin(procbin_addr).binary_addr;
let new_refcount = decrement_refcount(binary_addr);
if new_refcount == 0 {
// Binary is garbage - free from realm binary heap
free_refc_binary(binary_addr);
}
// Remove entry from MSO list
if let Some(p) = prev {
write_mso_next(p, entry.next);
} else {
self.mso_list = entry.next;
}
}
current = entry.next;
}
}
Virtual Binary Heap¶
VIRTUAL BINARY HEAP
════════════════════════════════════════════════════════════════════════════════
Each process tracks cumulative off-heap binary size to trigger early GC:
struct Process {
// ... other fields ...
vbin_size: usize, // Virtual binary heap size
vbin_limit: usize, // Threshold for early GC
}
fn allocate_binary(&mut self, size: usize) -> Result<Vaddr, Error> {
if size < LARGE_BINARY_THRESHOLD {
// Small binary - allocate on process heap
return self.alloc_string(size);
}
// Large binary - allocate in realm binary heap
let binary_addr = self.realm.alloc_refc_binary(size)?;
let procbin_addr = self.alloc_procbin(binary_addr, size)?;
// Update virtual binary heap
self.vbin_size += size;
// Check if we should trigger early GC
if self.vbin_size > self.vbin_limit {
self.minor_gc();
// GC will sweep MSO list, potentially freeing binaries
// and reducing vbin_size
}
Ok(procbin_addr)
}
Virtual binary heap ensures that processes creating many large binaries
don't accumulate unbounded binary memory before GC runs.
Heap Fragments (Message Passing)¶
When sending messages, data must be copied to the receiver's heap. Heap fragments handle contention.
Fragment Creation¶
HEAP FRAGMENT CREATION
════════════════════════════════════════════════════════════════════════════════
SCENARIO: Process A sends message to Process B
OPTION 1: Direct allocation (common case)
────────────────────────────────────────────────────────────────────────────────
A acquires B's heap lock (brief trylock)
A deep copies message directly into B's young heap
A releases lock
Message immediately part of B's heap
OPTION 2: Heap fragment (contention case)
────────────────────────────────────────────────────────────────────────────────
A cannot acquire B's heap lock (B is busy)
A allocates heap fragment from worker's allocator
A deep copies message into fragment
A atomically links fragment to B's mbuf_list
A continues without waiting
HEAP FRAGMENT STRUCTURE:
┌─────────────────────────────────────────────────────────────────────────────┐
│ HeapFragment │
│ ┌─────────────────┬─────────────────┬──────────────────────────────────┐ │
│ │ next: *Fragment │ size: usize │ MESSAGE DATA (Values, objects) │ │
│ └─────────────────┴─────────────────┴──────────────────────────────────┘ │
│ │
│ Linked to receiver's mbuf_list │
└─────────────────────────────────────────────────────────────────────────────┘
Fragment Consolidation¶
FRAGMENT CONSOLIDATION DURING GC
════════════════════════════════════════════════════════════════════════════════
Heap fragments are treated as part of young generation during GC:
fn minor_gc(&mut self) {
// Fragments are additional roots
for fragment in self.mbuf_list.iter() {
for value in fragment.values() {
self.trace_value(value);
}
}
// After copying live data to old heap...
// Free all fragments (data is now in old heap)
for fragment in self.mbuf_list.drain() {
self.worker.allocator.free(fragment);
}
}
BEFORE GC:
Main heap: [stack][objects]
Fragment 1: [msg from A]
Fragment 2: [msg from C]
AFTER GC:
Main heap: [stack][ FREE ]
Old heap: [promoted objects + live message data]
Fragments: freed
Memory Allocation Architecture¶
Per-Worker Allocators¶
PER-WORKER ALLOCATOR ARCHITECTURE
════════════════════════════════════════════════════════════════════════════════
Each worker has an independent allocator instance for lock-free allocation:
PROCESS POOL MEMORY REGION
(from realm's VSpace)
│
┌─────────────────────────┼─────────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ WORKER 0 │ │ WORKER 1 │ │ WORKER 2 │
│ ALLOCATOR │ │ ALLOCATOR │ │ ALLOCATOR │
│ │ │ │ │ │
│ ┌───────────┐ │ │ ┌───────────┐ │ │ ┌───────────┐ │
│ │ Carrier 0 │ │ │ │ Carrier 0 │ │ │ │ Carrier 0 │ │
│ │ [blocks] │ │ │ │ [blocks] │ │ │ │ [blocks] │ │
│ ├───────────┤ │ │ ├───────────┤ │ │ ├───────────┤ │
│ │ Carrier 1 │ │ │ │ Carrier 1 │ │ │ │ Carrier 1 │ │
│ │ [blocks] │ │ │ │ [blocks] │ │ │ │ [blocks] │ │
│ └───────────┘ │ │ └───────────┘ │ └───────────┘ │ │
│ │ │ │ │ │
│ LOCK-FREE │ │ LOCK-FREE │ │ LOCK-FREE │
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
│ │ │
▼ ▼ ▼
Processes on Processes on Processes on
Worker 0 Worker 1 Worker 2
Allocations within a worker require NO locks (common case).
Coordination only needed for:
- Getting new carriers from LMM (IPC)
- Carrier migration between workers (rare)
Carrier Structure¶
CARRIER STRUCTURE
════════════════════════════════════════════════════════════════════════════════
MULTIBLOCK CARRIER (typical, ~1 MB):
┌─────────────────────────────────────────────────────────────────────────────┐
│ HEADER │ Block 0 │ Block 1 │ FREE │ Block 2 │ FREE │
│ (meta) │ (P1 heap)│ (P7 heap)│ │ (P3 heap)│ │
└──────────┴──────────┴──────────┴──────────┴──────────┴─────────────────────┘
Multiple process heaps share a carrier.
Free list tracks available blocks.
SINGLEBLOCK CARRIER (for large heaps):
┌─────────────────────────────────────────────────────────────────────────────┐
│ HEADER │ SINGLE LARGE BLOCK │
│ (meta) │ (one process's heap) │
└──────────┴──────────────────────────────────────────────────────────────────┘
Dedicated carrier for processes with very large heaps (> threshold).
Requesting Memory from LMM¶
MEMORY REQUEST FLOW
════════════════════════════════════════════════════════════════════════════════
When worker's allocator needs more carriers:
1. GC completes but still needs more space
2. Worker calls lmm_request_pages(ProcessPool, count)
3. IPC to Lona Memory Manager:
- Message: AllocPages { region: ProcessPool, count: N }
- LMM checks realm quota
- LMM retypes Untyped → Frames
- LMM maps frames into realm's VSpace
- Response: { status: Ok, vaddr: 0x... } or { status: OOM }
4. Worker receives response:
- Success: Create new carrier from mapped pages
- OOM: Return error to process (RuntimeError::OutOfMemory)
This explicit IPC model (vs. page faults) provides:
- Predictable latency
- Graceful OOM handling
- Clear quota enforcement
GC Interface¶
Process API¶
PROCESS GC API
════════════════════════════════════════════════════════════════════════════════
struct Process {
/// Trigger minor GC if needed before allocation
fn alloc(&mut self, size: usize) -> Option<Vaddr>;
/// Force minor GC
fn minor_gc(&mut self) -> GcResult;
/// Force major GC (fullsweep)
fn major_gc(&mut self) -> GcResult;
/// Check if GC should run
fn needs_gc(&self) -> bool {
self.htop >= self.stop || self.vbin_size > self.vbin_limit
}
/// Get GC statistics
fn gc_stats(&self) -> GcStats;
}
struct GcStats {
minor_gcs: u64,
major_gcs: u64,
total_reclaimed: u64,
last_gc_time_us: u64,
heap_size: usize,
heap_used: usize,
old_heap_size: usize,
old_heap_used: usize,
}
enum GcResult {
Success { reclaimed: usize },
HeapGrown { new_size: usize },
OutOfMemory,
}
Lonala Intrinsics¶
;; Force garbage collection
(garbage-collect) ; Minor GC
(garbage-collect :full) ; Major GC (fullsweep)
;; Get GC statistics for current process
(gc-stats)
;; Returns:
;; %{:minor-gcs 42
;; :major-gcs 3
;; :heap-size 32768
;; :heap-used 12456
;; :old-heap-size 16384
;; :old-heap-used 8192}
;; Configure GC parameters (per-process)
(spawn-opt worker-fn {:min-heap-size 8192
:fullsweep-after 1000})
Differences from BEAM¶
Summary Table¶
| Aspect | BEAM | Lona | Rationale |
|---|---|---|---|
| Memory source | OS (malloc/mmap) | seL4 via LMM IPC | Capability-based allocation |
| Heap allocation | Per-scheduler allocator | Per-worker allocator | Same design, different terminology |
| Page faults | Transparent (OS handles) | Explicit IPC for process pool | seL4 MCS timing; predictable latency |
| Binary heap | Global, reference counted | Per-realm, cross-realm sharing via capabilities | Zero-copy with capability security |
| Term representation | Tagged words (8 bytes) | Tagged words (8 bytes) | Same approach as BEAM |
| Header/forwarding | Header word, inline forward | Header word, inline forward | Same approach as BEAM |
| Heap fragments | Off-heap allocation | Worker allocator | Same concept, different source |
| Inter-process messages | Deep copy | Deep copy | Identical semantics |
| Large binaries | Refc in global heap | Refc in realm heap + cross-realm caps | Zero-copy philosophy preserved |
| Scheduler integration | Reduction counting | Reduction counting | Identical |
Key Adaptations¶
LONA-SPECIFIC ADAPTATIONS
════════════════════════════════════════════════════════════════════════════════
1. EXPLICIT MEMORY REQUESTS
────────────────────────────────────────────────────────────────────────────────
BEAM: Heap growth via OS transparent page faults
Lona: Heap growth via IPC to LMM
Reason: seL4 MCS scheduling + page faults = complex timing.
Explicit IPC provides predictable latency and error handling.
2. REALM-SCOPED BINARY HEAP WITH CROSS-REALM SHARING
────────────────────────────────────────────────────────────────────────────────
BEAM: Single global binary heap for entire VM
Lona: Separate binary heap per realm, with capability-based cross-realm sharing
Mechanism: Binaries CAN be shared across realms via seL4 frame capabilities.
The sender grants read capabilities to binary frames; receiver
maps them into its address space.
Implication: Both intra-realm and inter-realm binary sharing are zero-copy.
Security is maintained through capability-based access control.
3. WORKER ALLOCATOR SOURCE
────────────────────────────────────────────────────────────────────────────────
BEAM: Carriers allocated via OS malloc
Lona: Carriers allocated via LMM IPC (Untyped → Frames)
Same architecture (per-scheduler/worker allocators, carriers, blocks),
different underlying memory source.
4. ISOLATED VSPACES WITH CONTROLLED SHARING
────────────────────────────────────────────────────────────────────────────────
BEAM: Single Erlang VM, all processes share same memory space
Lona: Each realm has isolated VSpace
Consequence: Cross-realm messages require serialization/deserialization
for non-binary data. Large binaries can be shared zero-copy
via capability-based frame sharing.
This provides security isolation while preserving BEAM's zero-copy binary
sharing philosophy through seL4's capability model.
Implementation Considerations¶
GC-Safe Points¶
GC-SAFE POINTS
════════════════════════════════════════════════════════════════════════════════
GC can only occur at safe points where all roots are identifiable:
SAFE POINTS:
- Before/after function calls
- At allocation sites (obvious - allocation triggers GC)
- At message send/receive
- At explicit yield points
NOT SAFE (GC cannot occur):
- Mid-instruction with partial state
- During native function execution
- While constructing compound values
The bytecode interpreter naturally hits safe points between instructions.
Allocation is the primary GC trigger.
Stack Scanning¶
STACK SCANNING DETAILS
════════════════════════════════════════════════════════════════════════════════
FRAME LAYOUT (reminder):
stop (after ALLOCATE)
┌──────────────────────────────────┐
│ Y(0) (8 bytes) │ ← stop + 0
│ Y(1) (8 bytes) │ ← stop + 8
│ ... │
│ Y(N-1) (8 bytes) │ ← stop + (N-1) × 8
├──────────────────────────────────┤ ← frame_base
│ return_ip (u64) │ + 0
│ chunk_addr (u64) │ + 8
│ caller_frame_base(u64) │ + 16
│ y_count (u64) │ + 24
└──────────────────────────────────┘
Y_REGISTER_SIZE = 8 bytes (size of Term)
FRAME_HEADER_SIZE = 32 bytes
SCANNING ALGORITHM:
fn scan_stack(&mut self) {
let mut fp = self.frame_base;
while let Some(frame_ptr) = fp {
// Read frame header
let header = read_frame_header(frame_ptr);
// Scan Y registers (above frame header, going up)
for i in 0..header.y_count {
let y_addr = frame_ptr - FRAME_HEADER_SIZE - (i + 1) * Y_REGISTER_SIZE;
let term = read_term(y_addr);
if let Some(new_term) = self.trace_and_copy(term) {
write_term(y_addr, new_term);
}
}
// Move to caller's frame
fp = if header.caller_frame_base != 0 {
Some(header.caller_frame_base)
} else {
None
};
}
}
Initialization Safety¶
Y REGISTER INITIALIZATION
════════════════════════════════════════════════════════════════════════════════
Y registers must be GC-safe at all times. Two approaches:
1. ALLOCATE_ZERO (opcode 14):
- Allocates Y registers AND initializes to nil
- Safe for GC at any point
- Slight overhead for initialization
2. ALLOCATE (opcode 13):
- Allocates Y registers, UNINITIALIZED
- Compiler must ensure all Y regs written before any GC-possible operation
- More efficient but requires compiler discipline
Current implementation uses ALLOCATE_ZERO for safety.
Optimization: switch to ALLOCATE once liveness analysis ensures safety.
Summary¶
| Aspect | Description |
|---|---|
| Model | Per-process generational semi-space copying collector |
| Generations | Young (frequent) and Old (rare) |
| Algorithm | Cheney's copying algorithm |
| Trigger | Young heap exhaustion (htop >= stop) |
| Minor GC pause | 10-100 µs typical |
| Major GC pause | 100 µs - few ms |
| Heap growth | Fibonacci sequence, then 20% |
| Large binaries | Reference counted in realm binary heap |
| Memory source | Explicit IPC to Lona Memory Manager |
| Allocator | Per-worker, lock-free for common case |
This design provides soft real-time properties through per-process GC with no global pauses, while adapting BEAM's proven approach to seL4's capability-based memory model.
References¶
- BEAM Garbage Collection (Erlang/OTP Documentation)
- Erlang GC Details (Hamidreza Soleimani)
- The BEAM Book
- Process Model - Lona's process and heap structure
- Realm Memory Layout - Memory regions and allocation
- System Architecture - LMM and IPC