Lesson 8: Integration & Ghostty Contribution
Every piece you've built — the shell, the PTY inspector, the VT parser, the GPU text renderer — now wires together into a working terminal emulator. After that, you read production code and contribute back.
toyterm: Integration Architecture
flowchart TD
subgraph toyterm["toyterm"]
subgraph Workers["Worker Threads"]
IO["PTY I/O Thread<br/>read() master fd"]
Parser["vt-parse Thread<br/>byte-by-byte DFA"]
Renderer["Renderer Thread<br/>GPU draw calls"]
end
Grid["Grid State<br/>2D cell array<br/>(contiguous allocation)"]
Main["Main Thread<br/>window event loop<br/>keyboard → PTY write<br/>resize → ioctl + realloc<br/>signal handling"]
end
IO -->|"SPSC queue<br/>(raw bytes)"| Parser
Parser --> Grid
Grid -->|"dirty cell list<br/>(lock-free)"| Renderer
Main -->|"writes keystrokes"| IO
IO -->|"receives output"| Main
Thread assignment
- I/O thread: single
read()loop on the master fd. Writes raw bytes to the parser's SPSC queue. Never blocks on parsing or rendering. - Parser thread: dequeues bytes from the I/O thread, runs the VT state machine, updates the grid. Marks dirty cells. Sends render updates to the render thread.
- Render thread: reads dirty cells, prepares vertex/instance buffers, issues GPU draw calls. Syncs to vsync. Must complete within frame budget (16.6ms at 60Hz).
Why three threads, not two: if parsing and rendering shared a thread, a burst of output (e.g., cat bigfile) would block rendering, causing frame drops. If I/O and parsing shared a thread, a complex parse (long OSC string, sixel image) would delay reading from the PTY, causing input latency. Three threads ensures each latency-critical path runs independently.
The grid data structure
typedef struct {
char32_t codepoint; // Unicode codepoint
uint32_t fg; // foreground color (RGBA)
uint32_t bg; // background color (RGBA)
uint16_t flags; // bold, italic, underline, inverse, blink, hidden
uint16_t glyph_idx; // index into glyph atlas (populated by renderer)
uint16_t dirty; // set by parser, cleared by renderer
} cell_t;
typedef struct {
cell_t *cells; // rows × cols contiguous array
int rows, cols;
int cursor_row, cursor_col;
int scroll_top, scroll_bottom; // scroll region margins
int saved_cursor_row, saved_cursor_col; // DECSC/DECRC
int *tab_stops; // one per column
sgr_t current_sgr; // current text attributes
cell_t default_cell; // for blank/damaged cells
} grid_t;
Contiguous allocation (cells = calloc(rows * cols, sizeof(cell_t))) is cache-friendly. Row-major order means scrolling (moving rows up/down) can use memmove() on spans of cells.
The main loop (simplified)
int main() {
init_pty(); // openpty, fork, setsid, exec shell
init_grid(); // allocate rows×cols cells
init_renderer(); // GPU context, shaders, atlas
init_threads(); // spawn I/O, parser, render threads
while (running) {
handle_events(); // window events, keyboard, resize, signals
}
cleanup();
}
What toyterm Should Support (MVP)
| Feature | Requirement |
|---|---|
| PTY I/O | spawn shell, read/write bidirectional |
| VT parser | all 10+ states, CSI/OSC/DCS dispatch |
| Cursor movement | CUU, CUD, CUF, CUB, CUP, CHA, CNL, CPL |
| Erase | ED (display), EL (line), ECH (characters) |
| SGR | bold, italic, underline, inverse, 256 colors, true color (38;2;r;g;b) |
| Scrolling | scroll region (DECSTBM), scroll on LF at bottom |
| Alternate screen | DECSET ?1049h/l |
| Window | resize reallocation, SIGWINCH propagation |
| GPU rendering | instanced quad draw, glyph atlas, FreeType + HarfBuzz |
| Keyboard | send bytes to PTY master, kitty protocol for modifiers |
| Cursor | visible/invisible (DECTCEM), blink |
| Mouse | basic mouse reporting (?1000h) |
| Clipboard | OSC 52 read (with confirmation) and write |
Stretch goals: sixel graphics, ligature support, IME (input method editor) for CJK, synchronized output (DECSET ?2026h), terminal notifications (OSC 9, OSC 777).
Reading Ghostty Source
After building toyterm, read the real implementation. The structure will be familiar because you've implemented each piece yourself.
Key source files to read
| File | What it does | What to compare to your implementation |
|---|---|---|
src/main.zig |
Entry point, initialization, config loading | Your main() |
src/Surface.zig |
Per-tab surface: owns the grid, PTY, and renderer | Your main loop + grid |
src/termio/Exec.zig |
Subprocess creation (fork, setsid, dup2, exec) | Your init_pty() |
src/termio/Termio.zig |
I/O thread: master fd read/write, signal handling | Your I/O thread |
src/termio/StreamHandler.zig |
VT parser DFA | Your vt-parse |
src/terminal/Terminal.zig |
Grid state, cursor, margins, SGR, scrolling | Your grid |
src/termio/mailbox.zig |
Lock-free SPSC queue | Your SPSC queue |
src/renderer/Metal.zig |
Metal backend: shader compilation, buffer mgmt, draw calls | Your renderer |
src/font/ |
Font discovery, atlas management, shaping | Your FreeType/HarfBuzz integration |
src/input/ |
Keyboard/mouse input handling, kitty protocol encoding | Your keyboard handler |
How to read Ghostty effectively
- Read the data structures first. In Zig, structs are at the top of each file (or in a types file). Understand the shape of the data before tracing control flow.
- Follow the byte. Start at a keypress in the input handler, trace through the I/O write to PTY, the child process output, the I/O read, the parser, the grid update, and the render dispatch.
- Read tests. Ghostty has extensive unit and integration tests. Tests are executable documentation — they show expected behavior for edge cases.
- Use the debugger. Build Ghostty in debug mode, set breakpoints in the parser, and observe the state machine transitions in real time while you type.
Building Ghostty
# macOS
git clone https://github.com/ghostty-org/ghostty.git
cd ghostty
zig build -Doptimize=Debug
# Linux (requires GTK4, libadwaita, and GPU drivers)
zig build -Doptimize=Debug
Run the test suite:
zig build test
The test suite exercises the VT parser, terminal state machine, font shaping, and rendering. Passing tests is the prerequisite for contributing.
Contributing to Ghostty
Finding a first issue
- Check GitHub issues labeled
good first issueorcontributions-welcome - Start with
libghostty-vt— the VT parser library extracted from Ghostty. Parser fixes are well-scoped and testable. - Documentation improvements are always welcome and teach you the codebase
- Config options (new configuration knobs) are straightforward: add the config key, parse it, use it in the relevant module
Contribution workflow
- Fork the repository
- Create a branch for your change
- Write tests that fail before your change and pass after
- Implement the fix or feature
- Run the full test suite (
zig build test) - Run
zig fmtto format your code - Commit with a descriptive message
- Open a PR with a clear description, linked to the issue
What makes a good Ghostty PR
- Small scope. A PR should do one thing. Fix one bug. Add one config option. Improve one parser transition.
- Tests included. Every behavioral change needs a test. The parser tests in
test/are straightforward to extend. - Follows existing patterns. Read the surrounding code. Match the style, naming, and structure. Ghostty uses explicit error handling and Zig's comptime features heavily.
- Performance-conscious. Terminal emulators are latency-sensitive. Avoid allocations in hot paths. Use arena allocators for per-frame data. Batch GPU updates.
Common first-PR areas
| Area | Example PRs | Difficulty |
|---|---|---|
| VT parser fix | Handle a malformed escape sequence correctly, fix an edge case in SGR parameter parsing | Medium |
| Terminfo entry | Add a missing terminal capability | Easy |
| Config option | Expose a tunable (cursor style, scrollback size, color palette) | Easy-Medium |
| Documentation | Improve architecture docs, add examples for config options | Easy |
| Test coverage | Add tests for untested parser states or terminal operations | Medium |
| Sixel improvement | Fix rendering artifacts in sixel images | Hard |
The Path Forward
You've rebuilt the computer from electrons to a GPU-accelerated terminal. You can explain:
- How a transistor becomes a NAND gate becomes an ALU
- Why
fork()is fast (copy-on-write) and how it works at the page table level - What happens between pressing a key and the character appearing on screen
- How a lock-free SPSC queue uses acquire/release ordering to send data between threads without a mutex
- How FreeType rasterizes glyphs and HarfBuzz shapes text for GPU rendering
- Why Ghostty's architecture (three threads, SPSC queues, instanced rendering) is the correct design for a terminal emulator
From here, the knowledge transfers in two directions:
- Deeper into terminals: contribute to Ghostty, implement missing protocols (kitty graphics protocol, synchronized output), or build terminal-adjacent tools (terminal multiplexers, TUIs)
- Wider into systems: the concepts transfer directly — vLLM's KV-cache is a slab allocator, CUDA kernels follow GPU architecture rules, kernel hacking requires understanding page tables and interrupts, database storage engines are memory managers with persistence
The meta-lesson: a terminal emulator is not a special snowflake. It's a systems problem. Understanding it requires understanding the hardware (CPU, cache, GPU), the kernel interface (processes, PTYs, signals), memory management, concurrency, parsing, and rendering. Master the terminal and you've mastered a large chunk of systems engineering.
Self-Check
Can you:
- Draw the full toyterm architecture (threads, data structures, data flow)
- Explain why three threads are necessary and what each one does
- Write the main loop pseudocode: init PTY, grid, renderer, threads → event loop → cleanup
- Read
src/termio/Exec.zigand explain each system call it makes and why - Build Ghostty from source and run the test suite
- Identify a good-first-issue and describe your approach before writing code
- Trace a byte from PTY master to pixel on screen through the full Ghostty stack