← Engineering

Porting Wesnoth with Amnesiac Agents

March 19, 2026
TLDR: We didn't have a WebAssembly specialist, but we ported Wesnoth—a 20-year-old, million-line C++ strategy game—to the browser anyway using AI agents. It turned out the hardest engineering problem wasn't Emscripten or C++, but building a state-preserving harness so the agents wouldn't lose their minds and repeat the same mistakes across hundreds of build experiments.

Context

Going into this, I braced myself for the worst: wrestling with C++ linker errors and deciphering browser APIs that refuse to behave as documented. But what we didn't expect was that the primary bottleneck would be figuring out how to structure work that runs for days.

Here is the thing about LLM agents: they can be incredibly capable, but they have the memory of a goldfish. How do you run hundreds of compile-and-test loops without an agent completely losing track of what it has already tried?

Before the C++: Cross-Compiling the Dependency Graph

Before we touched a single line of Wesnoth's own code, we had to compile its ten C++ dependencies to WebAssembly.

Wesnoth uses vcpkg as its package manager. vcpkg has build recipes (“portfiles”) for all of Wesnoth's dependencies: glib, cairo, pango, freetype, brotli, zlib, and others. The problem is that those recipes assume a native target. They have no idea what to do with emcc, Emscripten's compiler wrapper.

The fix was using vcpkg overlay ports: a parallel set of recipes overriding the upstream ones specifically for Emscripten. Each dependency required its own surgery:

Across ten libraries, that came to about 60 files of portfiles and patches. None of it shows up in the four source files I'll mention later. The C++ code is thin; the build infrastructure is not.

~100 Flag Combinations. About 30 Were Duplicates.

Emscripten exposes a lot of knobs: threading models, async patterns, filesystem backends, audio routing. For a codebase as tangled as Wesnoth, there's no StackOverflow answer for “how to port a 20-year-old SDL2 game with Boost and OpenSSL.” You just have to search.

The problem is that agents are terrible at searching across multiple sessions. An agent would eagerly try a set of build flags, hit a failure, exhaust its context window, and stop. The next invocation would start fresh—often re-trying a combination we'd already eliminated two runs ago.

Across roughly a hundred flag combinations, about a third were duplicates. The agents weren't doing anything wrong; they were reasoning correctly from the information they had. You can have the smartest agents in the world, but if state doesn't persist between runs, you make zero net progress.

Building the Experiment Harness

The fix was conceptually simple: every experiment's results needed to be readable by the next agent, completely bypassing the context window limit.

We built a lightweight harness around the build process. Every run gets a named experiment ID (E71, E72) and a short hypothesis. The active configuration lives in a current_experiment.env file that any agent reads at startup. When a build completes, the harness dumps a bundle to disk: resolved build flags, git commit, artifact hashes, and the diff from the previous experiment.

Then, a Playwright test run records its findings into the same bundle. Each run reduces to something brutally simple:

What the Harness Made Possible

With the harness in place, the port converged through five named experiments:

E71 — bypass SDL2_mixer for music, routing through Web Audio AudioBufferSourceNodes

E72 — same approach for sound effects

E73 — IDBFS save persistence (mounting /userdata on IndexedDB)

E74 — WebSocket add-on networking

E75 — full multiplayer over a WebSocket-to-TCP proxy

The hardest problem to track down was the JSPI freeze. The browser tab would lock up randomly whenever the C++ main loop ran long without yielding. The fix turned out to be a single emscripten_sleep(0) call at the top of every frame—not an actual pause, just a yield.

Out of a million lines of C++, exactly four source files handle all browser-specific I/O. The rest of the game is untouched.

Takeaways & What Transferred

The Wesnoth port took roughly a week of active iteration. The following week, we started on other C++ games.

The harness transferred almost entirely. The experiment ID structure, the Playwright classifications, the bundle format—all of it applied with minimal changes. The first port is expensive because you're building the workflow while doing the work. Each one after that starts with the workflow already in place.

Current state: Desktop Chromium only (JSPI is behind a feature flag), ~18-20 second load time, touch is best-effort.