Eight AI coding models were given the same Three.js game brief, the same generated assets, the same agentic workflow, and a 60-minute ceiling. Six shipped a playable build. Two timed out. None crashed.
The brief was Combat Arcade Racer: a street racer with combat power-ups, urban circuits, and arcade physics. The build requires a working render loop, input handler, collision detection, and power-up logic to produce anything playable.
Results
Six of eight models passed. Claude Opus 4 finished in 22 minutes and 19 seconds with 1,820 lines and zero debug iterations. Claude Sonnet 4 passed in 36 minutes with 9,225 lines. GLM-5.1 passed in 48 minutes with 2,150 lines across 5 debug iterations. The slowest passing run took 79 minutes.
Two models timed out at the 60-minute ceiling. Both were still iterating in the debug loop when the clock stopped.
Methodology
We gave eight AI coding models the same Three.js game brief. Concept art, 3D models, and audio were generated once and reused across all runs. The agentic workflow was held constant. The only variable was the coding model under test.
Each run had a 60-minute wall-clock ceiling. A run counted as a pass if the output ran in a browser, accepted input, and did not crash within 30 seconds.
The brief was Combat Arcade Racer: a street racer with combat power-ups, urban circuits, and arcade physics. Passing the brief requires a working render loop, input handler, collision system, and power-up logic on the same build.
Observations
Wall-clock time among passing models ranged from 22 to 79 minutes. Debug iteration counts among passing models ranged from 0 to 7. High debug counts did not prevent a pass.
Line counts varied by roughly 9x across models given the same brief. The largest passing codebase came from a model that also ran four full debug cycles. The smallest passing build was under 2,000 lines.
Both failing models timed out during active debug iteration rather than crashing. A longer ceiling might have converted one or both to passes.
If you're specifically interested in how these models handle Three.js scene graphs, render loops, and asset loading, see our detailed breakdown: Best AI for Three.js development.