Try Rosebud AI for free

September 2, 2025

Real-time World Models Gaming: The Future is Already Running

By

Lisha Li

Real-time World Models Gaming: The Future is Already Running

Genie 2 laid the foundation. The next generation (Genie 3?) will bring 24fps real-time generation. Odyssey streams at 40 milliseconds per frame. World Labs creates persistent 3D from single images. Real-time world generation is arriving.

What Real-time World Models Actually Mean

Traditional Game Rendering

Load pre-built assets
Execute pre-written physics
Render pre-defined scenes

Real-time World Models

Generate everything as you play
No loading screens because nothing pre-exists
Infinite variation because nothing is pre-defined

The shift is from playing in worlds to worlds creating themselves around you.

The Technical Achievement

What next-gen models (like an anticipated Genie 3) could deliver:

720p or higher at 24fps+
Consistency for extended gameplay sessions
Real-time response to player input
Memory of previous frames for coherence

How It Works:

Player input → World model inference (41ms) →
Frame generation → Display →
Next input processed

At 24fps, you have 41.6 milliseconds per frame. Next-generation models will need to achieve this to be truly playable.

The Latency Challenge Solved

The breakthrough isn't just generation - it's speed:

5-10s NeRF rendering per frame 2-3s Stable Diffusion per image 40ms Genie 2 (playable) 16ms Gaussian Splatting (smooth)

// This is the target for real-time function frameTime(fps) { return 1000 / fps; // ms } frameTime(24); // 41.6ms - Cinematic frameTime(30); // 33.3ms - Console standard frameTime(60); // 16.6ms - Smooth gameplay frameTime(120); // 8.3ms - Competitive gaming

Current Implementations

Odyssey (May 2025):

Interactive video streaming
Users control exploration
40ms response time
Runs on cloud GPUs

World Labs:

Single image to 3D world
Interactive and modifiable
Physics-accurate
Persistent across sessions

Rosebud + Veo 3:

Video generation with game logic overlay
Playable, not just viewable
Browser-based deployment

The Memory Architecture

Real-time world models need memory to maintain consistency:

class WorldMemory { constructor() { this.shortTerm = []; // Last 60 frames (2 seconds) this.longTerm = {}; // Key world facts this.spatial = {}; // 3D understanding } generateNextFrame(input) { // Consider recent history const context = this.shortTerm.slice(-30); // Maintain spatial consistency const position = this.spatial.playerPosition; // Generate coherent next frame return model.generate({context, position, input}); } }

Future models like Genie 3 will need to remember visual details from extended periods - revolutionary for generated worlds. This builds upon concepts explored in our world models overview.

The Multiplayer Problem

Current limitation: World models are single-player.

Why? Each player's actions create divergent worlds:

Player A turns left → Generates left path Player B turns right → Generates different right path Worlds diverge → No shared reality

Solutions being explored:

Authoritative world generation (one model, many viewers)
Consensus algorithms (models vote on next frame)
Predetermined seeds (same input = same world)

Browser-Based Advantage

When real-time world models ship to browsers:

Native Integration:

// Future browser API const worldModel = new WorldModel('genie-4'); const canvas = document.getElementById('game'); worldModel.onFrame = (frameData) => { canvas.render(frameData); }; worldModel.start({ prompt: "Cyberpunk racing game", fps: 60 });

WebGPU Acceleration:

Direct GPU access from JavaScript
Tensor operations at native speed
No driver installations
Works on all devices

The Gaming Revolution

Infinite Content: Every playthrough is unique

// Never the same game twice startGame("Fantasy RPG with dragons"); // Completely different world each time

Adaptive Difficulty: Worlds reshape based on player skill

if (player.struggling) { world.adjust("make slightly easier"); }

Emergent Narratives: Stories that write themselves

world.addContext("Player chose peaceful path"); // World generates accordingly

Performance Optimizations

Making world models game-ready:

Frame Interpolation:

// Generate at 12fps, interpolate to 60fps const keyframe1 = await model.generate(); const keyframe2 = await model.generate(); const interpolated = interpolate(keyframe1, keyframe2, 5); // 2 model calls = 10 display frames

Predictive Generation:

// Generate probable futures in parallel const futures = await Promise.all([ model.predict({action: 'moveLeft'}), model.predict({action: 'moveRight'}), model.predict({action: 'jump'}) ]); // Instant response when player acts

Level-of-Detail:

// Generate detail where player looks const highDetail = model.generate({ area: player.viewFrustum, quality: 'high' }); const lowDetail = model.generate({ area: 'peripheral', quality: 'fast' });

The Rosebud Integration

We're building the platform for when these models are publicly available:

Today: Composite approach

Veo 3 for cinematics
Procedural generation for gameplay
AI for game logic

Tomorrow: Full world model integration

Genie-style world generation
Real-time at 60fps
Multiplayer support
Browser-native

The Architecture:

class RosebudWorld { constructor() { this.renderer = new THREE.WebGLRenderer(); this.worldModel = null; // Ready for future models } async loadWorldModel(modelName) { this.worldModel = await import(`@models/${modelName}`); this.worldModel.onFrame = (data) => { this.renderer.render(data); }; } start(prompt) { if (this.worldModel) { // Use world model when available this.worldModel.generate(prompt); } else { // Fall back to current generation this.generateTraditional(prompt); } } }

What You Can Build

Today with current tech:

Infinite runners with generated levels
RPGs with AI-driven narratives
Puzzle games with unique solutions
Social spaces that evolve

Tomorrow with full world models:

Open worlds that generate as you explore
Stories that adapt to every choice
Multiplayer worlds that surprise everyone
Games that dream themselves into existence

Projected Development Path

Current: Limited preview, research access

Near term: Potential API access expansion

Future potential: Consumer hardware may run local models

Long term vision: Real-time world models could become standard in browsers

Each step makes creation more accessible.

Start Building

The infrastructure is ready. The models are improving daily. The future of gaming is real-time generation.

Don't wait for tomorrow's models. Build with today's tools, ready for tomorrow's capabilities.

Join the Revolution
Join the real-time world model revolution at rosebud.ai

‍

About author

Lisha Li

Lisha Li is the CEO and founder of Rosebud. She holds a PhD from UC Berkeley, where her research focused on deep learning for clustering, community detection, and their information-theoretic bounds. She is also an angel investor focused on early stage AI startups. Lisha writes about how to productize frontier AI research, especially focused on making products for builders and creators. She also creates content on the evolution of creative tools, and the broader impact of AI on technology and society.

Try Rosebud AI Game Maker for free

Start vibe coding your first impressive 3D game today.

CREATE GAME

Vibe Code Games on Rosebud AI

Create Game