license: public domain CC0.
AI-Driven Game Design Document
1. High-Level Overview
The goal of this project is to create a weird, AI-driven game using a small LLM as the core game engine, paired with AI-generated visual rendering and frame interpolation for smoother gameplay. The game operates at a simulation frame rate of ~10 FPS, while the visual output runs at 20–30 FPS using frame interpolation techniques.
2. Core Game Loop Architecture
2.1 Components
Game State Manager: Manages the simulation state of the game world.
LLM Game Engine: A small language model that updates the game state based on input and current state.
Renderer: Translates the game state into visual output (state→sprite for speed, text→image for creativity).
Frame Interpolator: Smoothes the visual output by generating intermediate frames.
Input Handler: Handles player input and aggregates actions for each simulation tick.
2.2 End-to-End Game Loop
Input Polling (e.g., 30–60 Hz): Get player input (e.g., MOVE_LEFT, FIRE).
Engine Update:
Call the LLM engine with the current state and player input.
Get the next state and optional description from the model.
State Update:
Update the simulation state with the next state provided by the LLM engine.
Rendering:
Render the current state to the screen (using sprite-based rendering for speed).
Interpolation:
Use a frame interpolation model to generate intermediate frames between simulation ticks.
Display:
Display the interpolated frames at 20–30 FPS.
3. Core Components: Detailed Breakdown
3.1 Game State Representation
The game state is kept minimal to ensure fast updates and ease of processing by the small LLM. It is serialized in a compact, JSON-like format.
{
"tick": 123,
"player": { "x": 4, "y": 1, "hp": 1 },
"enemies": [
{ "x": 1, "y": 10, "type": "basic" },
{ "x": 2, "y": 10, "type": "basic" }
],
"bullets": [
{ "x": 4, "y": 2, "owner": "player" }
],
"effects": [],
"rng_seed": 987654
}
Tick: Current frame or game tick.
Player: Coordinates and health of the player.
Enemies: List of enemy positions and types.
Bullets: Bullets in the game world.
Effects: Temporary effects (e.g., explosions, power-ups).
rng_seed: Random seed for deterministic behavior.
3.2 LLM Game Engine
3.2.1 Model Choice
Size: ~1B–3B parameters.
Type: Small instruction-tuned LLM (fine-tuned on state→next-state transitions in grid-based games).
Deployment: Quantized (4–8 bit) to optimize for GPU inference.
3.2.2 Prompt Format
System Prompt (Fixed):
You are a game engine for a 2D arcade shooter.
Input: JSON state + player action.
Output: JSON next state with the same schema.
Do not explain. Only output valid JSON.
Per-Frame Prompt:
STATE:
{
"tick": 123,
"player": { "x": 4, "y": 1, "hp": 1 },
"enemies": [{ "x": 1, "y": 10, "type": "basic" }],
"bullets": [{ "x": 4, "y": 2, "owner": "player" }],
"effects": [],
"rng_seed": 987654
}
ACTION:
"FIRE"
NEXT_STATE:
Example Output:
{
"tick": 124,
"player": { "x": 4, "y": 1, "hp": 1 },
"enemies": [{ "x": 1, "y": 9, "type": "basic" }],
"bullets": [{ "x": 4, "y": 3, "owner": "player" }],
"effects": [],
"rng_seed": 987654
}
Optional Description:
DESCRIPTION:
"The player fires upward. The enemy drifts closer."
3.3 Rendering
For proof of concept, we will use state-to-sprite rendering to ensure speed and stability.
3.3.1 State-to-Sprite Pipeline
Directly translate the structured game state into 2D sprites (player, enemies, bullets).
Apply a simple style filter (e.g., neural style transfer) every N frames for visual flair.
3.3.2 Text-to-Image (Optional for Later Phases)
Input: A short description generated by the LLM engine.
Output: Low-res, AI-generated image based on the description.
3.4 Frame Interpolation
Simulation FPS: ~10 FPS (from LLM engine).
Display FPS: 20–30 FPS.
Use a frame interpolation model (e.g., RIFE, FILM, or NVIDIA DLSS) to generate intermediate frames for smooth display.
3.5 Input Handling
Input is polled at display rate (e.g., 30–60 Hz).
Aggregate input into a single action for each simulation tick, which is then fed into the LLM engine.
4. Proof of Concept Implementation
4.1 Initial Setup
Game Engine: Implement the LLM-based engine as a small, quantized model running inference at ~10 FPS.
Renderer: Create a simple sprite-based renderer using a 2D canvas.
Frame Interpolation: Implement a basic frame interpolation model (RIFE or FILM) for smoothing the visual output.
Input Handler: Handle input through key presses (MOVE_LEFT, FIRE, etc.) and translate it into actions.
4.2 Example Code Skeleton
import json
import random
# Initial game state
state = {
"tick": 0,
"player": { "x": 5, "y": 5, "hp": 3 },
"enemies": [{ "x": 1, "y": 1, "type": "basic" }],
"bullets": [],
"effects": [],
"rng_seed": 123456
}
# Simulate an LLM response (mocking for the proof of concept)
def llm_engine(state, action):
next_state = state.copy()
if action == "FIRE":
next_state["bullets"].append({"x": state["player"]["x"], "y": state["player"]["y"] + 1, "owner": "player"})
next_state["tick"] += 1
return next_state
# Simple render function (mocking)
def render(state):
print(f"Player at {state['player']['x']}, {state['player']['y']} | Bullets: {len(state['bullets'])}")
# Frame interpolation (mocked, would use an actual model here)
def interpolate_frames(previous_frame, current_frame):
return [previous_frame, current_frame] # Return both frames for now
# Main game loop (mocked)
def game_loop():
global state
while state["player"]["hp"] > 0:
action = random.choice(["FIRE", "MOVE_LEFT", "MOVE_RIGHT", "NONE"])
next_state = llm_engine(state, action)
render(next_state)
state = next_state
# Interpolate and display frames (mocked)
interpolate_frames(state, state)
if __name__ == "__main__":
game_loop()
4.3 Next Steps / Ideas
4.3.1 Game Expansion
Rules Evolution: Add the ability for the game to evolve over time. For example, the LLM engine could dynamically alter enemy behavior or power-ups, which would create interesting "weirdness" as the game progresses.
Procedural Generation: Use the LLM engine to procedurally generate new levels, enemies, and obstacles.
4.3.2 Text-to-Image Rendering
Integrate Text-to-Image: Replace or supplement sprite-based rendering with text-to-image models for more surreal, dynamic visuals.
Style Variability: Implement an option to change the art style using neural filters or style transfer, adding creativity and weirdness to the visuals.
4.3.3 Hybrid Architecture for Game Design
Large Model for Creative Design: Use a larger LLM (not in the real-time loop) to create unique game mechanics, mechanics, and transition data.
Small Model for Real-time Processing: Fine-tune a smaller model to operate in real-time, based on the data generated by the larger model.
4.3.4 Multiplayer or Co-op Features
Expand the game loop to support multiple players, each with their own actions, states, and interactions.
Explore LLMs generating interactions between players in real-time.
5. Conclusion
This design outlines the integration of a small LLM as the core game engine, providing a proof of concept with a basic game loop using state-to-sprite rendering and frame interpolation. The modular nature of the system allows for easy experimentation, such as switching to text-to-image rendering, adding complex rules, and incorporating hybrid architectures. The game’s unique, AI-driven approach offers potential for endless "weirdness" and emergent gameplay.
No comments:
Post a Comment