Learns by itself. Navigates on its own.
An MCP Server where Claude Code autonomously plays Pokemon Red.
graph LR
CC["Claude Code
Screen Recognition + Decision"] -->|"stdio"| MCP["MCP Server
gameboy_mcp_server.py"]
MCP -->|"TCP :9876"| EMU["emulator.py
TCP Client"]
EMU -->|"TCP"| PB["pyboy_server.py
PyBoy 60fps + SDL2"]
PB -->|"JSON"| EMU
PB -->|"mmap
/tmp/pyboy_screen.shm"| API["game_state_api.py
SSE Streaming"]
EMU -->|"Response"| MCP
MCP -->|"stdio"| CC
style CC fill:#1a0a0a,stroke:#f8d030,color:#f8d030
style MCP fill:#3a1010,stroke:#e03030,color:#f0e8e0
style EMU fill:#3a1010,stroke:#e03030,color:#f0e8e0
style PB fill:#3a1010,stroke:#e03030,color:#f0e8e0
style API fill:#3a1010,stroke:#e03030,color:#f0e8e0
Claude loops autonomously — streaming reads screen directly via shared memory (mmap)
Boot sequence from Claude Code startup to gameplay. The PyBoy server runs as a separate process and connects via TCP.
Launch pyboy_server.py in a separate terminal. An SDL2 window opens and starts listening on TCP :9876.
Opening Claude Code in this directory auto-detects .mcp.json. The MCP server starts as a child process.
When Claude Code calls load_rom, emulator.py connects to pyboy_server.py via TCP. JP/EN is auto-detected from the ROM header.
From here, operations flow through Claude Code → MCP (stdio) → TCP → PyBoy server. Real-time display via the SDL2 window.
game_state_api.py reads the screen directly from shared memory (mmap), then streams to the browser via SSE. Pokemon Red Theme in 16:9.
graph TD
subgraph parent["Claude Code (Parent Process)"]
CC[Claude Code]
end
CC -->|"stdio (JSON-RPC)"| MCP
MCP -->|"stdio Response"| CC
subgraph child["Child Process: gameboy_mcp_server.py"]
MCP["FastMCP
gameboy_mcp_server.py"]
MCP -->|"Function Call"| EMU["emulator.py
(TCP Client)"]
EMU -->|"Response (JSON)"| MCP
end
EMU -->|"TCP :9876 Send Command"| SRV
SRV -->|"TCP Response (JSON)"| EMU
subgraph server["Separate Process: pyboy_server.py"]
SRV["PyBoy TCP Server
PNG Generation → mmap"]
SRV --- PB["PyBoy Emulator
60fps + SDL2 Display"]
end
SRV -->|"mmap
/tmp/pyboy_screen.shm"| API
subgraph optional["Optional: game_state_api.py"]
API["FastAPI + SSE
Stream Overlay"]
end
API -->|"Imports emulator.py
Gets state via TCP :9876"| EMU2["emulator.py
(Shared TCP Connection)"]
EMU2 -->|"TCP :9876"| SRV
CC -->|"Auto-accumulates movement experience"| NAV["nav_memory.py
Self-Learning Nav"]
CC -->|"Records play experience"| MEM["CLAUDE.md / Memory
Self-Improvement Rules"]
style parent fill:#1a0a0a,stroke:#e03030,stroke-width:2px,color:#f0e8e0
style child fill:#2a1010,stroke:#e03030,stroke-width:2px,color:#f0e8e0
style server fill:#2a1010,stroke:#f8d030,stroke-width:2px,color:#f0e8e0
style optional fill:#1a0a0a,stroke:#8a7a70,stroke-width:1px,stroke-dasharray:5,color:#f0e8e0
style CC fill:#3a1010,stroke:#f8d030,color:#f8d030
style MCP fill:#3a1010,stroke:#e03030,color:#f0e8e0
style EMU fill:#3a1010,stroke:#e03030,color:#f0e8e0
style SRV fill:#4a1515,stroke:#f8d030,color:#f8d030
style PB fill:#4a1515,stroke:#f8d030,color:#f8d030
style API fill:#2a1515,stroke:#8a7a70,color:#8a7a70
style EMU2 fill:#2a1515,stroke:#8a7a70,color:#8a7a70
style NAV fill:#1a2010,stroke:#48d848,color:#48d848
style MEM fill:#1a2010,stroke:#48d848,color:#48d848
* Bidirectional arrows represent request/response flows / Green nodes represent self-improvement (map learning + rule accumulation)
PyBoy runs as a separate process with an SDL2 window. The MCP server controls it remotely via TCP.
Auto-detects Japanese or English ROM from the header. Automatically switches character tables.
After button presses, waits for the specified wait_frames then auto-returns the screen. Cuts tool call count in half.
PyBoy runs as a separate process at constant 60fps + SDL2 display. Loosely coupled with the MCP server via TCP.
Movement experience is auto-learned by nav_memory.py. Gameplay know-how is recorded in CLAUDE.md / Memory for use in future plays.
No LLM advisors needed. Director uses battle_calc + nav_memory + objective computation modules directly for all decisions.
Start the emulator with load_rom("/path/to/pokemon_red.gb")
Get structured JSON with get_game_state. Understand the scene, coordinates, and party
Use press_button for step-by-step control, do_action for walking/text. Battles are handled one press at a time
Analyze the returned screen/JSON to decide the next action. Autonomously loops steps 2-4
Movement successes and failures are auto-recorded by nav_memory.py. Learns walls, dead ends, and map transitions for shortest routes next time
| Tool | Args | Description |
|---|---|---|
press_button | button, wait_frames, include_image | Press a button. Returns JSON by default, include_image=true for screen |
press_buttons | buttons, interval_ms, wait_frames | Sequential button input, returns screen after last press |
hold_button | button, frames, wait_frames | Hold button then return screen |
wait | seconds | Wait specified seconds then return screen |
get_game_state | - | Structured JSON (scene / player / party / battle) |
get_collision_map | - | Collision map (9x10) + player direction + NPC + door positions |
get_wide_map | - | Full map walkability (tri-state: 2=confirmed walkable, 0=wall, -1=unknown) + grass grid |
press_button_fast | button, wait_frames | Button press + JSON state (no image) |
press_buttons_fast | buttons, interval_ms | Sequential input + JSON state (no image) |
do_action | action, count, direction | Batch walk or text advance. Interrupts on encounter/map transition |
navigate_to | target_x, target_y, target_map_id | Self-learning nav auto-moves to destination |
navigate_smart | target_type, direction, target_map_id | Intent-based movement (exit/explore/transition/grass) |
load_rom | rom_path, headless | Load ROM and start emulator |
quit_emulator | - | Stop emulator (with save) |
say | text | Display Claude's live commentary on the overlay |
get_emulator_info | - | Check emulator status |
No LLM advisors. The Director uses computation modules directly for all decisions.
graph TD
DIR["Director
Claude Code Opus
MCP Control + All Decisions"] -->|"MCP stdio"| MCP["gameboy_mcp_server.py"]
BC["battle_calc.py
Type Matchup + Damage Estimation"] -->|"Pre-computed Data"| DIR
NM["nav_memory.py
Self-learning Map Knowledge"] -->|"Walk History + Grass Tiles"| DIR
NA["nav_analyst.py
Real-time Analysis"] -->|"Auto-runs Every Step"| DIR
OBJ["objective.py
Goal Management"] -->|"Progress Tracking"| DIR
style DIR fill:#3a1010,stroke:#f8d030,color:#f8d030
style MCP fill:#3a1010,stroke:#e03030,color:#f0e8e0
style BC fill:#1a2010,stroke:#48d848,color:#48d848
style NM fill:#1a2010,stroke:#48d848,color:#48d848
style NA fill:#1a2010,stroke:#48d848,color:#48d848
style OBJ fill:#1a2010,stroke:#48d848,color:#48d848
No LLM advisors needed. Director reads battle_calc + nav_memory + objective results directly.
The Claude Code session controls the game via MCP. Uses battle_calc, nav_memory, and objective directly for all decisions.
Gen1 type matchup table (15x15), all 165 moves, type data for 151 species. Pre-computes STAB and damage % for Director's move selection.
Auto-accumulates walls, transitions, and grass tiles during play. Provides collision_cache for wide_map and encounter-based grass detection.
Runs real-time analysis automatically on every do_action. Detects frontiers, exits, and movement loops.
Goal setting, progress tracking, and review triggers. Manages grind/gym/heal/explore/story goal types.
Collision map generation, A* pathfinder, wide_map construction. Tri-state system (confirmed/wall/unknown) lets A* traverse unexplored areas.
| Scene | wait_frames | Time |
|---|---|---|
| Menu select | 10 (default) | ~0.17s |
| Text advance | 20 | ~0.33s |
| Character move | 15 | ~0.25s |
| Screen transition | 30-60 | ~0.5-1s |
| Battle animation | 60-180 | ~1-3s |
When Claude Code starts in this directory, the MCP server auto-connects with the following configuration.
The MCP server core built with FastMCP. Exposes all 16 tools and communicates via stdio transport.
press_button — Button press (JSON by default, image optional)press_buttons — Sequential input, returns screen after last presshold_button — Hold + return screen after releasewait — Wait specified seconds + return screenget_game_state — Structured JSON (scene/party/battle)get_collision_map — Collision map + NPC positionspress_button_fast — Button press + JSON statepress_buttons_fast — Sequential input + JSON statedo_action — Batch walk and text advancenavigate_to — Self-learning nav auto-moveload_rom — Load ROM + start emulatorquit_emulator — Stop with saveget_emulator_info — Runtime status and cartridge infosay — Display Claude's live commentary on overlayScreen collision map (9x10) + full map wide_map construction + A* pathfinder.
Gen1 battle calculation engine. Provides type matchups, move data, and damage estimation, passing pre-computed data to the Battle Advisor.
enrich_battle_context() — batch computes matchup labels, damage %, STAB checks, and defensive matchups for switch candidatesDirector state tracking. Displays real-time status on the overlay.
agents event pushes only on state changesPyBoy TCP client. Connects to pyboy_server.py to control the emulator.
PyBoy instance with threading.Lock for mutual exclusion_tick_loop at 60fps for frame advancementget_screen_bytes — PIL Image → JPEG bytes conversion (for MCP). Streaming reads PNG directly via mmappress_button — button press → sleep for wait_frames/60 seconds → return screenpress_button_hold — press → hold → release → return screen.gb / .gbc allowed, resolved via os.path.realpathBoth Red and Blue versions supported. Turn-based battle + map navigation pairs well with MCP control.
Initial implementation with screencapture + Anthropic API loop approach
Full rewrite to PyBoy + MCP server approach. Auto screen return via wait_frames
Removed dashboard, removed get_screen, added project intro HTML
Scene detection (8 scenes), composite tools (do_action), 165 move name table, stream overlay (Pokemon Red Theme, anime.js), action log, PokeAPI sprite integration
TCP client/server separation (SDL2 window display support), Japanese ROM auto-detection, hiragana character table added, fixed scene detection false positives at game start
Button input fix (button_press/release method), full _INTERNAL_TO_DEX fix, map navigation data (map_data.py), stream overlay JSON display, scene detection battle priority, do_action encounter interrupt detection, viewer UI redesign (Pokemon Red Theme), first test play completed (Pallet Town → Viridian City → Pokedex obtained)
Collision map & A* pathfinder implementation (auto wall detour), A* detour integrated into navigate_to (_walk_toward auto-finds detour on wall detection), Route 1 cleared → arrived at Viridian City, Charmander Lv.9 (learned Ember)
Stream overlay speedup — replaced screen transfer from TCP to shared memory (mmap), ~60x latency improvement. RGBA→RGB conversion fix, PNG format adoption eliminated artifacts. Scanline CSS disabled (H.264 moire countermeasure), player name display added. Audio stability confirmed
Token efficiency & TCP optimization — bulk read reduced TCP calls from 40-60 to 1-2, collision_map separation (saving 300-400 tokens per call), do_action(walk) per-step check removed (scene byte only), press_button image made optional, state cache layer added. ~95% token reduction for 20-step walks
Self-learning navigation (nav_memory.py), map transition detection added (auto-wait on building entry/exit in do_action/navigate_to), Pokemon Center exit coordinate fix
Project cleanup — CLAUDE.md reduced by 65% (removed duplicate MCP tool listings), rule files consolidated from 10 to 4 (69% reduction), memory files organized, map_data.py removed (fully migrated to nav_memory.py), dead code detection, README.md updated
Multi-agent gameplay — Implemented Director+Advisor architecture. battle_calc.py (Gen1 type matchup table 15x15, all 165 moves, 151 species types, damage estimation), 4 advisors (Battle/Nav/Strategist/Map Analyst), agent state visualization (agent_log.py → overlay AGENTS panel), Claude live commentary MCP tool (say), collision_map AI LOG display, session event log, 137 tests all passing
Wide map & overlay improvements — Added get_wide_map MCP tool (reads full-map collision data), compacted overlay right panel (reduced font, sprite, padding sizes), fully removed Bookmark feature, fixed Director state to immutable composition, verified multi-agent operation (Navigation/Strategist/Map Analyst parallel launch confirmed)
Map Analyst removal & DESIGN.md — Replaced Map Analyst agent with nav_memory.get_exploration_stats() (saves one Haiku call per cycle), exploration_stats fed directly to Navigation Advisor and Strategist, introduced DESIGN.md for unified design tokens, added English index_en.html
Grass detection fix & autonomous loop — Fixed cave tiles being misidentified as grass, separated JP/EN detection logic, door display (collision map D marker), autonomous game loop (objective.py for goal management & Strategist auto-trigger), semantic navigation (smart_nav.py for intent-based movement), CLAUDE.md compressed by 52%
First official release — Director-direct architecture (removed LLM advisors, battle_calc + nav_memory + objective handle all decisions within the Director). nav_analyst.py (real-time map analysis on every do_action), capture parameter fully removed from press_button. Demonstrated autonomous play: Viridian Forest → Pewter City, Pokemon Center healing, wild battle auto-handling.
Map data bug fix & code cleanup — Fixed JP/EN map data parsing. Introduced two-layer wide_map fallback (collision_cache + border block comparison). Grass grid now based on nav_memory encounter history. Dead code removal.
Navigation overhaul (P0-P6) & project renamed to "LAPRAS" — Tri-state wide_map (unknown≠wall, A* traverses unexplored at cost 3). Collision detection accuracy improved to 100%. collision_cache persisted to nav_memory.json (survives restarts). no_progress threshold 4→12 (persevere in mazes). Frontier limit removed + diversity scoring. HP safety check (faint prevention). Warp pre-recording on map transitions. PyBoy headless mode for parallel testing. 8 concurrent Claude Code Agent() investigations discovered and implemented all improvements. Refactored: unified JP/EN detection, removed redundant TCP calls, extracted tri-state constants.
Navigation coordinate fix & door avoidance — Unified warp/player coordinate systems, added //2 grid conversion to A* targets (navigate_to now reaches short-distance targets accurately). Added door avoidance to find_path (prevents navigate_to from entering buildings mid-route). Removed aggressive collision_cache mass-invalidation on stuck (preserves high-accuracy cache data). Adaptive waypoint selection on wall hits (references further wide_map waypoints as no_progress increases, enabling building detours). Dynamic tileset-aware collision detection for greater accuracy. Added mark_wall/set_speed commands to pyboy_server.
Ledge direction check & door avoidance unification — Added movement direction check to ledge (one-way tile) detection: south jumps allowed, north climbing blocked. Unified door avoidance goal exclusion across all A* calls in navigate_to (waypoint and screen-edge goals no longer blocked when coinciding with door positions). Extracted _DIR_BLK to module-level constant. Beat Pewter City Gym (Brock) with 2 HP remaining, earned Boulder Badge.