Core Concepts

Architecture

Three layers — the product model

The plugin is organized into three layers. Each one has a different job and gets used at a different point in the loop.

Three layers, one contract

L1 · Introspect CDP reads — tree, store, nav shared — always safe

L2 · Interact native runner taps & types shared — re-attach, don't evict

L3 · Flow-replay Maestro flow runs end-to-end exclusive — owns the device

⇢ reads ⇢ tap BUSY_FLOW_ACTIVE ⇒ flow

Reads never conflict · taps coexist · one flow at a time — enforced by the in-process arbiter.

L1 reads are always safe. L2 taps re-attach instead of evicting. L3 flows own the device — and the arbiter refuses conflicting work fast.

Layer	Job	Examples
Workflow	Decides what to do; gates quality	`/rn-feature-dev` 8-phase pipeline; the rn-tester / rn-debugger / rn-code-architect agent protocols
Discovery	Looks at the running app to know what’s there	CDP tools (`cdp_component_tree`, `cdp_store_state`, `cdp_navigation_state`, `cdp_evaluate`); device tools (`device_press`, `device_fill`, `device_snapshot`, `device_screenshot`)
Reproducible actions	Replays known flows in seconds; self-repairs on UI drift	`.rn-agent/actions/<name>.yaml`; `cdp_run_action` orchestrator; `cdp_repair_action` selector patcher; actions guide

        ┌──────────────────────────────────────┐
        │   Workflow (rn-feature-dev, agents)   │
        │   Decides WHAT, gates quality         │
        └────────┬───────────────────┬─────────┘
                 │ "verify this"     │ "produce/replay flow"
                 ▼                   ▼
   ┌─────────────────────┐  ┌──────────────────────┐
   │      Discovery      │  │       Actions        │
   │  Empirical reality  │  │  Replay in seconds   │
   │  CDP + device tools │  │  Self-repair on drift│
   │  Macro-Asserts      │◄─┤                      │
   └─────────────────────┘  └──────────────────────┘
                 ▲                   │
                 │  Discovery emits  │
                 └─── new actions ───┘
                      Actions repair via Discovery

Why this matters: replay vs. interactive walks

On a known 3-step task-creation wizard, an interactive agent walk took 13 minutes 55 seconds. The same wizard, replayed as a saved action, finished in ~4 seconds — a ~210× speedup. That’s the load-bearing data point behind the architecture: discovery tools are how the agent finds new ground; actions are how it replays known ground without paying that cost again.

The agent doesn’t choose all-or-nothing. If you’re on the login screen and need to do something on home, the agent runs the saved login action as a prologue (4 seconds), then discovers the new work interactively. See actions for the full hybrid composition pattern.

Implementation layers — how it’s built

The product layers above are organized; underneath, three implementation layers do the work.

┌─────────────────────────────────────────────────────┐
│  Claude Code                                         │
│  ┌─────────────┐  ┌──────────────┐  ┌────────────┐ │
│  │ Skills       │  │ Agents       │  │ Commands   │ │
│  │ (knowledge)  │  │ (protocols)  │  │ (entry pts)│ │
│  └──────┬───┬──┘  └──────┬───────┘  └─────┬──────┘ │
│         │   │            │                 │        │
│  ┌──────▼───▼────────────▼─────────────────▼──────┐ │
│  │              MCP Server (CDP Bridge)            │ │
│  │  WebSocket → Metro → Hermes CDP                 │ │
│  │  74 tools across 5 families                     │ │
│  └─────────┬───────────────────────────┬───────────┘ │
│            │                            │             │
│  ┌─────────▼──────────┐      ┌─────────▼──────────┐ │
│  │   rn-fast-runner   │      │  rn-android-runner │ │
│  │   (iOS, in-tree)   │      │  (Android, in-tree)│ │
│  │   XCTest /command  │      │  UiAutomator instr.│ │
│  └────────────────────┘      └────────────────────┘ │
└─────────────────────────────────────────────────────┘
         │                              │
    ┌────▼────┐                   ┌─────▼─────┐
    │ iOS Sim │                   │ Android   │
    │         │                   │ Emulator  │
    └─────────┘                   └───────────┘

Implementation tier	Tool	Role
Device interaction (iOS)	In-tree `rn-fast-runner` XCTest rig (`packages/rn-fast-runner/`) — `POST /command` HTTP	Native iOS device control. Always calls `XCUIApplication.activate()` per request (D1219, PR #164). The in-tree runner is the sole iOS device backend.
Device interaction (Android)	In-tree `rn-android-runner` (`packages/rn-android-runner/`) — UiAutomator instrumentation via `adb shell am instrument`	Native Android device control. The in-tree runner is the sole Android device backend (default-on; `RN_ANDROID_RUNNER=0` errors, it does not fall back).
App introspection	Custom MCP server → Hermes CDP via WebSocket	Persistent WebSocket — reads React fiber tree, store state, network, console, errors
E2E testing	maestro-runner (preferred) / Maestro (fallback)	YAML-based persistent test files; underlying format for actions in `.rn-agent/actions/`

Fallback: xcrun simctl (iOS) + adb (Android) for device lifecycle (boot / install / launch / terminate) — the runner doesn’t manage device state, only interaction.

Three-layer device-control contract

One mechanism per capability tier — L1 + L2 coexist (drive with XCTest, assert with CDP in the same per-step loop); L3 is exclusive (owns the device for the flow).

Layer	Mechanism	Role	Exclusivity
L1 INTROSPECTION	CDP / Hermes	read store / network / component-tree / mmkv / native	shared
L2 INTERACTION	iOS `RnFastRunner` / Android `rn-android-runner`; `cdp_interact`	primitive taps / types / scrolls	shared
L3 FLOW-REPLAY	`maestro-runner` (Go + WDA)	whole-`.yaml` E2E flows	exclusive

Foreign runners (e.g. the standalone maestro-mcp): L1 reads are always safe; on an L2 leak the device-session re-attaches rather than evicts (#188); device_snapshot action=open surfaces an informational FOREIGN_RUNNER_ACTIVE warning. Since #186, a LIVE foreign session (UDID-scoped, 5 s-TTL ps scan, fail-open) also makes local device_* and flow tools refuse fast with BUSY_FOREIGN_FLOW (~50 ms, vs the ~44 s runner-leak cascade) while CDP reads stay free and device_screenshot falls back to simctl; the plugin’s maestro_run is the canonical Maestro surface. Opt out with RN_IOS_FOREIGN_GUARD=0.

MCP server (CDP bridge)

The MCP server is a Node.js process that maintains a persistent WebSocket connection to the React Native app’s Hermes engine through Metro’s CDP endpoint.

Since #264 the entry point is a supervisor split: a thin stdio shim (dist/supervisor.js) that holds zero network sockets owns the MCP connection with Claude Code and spawns the real bridge as a respawnable worker. Killing everything on Metro’s port (lsof -ti tcp:8081 | xargs kill -9 — a common recovery step) used to SIGKILL the whole server and cost the session every tool; now it only takes the worker, which the supervisor respawns (bounded: 3 per rolling 60 s), replaying the cached MCP initialize handshake so the session continues. In-flight calls at the moment of death fail fast with a “worker restarted — retry the call” error. Supervision state is visible in cdp_status → bridge: { supervised, workerRestarts, lastWorkerExit }; opt out with RN_BRIDGE_SUPERVISOR=0.

74 tools across five families:

CDP — React internals via Chrome DevTools Protocol (component tree, store state, navigation, profiling, network)
Device — Native interaction (iOS: rn-fast-runner, Android: rn-android-runner)
Actions — Record / replay / self-repair (cdp_run_action, cdp_repair_action, cdp_record_test_save_as_action, cdp_record_test_*)
Testing — Proof capture, auto-login, cross-platform verify, Maestro orchestration
Macro-Asserts — State-assertive replays (expect_redux, expect_route, expect_visible_by_testid, expect_text)

All tools are registered through a single trackedTool() wrapper that records per-tool telemetry — the same mechanism that feeds the auto-action capture loop. Failure signals also feed a repo-local troubleshooting memory: hooks capture failures during a session and the agent synthesizes them into a gitignored .rn-agent/local/troubleshooting.md, read back at the next SessionStart.

Helper injection

On first CDP connect, ~2KB of JavaScript is injected into the Hermes runtime via Runtime.evaluate. This creates globalThis.__RN_AGENT with methods:

getTree(filter, depth) — Walk the React fiber tree
getNavState() — Read React Navigation / Expo Router state
getStoreState(path, type) — Read Redux / Zustand / React Query
getComponentState(testID) — Inspect hooks by testID
navigateTo(screen, params) — Navigate via fiber tree traversal
getErrors() / clearErrors() — Error tracking

Ring buffers

Since MCP is pull-based (tools are called on demand), events that fire between calls are buffered:

Buffer	Size	Content
Console	200 entries	`console.log/warn/error` output
Network	100 entries	HTTP request/response metadata
Errors	50 entries	Unhandled exceptions and promise rejections

Key technical decisions

Decision	Rationale
Inject helpers ONCE on connect	~2KB JS, then call `__RN_AGENT.*` — small payloads per call
5-second timeout on ALL CDP calls	Prevents hanging promises
RedBox detection before tree return	Check fiber root for LogBox/ErrorWindow, warn instead of returning error overlay
`Debugger.paused` auto-resume	Prevents silent JS thread freeze from `debugger;` statements
Network fallback for RN < 0.83	Try `Network.enable`, if fails → inject fetch/XHR monkey-patches
Filter mandatory on component tree	Full dumps waste 10K+ tokens — always scope to testID or component

Device dispatch by platform

Both platforms dispatch through a single native entry point — runNative() in agent-device-wrapper.ts — which routes to the in-tree runner for the target platform. There is no external CLI tier on either platform:

iOS — single-endpoint rn-fast-runner. Every iOS device_* call short-circuits through runIOS() (TS client at packages/rn-dev-agent-core/src/runners/rn-fast-runner-client.ts) to a POST /command HTTP endpoint exposed by an in-tree XCTest rig. Coordinate-based gestures map to .drag; direction-based swipes/scrolls are pre-computed to coords by device-interact.ts before dispatch. device_find (non-exact) and device_scrollintoview are TS-side orchestrators over runIOS('snapshot') — no Swift .findText round-trip for fuzzy matching. The in-tree runner is the sole iOS device backend.

Android — single-endpoint rn-android-runner. Every Android device_* call routes through runAndroid() (TS client at packages/rn-dev-agent-core/src/runners/rn-android-runner-client.ts), which drives an in-tree UiAutomator instrumentation started via adb shell am instrument. The runner is default-on and the sole Android device backend — the legacy agent-device daemon + CLI tiers were removed (eradicate-agent-device). Setting RN_ANDROID_RUNNER=0 does NOT fall back to anything; it makes device_* error with RUNNER_DISABLED.

Measured: iOS rn-fast-runner delivers ~216ms tap, ~5ms snapshot, ~74ms screenshot.

A stale ~/.agent-device/daemon.json can respawn the upstream AgentDeviceRunner and fight the in-tree rn-fast-runner for focus on iOS. Since #202 the plugin terminates stale AgentDeviceRunner processes at session-open by default (scoped to the target simulator UDID), clears orphaned ~/.agent-device/daemon.{json,lock}, and uninstalls the legacy runner apps (com.callstack.agentdevice.runner and its xctrunner companion) from the target simulator — an installed XCUITest runner is otherwise relaunched by iOS mid-flow even after its process is killed. Opt out with RN_DEVICE_KILL_LEGACY=0.

What we’re NOT using (and why)

Tool	Why not
Appium	Too heavy, latency overhead, black-box (no RN sync)
Flipper	Deprecated for debugging in RN 0.76+
Detox	Great for JS tests but not AI-agent-friendly (JS files, not YAML)
Facebook idb	Python + pip + companion daemon = too much setup friction