PhoneAgent is an experimental mobile automation project with two operating modes:
- In-app iPhone agent (SwiftUI app + XCTest runner + OpenAI Responses API)
- External bridge to let Codex/OpenClaw control iOS/Android devices
The bridge supports both:
- iOS (XCTest-hosted actions against simulator or physical iPhone)
- Android (adb + UiAutomator + input/screencap actions against emulator or device)
Demo
- Self contained iOS app
- OpenClaw controlling an iPhone
- OpenClaw controlling an Android phone
- Codex controlling an iPhone
What This Repo Includes
- iOS app UI: API key entry, prompt input, microphone support, settings, always-on wake-word mode
- iOS test-hosted RPC server: newline-delimited JSON-RPC on port
45678 - Android RPC server: localhost JSON-RPC bridge backed by
adbcommands - Helper scripts:
- iOS bridge launcher + physical-device localhost forwarding
- Android bridge launcher (with adb auto-discovery)
- generic RPC CLI (
rpc.py) for both platforms
Capabilities
Shared RPC action surface (iOS + Android)
get_treeget_screen_imageget_contextset_api_keyopen_apptaptap_elemententer_textscrollswipestop
iOS-only RPC method
submit_prompt
submit_prompt powers the in-app iPhone agent loop.
In-app iPhone agent features
- OpenAI API key stored in Keychain
- Prompt submission from keyboard or microphone
- Optional always-on mode with custom wake word
- Notification completion + quick-reply follow-up loop
Requirements
macOS host
- Xcode (for iOS app/UITest bridge)
- Python 3
- Android SDK tools (
adb) for Android bridge
Devices
- iOS simulator or physical iPhone (Developer setup)
- Android emulator or physical Android device (USB debugging or wireless debugging)
Quick Start (In-App iPhone Agent)
- Open
/Users/rounak/Developer/PhoneAgent-cli/PhoneAgent.xcodeprojin Xcode. - Run the
PhoneAgentscheme on an iPhone/simulator. - Enter OpenAI API key when prompted.
- Submit tasks via keyboard or microphone.
Quick Start (AI Agents)
For Codex/OpenClaw usage, use the skill docs:
Send RPC calls
# iOS bundle identifier example ./.agents/skills/phoneagent/scripts/rpc.py open-app com.apple.Preferences # Android package example ./.agents/skills/phoneagent/scripts/rpc.py open-app com.android.settings # Fetch tree ./.agents/skills/phoneagent/scripts/rpc.py get-tree # Capture screenshot (writes PNG under /tmp/phoneagent-artifacts) ./.agents/skills/phoneagent/scripts/rpc.py get-screen-image --print-metadata
The CLI supports --host and --port if you need non-default endpoint settings.
RPC Notes
- Transport: newline-delimited JSON-RPC objects
- Endpoint:
127.0.0.1:45678by default open_apprequest parameter isbundle_identifier:- iOS: pass bundle identifier (e.g.
com.apple.Preferences) - Android: pass package name (e.g.
com.android.settings)
- iOS: pass bundle identifier (e.g.
tap_element/enter_textuse coordinate rectangles in format{{x, y}, {w, h}}
Common App Identifiers
iOS
- Settings:
com.apple.Preferences - Camera:
com.apple.camera - Photos:
com.apple.mobileslideshow - Messages:
com.apple.MobileSMS - Home Screen:
com.apple.springboard
Wireless Android (No USB)
# Pair (from Wireless debugging screen) adb pair <PHONE_IP:PAIRING_PORT> # Connect (from Wireless debugging screen) adb connect <PHONE_IP:ADB_PORT> # Verify adb devices -l
Then start Android bridge with that network serial:
./.agents/skills/phoneagent/scripts/start_android_rpc_bridge_local.sh --serial <PHONE_IP:ADB_PORT>
Security Model
- RPC bridge is localhost-oriented (
127.0.0.1) - iOS physical-device workflow uses localhost forwarding
- Android bridge executes only through selected
adbserial - In-app API key is stored in iOS Keychain
Repository Pointers
- iOS app entry:
PhoneAgent/PhoneAgentApp.swift - iOS app UI/state:
PhoneAgent/ContentView.swift,PhoneAgent/PromptView.swift,PhoneAgent/SettingsView.swift - iOS bridge server:
PhoneAgentUITests/SimulatorRPCServer.swift,PhoneAgentUITests/PhoneAgent.swift - RPC CLI:
.agents/skills/phoneagent/scripts/rpc.py - iOS bridge launcher:
.agents/skills/phoneagent/scripts/start_rpc_bridge_local.sh - Android bridge launcher:
.agents/skills/phoneagent/scripts/start_android_rpc_bridge_local.sh - Android bridge server:
.agents/skills/phoneagent/scripts/android_rpc_bridge.py
Known Limitations
- Android bridge does not yet implement
submit_promptagent loop - UI tree snapshots can be noisy/stale during animations
- Keyboard/text reliability can vary by app and platform
- Long-running tasks may require explicit polling/retries
Disclaimer
- Experimental software
- Personal project
- App contents may be sent to OpenAI API when using agent flow
- Model/tool actions can be incorrect; verify important operations