GitHub - RTBRuhan/ApexAgent: A Extension that allows you to connect your browser with Ai

5 min read Original article ↗

Apex Agent

Apex Agent

AI-Powered Browser Control & Debugging Extension

Version License: MIT Chrome Edge

FeaturesInstallationUsageMCP ToolsPrivacyAuthor


🚀 Features

🤖 MCP Server Integration

Connect AI assistants like Cursor, Windsurf, or any MCP-compatible tool to control your browser in real-time.

📹 Interaction Recording

Record user interactions including:

  • Mouse clicks (single, double, right-click)
  • Keyboard input
  • Scroll events
  • DOM changes with before/after states
  • Smart filtering to skip dynamic/automatic changes

🕹️ AI Agent Control

Allow AI to fully control your browser:

  • Navigate to URLs
  • Click elements
  • Type text
  • Scroll pages
  • Execute JavaScript
  • Take screenshots

🔍 DevTools Inspection

AI-powered debugging capabilities:

  • Inspect element properties, styles, and box model
  • View DOM tree structure
  • Get computed CSS styles
  • Query elements by selector or text
  • Monitor console logs
  • Analyze network requests
  • Access localStorage/sessionStorage
  • View cookies

🎨 Visual Feedback

  • AI cursor visualization
  • Element highlighting
  • Action tooltips
  • Connection status badge on extension icon

🤖 Built-in AI Sidebar (Optional)

  • Use your own API key (OpenAI, Anthropic, Google, OpenRouter)
  • Chat with AI directly in the browser sidebar
  • AI can execute browser actions automatically

🛡️ Chrome DevTools Protocol (CDP) Access

  • Event listener inspection
  • CPU profiling and heap snapshots
  • CSS/JS coverage analysis
  • DOM breakpoints
  • Accessibility tree inspection

📦 Installation

Extension Setup

  1. Clone the repository:

    git clone https://github.com/RTBRuhan/ApexAgent.git
    cd ApexAgent
  2. Load in Chrome/Edge:

    • Open chrome://extensions/ or edge://extensions/
    • Enable "Developer mode"
    • Click "Load unpacked"
    • Select the extension folder
  3. Install MCP Server dependencies:

    cd mcp-server
    npm install

AI Tool Configuration

Add this to your AI tool's MCP settings (e.g., Cursor's mcp.json):

{
  "apex-agent": {
    "command": "node",
    "args": ["/path/to/ApexAgent/mcp-server/index.js"]
  }
}

⚠️ Replace /path/to/ApexAgent with your actual installation path


🔧 Usage

Quick Start

  1. Start the MCP Server:

  2. Connect Extension:

    • Click the Apex Agent extension icon
    • Go to MCP tab
    • Click Connect
  3. Enable Agent Control:

    • Go to Agent tab
    • Toggle Agent Control ON
    • Configure permissions as needed
  4. Use with AI:

    • Your AI assistant can now control the browser!

Connection Status Badge

Badge Status
● Green Connected
◐ Orange Reconnecting
○ Gray Disconnected

🛠️ MCP Tools

Browser Control

Tool Description
browser_navigate Navigate to a URL
browser_click Click an element by CSS selector
browser_type Type text into an element
browser_scroll Scroll the page
browser_press_key Press keyboard keys (Enter, Arrow keys, Tab, etc.)
browser_click_by_text Click element by its text content
browser_wait_for_element Wait for element to appear
browser_snapshot Get page snapshot with interactive elements
browser_screenshot Take a screenshot of the page
browser_evaluate Execute JavaScript code
browser_execute_safe Execute JS in content script context (bypasses CSP)
browser_execute_on_element Execute JS on specific element (CSP-safe)

DevTools Inspection

Tool Description
inspect_element Deep inspect - box model, styles, attributes
get_dom_tree Get DOM tree structure
get_computed_styles Get computed CSS properties
get_element_html Get innerHTML/outerHTML
query_all Find all elements matching selector
find_by_text Find elements containing text
get_attributes Get all attributes and data-* properties

Page Analysis

Tool Description
get_page_metrics Performance, element counts, memory
get_console_logs Captured console messages
get_network_info Network requests and timing
get_storage localStorage/sessionStorage contents
get_cookies Document cookies

Extension Management (for Extension Developers)

Tool Description
list_extensions List all installed extensions
reload_extension Reload extension by ID (use "self" for Apex Agent)
get_extension_info Get detailed extension info
enable_extension Enable an extension
disable_extension Disable an extension

🔧 For Extension Developers: These tools enable AI-assisted extension development workflow. Your AI can automatically reload your extension after making changes!


📁 Project Structure

ApexAgent/
├── extension/
│   ├── manifest.json           # Extension manifest (MV3)
│   ├── background.js           # Service worker
│   ├── getting-started.html    # Setup guide for new users
│   ├── popup/
│   │   ├── popup.html          # Extension popup UI
│   │   ├── popup.css           # Styles
│   │   └── popup.js            # Popup logic
│   ├── sidebar/
│   │   ├── sidebar.html        # AI sidebar panel
│   │   ├── sidebar.css         # Sidebar styles
│   │   └── sidebar.js          # AI chat logic
│   ├── content/
│   │   ├── content.js          # Content script (DOM interaction)
│   │   └── content.css         # Visual feedback styles
│   └── icons/
│       ├── icon16.png          # 16x16 icon
│       ├── icon48.png          # 48x48 icon
│       └── icon128.png         # 128x128 icon
├── mcp-server/
│   ├── index.js                # MCP server implementation
│   ├── package.json            # Node.js dependencies
│   └── README.md               # Server documentation
├── PRIVACY.md                  # Privacy policy
└── README.md                   # This file

⚙️ Configuration

Extension Permissions

The extension requests the following permissions:

  • activeTab - Access to the current tab
  • tabs - Tab management
  • scripting - Execute scripts in pages
  • storage - Save settings
  • webNavigation - Track navigation events
  • alarms - Keep service worker alive
  • <all_urls> - Access to all websites

Agent Permissions

Configure in the Agent tab:

  • Mouse Control - Allow AI to click and hover
  • Keyboard Input - Allow AI to type
  • Navigation - Allow AI to navigate
  • Script Execution - Allow AI to run JavaScript
  • Screenshots - Allow AI to capture screenshots

🔒 Security & Privacy

  • Agent control is enabled by default for convenience, but can be toggled off
  • All AI actions require explicit permission
  • The extension only connects to localhost MCP server
  • No data is sent to external servers
  • All settings and logs are stored locally on your device
  • Open source - you can verify the code yourself

For complete privacy details, see our Privacy Policy.


🐛 Troubleshooting

Extension not connecting?

  1. Make sure MCP server is running (npm start)
  2. Check the port (default: 3052)
  3. Reload the extension

AI can't control browser?

  1. Enable Agent Control in Agent tab
  2. Check permission checkboxes
  3. Ensure you're on a regular webpage (not chrome:// or edge://)

Console showing errors?

  • Check DevTools console for detailed error messages
  • Reload extension after making changes

📝 License

MIT License - see LICENSE for details.


👤 Author

RTBRuhan


🙏 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Made with ❤️ for the AI development community