GitHub - rohanarun/Open-Agent-Studio

Open Agent Studio - Build Unlimited Computer Use Agents Without Running Out Of Credits (Mac/Windows/Linux)

The first cross-platform desktop application for Agentic Process Automation
An open-source alternative to UIPath and traditional RPA tools using unlimited VLM models for semantic targets.

🚀 Get Started • 📖 Documentation • 🤝 Contributing • 💬 Support

🌟 What is Open Agent Studio?

Open Agent Studio is an open-source alternative to UIPath and traditional RPA tools by enabling Agentic Process Automation through natural language. Instead of brittle selectors and complex code, describe what you want in plain English and let AI handle the rest. It uses unlimited free VLM models with state of the art object detection from our synthetic dataset to achieve state of the art computer use without credits across unlimited servers and agents.

✨ Key Features

🎯 Semantic Targets: Future-proof automation that survives UI changes
🎬 Video-to-Agent: World's first video-based agent creation
🌐 Cross-Platform: Works on Windows and Linux
🔗 Built-in API: Every instance includes a REST API
🧠 AI-Powered: GPT-4 integration for intelligent decision making
🔄 Self-Healing: Robust verification and testing loops

📥 Installation

Windows

Download: Windows Executable
Extract the ZIP file
Install Python 3.10 from Windows Store
Run the application (click "More info" → "Run anyway" if Windows shows security warnings)

Linux

Download: Linux Executable
Extract and run

📺 Video Tutorials

✨ Tools built using Open Agent Studio

🚀 Quick Start

1. Create Your First Agent

{
  "key": "your_api_key",
  "json_output": [
    {
      "type": "open tab",
      "target": "https://example.com"
    },
    {
      "type": "click",
      "target": "Login button",
      "browser_mode": true
    }
  ],
  "goal": "Navigate to website and click login"
}

2. Test Locally

curl -X POST http://localhost:8080/agents \
  -H "Content-Type: application/json" \
  -d @your_agent.json

🎯 Semantic Targets

Instead of fragile CSS selectors, use natural language:

{
  "type": "click",
  "target": "blue submit button",
  "browser_mode": true
}

This works even if the website completely changes its design!

🛠 Agent Node Types

🌐 Browser Automation

Click

{
  "type": "click",
  "target": "Submit button",
  "browser_mode": true
}

Type Text

{
  "type": "keypress",
  "prompt": "Hello, World!"
}

Open Tab

{
  "type": "open tab",
  "target": "https://example.com"
}

Wait/Delay

{
  "type": "delay",
  "time": 5
}

🧠 AI & Data Processing

GPT-4 Processing

{
  "type": "gpt4",
  "prompt": "Summarize the following text:",
  "input": ["article_text"],
  "data": "summary"
}

Python Execution

{
  "type": "python",
  "code": "import pandas as pd\nprint('Hello from Python!')"
}

Semantic Scraping

{
  "type": "semanticScrape",
  "target": "product prices",
  "data": "price_data"
}

📊 Integrations

Google Sheets

{
  "type": "google_sheets_add_row",
  "URL": "sheet_url",
  "Sheet_Name": "Sheet1",
  "data": ["John", "Doe", "30"]
}

Email

{
  "type": "email",
  "to": "user@example.com",
  "subject": "Automation Report",
  "body": "Task completed successfully!"
}

API Calls

{
  "type": "api",
  "URL": "https://api.example.com/data",
  "headers": {"Content-Type": "application/json"},
  "body": {"key": "value"}
}

📖 Complete Example

Here's a complete agent that scrapes data, analyzes it, and sends results:

{
  "key": "your_api_key",
  "json_output": [
    {
      "type": "open tab",
      "target": "https://news.ycombinator.com"
    },
    {
      "type": "semanticScrape",
      "target": "top story headlines",
      "data": "headlines"
    },
    {
      "type": "gpt4",
      "prompt": "Summarize these headlines and identify key trends:",
      "input": ["headlines"],
      "data": "analysis"
    },
    {
      "type": "google_sheets_create",
      "URL": "sheet_url",
      "Sheet_Name": "news_analysis"
    },
    {
      "type": "google_sheets_add_row",
      "URL": "sheet_url",
      "Sheet_Name": "news_analysis",
      "data": ["{{analysis}}"]
    },
    {
      "type": "email",
      "to": "manager@company.com",
      "subject": "Daily News Analysis",
      "body": "Please find today's analysis attached.",
      "data": "analysis"
    }
  ],
  "goal": "Scrape news, analyze trends, save to sheets, and email results"
}

📡 Agent API Reference

POST /agents

Creates and runs a new agent.

Request:

{
  "key": "your_api_key",
  "json_output": [...],
  "goal": "description"
}

Response: Returns execution results and verification data.

🗺 Roadmap

Open Agent Cloud - Cloud-based execution Done!
Enhanced Video-to-Agent - Improved conversion accuracy Done!
Advanced Evaluations - Better testing for generalized agents
Improved Testing Loop - Self-healing automation
Full Open Source Backend - Complete local deployment Done!

🤝 Contributing

We welcome contributions! Here's how you can help:

🚀 Get Started

Email rohan@cheatlayer.com for contributor access
Join our community discussions
Check out open issues and feature requests

🎯 Areas We Need Help

Evaluations for generalized agents
Testing loop improvements
Video-to-agent enhancement
Documentation and tutorials
Bug reports and fixes

📋 Development Setup

# Clone the repository
git clone https://github.com/rohanarun/Open-Agent-Studio.git

# Email: rohan@cheatlayer.com

💬 Support

📧 Email: rohan@cheatlayer.com
📚 Documentation: docs.cheatlayer.com
🐛 Issues: GitHub Issues
💡 Feature Requests: GitHub Discussions

🏆 Our Story

Founded during the pandemic to help people rebuild their businesses with AI, we were the first startup approved by OpenAI to sell GPT-3 for automation in August 2021. We invented "Semantic Targets" and achieved 97% accuracy with our Atlas-2 multimodal model.

Our Vision: In a future where AI can generate custom, secure, and free versions of expensive business software, we're building tools that level the playing field for everyone.