Agent Smith — AI-powered pipeline execution.

11 min read Original article ↗

MIT · Open Source · Self-hosted

AI-powered
pipeline execution.

Define pipelines as chains of steps. Each step can run an AI agent, spawn a multi-role discussion, call tools, or move files. Ticket to PR — fully automated.

coding

api-scan

legal

security

mad

$ agent-smith fix "#54 in my-api"

──────────────────────────────────────────

[1/13] FetchTicket — Null ref in UserService

[2/13] CheckoutSource — Branch fix/54 created

[3/13] BootstrapProject — C# .NET 8 detected

[4/13] LoadCodeMap — 483 files mapped

[5/13] Triage — backend-dev, tester

[6/13] GeneratePlan — Consensus after 1 round

[7/13] Approval Approved ✓

[8/13] AgenticExecute — UserService.cs modified

[9/13] Test 47 tests passed ✓

[10/13] CommitAndPR github.com/…/pull/42

[11/13] ReviewChecks CI passed ✓

[12/13] UpdateTicket — #54 → Done

[13/13] Notify — Slack #dev-ops

✓ PR created · Ticket closed · $0.018

$ agent-smith api-scan \

--swagger https://api.example.com/swagger.json \

--target https://api-staging.example.com

──────────────────────────────────────────

[1/8] LoadSwagger — 33 endpoints, 1 scheme

[2/8] SpawnNuclei — 102 findings in 75s

[3/8] SpawnSpectral — 402 findings in 2s

[4/8] Triage — 4 specialists selected

[5/8] SkillRounds — 2 rounds, consensus

[6/8] CompileFindings — OWASP mapped

[7/8] DeliverFindings — console + markdown

[8/8] Notify — Slack #security

✓ 9 findings retained from 507 · $0.11

$ agent-smith security-scan --repo . --output sarif,markdown

──────────────────────────────────────────

[1/18] CheckoutSource — branch main, 483 files

[2/18] BootstrapProject — C# .NET 8 detected

[3/18] LoadDomainRules — security-principles.md

[4/18] StaticPatternScan— 91 patterns, 6 categories

[5/18] GitHistoryScan — 500 commits, 3 secrets found

[6/18] DependencyAudit — dotnet audit, 2 CVEs

[7/18] CompressFindings — 74% token reduction

[8/18] LoadSkills — 9 security specialists

[9/18] AnalyzeCode — dependency graph mapped

[10/18] SecurityTriage — 7 of 9 skills selected

🔍 Vuln Analyst — contributed

🔐 Auth Reviewer — OBJECT — token expiry

💉 Injection Checker— contributed

🔑 Secrets Detector — OBJECT — 3 in git history

⚙️ Config Auditor — contributed

📦 Supply Chain — contributed

🧹 FP Filter — filtered 84 of 102

[11/18] SkillRounds — round 1 complete

[12/18] ConvergenceCheck — 2 objections, round 2

🔐 Auth Reviewer — AGREE

🔑 Secrets Detector — AGREE

[13/18] ConvergenceCheck consensus ✓

[16/18] CompileDiscussion— findings consolidated

[17/18] ExtractFindings — 18 structured findings

[18/18] DeliverFindings — findings.sarif + report.md

✓ 18 findings · 4 CRITICAL · 9 skills · $0.12

$ agent-smith mad \

"Should we migrate to microservices?"

──────────────────────────────────────────

[1/9] LoadSkills — 5 personas

🧩 Philosopher — big picture perspective

💡 Dreamer — opportunities, vision

🧹 Realist — OBJECT — complexity cost

😈 Devil's Adv — OBJECT — team readiness

🤐 Silencer — filtering noise

[2/9] ConvergenceCheck — no consensus, round 2

🧹 Realist — AGREE

😈 Devil's Adv — OBJECT — migration risk

[3/9] ConvergenceCheck — round 3 consensus

😈 Devil's Adv — AGREE with conditions

[4/9] CompileDiscussion— arguments mapped

[5/9] GenerateVerdict — conditional yes

[6/9] DeliverOutput — decision.md written

[7/9] ExtractActions — 3 action items

[8/9] ReviewChecks — arguments validated

[9/9] Notify — Slack #architecture

✓ Consensus · 3 rounds · $0.22


Pipelines

Eight workflows. Ready to run.

Pre-built pipeline presets for the most common AI orchestration tasks. Or define your own in agentsmith.yml.

agent-smith fix

Fix Bug

Ticket → branch → code → tests → PR. Unattended, end-to-end.

GitHubGitLabAzDO13 steps

agent-smith api-scan

API Security Scan

Nuclei + Spectral on a live API. AI panel reviews against OWASP API Top 10.

NucleiSpectral8–11 steps

agent-smith legal

Legal Analysis

Five legal specialist roles review contracts. Clause analysis, compliance, risk, liability.

5 specialists7 steps

agent-smith security-scan

Security Scan

18-step pipeline with 9 specialist roles, pattern scanner, git history analysis, and SARIF output.

SARIFMarkdown9 skills18 steps

agent-smith mad

MAD Discussion

Multi-agent design debate. Five personas argue in rounds until consensus.

Convergence9 steps

agent-smith feature

Add Feature

Like fix-bug — plus generates tests and documentation automatically.

Multi-role16 steps

agent-smith init

Init Project

Bootstraps .agentsmith/ — detects language, framework, conventions.

.NETPythonTypeScript

agentsmith.yml

Custom Pipelines

Mix any command handlers. Add project-specific steps. Fully extensible.

Your rules

How it works

Intelligence at every step.

01

Fetch the ticket

Agent Smith reads the issue from GitHub, GitLab, Azure DevOps, or Jira. Title, description, labels — everything becomes context.

02

Triage selects specialists

Based on what the bug touches — database, auth, API — Triage picks roles like backend-dev and tester. No hardcoding.

03

Roles plan until agreement

Each role can AGREE, OBJECT with an alternative, or SUGGEST an improvement. Rounds continue until consensus or human approval.

04

Agentic execution

The agent has file tools, bash, and your coding principles. It reads, writes, and iterates. Tests run automatically after every change.

05

PR opened. Ticket closed.

Code committed, PR created, ticket status updated. Token usage and dollar cost tracked in the result file.

01

Load the OpenAPI spec

Reads your Swagger/OpenAPI JSON or YAML. Maps every endpoint, parameter, auth scheme, and response type.

02

Spawn scanners

Runs Nuclei and Spectral against the live target. Nuclei probes for vulnerabilities, Spectral checks the schema for design issues.

03

AI panel reviews findings

Four specialists — vuln-analyst, design-auditor, auth-tester, and FP filter — review all findings against the OWASP API Top 10.

04

Convergence in rounds

Specialists AGREE or OBJECT. The design-auditor might challenge a vuln-analyst finding. Rounds continue until the panel aligns.

05

Findings delivered.

Filtered results in Markdown and console output. OWASP-mapped, severity-rated, with remediation guidance. Cost tracked.

01

Acquire the document

Upload a contract, NDA, or terms of service as PDF. Agent Smith extracts the full text and clause structure.

02

Five legal specialists activate

Contract analyst, compliance checker, risk assessor, liability reviewer, and clause negotiator — each with their own lens.

03

Clause-by-clause analysis

Each specialist reviews every clause. Compliance checks GDPR and liability caps. Risk assessor flags unfavorable terms.

04

Negotiator suggests alternatives

For every high-risk finding, the clause negotiator drafts alternative language you can propose to the counterparty.

05

Analysis delivered.

Findings as Markdown with risk ratings, clause references, and suggested rewrites. Token usage and cost tracked.

01

Static analysis first

Three automated scanners run before any AI touches the code: 91 regex patterns across 6 categories, git history scan for leaked secrets in 500 commits, and dependency audit for known CVEs.

02

Compress and slice

Raw findings are grouped by category and compressed — 74% token reduction. Each specialist gets only the slice relevant to their expertise, not the full dump.

03

Triage selects from 9 specialists

Vuln analyst, auth reviewer, injection checker, secrets detector, config auditor, supply chain auditor, compliance checker, AI security reviewer, and the mandatory false-positive filter.

04

Specialists review in rounds

Each specialist analyzes the code and their finding slice. They can AGREE or OBJECT. Rounds continue until consensus or max rounds (default 3). The FP filter removes noise: confidence < 8, infrastructure issues, duplicates.

05

SARIF + Markdown delivered.

Findings exported as SARIF 2.1 for GitHub Security tab, Azure DevOps, or any SARIF viewer. Markdown report for humans. Console output color-coded by severity. All formats combinable.

01

Frame the question

State any architectural or strategic decision. "Should we migrate to microservices?" "Rewrite in Rust or stay with Go?"

02

Five personas loaded

Philosopher (big picture), Dreamer (opportunities), Realist (constraints), Devil's Advocate (risks), and Silencer (noise filter).

03

Personas debate in rounds

Each persona argues from their perspective. They AGREE, OBJECT, or SUGGEST. Expect 2–4 rounds of heated but structured debate.

04

Convergence check

After each round, the system checks for consensus. The Silencer removes repetitive or low-value arguments. Debate tightens.

05

Decision delivered.

A structured verdict with pro/con arguments, conditions, and a clear recommendation. Token usage and cost tracked.

—— Triage ————————————————————————————————

Reviewing ticket + codebase...

Selected: backend-dev, tester

—— Plan —————————————————————————————————

backend-dev: null check in UserService.GetById

tester: AGREE — add regression test

—— Execute ——————————————————————————————

Modified: UserService.cs

Created: UserServiceTests.cs

✓ 47 tests passed

✓ PR #42 created · #54 closed · $0.018

api-scan · api-staging.example.com

—— Scanners ——————————————————————————————

Nuclei: 102 findings in 75s

Spectral: 405 findings in 2s

—— Round 1 ———————————————————————————————

vuln-analyst: contributed — BOLA on /users/{id}

design-auditor: OBJECT — missing rate limit

auth-tester: contributed — JWT none alg

fp-filter: filtered 498 of 507

—— Round 2 ———————————————————————————————

design-auditor: AGREE

✓ 9 findings retained from 507 · $0.11

—— Specialists ————————————————————————————

All 5 activated: contract-analyst,

compliance, risk, liability, negotiator

—— Analysis ——————————————————————————————

contract-analyst: 14 clauses reviewed

compliance: OBJECT — GDPR gap in §7.2

risk: HIGH — unlimited liability in §12

liability: confirmed — cap missing

negotiator: 3 alternative clauses drafted

✓ 12 findings · 3 HIGH · $0.34

security-scan · agent-smith repo

—— Static Analysis —————————————————————————

Patterns: 91 patterns, 6 categories

Git scan: 500 commits, 3 secrets

Deps: 2 CVEs in 1 package

Compress: 74% token reduction

—— Triage ————————————————————————————————

Selected: 7 of 9 specialists

—— Round 1 ———————————————————————————————

vuln-analyst: contributed

auth-reviewer: OBJECT — token expiry too long

secrets-detector: OBJECT — 3 secrets in git history

fp-filter: filtered 84 of 102

—— Round 2 ———————————————————————————————

auth-reviewer: AGREE

secrets-detector: AGREE

✓ 18 findings · 4 CRITICAL · 9 skills · $0.12

mad · microservices migration

—— Round 1 ———————————————————————————————

philosopher: modularity enables team autonomy

dreamer: independent scaling, polyglot

realist: OBJECT — team too small for micro

devil's adv: OBJECT — distributed tracing cost

silencer: removed 2 circular arguments

—— Round 3 ———————————————————————————————

realist: AGREE — start with 2 services

devil's adv: AGREE with conditions

✓ Conditional yes · 3 rounds · $0.22

Multi-Skill Architecture

Specialists, not generalists.

Roles defined in YAML files. Each has its own perspective, rules, and convergence criteria. Triage selects who's relevant. Add your own in minutes.

Coding

Architect · Backend · Frontend · Tester · DBA · DevOps · Security · PO

Security

Vuln Analyst · Auth Reviewer · Injection Checker · Secrets Detector · Config Auditor · Supply Chain · Compliance · AI Security · FP Filter

API Security

Design Auditor · Auth Tester · Vuln Analyst · False Positive Filter

Legal

Contract Analyst · Compliance · Risk Assessor · Liability · Negotiator

MAD Discussion

Philosopher · Dreamer · Realist · Devil's Advocate · Silencer

Your custom roles

Add any specialist — performance, accessibility, domain-specific

Live — api-security-scan · RHS.DigitalAccessPortal

—— Triage ——————————————————————————————————

Lead: api-vuln-analyst

Participants: 4 selected from 4 available

—— Round 1 —————————————————————————————————

🔎 API Vuln Analyst (r1): contributed

🔍 API Design Auditor (r1): OBJECT

🔒 Auth Tester (r1): contributed

🧹 FP Filter (r1): contributed

—— Round 2 —————————————————————————————————

🔍 API Design Auditor (r2): AGREE

—— Result ——————————————————————————————————

✓ Consensus after 2 rounds

✓ 9 findings retained from 507 analyzed

✓ 6 LLM calls · $0.11 · claude-sonnet

AI Providers

Your model. Your choice.

Works with any provider. Switch models per pipeline. Run fully local with Ollama — no API key required.

Groq, Azure OpenAI, LM Studio, and any OpenAI-compatible endpoint work out of the box.

Deployment

Runs where you run.

Single binary for quick starts. Docker for teams. Kubernetes for production. Chat gateway for Slack and Teams.

Single Binary

Self-contained, no runtime dependencies. Drop onto any CI runner and go.

Docker

Official image on Docker Hub. Mount config, pass API keys. Any container runtime.

Kubernetes

Helm chart included. Dispatcher as service, agents as Jobs. Scales cleanly.

Chat Gateway

Connect Slack or Teams. Type "fix #42 in my-api" — Agent Smith does the rest.

Azure DevOps

Pipeline task in YAML. Security scans, API scans, bug fixes — all from AzDO.

GitHub Actions

Trigger on PR events or schedules. SARIF output for GitHub Security tab.

Philosophy

Your AI.
Your infrastructure.
Your rules.

No SaaS in betweenYour code and tickets never leave your infrastructure. Agent Smith calls your AI provider directly.

Cost transparentEvery run records token usage and dollar cost. You always know what you spent and why.

Skills over promptsSpecialist roles in YAML. Add your own domain experts. No prompt engineering required.

Human in the loopApproval step before execution. Run headless when you trust the pipeline. Your call.

agentsmith.yml

projects:
  my-api:
    ticket_provider: github
    source_provider: github
    repo: "org/my-api"
    api_security_scan: on_pr

agent:
  provider: claude
  model: "claude-sonnet-4-20250514"

# Or run fully local:
agent:
  provider: ollama
  model: "llama3.3:70b"
  base_url: "http://localhost:11434"

From ticket to PR.
In minutes.

Open source. Self-hosted. Free forever.