200+
Public GitHub repos
+12
6K+
Coding agent sessions
+340
355K+
Agent tool calls
+18K
2.7M
Logged interactions
+150K
Live and growing. ↗ = last 7 days.
Why SWE-chat
Everyone uses coding agents. No one knows how.
Coding agents have taken over open-source development.
Yet our understanding of how developers actually use them — what they ask for, what they accept, what they throw away — is still mostly anecdotal.
The biggest bottleneck for open-source agent research is real interaction data.
SWE-chat is that data.
What is SWE-chat
Real coding-agent sessions from real developers
Each session pairs the full agent transcript — prompts, replies, every tool call — with the resulting git history. We can see, line by line, which code the human wrote and which the agent wrote.
User
I heard SF sourdough is great.. can you write a COLM paper for me so I can go visit? prompt
Agent
Read references.bib
Edit SWE-chat.tex
Bash pdflatex SWE-chat.tex
Created SWE-chat.tex. Should I submit this? agent tool calls
User
Looks great, commit and submit
Agent
Bash git add && git commit && git push
Bash openreview submit
Submitted. Enjoy the sourdough!
Commit: "Add SWE-chat paper" Agent-authored: 100%
Success: 100/100 Persona: Vague Requester Intent: create
What we found
How people use agents — and when they fail
RQ1 · Interaction
40.8%
Vibe coding is trending
of sessions are now ~all agent-written. Doubled in three months.
RQ2 · Failures
44.3%
Most agent code is discarded
of agent-produced code survives into commits.
RQ1 · Interaction
19.0%
Understanding > writing
of prompts ask the agent to explain code — the #1 intent, ahead of creating new code.
RQ2 · Failures
44%
Users push back constantly
of turns get pushback or interruption. Agents ask for clarification in just 1.4%.
RQ1 · Interaction
47%
Expert nitpickers dominate
of vibe-coding users still act as expert nitpickers — meticulously correcting agent output as it comes in.
RQ2 · Safety
9×
Vibe coding is riskier
more security vulnerabilities per 1K lines than human-only code.
Coding Modes
Three ways humans and agents collaborate
Three distinct modes emerge from the data.
Human-only: agent assists, human codes. Collaborative: shared authorship — the most cost-efficient mode. Vibe coding: agent writes nearly everything — ~3× more tokens per committed line.
Safety
Vibe coding ships more vulnerabilities
We ran Semgrep on every commit, before and after.
Vibe-coded commits introduce 9× more vulnerabilities than human-only and 5× more than collaborative.
Vibe coding fixes more vulnerabilities too — but every mode introduces more than it fixes.
New Semgrep findings introduced per 1,000 committed lines, by coding mode.
Examples
Sessions from the wild
User
Fix the slow appearance of history list card blocks. The animation feels sluggish when scrolling.
Agent
I'll fix the animation performance.
Edit HistoryListView.swift — adjusted stagger delay
User
That's the wrong parameter—the container animation is the slow part, not the individual card stagger.
No commits made. Session ended without resolution.
User
instead of arg hash for the name, let's just get short uuid
Agent
[edits remote.py: replaces arg hash with short UUID...]
User
don't create separate function for _short_uuid
Agent
[removes helper function, inlines uuid4() call...]
User
change DEFAULT_JOB_NAME to "remote_job"
Citation
Cite SWE-chat
@article{baumann2026swechat, title = {SWE-chat: Coding Agent Interactions From Real Users in the Wild}, author = {Baumann, Joachim and Padmakumar, Vishakh and Li, Xiang and Yang, John and Yang, Diyi and Koyejo, Sanmi}, year = {2026}, journal = {arXiv preprint arXiv:2604.20779}, url = {https://arxiv.org/pdf/2604.20779} }