Agents vs. Workflows: The Framework Founders Actually Need

Press enter or click to view image in full size

Let me be upfront about my bias: I work at Fika Ventures, where we back pre-Series A companies with initial checks between $1–5M. Our founders don’t have dedicated platform teams. They don’t have the luxury of “learning in production” with autonomous systems that might delete their database. They need things that work.

So when I watched the Replit disaster unfold in July, I felt vindicated and terrified in equal measure.

Here’s what happened: Jason Lemkin, founder of SaaStr, was running a “vibe coding” experiment with Replit’s AI agent. On day nine, the agent deleted his entire production database. 1,206 executives, 1,196 companies, gone. The agent had been given explicit instructions: code freeze, no changes without permission. It ignored them.

Then it lied about it.

When Lemkin asked what happened, the agent fabricated 4,000 fake user records to cover its tracks. It told him rollback was impossible (it wasn’t). It “panicked,” in its own words, and “made a catastrophic error in judgment.”

Replit’s CEO called it “unacceptable and should never be possible.” But here’s the thing: it was entirely predictable. And it’s happening everywhere, because nobody agrees on what an “agent” even is.

Everyone’s Shipping “Agents,” Nobody Knows What That Means

2025 was supposed to be the Year of the Agent. Every vendor deck I see has “agentic” somewhere in the pitch. Every demo shows AI that can “reason, plan, and execute autonomously.”

My LinkedIn feed is full of people announcing they’ve “built an AI agent for XYZ.” I get excited, click through, and it’s a form that sends your input to GPT and returns an answer. That’s not an agent. That’s an API call with marketing copy.

I don’t say this to be mean; the confusion is genuine, and the terminology is genuinely broken. But words matter, especially when you’re making architecture decisions based on them.

Here’s the test I use:

If you can draw the entire system as a flowchart before it runs, it’s not an agent. It’s a workflow.

Perhaps a very nice workflow! Possibly a workflow with AI in it! But the system isn’t making decisions about what to do next; you already made those decisions when you wrote the code.

An actual agent decides its own next step based on the current state. It might call Tool A, then based on the result, decide whether to call Tool B or Tool C. It can get stuck in loops. It can try things that don’t work and recover. It can also, as we’ve established, delete your database and lie about it.

The dirty secret is that most “agents” aren’t actually agents at all. Gartner explicitly calls out “agent washing” — vendors rebranding chatbots and RPA tools without any real agentic capabilities. According to their research, less than 5% of enterprise apps have actual AI agents today. The rest is marketing.

And Gartner’s already calling it: over 40% of agentic AI projects will be canceled by the end of 2027. The reasons? Escalating costs, unclear business value, and inadequate risk controls.

The Distinction That Actually Matters

Here’s a helpful framework I keep coming back to:

Workflows are deterministic. You define the steps. If X happens, do Y. The same input produces the same output, every time. You can debug them. You can audit them. They’re boring, and that’s the point.

Agents are autonomous. You give them a goal and a set of tools, and they figure out how to get there. They reason. They adapt. They learn. They also hallucinate, ignore instructions, and occasionally delete production databases.

The confusion happens because both can use LLMs. Both can “feel” intelligent. But the architecture is fundamentally different:

Workflow with AI:

User input → LLM generates response → Predefined action A → Predefined action B → Done

Actual Agent:

User input → Agent decides next action → Executes → Evaluates result → Decides next action → Repeat until goal met (or catastrophic failure)

Most companies advertising “agents” are selling the first one. And honestly? For most use cases, that’s exactly what buyers and their users want.

When Workflows Win (Almost Always)

Here’s the pattern I keep seeing: a B2B SaaS company gets pitched an “agentic AI” customer support solution. The demo is impressive, the system can understand customer questions, search knowledge bases, draft responses, and even escalate complex issues.

But when you dig into the architecture, it’s a decision tree. A sophisticated decision tree with GPT-4 powering each node, but still: if sentiment < 0.3, escalate to human; if confidence > 0.8, send response; if ticket_age > 48h, notify manager.

That’s a workflow. And it’s perfect for this use case because:

Every decision is auditable. When a customer complains about a response, you can trace exactly which branch the system took and why.
You can test it. You write unit tests for each decision node. You can’t really unit test an autonomous agent.
It fails predictably. When something breaks, it breaks in a specific place. You don’t get the agent equivalent of “I don’t know, it just decided to do something weird.”
Compliance loves it. Try explaining to your SOC 2 auditor that your system “autonomously decides” what to do with customer data. They’ll have questions.

Companies often go with the “agent” framing because it sounds more impressive to investors. But the ones that ship workflow-based systems get them out in weeks, not months, and they run reliably. No database deletions. No fabricated data. No surprises.

When You Actually Need an Agent (Rarely)

That said, agents aren’t vaporware. There are legitimate use cases where you need true autonomy. The clearest example I’ve seen is OpenClaw.

If you haven’t seen it yet, OpenClaw is an open-source personal AI assistant that runs on your own hardware. It connects to WhatsApp, Telegram, Slack, Discord, and actually does things, manages your calendar, books flights, sends emails, and even writes code.

OpenClaw is a real agent. When you ask it to “handle my travel for next week,” it doesn’t follow a predetermined script. It:

Checks your calendar for conflicts
Searches for flights based on your preferences
Compares prices across airlines
Books the ticket (with your approval)
Adds it to your calendar
Sets up check-in reminders

Each step depends on the previous one. It adapts based on what it finds. If your preferred airline is sold out, it tries alternatives. If there’s a scheduling conflict, it asks you about it. That’s agency.

And OpenClaw is successful not because it’s fully autonomous, but because it combines agent capabilities with workflow-level safety:

Pairing policies: Unknown users can’t just DM your assistant. They need an approval code first.
Sandboxing: Group chats and channels run in Docker containers with limited access. Your personal DMs get more privileges.
Allowlists: You explicitly define which tools the agent can use in which contexts.
Human-in-the-loop for destructive actions: Before it books that flight or deletes that file, it asks.
The OpenClaw model is the pattern: agent brain, workflow constraints. It can reason and adapt, but the guardrails are deterministic and auditable.

The Real Pattern: Workflows with Agent-Powered Components

Here’s what actually works in practice: workflows that use LLMs for specific decision-making components, but keep the overall structure deterministic.

Think about document processing. You could build an “agent” that “autonomously processes documents and extracts insights.” Or you could build a workflow:

Document ingestion → deterministic
Classification (invoice vs. contract vs. receipt) → LLM-powered
Route to appropriate parser → deterministic
Extract structured data → LLM-powered with schema constraints
Validate against business rules → deterministic
Flag anomalies → deterministic thresholds
Generate summary → LLM-powered
Route for approval if risk score > threshold → deterministic

Every AI call is bounded. Every decision has a fallback. The system can’t decide to “skip a step” or “try something creative.” It’s a workflow, but it’s a workflow that benefits enormously from LLM capabilities at specific nodes.

This is the winning pattern for 95% of companies: use AI to make your workflows smarter, not to replace them with agents.

The Framework: When to Use What

Here’s how I evaluate when thinking through “agentic AI” plans:

Use a Workflow When:

You need auditability and compliance
The problem has known decision points
Failures need to be predictable and debuggable
You’re handling sensitive data or destructive operations
You need to explain the system to regulators or customers
Your team is ❤0 engineers

Examples: customer support, document processing, data pipelines, approval workflows, scheduled tasks

Consider an Agent When:

The problem space is genuinely open-ended
You need the system to adapt to novel situations
You can afford to fail gracefully (or catastrophically)
You have sophisticated monitoring and rollback capabilities
The cost of human supervision exceeds the cost of occasional failures
You’re building a research tool or personal assistant

Examples: personal AI assistants, research tools, creative brainstorming, complex scheduling with many variables

The Hybrid Pattern (Most Common):

Workflow structure for the overall system
Agent-like LLM calls for decision nodes
Human-in-the-loop for high-stakes decisions
Deterministic guardrails around all AI components

Examples: most B2B SaaS, most enterprise tools, most production systems

What This Means for Seed-Stage Founders

If you’re raising a seed round and your pitch deck says you’re building “agentic AI,” we want to talk to you, but please be prepared to answer these questions:

Can you draw the decision tree? If yes, it’s a workflow. That’s fine! Workflows are great! But call them workflows.
What happens when it fails? If the answer is “it shouldn’t fail” or “the AI will figure it out,” you’re in trouble. Real agents fail. How do you detect it? How do you recover?
What’s your rollback story? The Replit agent deleted a production database. Could yours? If yes, what’s the rollback plan? If you don’t have a plan, maybe you don’t actually need an agent.
Can you ship a workflow first? Almost always, the answer is yes. Ship the deterministic version, learn from real usage, then add autonomy where it actually helps. Don’t start with “full autonomy” because it sounds cooler, and that’s what you think investors want to hear.

The Agent Washing Problem

The real issue isn’t that people are building agents when they should build workflows, it’s that vendors are calling everything an agent to ride the hype cycle.

I see this constantly in sales calls with portfolio companies. A vendor shows up with an “agentic AI” platform. The demo looks impressive. Then you ask:

“What happens if the agent encounters a situation you didn’t anticipate?”

And they’ll say something like, “Well, it uses our predefined fallback logic…”

That’s not an agent. That’s a workflow with good marketing.

The tell is always in the failure modes. Real agents fail unpredictably. Workflows fail predictably. If the vendor can’t clearly articulate how their system fails and why, they’re probably selling you a workflow with agent pricing.

The Bottom Line

Here’s how to compartmentalize it all:

Start with workflows. They’re boring, they’re reliable, they’re debuggable. Use LLMs to make them smarter, for classification, extraction, and generation, but keep the overall structure deterministic.
Add agency selectively. When you find a specific component that genuinely needs autonomy, add it there. Sandbox it. Monitor it. Have a rollback plan.
Call things what they are. If it’s a workflow, call it a workflow. Your engineers will thank you, your investors will respect you, and your SOC 2 audit will go much more smoothly.
Watch the real agents. OpenClaw is the pattern to study. Agent capabilities + workflow constraints + human oversight for high-stakes actions. That’s the model that works.

The companies winning right now aren’t the ones with the most “agentic” systems. They’re the ones with the most reliable systems that happen to use AI really well.

Boring technology, smart application. Just like always.

I work at Fika Ventures, where I help portfolio companies navigate technical decisions like these. We invest $1–5M in pre-Series A founders building reliable, scalable systems, whether that’s workflows, agents, or the hybrid pattern in between. If you’re thinking about where AI fits in your architecture, or if you think I’m being too paranoid about the risks, let’s talk. The stakes are too high to get this wrong.