TL;DR
We set out to make API orchestration cheaper for customers sitting on mountains of REST and gRPC; what we got is a practical backend framework for LLMs. Cosmo Plugins give you schema-first modules, generated protobuf contracts, and agent-assisted adapter code so models (in Cursor-class IDEs) implement the glue between your federated graph and non-GraphQL backends—without a bespoke subgraph for every integration. Plugins run as isolated subprocesses under the Cosmo Router, so you can ship a modular monolith, split into services later, and still lean on Federation for cross-plugin dependencies. One typed graph at the edge for apps and agents; the LLM does the repetitive wiring. Cosmo Plugins are part of Cosmo Connect, our framework for connecting REST, gRPC, and other backends into the Supergraph without GraphQL rewrites.
While solving the problem of simplifying API orchestration with LLMs for our customers, we accidentally built a backend framework for LLMs — and it's a joy to use. We unpacked the thinking behind it in this blog on generative API orchestration .
It's like v0 from Vercel, Lovable, or Bolt...but for the backend.
You can use it to build modular monoliths, microservices, or a mix of both. And if you want to, you can move from one to the other. If you like the idea of vibe coding complete backend applications, you will love it. If that sounds like nails on a chalkboard, you might lose your job.
Why digital enterprises build graphs of APIs
APIs naturally form graphs
We're working with companies like eBay, SoundCloud , Procore, and others to help them build great GraphQL APIs.
By talking to them about their workflows and how they build software to solve customer problems, we've started to notice consistent patterns.
One such pattern is that all digital enterprises build graphs of APIs. Intentionally or not, organizations tend to split their projects into teams, each owning part of the overall domain. These teams build APIs to provide capabilities to other teams or to enable products to be built.
Typically, the APIs built by individual teams are somewhat related to each other. Nothing in a business is fully isolated from the rest. A customer buys products, that product needs to be shipped, payment needs to be processed, and different teams present the buying experience through different channels: like a website, a mobile app, or a chatbot.
From accidental graphs to GraphQL Federation
Our customers have not only acknowledged this, they are actively embracing it. Instead of creating a graph of APIs by accident, they are doing it intentionally. They are building on top of the federated graph concept, also known as GraphQL Federation .
A federated graph is an API that combines multiple services (subgraphs) into a single API. The public API of the federated graph looks like a GraphQL monolith, but on the inside it's composed of multiple services.
Why BFFs and service registries fall short
An alternative would be to build fully isolated REST or gRPC services, publish them in a registry like Backstage , and have product teams manually integrate them into their applications, e.g., through a backend-for-frontend (BFF ) pattern.
The problem with the BFF approach is that instead of creating NxM connections between frontend and backend services, you've just moved the problem to the BFF layer, creating NxM connections between frontend and BFF services.
Federated graphs as a single AI entry point
The federated graph approach, on the other hand, unifies all APIs into a graph. This serves as a single entry point for all applications, including web apps, mobile, chat applications, and now LLM agents using the Model Context Protocol (MCP).
Compared to the Backstage approach, the federated graph makes API discovery, integration, and extension much easier. With a traditional Microservices architecture, you have to find the right service to solve your problem, then manually integrate each one into your product, similar to managing dependencies. With a federated graph, all you have to do is search through the Schema and write a query. With the help of LLMs, that means that you write a single prompt and get a query back. In terms of extending the graph, you add new fields to the schema, implement them, and publish the changes to the Schema Registry (WunderGraph Cosmo). Once published, other users can use a prompt to build a query that includes the new field.
But—and this is a big but— until now, the cost of creating a Supergraph was simply too high.
State of GraphQL Federation 2026
How are teams governing schema changes, handling production traffic, and measuring Federation success? Share your experience and get early access to the full report. For every valid survey completed, we'll donate $30 to UNICEF .
The problem: unifying all your APIs was impossible
A Supergraph is an abstraction layer on top of existing internal and external APIs. It’s essentially a federated GraphQL graph that unifies many backend services into a single schema and execution layer. While it's proven to be very beneficial for serving many different applications with many backend services, most of these services don't speak GraphQL, nor do they understand the Federation protocol.
If you have to implement and deploy a new Subgraph for every handful of fields, you end up with a lot of Subgraphs, even though all they do is proxy between the Supergraph and the existing underlying services.
There must be a better way! Could we build a framework that leverages LLMs to generate the code needed to proxy between the Supergraph and non-GraphQL, non-Federation services? This is exactly what we've been working on for the last few months.
Legacy REST makes GraphQL adoption expensive
One of our customers has a huge legacy of thousands of REST APIs, or let's just say JSON-over-HTTP APIs, because who really builds truly RESTful APIs? They want to adopt the Supergraph approach, but it's simply too expensive to rewrite all of their existing APIs to GraphQL. If they could reduce the number of API calls for a single page from 20 to one, it would be a huge win.
The solution: LLM-generated API adapters
As simple as it sounds, solving this problem was anything but trivial. It's not possible to tell an LLM to generate proxy code for thousands of REST API endpoints. You have to split the problem into small chunks, give the LLM very specific constraints and instructions, and only then can you leverage its strengths without overloading it.
As a side effect, like I mentioned earlier, this led us to build a backend framework for LLMs.
The framework: Cosmo Plugins
Here's how it works:
- You define a Schema for each module or service you'd like to add to the Supergraph.
- Through a process of composition, we ensure that all modules/services fit together and that all fields in the Supergraph schema can be resolved.
- For each module, we generate a gRPC protobuf file that describes the features the module provides.
- With predefined prompts, we can instruct an LLM code-gen with agentic capabilities like Cursor to generate the mapping/adapter code between gRPC call and the underlying service.
- The module gets bundled into a plugin file, which the Router manages as a sub-process.
How Cosmo Plugins connect LLMs and APIs
This architecture means LLMs interact with GraphQL APIs through the federated graph’s schema — the schema is a typed contract that tooling can expose to the model, so it can author operations against one interface instead of stitching together dozens of service‑specific clients Once an operation is in flight, the Cosmo Router plans and executes it across subgraphs and plugins, so the model reasons about capabilities, not deployment topology.
For the plugin lifecycle management, we're using go-plugin from Hashicorp. It is a battle-tested library being used by projects like Terraform, Consul, and Vault. Because each plugin runs as a separate subprocess, panics are isolated and don't affect the main process.
Another huge benefit of this approach is that it allows you to build truly modular monoliths. Since each plugin is isolated in its own subprocess, multiple plugins can form a modular monolith, but each plugin is completely independent. One of the big issues with monoliths is that it's hard to split them into smaller parts when the business requires it. This comes down to the fact that oftentimes, code between different parts of the monolith is tightly coupled.
With Cosmo Plugins, you can build a graph of APIs, but the deployment strategy is completely up to you. You can deploy everything as a single monolith, keeping all plugins in a single codebase, but you could also deploy each plugin as a separate service, distributing the code base across multiple repositories.
The beauty of this approach is that you can start with a monolith, and if you really really need to, there's an easy path to move some functionality to a separate service.
You might be wondering how this approach could handle dependencies between plugins. What if one service needs some data from another one? Such a problem is already solved by the underlying Federation protocol. You can define dependencies between services in a declarative way, composition will ensure that the dependencies are valid and resolvable, and the Router will take care of resolving them at runtime.
How LLMs interact with a federated graph
At runtime the loop is deliberately boring—in a good way.
- The agent (or IDE assistant) grounds itself in the published schema: what types exist, which entry points are safe to call, and how fields connect across services.
- It authors a GraphQL document for the user’s intent—often starting from natural language and ending in a concrete operation the router can validate.
- The Cosmo Router normalizes and plans the request, then fans execution out to the right subgraphs and Cosmo Plugins, collecting partial results until the response shape matches what the schema promised.
That mechanical split is why federation pairs well with LLMs: the model’s job stays bounded to “one operation at a time,” while the platform handles cross-service orchestration the same way it would for a mobile client.
Example: using Cosmo Plugins with LLMs to build an API
Step 1: initialize a new router and plugin
First, you need to initialize a new project:
This will create a complete setup including Router and a plugin that implements a hello-world module.
Step 2: open the plugin in your AI IDE
Next, you have to open Cursor (or another AI coding IDE like Windsurf or Claude Code) in the plugin directory (./cosmo/plugins/hello-world). If you're in the right directory, it will automatically pick up the generated Cursor Rules we've prepared to make AI-coding as easy as possible.
Step 3: let the LLM implement the plugin
Finally, tell Cursor to implement your plugin. You can also modify the schema.graphql file if you'd like to add more fields or types.
For more in-depth instructions on how to use Cosmo Plugins, check out the Cosmo Plugins documentation .
If you'd like to add another module, just run the init command again, add a second module with a different name, and update the Router configuration to compose it with the first one. That's it, you've just built your first modular monolith with LLMs.
Conclusion: why LLM-native backends need federated graphs
We believe that frameworks focused on leveraging LLMs will continue to grow in usage. LLMs are very powerful for code generation, but they also have limitations. Not every problem is a good fit for LLMs, especially when the scope of each individual problem is too large.
With Cosmo Plugins, we're just scratching the surface of what's possible. So we're really excited to see what you'll build with it. Please share your thoughts and feedback with us on Discord .
If you'd like to learn more about Cosmo Plugins, check out the Cosmo Plugins documentation .
If you'd like to learn more about the Router, check out the Router documentation .