agents is a Rust toolkit for building typed agent systems and evaluating them.
It includes:
agentsfor talking to LLMs, managing sessions, tools, context, and storageevalsfor defining suites with evaluation trajectories and grading with predicates and judgescargo-evalsfor listing and running eval suitescodemodefor embeddable JavaScript execution and code search
This README does two things:
- show the shortest path to your first agent
- show the shortest path to your first eval
Getting Started
Building Agents
To start building agents, add agents to your deps:
Start with a simple string-in, string-out agent (the default):
use std::sync::Arc; use agents::{Agent, LlmRunner, SessionAgent}; #[derive(Agent)] struct BasicAgent(SessionAgent<String, (), (), String>); impl BasicAgent { pub fn new(llm: Arc<LlmRunner>) -> anyhow::Result<Self> { let agent = SessionAgent::builder() .with_llm_runner(llm) .build()?; Ok(Self(agent)) } }
And run one turn:
let mut agent = BasicAgent::new(llm)?; let reply = agent.call("hello world".to_string()).await?;
When you want stricter contracts, switch to typed input and output:
use std::sync::Arc; use agents::{Agent, InputItem, LlmRunner, SessionAgent}; use schemars::JsonSchema; use serde::{Deserialize, Serialize}; #[derive(Clone, Debug, Serialize, Deserialize)] struct EchoRequest { text: String, } impl From<EchoRequest> for InputItem { fn from(value: EchoRequest) -> Self { InputItem::user_text(value.text) } } #[derive(Clone, Debug, Serialize, Deserialize, JsonSchema)] struct EchoResponse { text: String, } #[derive(Agent)] struct TypedAgent(SessionAgent<EchoRequest, (), (), EchoResponse>); impl TypedAgent { fn new(llm: Arc<LlmRunner>) -> anyhow::Result<Self> { let agent = SessionAgent::builder() .with_llm_runner(llm) .with_message_type::<EchoRequest>() .with_response_type::<EchoResponse>() .build()?; Ok(Self(agent)) } }
It is a good idea to make your agents take an LlmRunner as a construction
parameter, since the evals framework provides one for you to run tests with
many providers and models.
Evaluating Your Agents
Once you have an agent, you'll want to evaluate how it performs. To do this we'll use the evals crate,
which lets us author evals in plain rust code like:
use agents::SessionAgent; use anyhow::Result; use evals::{EvalContext, Trajectory, assistant, eval, predicate, suite, trajectory, user}; type StringyAgent = SessionAgent<String, (), (), String>; // We set up suites, with a factory for creating the Agent Under Test. #[suite( kind = "regression", agent = new_agent )] async fn new_agent(ctx: EvalContext<()>) -> Result<StringyAgent> { Ok(SessionAgent::builder() .with_llm_runner(ctx.llm_runner()) .build()?) } // We define trajectories for evaluating interactions with the agent #[eval( agent = StringyAgent, desc = "echoes input", tags = ["smoke"], timeout = "30s" )] async fn smoke(_ctx: EvalContext<()>) -> Result<Trajectory<StringyAgent, ()>> { Ok(trajectory![ user!("hello world"), assistant!(predicate("echoes-input", |trial, _ctx| async move { let reply = trial.final_reply.expect("reply"); Ok(evals::GradeResult { score: if reply == "hello world" { 1.0 } else { 0.0 }, summary: "agent should echo the input".to_string(), evidence: serde_json::json!(reply), }) })), ]) }
To get started we'll need to install evals and do some setup:
# Install the evals library cargo add evals cargo add evals anyhow --build # Install the `cargo evals` command cargo install cargo-evals # Initialize evals in your project cargo evals init
Next we add to a build-step that makes the evals build with cargo build:
// file: build.rs fn main() -> anyhow::Result<()> { evals::build()?; Ok(()) }
And then we make our evals discoverable by adding them to your src/lib.rs:
// file: src/lib.rs evals::setup!();
Finally you configure at least one target in your new evals.toml file:
[[evals.targets]] provider = "ollama" model = "llama3.2:3b"
Then run:
cargo evals list cargo evals run
At that point you have:
- a working agent
- a discovered eval suite
- artifacts under
.evals/