A personal assistant server and client.
toi
An extensible personal assistant HTTP server with a simple REPL chat client.
- For details on the server and client, see
toi_server/README.mdandtoi_client/README.md, respectively. - To extend the assistant, see
CONTRIBUTING.md.
Requirements
I developed and tested this project using a single NVIDIA RTX 2080 and WSL. As such, this project and the default models provided in the Docker Compose files are intended to run on a commercially available GPU with at least 8GB of VRAM. That isn't to say this project will not work natively on Windows, with CPUs, or even with GPUs with less VRAM; I simply have not tested it with those variations.
Quickstart
-
Run the server using the provided Docker Compose file:
You can configure runtime environment variables using a local
.envfile. As an example, you can change the build target and log level with an.envfile with the following contents:RELEASE=true RUST_LOG=info,tower_http=trace
-
Install the client binary:
-
Start an interactive REPL session using the client binary:
Motivation
In addition to wanting to learn some of the dependencies I used in this project, I've been thinking about making a self-hosted personal assistant that I could use and easily extend myself for a while now. Recently, there's been a flurry of AI tool usage articles, followed by the announcement of the Model Context Protocol (MCP), and now MCP servers are popping-up everywhere. Eventually, I couldn't resist the intrusive thought of "well, you could just build type-safe tools using plain ol' HTTP endpoints, OpenAPI schemas, and JSON Schemas". And so that's what this is.
Non-goals
This project is largely a learning exercise and a proof of concept. As such, the following (and probably other things) are out of scope:
- Support for multiple users or tenants
- Additional tool calling -like endpoints similar to the
/assistantendpoint - UIs beyond the provided REPL
