Show HN: Monitoring your OpenAI usage without a third party
github.comLLM observability is an absolute must-have for anyone running something in prod (or prod-like). While all the observability startups are great, you're essentially sending all your OpenAI usage history - prompts, generations, chats - to a random third party.
So this script deploys a basic proxy in your Azure account, catches all incoming OpenAI requests, stores logs in your own resource group, and comes with visualizations premade (charts, timelines, chat history, cost estimation, etc).
Thanks for any thoughts and feedback! This looks great, although the Azure requirement makes it a no-go for me. I just keep wishing for a simple solution that is easy to self-host locally. After a lot of work I managed to get an instance of Helicone running, but it's made of 6 or 7 different services, each with a pretty large footprint. I feel like there's definitely room for a single-binary tool made in Go with a small sqlite database and in-memory caching. why is Azure a no-go?
- it's the only other place to get an OpenAI service
- you can have dedicated capacity for it if you're big enough and it's worth it
- you can literally turn off public endpoint for it and make it inaccessible, and expose only through a proxy like this
- you can load balance across multiple openai deployments for regional efficiency and rate limit issues (not trying to push for Azure in general, just trying to understand why even in the OpenAI use case, Azure is a no-go) I'm sorry, I should have been clearer -- I meant that I wish this would be a locally self-hostable observability solution for OpenAI-compatible APIs in general, not something that can only be deployed to Azure and is tied specifically to Azure's OpenAI service. For example, Helicone[0] and llm.report[1] work as generic proxies for the OpenAI API and you can deploy them anywhere. Thanks, that's great work. I will use it.