Ask HN: Have you reduced costs by caching LLM responses?
Providing a chat bot to a large user base means you spend a lot of money on similar requests. I'm looking for best practices or lessons learned from implementing LLM-enabled apps at scale. Thanks in advance.
No comments yet.