Settings

Theme

Show HN: RAG App Example with Self-hosted Embedding and LLM Services

github.com

5 points by mkandler a year ago · 1 comment

Reader

mkandlerOP a year ago

This is an example of how to build a RAG app on FastAPI with vector embeddings and LLM inference broken out as separate services. Using Runhouse, those services can be hosted on your own infra (A10 GPU on your own AWS, for example).

Hoping that this is helpful for anyone considering ways to scale out components of a more complex RAG application.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection