Settings

Theme

High performance client for Baseten.co

github.com

7 points by mich5632 6 months ago · 1 comment

Reader

mich5632OP 6 months ago

We wrote a rust py03 client for OpenAI embeddings compatible servers (openai.com, or infinity, TEI, vllm, sglang). Most server-side ML infrastructure auto-scales based on the workload. On embedding workloads, this is no longer the bottleneck and has shifted to the client. In Python, the client is blocked by the global interpreter lock. With the performance package, we release the gil during requests, so you have available resources to query your VectorDB again.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection