Settings

Theme

Saving $750K on AI inference with one line of code and no quality loss

notdiamond.ai

8 points by t5-notdiamond a year ago · 3 comments

Reader

pinkbeanz a year ago

This is neat -- how would you think about evaluating the quality loss as you change to more efficient models? I saw you did an analysis on the number of messages, but wondering if there's more robust methods?

  • t5-notdiamondOP a year ago

    In offline training of our router, we run extensive cross-domain evaluations to determine when a smaller model can handle a request without any quality loss relative to more powerful models. In an online setting like our chat app, there's probably some more rigorous post-hoc analysis we could do on response quality—could make for a good follow-up post.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection