Settings

Theme

Ask HN: How do you integration-test AI / LLMs?

1 points by tom1337 a month ago · 0 comments · 1 min read


We’re currently debating how to properly test integrations with external LLMs in our software.

At the moment, all LLM calls are mocked in unit and end-to-end tests. Recently, we updated a model version and it started returning responses that no longer matched our expected schema (we validate outputs strictly) and thus returned in errors. Because all tests used mocked responses, this only surfaced in production.

In hindsight, a simple integration test that sends a real (non-mocked) request to the LLM provider would probably have caught this.

One idea is to have a test suite which sends each prompt to the LLM provider and checks whether the response matches the expected schema. This has its own issues since LLMs are inherently nondeterministic and these tests might be flaky, but I’m currently lacking better ideas.

Curious to hear how others approach this.

No comments yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection