The Car Wash Problem: A variable isolation study on prompt architecture

2 points by midmost44 3 months ago · 1 comment · 1 min read

Most AI products inject facts and hope reasoning follows. But intelligence is not measured by how much a model holds in its context window. It is measured by knowing to pick up the keys before leaving the house.

Last week, the "Car Wash problem" (50m away, walk or drive?) went viral here on HN. Every major LLM failed because they missed the implicit physical constraint: the car must be there. While testing InterviewMate's prompt architecture, I posed the same question. It answered drive immediately. Every other LLM had failed. But I didn't actually know why it worked — so I ran a variable isolation study to find out. 100 API calls, Claude Sonnet 4.5, 5 conditions:

Baseline (no prompt): 0% Role only: 0% Context injection (user profile, car location): 30% Structured reasoning (STAR framework): 85% Full stack (both combined): 100%

Throwing facts at the model doesn't work unless the architecture forces it to explicitly evaluate the task goal first. Without structure, the model jumps straight to the distance heuristic: "100m is short, walk." I'm writing a paper on this. Wanted to share the raw data with HN first. Code and raw eval data: https://github.com/JO-HEEJIN/interview_mate/tree/main/car_wash

gus_massa 3 months ago

Clicky https://github.com/JO-HEEJIN/interview_mate/tree/main/car_wa...

Settings

The Car Wash Problem: A variable isolation study on prompt architecture

Keyboard Shortcuts