Function Calling Harness: Success Rate From 6.75% to 100%, by compiler skills

3 points by samchon a month ago · 1 comment

Reader

samchonOP a month ago

I was personally invited by the Qwen team to speak at Qwen Meetup Korea, and got to present locally here in Korea yesterday — pretty honored to have been reached out to directly.

The talk was about how I got function calling to work reliably on deeply recursive union types — the stuff the industry generally says doesn't work. With `qwen3-coder-next`, first-try success rate was 6.75%. And the entire Qwen 3.5 model family was hitting 0% on union types due to a consistent double-stringify bug. Both ended up at 100%.

Slides (PPT) are also available in the link — speaker notes are written inside as slide notes if you'd like the full narrative behind each slide.

## TL;DR

1. *AutoBe* — AI backend auto-generation agent. Not text code, but AST data via function calling. 4 AST types + 4-tier compiler validation + self-healing loops. 2. *Typia* — The infrastructure that turns 0% into 100%. A single type automates schema, parser, validator, and feedback generator. Lenient JSON parsing + type coercion + precise validation feedback. 3. *In Praise of Function Calling* — Types eliminate ambiguity. Schemas constrain through absence, not prohibition. Model-neutral, mechanically verifiable, deterministically convergent. Applicable to all engineering domains with validators. 4. *Qwen* — Small models are the best QA engineers. They expose system vulnerabilities large models silently paper over. 5. *6.75% is not failure — it's the first input to the loop.* If you can verify, you converge.

Settings

Function Calling Harness: Success Rate From 6.75% to 100%, by compiler skills

Keyboard Shortcuts