Settings

Theme

LDB: Large Language Model Debugger via Verifying Runtime Execution Step by Step

github.com

2 points by panqueca 2 years ago · 2 comments

Reader

panquecaOP 2 years ago

HumanEval Benchmark: 95.1 @ GPT-3.5

I wonder if it can be combined with projects like SWE-Agent to build powerful yet opensource coding agents.

- https://paperswithcode.com/sota/code-generation-on-humaneval

- https://github.com/princeton-nlp/SWE-agent

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection