Settings

Theme

Show HN: Open-source PaperBanana – academic diagrams from text via agents

github.com

1 points by dippatel1994 a month ago · 0 comments · 1 min read

Reader

The PaperBanana paper (arXiv:2601.23265) from Google Cloud AI Research and PKU describes a multi-agent framework for generating publication-ready academic illustrations from text. The official code hasn't been released yet, so I implemented it from the paper.

The pipeline chains 5 agents: a Retriever that selects reference diagrams, a Planner that generates a textual description, a Stylist that refines for visual aesthetics, a Visualizer that renders the image (Gemini for diagrams, Matplotlib for plots), and a Critic that evaluates and triggers iterative refinement.

Uses Google Gemini as the default backend (free tier works). Ships with an MCP server so you can use it directly from Claude Code or Cursor. Quick start: pip install -e ".[dev,google]" then paperbanana generate --input method.txt --output diagram.png

Repo: https://github.com/llmsresearch/paperbanana

This is an unofficial reimplementation and will differ from the original system. I plan to link to the official release once it drops. Happy to answer questions about the architecture or prompt engineering decisions.

No comments yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection