Settings

Theme

Show HN: Visualizing How Books Reference Each Other Across 3k Years

thiagolira.github.io

5 points by farcaster 4 months ago · 3 comments · 1 min read

Reader

There are two parts for this project:

1) The LLM-powered pipeline to extract citations (books + authors) from books and resolve them using both Wikipedia and Goodreads with offline copies I have. The result is data associating Books/Authors to other Books/Authors with accurate bibliographical information spanning centuries.

2) A WebGPU + D3.js powered visualization tool written by Claude Code so I'm able to deal with all this data on the browser on a more or less comfortable experience for the viewer.

I spent some months on a off with this project, and definitely the most challenging part was dealing with accurate bibliographical information across centuries, with original publication dates and etc. For that I wrote what is now a very complex pipeline with LLMs (I used DeepSeek V3.2) wired on offline Goodreads and Wikipedia databases + a fallback that actually uses the internet.

Hope you enjoy it! Open to suggestions on how to improve the system :)

Code is here: https://github.com/ThiagoLira/bookgraph-revisited

apresmoi 4 months ago

I love this, how much tokens/$ did you spend to be able to extract and link all of them?

  • farcasterOP 4 months ago

    I used DeepSeek V3.2 for everything at $0.25/M input tokens and $0.38/M output tokens.

    I lost count of how many runs I had until I was satisfied with the results. I'd say for the final books I spent +- $10, so some millions of tokens!

gorkermann 3 months ago

this is so cool!

FYI "Tractatus de sphaera, John of Holywood" appears to be wrongly been put in BC

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection