Settings

Theme

Show HN: DankGPT – Chat with Your Documents

dankgpt.com

17 points by rawsh 2 years ago · 9 comments · 1 min read

Reader

Uses hybrid semantic search (combination of dense embeddings and sparse vectors) to retrieve high quality answers across your documents.

Features

- Significantly faster than competition (Process a 200 page PDF in <5s)

- Much better answer quality

- Fast summarization tool

- Beta API for end to end extractive document QA (hello@dankgpt.com)

Try it out (no login)

- Llama 2 paper https://www.dankgpt.com/chat/346f444d-e286-4671-b157-540f4cb...

- Scott Aaronson Quantum Information Science lectures https://www.dankgpt.com/chat/cc491d72-dc7b-4ace-8e26-60026ae...

- Berkshire 2022 Annual Report https://www.dankgpt.com/chat/068bf85f-b372-46a4-a164-6096f8c...

Why not host it yourself?

- You definitely can! DankGPT is intended as a quick way to ask questions about a research paper, or help students with answering questions from their lecture slides, with an easy way to share your chatbot.

iamjackg 2 years ago

I'm so confused. It seems like a joke (DankGPT and mentions of GPT5) but then it actually works. Is it just a tiny wrapper on top of langchain meant to poke fun at all the thin API-wrapper startups?

  • rawshOP 2 years ago

    Nope, it’s a serious project; I mostly made it for personal use during my last semester of college. I rewrote it a few times and packaged it up because I think it’s genuinely useful. Langchain gets you 80% of the way there but you run into issues with it very quickly.

KomoD 2 years ago

What is different about this than the other "chat with your documents" things? I've seen so many, even open source ones.

mutant 2 years ago

What happens to documents uploaded? Can you access them? Are they used in later training?

  • rawshOP 2 years ago

    Documents actually never get uploaded! PDF text extraction happens on the client using a web worker and MuPDF compiled to WASM.

    1. PDF parsed and chunked on the client

    2. Sparse vectors are regenerated for the entire document corpus and the existing vectors are updated

    3. Dense vectors are generated for the new text and upserted along with the new sparse values

    The original documents stay on your device.

jaequery 2 years ago

do you have any plans to open source this?

bbstats 2 years ago

GPT5?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection