Show HN: DankGPT – Chat with Your Documents
dankgpt.comUses hybrid semantic search (combination of dense embeddings and sparse vectors) to retrieve high quality answers across your documents.
Features
- Significantly faster than competition (Process a 200 page PDF in <5s)
- Much better answer quality
- Fast summarization tool
- Beta API for end to end extractive document QA (hello@dankgpt.com)
Try it out (no login)
- Llama 2 paper https://www.dankgpt.com/chat/346f444d-e286-4671-b157-540f4cb...
- Scott Aaronson Quantum Information Science lectures https://www.dankgpt.com/chat/cc491d72-dc7b-4ace-8e26-60026ae...
- Berkshire 2022 Annual Report https://www.dankgpt.com/chat/068bf85f-b372-46a4-a164-6096f8c...
Why not host it yourself?
- You definitely can! DankGPT is intended as a quick way to ask questions about a research paper, or help students with answering questions from their lecture slides, with an easy way to share your chatbot. I'm so confused. It seems like a joke (DankGPT and mentions of GPT5) but then it actually works. Is it just a tiny wrapper on top of langchain meant to poke fun at all the thin API-wrapper startups? Nope, it’s a serious project; I mostly made it for personal use during my last semester of college. I rewrote it a few times and packaged it up because I think it’s genuinely useful. Langchain gets you 80% of the way there but you run into issues with it very quickly. What is GPT5? What is different about this than the other "chat with your documents" things? I've seen so many, even open source ones. What happens to documents uploaded? Can you access them? Are they used in later training? Documents actually never get uploaded! PDF text extraction happens on the client using a web worker and MuPDF compiled to WASM. 1. PDF parsed and chunked on the client 2. Sparse vectors are regenerated for the entire document corpus and the existing vectors are updated 3. Dense vectors are generated for the new text and upserted along with the new sparse values The original documents stay on your device. Yeah but what is GPT5? do you have any plans to open source this? GPT5?