Settings

Theme

Ask HN: Is there an OCR that might be able to handle field datasheets?

1 points by clamlady a year ago · 3 comments · 1 min read


I am an ecologist looking for OCR that can take .pdf scans of my Rite-in-the-Rain field notebooks (which are sometimes quite dirty) with length measurements and extract them. I've tried tesseract in R, but it doesn't handle them well. I plan on using this as an additional QC step after I enter them by hand. Thanks in advance!

_bc2z a year ago

I've also tried tesseract in the past with handwritten notes, which didn't provide very accurate results. Then I started looking into some commercial solutions and stumbled upon many different tools, but the only one that could handle my handwriting was Klippa DocHorizon: https://www.klippa.com/en/ocr/ It uses machine learning and OCR instead of just plain OCR like tesseract does, so it might be an option to look into. You could also test it out at https://www.klippa.com/en/ocr/tools/

I've been using it for a while and would highly recommend it. hopefully it can work out for your use case

solardev a year ago

In my limited experience, Google Cloud Vision API was much better than Tesseract: https://cloud.google.com/vision#demo

atsaloli a year ago

have you tried ai chatbots? they are pretty good at ocr nowadays

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection