Settings

Theme

Ask HN: Google Vision vs. AWS Textract vs. Tesseract?

2 points by takkatakka 2 years ago · 0 comments · 1 min read

Reader

I've been playing around with Tesseract to extract text from pdf files and have found it a little disappointing, even after doing some pre-processing to the image (300dpi, binarization, de-skewing).

Does anyone have good experiences with OCR tech for extracting text from PDFs? Especially scenarios where you can't control / improve the initial image.

No comments yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection