Settings

Theme

Show HN: Local-first fast CPU image to text for screenshots, PDFs, webpages

github.com

19 points by mrkn1 6 days ago · 18 comments

Reader

KetoManx64 6 days ago

What's the performance like compared to tesseract? I don't see tesseract mentioned anywhere in the readme, which is surprising considering that's the number one tool most go to for Image > text OCR.

  • mrkn1OP 6 days ago

    No rigorous eval, and I love Tesseract. Here's the example that motivated me to build textsnap (which is in the github's README), parsed with Tesseract:

    https://imgur.com/a/i2eQra8

    • KetoManx64 6 days ago

      Very noticable difference and the exact issue I run repeatedly with tesseract! Definitely going to try dropping textsnap into my scripts now. Thanks!!

abstract257 6 days ago

Curious how it does on multi-page scanned PDFs vs. single screenshots? The ORT vision/decoder split is the part that usually makes or breaks CPU VLM OCR...

lavaman131 5 days ago

This is awesome! Been needing something like this for some research paper diagrams I've been indexing.

vivzkestrel 6 days ago

- how well do you think this ll work with code? i mean take code screenshots and convert it into actual code for vscode

  • mrkn1OP 6 days ago

    Just ran

      textsnap "https://i.ytimg.com/vi/LBNDfxjEYlA/maxresdefault.jpg"
    
    and got this

      $('.count').each(function () {
      $('this').prop('Counter', 0).animate({
        Counter: $('this').text()
      }, {
          duration: 4000,
          easing: 'swing',
          step: 'function (now) {
              $('this").text(Math.ceil(now));
          }
        }); 
      });
monosma 6 days ago

What was the reason for adopting PaddleOCR? Can other OCR models be used as well?

  • mrkn1OP 6 days ago

    No reason other than their Q4 model working reasonably well and fast on my CPU laptop. Should work with any ONNX VLM model

garrett2558 6 days ago

Very cool, I'm building my own local-first product as well

kouru225 6 days ago

Roman alphabet only or does this work with other alphabets?

BIGFOOT_EXISTS 6 days ago

Now this is legit cool, keep up the great work.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection