Settings

Theme

Show HN: Annotate Images and Scans with BIO-Scheme Using Konfuzio SDK

colab.research.google.com

2 points by konfuzio 5 years ago · 1 comment

Reader

konfuzioOP 5 years ago

Retraining NLP models, like flair, often require the data structure to be in the BIO scheme. For scanned documents or images, we convert visual annotations to the BIO scheme using OCR, we transform the bounding box to the start and end offsets of each annotation and its label. In the new release of our SDK, this conversion can be done using the method get_text_in_bio_scheme() of the Document class.

Find the source code here https://github.com/konfuzio-ai/document-ai-python-sdk/blob/b...

Many other file types are supported. Have a look at https://dev.konfuzio.com/web/api.html#supported-file-types

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection