Don't read the PDF, write the parser
adriacidre.comAuthor here. Worth flagging: this pays back because public hospital reports use the same handful of layouts for years. For a long tail of one-off document types, vision is probably the right call — the parser write cost wouldn't amortize.