parsing - PDFClown - Highlight words in PDF -
i have requirement search set of strings in pdf, if found hightlight them, followed example below link read pdf page , extract text highlight words in pdf can parse pdf , extract text // 2.1. extract page text! map> textstrings = textextractor.extract(page);
reading text, issue is, have 2 paragraphs columns in pdf page, , extracted string "textstrings" shows 1st 3 lines read 1st column(1 para) , 2nd 3 lines read 2nd column(2nd paragraph), not correct, there way, make parser read first paragraph completely, 2nd paragraph, if has index references section below third paragraph.
appreciate kind of help!
thanks!
Comments
Post a Comment