ITextSharp PDF Reading high level text (annotation highlighting) using C #

Question

ITextSharp PDF Reading high level text (annotation highlighting) using C #

I am developing a C # winform application that converts pdf content to text. All required content is retrieved except for the content found in the selected pdf text. Please help to get a working sample to extract the selected text to pdf. I am using iTextSharp.dll in a project

+1

pdf itextsharp pdf-scraping

Binod Apr 28 14 at 13:31

source to share

1 answer

Bruno lowagie · Answer 1 · 2014-04-28T13:53:17+0000

Assuming you are talking about comments. Try the following:

for (int i = pageFrom; i <= pageTo; i++) {
    PdfDictionary page = reader.GetPageN(i);
    PdfArray annots = page.GetAsArray(iTextSharp.text.pdf.PdfName.ANNOTS);
    if (annots!=null)
        foreach (PdfObject annot in annots.ArrayList) {
            PdfDictionary annotation = (PdfDictionary)PdfReader.GetPdfObject(annot);
            PdfString contents = annotation.GetAsString(PdfName.CONTENTS);
            // now use the String value of contents
        }
    }
}

This is written from memory (I'm a Java developer, not a C # developer).

ITextSharp PDF Reading high level text (annotation highlighting) using C #

More articles: