PDFBox getFontSize returns -1

I am using PDFbox

to get font size from PDF files.

I extended PDFTextStripper

and overridden a function writeString

that gives me access to an object TextPosition

.

It works fine for half the time. But in other cases, it returns the font size as "-1". Why is this? This affects the rest of my algorithm.

I've tried functions getHeight

, getHeightDir

and getFontSize

. I get all the same results.

Here's the function writeString

:

@Override
protected void writeString(String string, List<TextPosition> textPositions) throws IOException {
    for (TextPosition text : textPositions) {
        getChar(text);
        writeString(string);
    }
}

      

The function getChar

processes information.

How to fix it? Thanks in advance.

EDIT : I am using PDFBox 2.0.2. My application requires me to convert any file to pdf and then process it with PDFBox. This issue occurs with all table files. I am using Apache POI 3.15 to convert a document to PDF. It works great for doc, docx, ppt, pptx, odt, odp

+3


source to share


1 answer


Since you didn't share the sample document, from your question, here are my findings.

Assuming the PDFBox works fine if getFontSize returns -1, the font size was not set on the source side, i.e. when the PDF was generated. If from your observation the characters for which getFontSize returns -1 are the same size, this can be considered the default size .



If that doesn't help, for a real solution, you can provide any sample pdf as mentioned in the comments of others.

0


source







All Articles