PDFBox getFontSize returns -1
I am using PDFbox
to get font size from PDF files.
I extended PDFTextStripper
and overridden a function writeString
that gives me access to an object TextPosition
.
It works fine for half the time. But in other cases, it returns the font size as "-1". Why is this? This affects the rest of my algorithm.
I've tried functions getHeight
, getHeightDir
and getFontSize
. I get all the same results.
Here's the function writeString
:
@Override
protected void writeString(String string, List<TextPosition> textPositions) throws IOException {
for (TextPosition text : textPositions) {
getChar(text);
writeString(string);
}
}
The function getChar
processes information.
How to fix it? Thanks in advance.
EDIT : I am using PDFBox 2.0.2. My application requires me to convert any file to pdf and then process it with PDFBox. This issue occurs with all table files. I am using Apache POI 3.15 to convert a document to PDF. It works great for doc, docx, ppt, pptx, odt, odp
source to share
Since you didn't share the sample document, from your question, here are my findings.
Assuming the PDFBox works fine if getFontSize returns -1, the font size was not set on the source side, i.e. when the PDF was generated. If from your observation the characters for which getFontSize returns -1 are the same size, this can be considered the default size .
If that doesn't help, for a real solution, you can provide any sample pdf as mentioned in the comments of others.
source to share