Convert XFA PDF to Image with PDFBox
Is there a way to convert XFA PDF to a set of images (png or jpeg) with Apache PDFBox ?
I am using version (1.8.6) which should have XFA support.
The PDF file to convert is a dynamic form > . Converting a static PDF form does not pose any problems.
I succeeded in PDFBox V 1.8.6 by calling the method Page.convertToImage()
.
My attempt at XFA PDF resulted in this image:
Here is the code I used to test the PDF conversion:
public void convertToImages(File sourceFile, File destinationDir){
if (!destinationDir.exists()) {
destinationDir.mkdir();
System.out.println("Folder Created -> "+ destinationDir.getAbsolutePath());
}
if (sourceFile.exists()) {
System.out.println("Images copied to Folder: "+ destinationDir.getName());
PDDocument document = null;
try {
//The classical way to lod the PDF document doesn't work here
//document = PDDocument.load(sourceFile);
File scratch = new File(destinationDir, "scratch");
if(scratch.exists())scratch.delete();
document = PDDocument.loadNonSeq(sourceFile, new RandomAccessFile(scratch, "rw"));
//Doesn't seem to have an effect in my case but I keep it ;-)
document.setAllSecurityToBeRemoved(true);
@SuppressWarnings("unchecked")
List<PDPage> list = document.getDocumentCatalog().getAllPages();
System.out.println("Total files to be converted -> "+ list.size());
String fileName = sourceFile.getName();
int pos = fileName.lastIndexOf('.');
fileName = fileName.substring(0, pos);
int pageNumber = 1;
for (PDPage page : list) {
File outputfile = new File(destinationDir, fileName +"_"+ pageNumber +".png");
try {
BufferedImage image = page.convertToImage();
ImageIO.write(image, "png", outputfile);
pageNumber++;
if(outputfile.exists()){
System.out.println("Image Created -> "+ outputfile.getName());
} else {
System.out.println("Image NOT Created -> "+ outputfile.getName());
}
} catch (Exception e) {
System.out.println("Error while creating image file "+ outputfile.getName());
e.printStackTrace();
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {
if(document != null){
try {
document.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
System.out.println("Converted Images are saved at -> "+ destinationDir.getAbsolutePath());
} else {
System.err.println(sourceFile.getName() +" File does not exists");
}
}
Is there anything special about this case?
I tried GIMP, even though it should be used on the server, but it doesn't work with dynamic PDF.
I also tried ImageMagick but it didn't work at all. As I would be very surprised that he could solve everything, I gave up further research.
source to share
No one has answered this question yet
Check out similar questions: