Convert XFA PDF to Image with PDFBox

Is there a way to convert XFA PDF to a set of images (png or jpeg) with Apache PDFBox ?

I am using version (1.8.6) which should have XFA support.

The PDF file to convert is a dynamic form > . Converting a static PDF form does not pose any problems.

I succeeded in PDFBox V 1.8.6 by calling the method Page.convertToImage()

.

My attempt at XFA PDF resulted in this image:

enter image description here

Here is the code I used to test the PDF conversion:

public void convertToImages(File sourceFile, File destinationDir){
    if (!destinationDir.exists()) {
        destinationDir.mkdir();
        System.out.println("Folder Created -> "+ destinationDir.getAbsolutePath());
    }
    if (sourceFile.exists()) {
        System.out.println("Images copied to Folder: "+ destinationDir.getName());             
        PDDocument document = null;
        try {
            //The classical way to lod the PDF document doesn't work here
            //document = PDDocument.load(sourceFile);
            File scratch = new File(destinationDir, "scratch");
            if(scratch.exists())scratch.delete();
            document = PDDocument.loadNonSeq(sourceFile, new RandomAccessFile(scratch, "rw"));
            //Doesn't seem to have an effect in my case but I keep it ;-)
            document.setAllSecurityToBeRemoved(true);
            @SuppressWarnings("unchecked")
            List<PDPage> list = document.getDocumentCatalog().getAllPages();
            System.out.println("Total files to be converted -> "+ list.size());

            String fileName = sourceFile.getName();
            int pos = fileName.lastIndexOf('.');
            fileName = fileName.substring(0, pos);
            int pageNumber = 1;
            for (PDPage page : list) {
                File outputfile = new File(destinationDir, fileName +"_"+ pageNumber +".png");
                try {
                    BufferedImage image = page.convertToImage();
                    ImageIO.write(image, "png", outputfile);
                    pageNumber++;
                    if(outputfile.exists()){
                        System.out.println("Image Created -> "+ outputfile.getName());
                    } else {
                        System.out.println("Image NOT Created -> "+ outputfile.getName());

                    }
                } catch (Exception e) {
                    System.out.println("Error while creating image file "+ outputfile.getName());
                    e.printStackTrace();
                }
            }
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } finally {
            if(document != null){
                try {
                    document.close();
                } catch (IOException e) {
                    // TODO Auto-generated catch block
                    e.printStackTrace();
                }
            }

        }
        System.out.println("Converted Images are saved at -> "+ destinationDir.getAbsolutePath());
    } else {
        System.err.println(sourceFile.getName() +" File does not exists");
    }
}

      

Is there anything special about this case?

I tried GIMP, even though it should be used on the server, but it doesn't work with dynamic PDF.

I also tried ImageMagick but it didn't work at all. As I would be very surprised that he could solve everything, I gave up further research.

+3


source to share





All Articles