Executing "pdftk my-pdf-form.pdf dump_data_fields" shows nothing

I am using a tool pdftk

, I have an editable PDF, and I saw in the documentation that the arguments dump_data_fields

should show me the form fields.

I am using this command (Windows): pdftk my-pdf-form.pdf dump_data_fields

I am using the server version pdftk

.

Documentation: https://www.pdflabs.com/docs/pdftk-man-page/

The fact is that the PDF file is editable, it has fields for writing using the Adobe PDF Viewer.

+5


source to share


4 answers


The problem was that the pdf was created by Adobe LiveCycle Designer and was saved as "Adobe Dynamic XML From". The solution saves the file as "Adobe Static PDF Form". Perhaps pdftk cannot deal with these livecycle files.



+7


source


I thought the accepted answer might be my solution, but it turns out that the PDF document I was working with didn't actually create the form fields. If the document looks like a form, but the form fields are not grayed out, then no fields will be detected.



The only way to solve this problem is to open the document in Acrobat Pro and add fields through its form tool. Then pdftk worked fine.

+1


source


If you encounter OP issue on Windows, follow the instructions below.

1- Open GUI PDFtk program. (You can also use cli if you like)

extracting pdf fields using pdftk on Windows

2- Click the "Add PDF ..." button and find the finished PDF file to fill.

extracting pdf fields using pdftk on Windows

3- Scroll down to the bottom of the PDFtk GUI window and click "Create PDF ..." without adding or changing any settings.

extracting pdf fields using pdftk on Windows

4- Save the new PDF file ready to fill with a new name in a directory of your choice

extracting pdf fields using pdftk on Windows

5- Finally, enter the Windows version of the dump_data_fields command using cmd like this (note how "output" is used instead of ">")

extracting pdf fields using pdftk on Windows

6- Open the text file "fields.txt" and you will see the field names. An example is shown below.

extracting pdf fields using pdftk on Windows

0


source


I don't know if this helps, but I wrote C # code to count data fields in a document. Please check out the following features.

  1. Here we are passing the file path to the file and it counts the total number of fields in the document.

    public int countDataFields(string inputFile)
    {
        int fieldCount = 0;
        string arguments = "";
    
        using (Process newProcess = new Process())
        {
            arguments = inputFile + " dump_data_fields";
            newProcess.StartInfo = new ProcessStartInfo("pdftk ", arguments);
            newProcess.StartInfo.RedirectStandardInput = true;  
            newProcess.StartInfo.RedirectStandardOutput = true;
            newProcess.StartInfo.RedirectStandardError = true;
            newProcess.StartInfo.UseShellExecute = false;
            newProcess.StartInfo.CreateNoWindow = false;
            newProcess.Start();
    
            while (!newProcess.StandardOutput.EndOfStream)
            {
                var line = newProcess.StandardOutput.ReadLine();
                fieldCount = fieldCount + 1;
            }
    
            Console.WriteLine("Field Counts: " + fieldCount);
            newProcess.WaitForExit();
        }
    
        return fieldCount;
    }
    
          

  2. If you want to stream a file via standard input

    public void countDataFieldsWhenFilePassedAsBinaryStream(string file1)
    {
        int fieldCount = 0;
        // initialize the binary reader and open the binary reader with the file stream of the incoming file.
        BinaryReader binaryReader = new BinaryReader(File.Open(file1, FileMode.Open, FileAccess.Read));
    
        //create a buffer array of 1024.
        byte[] buffer = new byte[1024];
    
        using (Process newProcess = new Process())
        {
            newProcess.StartInfo = new ProcessStartInfo("pdftk");
            newProcess.StartInfo.Arguments = @" - dump_data_fields";
            newProcess.StartInfo.UseShellExecute = false;
            newProcess.StartInfo.RedirectStandardInput = true;
            newProcess.StartInfo.RedirectStandardOutput = true;
            newProcess.Start();
    
            int bytesRead = 0;
    
            // we are reading the binary files in chunks of 1024 bytes
            // we loop through as long as the byte read is greater than 0
            while ((bytesRead = binaryReader.Read(buffer, 0, 1024)) > 0)
            {
                //  we write the standard input bytes into the buffer.
                newProcess.StandardInput.BaseStream.Write(buffer, 0, bytesRead);
            }
    
            //closing the binaryReader
            binaryReader.Close();
    
            //closing the standard input stream
            newProcess.StandardInput.Close();
    
            // here we are going to loop through the standard output stream till the eof. we are counting the
    
            while (newProcess.StandardOutput.EndOfStream == false)
            {
                //read the line;
                newProcess.StandardOutput.ReadLine();
                //increment the counter
                fieldCount++;;
            }
    
            // console writeline the field count.
            Console.WriteLine(fieldCount);
    
            newProcess.WaitForExit();
        }// end of using
    }// end of function convertPDFToStandardInput
    
          

Hope this helps :)

0


source







All Articles