Remove all vector paths from PDF

I am looking for a way to remove all objects path

from a PDF file.

I suspect it can probably be done with javascript in Adobe Acrobat, but really appreciate the hint with the ghostscript or mupdf tools.

Anyway any working solution is acceptable as the correct answer

+3


source to share


1 answer


To do this with Ghostscript, you will have to change the pdfwrite device. In fact, you probably have to do something like this for any PDF interpreter.

What do you consider the "path" object? For example, shfill? How about text? How about text using type 3 font (which creates paths)?

How about clip paths?

If you really want to do this, I can tell you where to change the pdfwrite, if you don't mind recompiling Ghostscript.

This is probably a dumb question, but why would you want to do this? Is it possible that there might be another solution to your problem? If all you want to do is delete filled paths (or even stroked paths). One solution would be to run the file through ps2write to get PostScript, add code to override "fill" and "stroke" as no-ops, and then run the file back through pdfwrite to get the PDF.

[Added after reading comments]

PDF does not have a "path" object, unlike an XObject, which is an object type. Paths are created by a number of operations such as "newpath", "moveto", "curveto", and "lineto". After you have built the path, you will work with it using "fill" or "stroke". Note that PDF also has no "text" object type.

This is why your approach doesn't work, you can't delete "path objects" because they don't exist, paths are created in the content stream. You can use a Form XObject to do something like this, but then the path building is in the form's content stream, it is still not a separate object.

The same goes for PostScript, these are NOT any object oriented languages. You cannot "discover a vector type path object" in any language because there are no objects. In practice, everything that is not an image is a vector object and is built from a path (and with clipping, even some images can be considered as paths)

The selected PostScript slice adds a rectangle to the path (paths do not need to be contiguous in PDF or PostScript) and then fills it. Note that, as is usually the case with PostScript, they do not directly use PostScript statements, but rather execute procedures that use statements. The procedures are defined in the prologue of the program.

By the way, it looks like you used the pswrite device here (can't be sure of such a small sample). If so, you really want to start with ps2write. Otherwise, you end up with a huge amount of things degenerating into tiny filled rectangles (pswrite does this with many types of images)



I didn't suggest you "decrypt" the ps2write output (it's not encrypted, it's compressed).

I suggested creating a PostScript file, overriding the "show" and / or "fill" statements so they did nothing, and then running the resulting PostScript program back through Ghostscript using the pdfwrite device. This will create a PDF file that ignores all ironed and / or filled objects.

[final addition]

I took your sample file and looked at it.

I am guessing the error you are seeing is that the PDF file is using color / separator (of course it cannot help but fill the rectangle) with the ICCBate alternative and not display the space in the device. In this case, the current version of ps2write may solve your problem. This (currently related to change) does not preserve / separate colors and instead emits them as device color, default RGB. Therefore, simply converting files to PostScript and back to PDF can completely solve your problem.

If you knew what the problem was, it would be faster if you told us that I could give you this information and a workaround first.

Using ps2write, I then created a PostScript version of the file (note that the Separations are now RGB) and prefix the PostScript program with two lines:

/fill {newpath} bind def
/stroke {newpath} bind def

      

Note that you must use an editor that saves the binary. Then, when I run this PostScript program through Ghostscript with the pdfwrite device, I get a PDF with the green "decoration" that I am having a problem with disappearing.

So there is a solution to your question and perhaps a better way to solve your problem.

+6


source







All Articles