Ghostscript or python: how to combine PDF of different page sizes into pdf with the same page sizes?
I searched stackoverflow for the problem. Closest link:
How to set custom page size with Ghostscript
How to convert multiple PostScript files of different sizes into one PDF?
But this CANNOT solve my problem.
The question is simple. How can we combine multiple PDFs (with different page sizes) into a combined pdf that has all pages the same size.
two input pdf files:
hw1.pdf with one 5.43x3.26 inch page (found from adobe reader)
hw6.pdf with one 5.43x6.51 inch page
PDFs can be found here:
gs -sDEVICE=pdfwrite -r720 -g2347x3909 -dPDFFitPage -o homeworks.pdf hw1.pdf hw6.pdf
PROBLEM: The first PDF is a portrait and the second page is a landscape.
QUESTION: . How can we make a portrait of both pages?
-r720 is ppi.
Size -g2347x3909 found using python script:
wd = int(np.floor(720 * 5.43)) ht = int(np.floor(720 * 3.26)) gsize = '-g' + str(ht) + 'x' + str(wd) + ' ' # this gives: gsize = -g4308x6066
commands = 'gs -o homeworks.pdf -sDEVICE=pdfwrite -dDEVICEWIDTHPOINTS=674 ' +\ ' -dDEVICEHEIGHTPOINTS=912 -dPDFFitPage ' +\ 'hw1.pdf hw6.pdf' subprocess.call(commands, shell=1)
This gives the first portrait of both pages, but they are not the same size.
The first page is smaller and the second fills up when I open the output in adobe reader.
In general, how can we make all pages the same size?
source to share
You tagged this question with "ghostscript", but I'm assuming you are using
so that you don't mind using Python.
The pageographic canvas pdfrw The Python library can do this. There are examples of using pages of different sizes in the example directory and in the pagemerge.py source. Fancy_watermark.py shows an example of working with different page sizes in the context of watermarking.
pdfrw can rotate, scale, or simply position the original pages in the output. If you want rotation or scaling, you can look in the examples directory. (Since this is for homework, for extra credit you can control the scaling and rotation by looking at the different page sizes. :) But if all you want is a second page to be extended to be the same as first, you could do something with this bit of code:
from pdfrw import PdfReader, PdfWriter, PageMerge pages = PdfReader('hw1.pdf').pages + PdfReader('hw6.pdf').pages output = PdfWriter() rects = [[float(num) for num in page.MediaBox] for page in pages] height = max(x - x for x in rects) width = max(x - x for x in rects) mbox = [0, 0, width, height] for page in pages: newpage = PageMerge() newpage.mbox = mbox newpage.add(page) image = newpage image.x = (width - image.w) / 2 image.y = (height - image.h) output.addpage(newpage.render()) output.write('homeworks.pdf')
(Disclaimer: I am the primary author of pdfrw.)
source to share
The reason (in the first example) that one of the pages is rotated is because it is better suited for this. Since Ghostscript is primarily intended as a printing software, it is assumed that you want to print the input. If the output matches a fixed media size, a page setting is requested, and the requested media size is better (i.e., with less scaling) when rotated, then the content will be rotated.
To prevent this, you will need to rewrite the FitPage procedure, which is defined in / ghostpdl / Resource / Init / pdf_main.ps in the procedure
. You can modify this procedure so that it does not rotate the page for a better fit.
In the second case, you have not installed
not), so the size of media requests in PDF files will override the media size that you specified in the command line. This is why the pages don't change. Since the media is then the size requested in the PDF file, the page will fit unchanged, so
it won't do anything. Therefore, you need to install
if you are using
and any FitPage switches.
You are better advised (as a second try) to use
for setting the media size, since they do not depend on the resolution (unlike
), which can be overridden by the input of the PostScript program. You shouldn't tamper with the permission without a good reason, so don't install
Keep in mind that this process is not merging, not merging, but anything else, which implies that the content of the input does not change on the output. You should read the documentation on the subject and understand the process before attempting to use this procedure.
source to share