Wkhtmltopdf segfault on startup from python

I need to run wkhtmltopdf

from python using subprocess.call (...). From the command line, I can generate PDF without issue, but when run from python, it fails with segfault.

I don't know what causes the wkhtmltopdf

segfault to occur.

I even tried sending my terminal attributes env

, but it segfaults anyway. I sent stderr, stdin, stdout but nothing works. My concern is that it works from the terminal, but not from python.

Also, calling a process from another process in python makes it segfault too. For example, I added a script in between to invoke this application, and the script written in python will also get a segfault from wkhtmltopdf.

#!/bin/env python
import subprocess
import sys
import pdb
import os


sys.argv[0] = "/usr/local/bin/wkhtmltopdf.b"

sys.argv.remove('--quiet')

status = subprocess.call(sys.argv,
    env=env,
    stdin=sys.stdin,
    stdout=open("/tmp/stdout.w", "w"),
    stderr=open("/tmp/stderr.w", "w"))

cmd = " ".join(sys.argv)

pdb.set_trace()

      

I am currently doing this to get the time to execute the command in an external terminal. OpenErp checks the contents of the pdf file. wkhtmltopdf.b

is the original binary. I removed the silent parameter as I wanted to see what was going on.

At this point, he appears to be segfaults:

Loading pages (1/6)
[======>                                                     ] 10%

      

And nothing else

My wkhtmltopdf amd64 version is static from wkhtmltopdf.org website

$ wkhtmltopdf -V
wkhtmltopdf 0.12.1 (with patched qt)

      

I am running one of the ubuntu amd64 binary packages in my gentoo box. It's hard / long to get wkhtmltopdf compiled with the patched qt on gentoo, apparently it is not supported by default. However, since it works from the command line, it must also run from python.

I run it from zsh, but even if inside my python program I would call something like this:

'/bin/sh -c "%s"' % command

      

It will also segfault.

+3


source to share


2 answers


I had the same problem as yours, but worked through a different stack (Apache and PHP), but I'm not 100% sure how you start your python. Anyway, it crashed in exactly the same place as yours and it worked fine from the command line, so I guess it might be worth sharing if it helps anyone;)



I found out that the problem was that I had a problem setting ulimit differently when it was going through apache then shell. In particular, my "virtual memory" ulimit -v was pretty low. I end up doing $ cmd = "ulimit -v 1073741824; {$ this-> wkhtmltopdf_path} ...." and that solves my problem! (You can check with running ulimit -a and compare values ​​from the same command on the shell!)

+1


source


Try passing your HTML string through stdin. Here's an example followed by a download answer.

from subprocess import Popen, PIPE, STDOUT
from django.core.files.temp import NamedTemporaryFile
from django.template.loader import render_to_string
from django.http import HttpResponse

tmp = NamedTemporaryFile()
html = render_to_string('your-template.html', context)
p = Popen(['wkhtmltopdf', '-', tmp.name], stdout=PIPE, stdin=PIPE, stderr=STDOUT)
out, err = p.communicate(input=(html + u'\n').encode('utf-8'))
# check for errors in 'out' and 'err' -- print out, err
with open(tmp.name, 'r') as pdf:
    pdfcontent = pdf.read()
response = HttpResponse(pdfcontent, content_type='application/pdf')
response['Content-Disposition'] = 'attachment; filename=print.pdf'
response['Content-Length'] = len(pdfcontent)
return response

      



You will need to use full static URLs in your templates to prevent wkhtmltopdf

static CSS and JS files from being found.

0


source







All Articles