PDF print password protected page (ideally if wkhtmltopdf uses a cookie)
I'm trying to print a password-protected PDF page with wkhtmltopdf
, but I can't load the (working) cookie, meaning I always just print the "login" page.
Saving cookie after login
The following code works as expected: if I log in, I can view the correct pages, whether I loaded from a cookie or entered my login details:
class PrintPages(object):
def __init__(self):
...
self.browser = mechanize.Browser()
self.cj = mechanize.MozillaCookieJar()
self.browser.set_cookiejar(self.cj)
self.login("cookies.txt")
def login(self, cookie_jar):
""" Log in, save cookie if doesn't exist. Otherwise, load cookie. """
if os.path.isfile(cookie_jar):
self.cj.load(cookie_jar, ignore_discard=True, ignore_expires=True)
else:
self.browser.open(self.login_url)
self.browser.select_form(name="loginform")
self.browser["username"] = self.username
self.browser["password"] = getpass.getpass()
self.browser.submit()
self.cj.save(cookie_jar, ignore_discard=True, ignore_expires=True)
(cookies.txt)
# Netscape HTTP Cookie File
# http://www.netscape.com/newsref/std/cookie_spec.html
# This is a generated file! Do not edit.
sub.example.com FALSE / TRUE JSESSIONID B8307A77925DB287B0346C728BBF8F24
However, by reporting either wget
or wkhtmltopdf
to download cookies, you will get a login page.
$ wget -p --load-cookies cookies.txt sub.example.com/page.html $ wkhtmltopdf --cookie-jar cookies.txt sub.example.com/page.html page.pdf
What gives? Ideally, any solution that allows me to print to PDF would be ideal, but I'm curious what is going on here.
I use:
-
wkhtmltopdf
version 0.9.9 -
mechanize
: version 0.2.5
source to share
I do not have a solution to your specific cookie problem, but we did print PDFs with permissions:
- Open a separate view without authentication.
- Create a one-time token for the generated pdf file.
- In the unauthenticated view, make sure the key is correct and hasn't been used yet. If the token is valid, return the html to convert to pdf.
- If your opinion needs to know which user is requesting the pdf (in order to customize the page in some way), you can store the user id along with the token in the database.
We are looking for a better way to do this, but it has worked for us so far.
Hope it helps.
source to share