Python and mechanize.open ()
I have some code that uses a mechanized and password protected site. I can log in just fine and get the expected results. However, as soon as I log in, I don't want to bind the "click", I want to iterate over the list of urls. Unfortunately, every call to .open () just gets redirected to the login page, which is what I would expect if I were logged out or tried to login to another browser. This makes me think this is cookie handling, but I'm at a loss.
def main():
browser = mechanize.Browser()
browser.set_handle_robots(False)
# The below code works perfectly
page_stats = login_to_BOE(browser)
print page_stats
# This code ALWAYS gets the login page again NOT the desired
# behaviour of getting the new URL. This is the behaviour I would
# expect if I had logged out of our site.
for page in PAGES:
print '%s%s' % (SITE, page)
page = browser.open('%s%s' % (SITE, page))
page_stats = get_page_statistics(page.get_data())
print page_stats
source to share
This is not an answer, but it can lead you in the right direction. Attempt to enable Mechanize extensive debugging tools using some combination of the instructions below:
browser.set_debug_redirects(True)
browser.set_debug_responses(True)
browser.set_debug_http(True)
This will provide a stream of HTTP information which I found very useful when I developed my one and only engine based application.
I should note that I am not doing much (if anything) in my application than what you showed in your question. I create a browser object in the same way and then pass it to this login function:
def login(browser):
browser.open(config.login_url)
browser.select_form(nr=0)
browser[config.username_field] = config.username
browser[config.password_field] = config.password
browser.submit()
return browser
I can then open pages with the required authentication using browser.open (url) and all cookie handling is handled transparently and automatically for me.
source to share
Will be,
Your suggestion pointed me in the right direction.
Every web browser I have ever used answered the following:
http://www.foo.com//bar/baz/trool.html
Since I hate that the content is not concatenated correctly, my SITE variable was " http://www.foo.com/ "
Also, all other urls were "/bar/baz/trool.html"
My calls for discovery turned out to be .open('http://www.foo.com//bar/baz/trool.html')
, and the mechanization browser is obviously not massaging like a "real" browser. Apache didn't like the url.
source to share