Waiting for the website to fully load with WebKitGTK +

Possible duplicate:
Webkit GTK: determine when the document will be loaded

I want to get the HTML content of a website using WebKitGTK + in order to automatically handle javascript redirects.

I am using the following Python code:

def scanURL(domain, retries=3):
    status = 0
    loading = 0

    browser = webkit.WebView()
    browser.open('http://' + domain)
    while browser.get_load_status() < 2:
        continue

    if browser.get_load_status() == 4:
        if retries > 0:
            return scanURL(domain, retries - 1)
        return 'Failed'

    return 'Success'

      

The website loads fine, but there are some special websites that are redirected to redirect the webpage elsewhere, I tried to hook the event load-finished

to a function and it got called twice.

Is there a way to find out when WebKit has fully loaded a web page?

How can I tell if WebKit is executing JavaScript?

+2


source to share


1 answer


There is no reliable way to programmatically accomplish this task for all websites, as there are pages where these redirects start with javascript, often start with setTimeout after n seconds, and there is no built-in way to scan for such "quirks". However, if you are parsing a well-known group of websites where you know for sure that such redirects will occur, you can create a list of these URLs with the required number of seconds, after which the redirect will occur. After the initial loadFinished is running, you can start the QTimer, hook it up to a function that activates loadFinished again, so the next page will certainly start loading while you wait for the result. Wait until the page result is activated and new redirects are redirected.



+2


source







All Articles