Waiting for a full table load using selenium with python

I want to clear some data from a page that is in a table. So I only care about the data in the table. I've used Mechanize before, but sometimes I find some data missing, especially at the bottom of the table. Googling, I found out that this might be due to mechanization not handling JQuery / Ajax.

So, today I switched to Selenium. How can I wait for one and only one table to load, and then fetch all references from that table using selenium and python? If I wait for the full page to load, it will take a while. I want only the data in the table to be loaded. My current code:

driver = webdriver.Firefox ()
for page in range (1, 2):
    driver.get ("http://somesite.com/page/" + str (page))
    table = driver.find_element_by_css_selector ('div.datatable')
    links = table.find_elements_by_tag_name ('a')
    for link in links:
        print link.text
+2


source to share


2 answers


Use WebDriverWait

to wait until the table is located:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

...
wait = WebDriverWait(driver, 10)
table = wait.until(EC.presence_of_element_located(By.CSS_SELECTOR, 'div.datatable'))

      

This will be a clear expectation.




Alternatively, you can make the driver wait implicitly :

Implicit Waiting - Tell WebDriver to poll the DOM for a specific amount of time when you are trying to find an element or elements if they are not immediately available. The default is 0. Once set, an implicit wait is set for the lifetime of an instance of a WebDriver object.

from selenium import webdriver

driver = webdriver.Firefox()
driver.implicitly_wait(10) # wait up to 10 seconds while trying to locate elements
for page in range(1, 2):
    driver.get("http://somesite.com/page/"+str(page))
    table = driver.find_element_by_css_selector('div.datatable')
    links = table.find_elements_by_tag_name('a')
    for link in links:
        print link.text

      

+3


source


Perhaps you could use Selenium expected conditions ( http://docs.seleniumhq.org/docs/04_webdriver_advanced.jsp ) like



>>> from selenium import webdriver
>>> from selenium.webdriver.common.by import By
>>> from selenium.webdriver.support.ui import WebDriverWait
>>> from selenium.webdriver.support import expected_conditions as EC 
>>> 
>>> ff = webdriver.Firefox()
>>> ff.get("http://www.datatables.net/examples/data_sources/js_array.html")
>>> try:
...     element = WebDriverWait(ff, 10).until(EC.presence_of_element_located((By.ID, "example")))
...     print element.text
... finally:
...     ff.quit()
... 

Engine Browser Platform Version Grade
Gecko Firefox 1.0 Win 98+ / OSX.2+ 1.7 A
Gecko Firefox 1.5 Win 98+ / OSX.2+ 1.8 A
Gecko Firefox 2.0 Win 98+ / OSX.2+ 1.8 A
Gecko Firefox 3.0 Win 2k+ / OSX.3+ 1.9 A
Gecko Camino 1.0 OSX.2+ 1.8 A
Gecko Camino 1.5 OSX.3+ 1.8 A
Gecko Netscape 7.2 Win 95+ / Mac OS 8.6-9.2 1.7 A
Gecko Netscape Browser 8 Win 98SE+ 1.7 A
Gecko Netscape Navigator 9 Win 98+ / OSX.2+ 1.8 A
Gecko Mozilla 1.0 Win 95+ / OSX.1+ 1 A

      

0


source







All Articles