Accessing Chrome DOM tree with python
2 answers
The best way I've found is to use selenium.webdriver
:
import selenium.webdriver as webdriver
import lxml.html as lh
import lxml.html.clean as clean
browser = webdriver.Chrome() # Get local session of Chrome
browser.get("http://www.webpage.com") # Load page
content=browser.page_source
cleaner=clean.Cleaner()
content=cleaner.clean_html(content)
doc=lh.fromstring(content)
doc gets DOM like lxml.html.HtmlElement
+3
source to share
Have you used the BeautifulSoup library? This section of the tutorial may answer your question. http://www.crummy.com/software/BeautifulSoup/bs3/documentation.html#The Processing Tree
Then you will also need to import the query library.
from BeautifulSoup import BeautifulSoup import requests url = 'http://www.crummy.com/software/BeautifulSoup/bs3/documentation.html' page = requests.get(url) soup = BeautifulSoup(page.content) print soup
+1
source to share