Lxml xpath does not find anchor text

Question

Lxml xpath does not find anchor text

I have two xpaths and only one of them is pulling the job titles correctly from the url below. Any idea why xpath1 (which I found with the "check element / copy of XPath" feature in Chrome) doesn't work while xpath2 does?

import requests
from lxml import html

url = 'http://www.mynextmove.org/find/browse?c=54'

xpath1 = '//*[@id="content"]/table[1]/tbody/tr/td[1]/a/text()'
xpath2 = '//a[contains(@href, "profile")]/text()'

page = requests.get(url)
tree = html.fromstring(page.text)

jobs = tree.xpath(xpath2)
print 'jobs:', jobs

xpath1 returns [], an empty list.

xpath2 returns ['Anthropologists', 'Archaeologists', ...]

+3

python xpath web-scraping lxml

offwhitelotus 21 May '15 at 18:00

source to share

1 answer

salparadise · Accepted Answer · 2015-05-21T18:16:30+0000

No, tbody

it looks like changing it to:

`xpath1 = '//*[@id="content"]/table[1]/tr/td[1]/a/text()'`

and try it.

This is what I get when I do this:

In [31]: tree.xpath(xpath1)
Out[31]:
['Anthropologists',
 'Archeologists',
 'Architects',
 'Architectural Drafters',
 'Biochemists & Biophysicists',
 'Civil Drafters',
 'Civil Engineers',
 'Environmental Engineering Technicians',
 'Environmental Engineers',
 'Geodetic Surveyors',
 'Lawyers',
 'Legal Secretaries',
 'Mapping Technicians',
 'Marine Architects',
 'Marine Engineers',
 'Paralegals & Legal Assistants',
 'Survey Researchers',
 'Surveying Technicians',
 'Surveyors',
 'Tax Preparers',
 'Transportation Engineers',
 'Veterinarians',
 'Veterinary Assistants & Laboratory Animal Caretakers',
 'Veterinary Technologists & Technicians',
 'Water/Wastewater Engineers']

Lxml xpath does not find anchor text

More articles: