Xpath for url for import.io
I get a list of proposed jobs on this site: http://telekom.jobs/global-careers
I'm trying to get an XPath reference for more information working.
Here is the entire XPath for the first link:
/html/body/div[3]/div/div[2]/div[3]/table/tbody/tr[2]/td/div/a/@href
and this is what I have to insert in import.io:
tr[2]/td/div/a/@href
But it won't work, I don't know why.
Links to more information on job offer pages have XPath:
tr[2]/td/div/a/@href
tr[4]/td/div/a/@href
tr[6]/td/div/a/@href
tr[8]/td/div/a/@href
etc. Maybe why it doesn't work? Because numbers aren't 1,2,3 etc, but 2,4,6? Or am I doing something wrong?
source to share
If you build the API from URL 2.0 and reload the website with JS other than CSS, you can see the collapsible menu:
The DOM is structured on this website in such a way that all odd lines have the job titles, while more information about the job is hidden in the even lines. We can use the position () XPath property for this, so you can use the following XPath to train the strings manually:
/html/body/div[3]/div/div[2]/div[3]/table/tbody/tr[position() mod 2 = 0]
Which highlights more information fields that give you access to the data inside. From here, you can simply target specific attributes of elements that have title and link access.
Xpath reference: .//a[@class=’forward jobadview’]/@href
xpath header:.//div[@class=’info’]//h3
Having said that due to heavy use of JS on the website, it may not post, so we created an API for the request and you can get the same data as here.
https://import.io/data/mine/?id=0626d49d-5233-469d-9429-707f73f1757a
source to share