Xpath for url for import.io

I get a list of proposed jobs on this site: http://telekom.jobs/global-careers

I'm trying to get an XPath reference for more information working.

Here is the entire XPath for the first link:

/html/body/div[3]/div/div[2]/div[3]/table/tbody/tr[2]/td/div/a/@href

      

and this is what I have to insert in import.io:

tr[2]/td/div/a/@href

      

But it won't work, I don't know why.

Links to more information on job offer pages have XPath:

tr[2]/td/div/a/@href
tr[4]/td/div/a/@href
tr[6]/td/div/a/@href
tr[8]/td/div/a/@href

      

etc. Maybe why it doesn't work? Because numbers aren't 1,2,3 etc, but 2,4,6? Or am I doing something wrong?

+3


source to share


1 answer


If you build the API from URL 2.0 and reload the website with JS other than CSS, you can see the collapsible menu:

The DOM is structured on this website in such a way that all odd lines have the job titles, while more information about the job is hidden in the even lines. We can use the position () XPath property for this, so you can use the following XPath to train the strings manually:

/html/body/div[3]/div/div[2]/div[3]/table/tbody/tr[position() mod 2 = 0]

      

Which highlights more information fields that give you access to the data inside. From here, you can simply target specific attributes of elements that have title and link access.



Xpath reference: .//a[@class=’forward jobadview’]/@href

xpath header:.//div[@class=’info’]//h3

Having said that due to heavy use of JS on the website, it may not post, so we created an API for the request and you can get the same data as here.

https://import.io/data/mine/?id=0626d49d-5233-469d-9429-707f73f1757a

+5


source







All Articles