IO Import - Using XPath to Display "More" Content

I am completely stumped and turn to us for help!

I am using the Import.io crawler to fetch reviews from TripAdvisor. However, when I train the finder, the "more" button is grayed out.

Here is an example page: [ http://www.tripadvisor.co.uk/Hotel_Review-g295424-d306662-Reviews-Hilton_Dubai_Jumeirah_Resort-Dubai_Emirate_of_Dubai.html#REVIEWS] [1 ]

Here is the Xpath for a full overview: // * [@id = "UR288083139"] / div [2] / div / div [3]

And the More button: // * [@ID = "review_288083139"] / div [1] / div [2] / div / div / div [3] / p / range

Is it possible to have an Xpath, which is why the full overview is included in Import.io?

+3


source to share


2 answers


One way to do this is to use a scanner and then Extractor. This would split the process into two parts.

  • Create a crawler that you train to capture links for every review on a page. Make sure you select the link for the column.

    Example of a website review

  • Create an Extractor to get a complete overview of the links you get from the crawler.

  • Voila! You have all the reviews!



Note. If you already have all the links for the pages you need feedback on, you'd better make Extractor instead of Crawler. This way you can link the API to another extractor. You will need a scanner if you don't know all the links.

Hope this helps!

+1


source


It looks like the html is NOT on the page before this button is clicked and there is no url that has this data. Thus, you may be out of luck.



You can try playing around with the developer console to see if you can find the full reviews you wanted to find in an XML file or dynamic url. I'm not sure how to do this.

0


source







All Articles