Dynamically assign columns when crawling a product detail page?

I am brand new to import and please be kind to me.

I am trying to crawl the product detail pages of an online jewelry store and find this feature list on the page:

Functions

Gender  Men

Technical Style   Quartz

Material   Stainless steel

and etc.

Is it possible to train the crawler to dynamically retrieve text in bold as the column name rather than bold as the column value? those. The column "gender" is "men" and so on. It is assumed that on other product detail pages, features may not start with "gender"

Thanks for any help!

+3


source to share


2 answers


I haven't tried this, but I think they will work:



  • Build one column by highlighting all bold texts and another column with matching value using xpaths.
  • Build everything as one line to always select all functions.
+2


source


That's quite possible:)

  • You do col and give it a name - Paul
  • Then you click or highlight the data you want - Men

If it doesn't seem to work for you, you can go ahead and use xpath.

How to do:

To do this, you click on the data type, next to the column name, in the image below you can see its pink text labeled "Text" located in the left pane, on the right.

enter image description here

Then when you see the "show advanced settings" option you have to click.



enter image description here

Once there, you can add "xpath Override" and put it there.

//*[text()="Gender"]/following-sibling::*

      

This tells import.io to "exactly" where the data is based on a set of rules that you can insert there.

enter image description here

This article will be of some help: http://support.import.io/knowledgebase/articles/368731-webinar-5-tips-and-tricks

This is where you will find all mentions of gender on the page, then look at the item next to it in HTML and put it in your col.

0


source







All Articles