XPath expression

I am new to XPath. I have a html source of a web page

http://london.craigslist.co.uk/com/1233708939.html 

      

Now I want to extract from the above page the following data

  • Full date
  • Email - just below the date

I also want to find the presence of a "Reply to this message" button on the page

http://sfbay.craigslist.org/sfc/w4w/1391399758.html

      

Can anyone help me write three XPath expressions for three data.

+2


source to share


3 answers


You don't need to write them down yourself or even define them yourself. If you are using the Firebug plugin, go to the page, right-click the desired elements, click Inspect Element, and Firebug will open the HTML in a viewer at the bottom of the browser. Right-click the desired element in the HTML Viewer and click Copy XPath.

However, the XPath expression you're looking for (for # 3):



/ Html / body / div [4] / form / button

... obtained with the above method.

+5


source


I noticed that the DTD is HTML 4/01 Transitional, not XHTML for the first link, so there is no guarantee that it is a valid XML document and it might not be properly loaded by the XML parser. In fact, I see several tags that are not properly closed (i.e. <hr>, etc.)



I don't know, the first one is from the hands, and the third one answered simply by Alex, and the second one - / html / body / a [0].

+4


source


From your first page, this is simply not possible to do, because this is not how xpath works. For the xpath expression to select that "something" must be a node (ie an Element)
The second page is pretty simple, but for that you need an "id" attribute (or anything that can make your button unique). For example, if you are sure that the text "Reply to this message" correctly identifies the button, simply do so with
//button["Reply to this post"]

+1


source







All Articles