Finding last line in node using XPath

Question

Finding last line in node using XPath

I was wondering if there is a way to always select the content of a node over a specific element?

I have the following code that I want to extract from:

<div id="someDiv">
   <h3>Name</h3>
   Some content1
   <br/>
   <br/>
   Address 12345
   <br/>
   09876 City, Country
   <br/>
   <span id="tel_number">12345</span>
</div>

Here is an XPath that finds the contents of everything above the range:

//div[@id="someDiv"]/span[@id="tel_number"]/preceding-sibling::node()

Now I need an XPath that always selects content right above the range and nothing else (one line). It should also work if (for some reason) <br/>

there was no over the span.

Hope someone can help with this!

+1

ruby xpath

Severin 16 Aug 13 at 9:06

source to share

3 answers

Try:

(//div[@id="someDiv"]/span[@id="tel_number"]/preceding-sibling::text())[last()]

or if you want to remove spaces

normalize-space((//div[@id="someDiv"]/span[@id="tel_number"]/preceding-sibling::text())[last()])

+1

paul trmbrth 16 Aug '13 at 9:34 am

source to share

I want to get "09876 City, Country" stripped of any HTML tags

I think below is what you are looking for:

//div[@id="someDiv"]/span[@id="tel_number"]/preceding-sibling::text()[1]

Usage Nokogiri

:

require 'nokogiri'

doc = Nokogiri::HTML::Document.parse <<-EOT
<div id="someDiv">
   <h3>Name</h3>
   Some content1
   <br/>
   <br/>
   Address 12345
   <br/>
   09876 City, Country
   <br/>
   <span id="tel_number">12345</span>
</div>
EOT

doc.xpath("normalize-space(//div[@id='someDiv']/span[@id='tel_number']/preceding-sibling::text()[1])").to_s
# => "09876 City, Country"

0

Arup Rakshit 16 Aug '13 at 9:26

source to share

Severin · Accepted Answer · 2013-08-16T10:58:48+0000

I found that the best way to get the zip code is as follows:

data = page.search('(//div[@id="someDiv"]/span[@id="tel_number"]/preceding-sibling::node()').map{|data| data.text.cleanup}
data.delete("")
postcode = data.last.match(/\d{5}/).to_s

From there, it's easy to get everything after a choice or before a choice.

Finding last line in node using XPath

More articles: