Nokogiri - Get div with class by regex

I am using nokogiri gem in my rails app to get some html nodes. I am getting my div by class. But the name of this class changes sometimes. For example, I am currently getting the following:



but for example "x15" could be "x13". I could do something like this:

doc.css("div.t.m0.x13.h3.ff2.fs1.fc0.sc0.ls0.ws1", "div.t.m0.x15.h3.ff2.fs1.fc0.sc0.ls0.ws1")


It will work, but I think it would be better to set a range like x13-x15, so if it appears at x14 I don't need to keep it that verbose.

Any tips on how to do this? Thank!


I cannot remove the class "x *" because there is another div with the same other classes, so what are the differences between these two elements is "x". The other is xa, xb; and this one I'm trying to get is x13, x15.


source to share

3 answers

You can use the method .xpath

for this purpose. For example.

doc.xpath("//div[@class='x13' or @class='x15']")


Or you can use

//div[starts-with(@class, 'x') and (ends-with(@class, '13') or (ends-with(@class, '15'))]


Searching for regexp appears in XPath 2.0, but I don't know what the xpath version of nokogiri supports.



You can do the following:

base_classes = '.t.m0.h3.ff2.fs1.fc0.sc0.ls0.ws1'
extra_classes = ['.x15', '.x13']
doc.css(* { |extra_class| "div#{base_classes}#{extra_class}" })




If you really want a regex you should use select


  select{|div| div[:class][/x1[3-5]/]}


Note. This regex may not do what you expect. Also note: using select

turns your NodeSet into an array



All Articles