Htmlagilitypack xpath not working
The problem I have is that my xpath is not working.
I am trying to get a link to the following google.com link at the bottom.
But I am unable to reach url using Xpath.
Please help me fix my xpath. Also tell me what should be in place?
HtmlWeb hw = new HtmlWeb();
HtmlAgilityPack.HtmlDocument doc = hw.Load("http://www.google.com/search?q=seo");
HtmlNodeCollection linkNodes = doc.DocumentNode.SelectNodes("//*[@id='pnnext']");
foreach (HtmlNode linkNode in linkNodes)
{
HtmlAttribute link = linkNode.Attributes["href"];
MessageBox.Show(link.Value );
}
source to share
The weird thing here is that HtmlAgilityPack doesn't recognize the id
Next link attribute .
This could be a bug in the HtmlAgilityPack; you can post it to HAP Issue Tracker .
However, at the same time I found this solution:
- find the table containing the swap items (table c
id="nav"
). The identifier for this element is correctly recognized - take the first (and only
tr
) in the table and the lasttd
one (using XPath functionlast()
) - take the element
a
insidetd
that we got in the previous step.
In short, here's the code:
var doc = new HtmlWeb().Load("http://www.google.com/search?q=seo");
var nextLink = doc.DocumentNode
.SelectSingleNode("//table[@id='nav']/tr/td[last()]/a");
Console.WriteLine(nextLink.GetAttribute("href", "err"));
Update
After Simon's comment, I checked this again and the conclusion is that this is not a bug in the HTML Agility Pack; the attribute id="pnnext"
is only present when the request is made by the browser (possibly depending on the value of the UserAgent header). When executed HttpWebRequest
from code, this means the following link appears in the output:
<a href="/search?q=seo&hl=en&ie=UTF-8&[...]" style="text-align:left">
source to share