Parsing html for 8th level metro style app using C #, XAML
My app needs to parse the html and load the content into a list. I can get html via webclient but am stuck with parsing it.
I've heard about Htmlagilitypack and Fizzler, but couldn't find any tutorials or examples of how to use them.
I need help grabbing "first_content" and "second_content" in a list box from the html document shown below.
<html>
<body>
<div>
<section>
<article>
<header>
<hgroup>
<h1>
first_content
</h1>
</hgroup>
</header>
<ul>
<li>
second_content
</li>
</ul>
</article>
</section>
</div>
</body>
</html>
+3
source to share
2 answers
HtmlAgilityPack is the way to go, I have used it in WCF, Windows Phone and now WinRt with complete success, for checking out the tutorial this blog post
+3
source to share
You can use XPath. For example...
var html = "<html><body><div><section><article><header><hgroup><h1>first_content</h1></hgroup></header><ul><li>second_content</li></ul></article> </section></div></body></html>";
var doc = new XmlDocument();
doc.LoadXml(html);
var txt1 = doc.SelectSingleNode("/html/body/div/section/article/header/hgroup/h1").InnerText;
var txt2 = doc.SelectSingleNode("/html/body/div/section/article/ul/li").InnerText;
0
source to share