Parsing html for 8th level metro style app using C #, XAML

My app needs to parse the html and load the content into a list. I can get html via webclient but am stuck with parsing it.
I've heard about Htmlagilitypack and Fizzler, but couldn't find any tutorials or examples of how to use them.

I need help grabbing "first_content" and "second_content" in a list box from the html document shown below.

<html>
<body> 
<div>
<section>
<article>
   <header> 
       <hgroup> 
           <h1> 
              first_content
           </h1>
       </hgroup>
   </header> 
   <ul> 
        <li> 
           second_content
        </li>
   </ul>
</article> 
</section>
</div>
</body>
</html>

      

+3


source to share


2 answers


HtmlAgilityPack is the way to go, I have used it in WCF, Windows Phone and now WinRt with complete success, for checking out the tutorial this blog post



+3


source


You can use XPath. For example...



var html = "<html><body><div><section><article><header><hgroup><h1>first_content</h1></hgroup></header><ul><li>second_content</li></ul></article> </section></div></body></html>";
var doc = new XmlDocument();
doc.LoadXml(html);
var txt1 = doc.SelectSingleNode("/html/body/div/section/article/header/hgroup/h1").InnerText;
var txt2 = doc.SelectSingleNode("/html/body/div/section/article/ul/li").InnerText;

      

0


source







All Articles