How to get HTML from WebBrowser control

There are numerous posts similar to this.

How do I get the rendered html (rendered Javascript) in a WebBrowser control? suggests using something like

webBrowser1.Document.GetElementsByTagName("HTML")[0].OuterHtml;

      

Document

treated as an object, I have no way to use GetElementsByTagName

Copy all text from web browser control suggests usingDocumentText

I have Document

but noDocumentText

This post also suggests webBrowser.Document.Body.InnerText;

I have the ability to use webBrowser.Document

, but that's all. It webBrowser.Document

is an object for some reason and as such I cannot access these methods.

Retrieving an HTML source via the C # WebBrowser control also suggests using DocumentStream

. Again, I don't have that.

I am doing this in a WPF application and using WebBrowser

fromSystem.Windows.Controls

All I'm trying is to read the rendered HTML from the webpage.

My code

public void Begin(WebBrowser wb)
{
   this._wb = wb;
   _wb.Navigated += _wb_Navigated;
   _wb.Navigate("myUrl");
}

private void _wb_Navigated(object sender, System.Windows.Navigation.NavigationEventArgs e)
{
    var html = _wb.Document;//this is where I need help
}

      

+3


source to share


1 answer


The samples are for WinForms-WebBrowserControl . Add a link to Microsoft.mshtml (via sitelinks dialog -> search) to your project.

Inject Document -Property into

HTMLDocument



to access methods and properties (as stated on MSDN).

See also my GitHub-Sample :

private void WebBrowser_Navigated(object sender, NavigationEventArgs e) {
    var document = (HTMLDocument)_Browser.Document;
     _Html.Text = document.body.outerHTML;
}

      

+2


source







All Articles