Generating HTML from XML tree (C # /. NET)

I have an HTML document stored in memory as a tree of Linq-to-XML objects. How can I serialize XDocument as HTML, taking into account the specifics of HTML?

For example, empty tags such as <br/>

should be serialized as <br>

, whereas empty tags <div/>

should be serialized as <div></div>

.

HTML output is possible from an XSLT stylesheet, and XmlWriterSettings

has a property OutputMethod

that can be set to HTML, but the installer internal

is for XSLT or Visual Studio and I can't seem to find a way to serialize arbitrary XML as HTML.

So, if you don't use XSLT solely to be able to render HTML (i.e. do something like run a document through the nonsensical XDocument-> XmlReader-> chain via XSLT, in HTML), is there a way to serialize a.NET XDocument to HTML?

+2


source to share


3 answers


No . XDocument-> XmlReader-> XSLT is the approach you need.

What you're looking for is a specialized serializer that lets you add tag values ​​to names like br

and div

and does them differently. You can also expect such a serializer to work both ways, IOW will be able to read the HTML Tag soup and generate an XDocument. Such a thing doesn't exist out of the box.



The XmlReader for XSLT seems simple enough to define, it is ultimately just a chain of streams.

+2


source


Like you, I am very surprised that the HTML output method is not showing, and I am not aware of this in any way other than the XSLT route you already identified. When I ran into the same problem a couple of years ago, I wrote an XmlWriter wrapper class that would force WriteEndElement to use the WriteFullEndElement in the base XmlWriter if the processed tag was not in the list {"area", "base", "basefont", "bgsound "," br "," col "," embed "," frame "," hr "," isindex "," image "," img "," input "," link ", meta", "param", " spacer "," wbr "}.



This fixed the <div /> problem and was sufficient for me as I wanted to write polyglot documents. I haven't found a way to make it appear like <br> but other than the fact that it can't be validated like HTML 4.01, it doesn't cause a real problem. My guess is that if you really need this and don't want to use an XSLT method, you will have to write your own XmlWriter implementation.

+2


source


Of course have!

//XDocument document; string filename;
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
typeof(XmlWriterSettings).GetField("outputMethod", BindingFlags.NonPublic|BindingFlags.Instance).SetValue(settings, XmlOutputMethod.Html);
using(XmlWriter xw = XmlWriter.Create(filename, settings))
{
    document.Save(xw);
}

      

+1


source







All Articles