Unable to read XML document containing ampersand character
I am writing a program that reads an XML file from Visual C #. I have a problem reading the Xml file because it contains invalid XML characters like '&'.
I need to read XML, but I cannot modify the document. How can I change the Xml file using C #? My code:
private void button1_Click(object sender, EventArgs e)
{
XmlDocument doc;
doc = new XmlDocument();
doc.Load("nuevo.xml");
XmlNodeList Xpersonas = doc.GetElementsByTagName("personas");
XmlNodeList Xlista = ((XmlElement)Xpersonas[0]).GetElementsByTagName("edad");
foreach (XmlElement nodo in Xlista)
{
string edad = nodo.GetAttribute("edad");
string nombre = nodo.InnerText;
textBox1.Text = nodo.InnerXml;
}
source to share
As @EBrown suggested, one possibility might be to read the file content in a string variable and replace the character &
with the correct representation for the XML property &
, then parse the XML structure. A possible solution might look like this:
var xmlContent = File.ReadAllText(@"nuevo.xml");
XmlDocument doc;
doc = new XmlDocument();
doc.LoadXml(xmlContent.Replace("&", "&"));
XmlNodeList Xpersonas = doc.GetElementsByTagName("personas");
XmlNodeList Xlista = ((XmlElement)Xpersonas[0]).GetElementsByTagName("edad");
foreach (XmlElement nodo in Xlista)
{
string edad = nodo.GetAttribute("edad");
string nombre = nodo.InnerText;
Console.WriteLine(nodo.InnerXml.Replace("&", "&"));
}
Output:
34 & 34
If it's okay to use LINQ2XML, then the solution is even shorter and there is no need to write the reverse (second) replacement, because LINQ2XML does it for you automatically:
var xmlContent = File.ReadAllText(@"nuevo.xml");
var xmlDocument = XDocument.Parse(xmlContent.Replace("&", "&"));
var edad = xmlDocument.Root.Element("edad").Value;
Console.WriteLine(edad);
The output is the same as above.
source to share