Parsing slightly compressed XML in C # with XmlSeralizer

I have been provided with some "XML" files that have not quite the correct schema (I think the problem is) and the medical device that generates them cannot be modified to create simple XML parsing. (With such tantalizing small modification (additional tags flowing around images around records image ) would be trivial to read these files --- this is not what is XML?)

I am mostly stuck here. XML looks like this:

<Series>
   <Metadata1>foo</Metadata1>
   <Metadata2>bar</Metadata2>
   ...
   <Image>...</Image>
   <Image>...</Image>
   ...
</Series>

      

(there can be any number of images, but all metadata tags are known). My code looks like this:

public class Image { ... }

public class Series : List<Image>
{
    public Series() { }
    public string Metadata1;
    public string Metadata2;
    ...
}

      

When I run it like this:

            XmlSerializer xs = new XmlSerializer(typeof(Series));
            StreamReader sr = new StreamReader(path);
            Series series = (Series)xs.Deserialize(sr);
            sr.Close();

      

List of Image objects are correctly read into a series object, but the Metadata1 / 2 / etc fields are not read (in fact, viewing the object in the debugger shows all the metadata fields inside the "Raw View" field).

When I change the code:

public class Series    // // removed this : List<Image>
{
    public Series() { }
    public string Metadata1;
    public string Metadata2;
    ...
}

      

and run the reader on the file, I get the series object with Metadata1 / 2 / etc. filled in in full, but no image data to be read (obviously).

How to parse metadata1 / 2 / etc. and a series of images with the least amount of painful ad hoc code?

Should I write some custom (painful? Simple?) ReadXML method to implement IXMLSeralizable?

I don't care how the objects are laid out as my software that consumes these C # classes is completely flexible:

List <Image> Images;
it would be nice for images, or maybe the metadata was wrapped by some object, no matter ...
+2


source to share


3 answers


Your classes are missing attributes that allow XML serialization. I believe the following should be sufficient.

[XmlElement]
public class Image { ... }

[XmlRoot(ElementName="Series")]
public class Series
{
        public Series() { }

        [XmlElement]
        public string Metadata1;

        [XmlElement]
        public string Metadata2;

        [XmlElement(ElementName="Image")]
        public Image[] Images;
}

      



I'm not sure if you can use a generic type instead of an array of images, but the link above should give you more information on how to apply serialization attributes for your particular situation.

EDIT: Another option is manual and XML schema which will validate documents generated by the application and then use XSD.exe to create the object model. The resulting classes will demonstrate how you should tweek your object model to work with the serializer.

+3


source


Why are you trying to use an XML serializer for this? Serialization is usually the ability to store the "state" of an object in some known format (text or binary) so that it can be recreated at a later point in time. This doesn't sound like what you are trying to do here. The problem here is that the XML data doesn't match your object hierarchy.

You have a hardware device that somehow generates XML data that you want to use. For me, this would be easiest with a simple XmlDocument or XmlReader class instead of trying to go through the serializer.

Perhaps you can do it with code like this:



public class Image { }

public class Series
{
   public string Metadata1;
   public string Metadata2;
   public List<Image> Images = new List<Image>();

   public void Load(string xml)
   {
      XmlDocument doc = new XmlDocument();
      doc.Load(xml);

      XmlNodeList images = doc.SelectNodes("Image");
      foreach (XmlNode image in images)
      {
         Images.Add(new Image(image.InnerText));
      }

      Metadata1 = GetMetadataValue(doc, "Metadata1");
      Metadata2 = GetMetadataValue(doc, "Metadata2");
   }

   private string GetMetadataValue(XmlDocument document, string nodeName)
   {
      string value = String.Empty;
      XmlNode metadataNode = document.SelectSingleNode(nodeName);
      if (metadataNode != null)
      {
         value = metaDataNode.InnerText;
      }

      return value;
   }
}

      

* This is untested / untested code, but it should get the idea.

+2


source


I think Steve should work. I just want to add that you can only read a finite number of metadata items with this technique because they don't have a permanent name. What you can do is read them as a collection of XmlElements, which you can parse later:

[XmlRoot(ElementName="Series")]
public class Series
{
    public Series() { }

    [XmlAnyElement]
    XmlElement[] UnknownElements;

    private string[] _metadata;
    [XmlIgnore]
    public string[] Metadata
    {
        get
        {
            if (_metadata == null && UnknownElements != null)
            {
                _metadata = UnknownElements
                            .Where(e => e.Name.StartsWith("Metadata")
                            .Select(e => e.InnerText)
                            .ToArray();
            }
            return _metadata;
        }
    }

    [XmlElement(ElementName="Image")]
    public Image[] Images;
}

      

+1


source







All Articles