How would I dynamically add a new XML node based on the values โ€‹โ€‹of other nodes?

Background:
I have an old web CMS that stores content in XML files, one XML file per page. I am in the process of importing content from this CMS to a new one, and I know that I will need to bulk up the existing XML for the import process to work properly.

Existing XML:

<page>
    <audience1>true</audience>
    <audience2>false</audience>
    <audience3>true</audience>
    <audience4>false</audience>
    <audience5>true</audience>
</page>

      

Required XML:

<page>
    <audience1>true</audience>
    <audience2>false</audience>
    <audience3>true</audience>
    <audience4>false</audience>
    <audience5>true</audience>
    <audiences>1,3,5</audiences>
</page>

      

Question:
Desired XML adds a node with a comma separated list of other nodes that are "true". I need to achieve my desired XML for multiple files, so what's the best way to do this? Some of my ideas:

  • Use a text editor with regex search / replace. But what expression? I don't even know where to start.
  • Use a programming language like ASP.NET to parse the files and add the desired node. Again, not sure where to start here as my .NET skills are only average.

Suggestions?

+1


source to share


2 answers


I would probably use the XmlDocument class in .net, but that's just me because I never liked regular expressions.

Then you can use XPath expressions to pull the child nodes of each page node, evaluate them and add them to the end of the child node pages, save the XmlDocument when you're done.



Xsl is an option too, but the initial learning curve is a bit sore.

Probably a more elegant way with regex, but if you only run it once, it only matters that it works.

+1


source


I would most likely use an XSLT stylesheet to solve this problem. I built the following stylesheet to be a little generalized, which is exactly what you asked for, but it can be easily modified to give you the exact output you provided if you really want the exact output.

<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">
  <xsl:apply-templates select="/*"/>
</xsl:template>

  <xsl:template match="/*">
    <xsl:copy>
      <xsl:copy-of select="*"/>

        <xsl:element name="nodes">
          <xsl:apply-templates select="*[normalize-space(.) = 'true']"/>
        </xsl:element>
      </xsl:copy>
  </xsl:template>

  <xsl:template match="/*/*">
    <xsl:value-of select="concat(',', local-name())"/>
  </xsl:template>

  <xsl:template match="/*/*[1]">
    <xsl:value-of select="local-name()"/>
  </xsl:template>

</xsl:stylesheet> 

      

This XSLT output will look like this:



<page>
  <audience1>
    true
  </audience1>
  <audience2>
    false
  </audience2>
  <audience3>
    true
  </audience3>
  <audience4>
    false
  </audience4>
  <audience5>
    true
  </audience5>
  <nodes>audience1,audience3,audience5</nodes>
</page>

      

XSLT is good for this because you can use just about any programming language you want, or you can use Visual Studio to apply the template. There are also many free tools that you can use to apply transformations.

+1


source







All Articles