C# to convert xml attributes to elements

2019-02-23 08:18发布

问题:

I need to convert all attributes to nodes in an XML file, with the exception of attributes in the root node.

I found a similar question here: xquery to convert attributes to tags, but I need to do the conversion in C#.

I have also found a possible solution using XLS here: Convert attribute value into element. However, that solution essentially changes the node name to the attribute name and removes the attribute.

I need to create new sibling nodes with the name and value of the attributes and remove the attributes, but still preserve the node that contained the attributes.

Given the following XML:

<Something xmlns="http://www.something.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.xomething.com segments.xsd">
  <Version>4.0.8</Version>
  <Segments>
    <Segment Name="Test">
      <SegmentField>
        <SegmentIndex>0</SegmentIndex>
        <Name>RecordTypeID</Name>
        <Value Source="Literal">O</Value>
      </SegmentField>
      <SegmentField>
        <SegmentIndex>1</SegmentIndex>
        <Name>OrderSequenceNumber</Name>
        <Value Source="Calculated" Initial="1">Sequence</Value>
      </SegmentField>
      <SegmentField>
        <SegmentIndex>3</SegmentIndex>
        <Name>InstrumentSpecimenID</Name>
        <Value Source="Property">BarCode</Value>
      </SegmentField>
    </Segment>
  </Segments>
</Something>

I need to produce the following XML:

<Something xmlns="http://www.something.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.xomething.com segments.xsd">
  <Version>4.0.8</Version>
  <Segments>
    <Segment>
      <Name>Test</Name>
      <SegmentField>
        <SegmentIndex>0</SegmentIndex>
        <Name>RecordTypeID</Name>
        <Value>O</Value>
        <Source>Literal</Source>
      </SegmentField>
      <SegmentField>
        <SegmentIndex>1</SegmentIndex>
        <Name>OrderSequenceNumber</Name>
        <Value>Sequence</Value>
        <Source>Calculated</Source>
        <Initial>1</Initial>
      </SegmentField>
      <SegmentField>
        <SegmentIndex>3</SegmentIndex>
        <Name>InstrumentSpecimenID</Name>
        <Value>BarCode</Value>
        <Source>Property</Source>
      </SegmentField>
    </Segment>
  </Segments>
</Something>

I have written the following method to create a new XML object, creating new elements from the source element's attributes:

private static XElement ConvertAttribToElement(XElement source)
{
    var result = new XElement(source.Name.LocalName);

    if (source.HasElements)
    {
        foreach (var element in source.Elements())
        {
            var orphan = ConvertAttribToElement(element);

            result.Add(orphan);
        }
    }
    else
    {
        result.Value = source.Value.Trim();
    }

    if (source.Parent == null)
    {
        // ERROR: The prefix '' cannot be redefined from '' to 'http://www.something.com' within the same start element tag.

        //foreach (var attrib in source.Attributes())
        //{
        //    result.SetAttributeValue(attrib.Name.LocalName, attrib.Value);
        //}
    }
    else
    {
        while (source.HasAttributes)
        {
            var attrib = source.LastAttribute;
            result.AddFirst(new XElement(attrib.Name.LocalName, attrib.Value.Trim()));
            attrib.Remove();
        }
    }

    return result;
}

This method produces the following XML:

<Something>
  <Version>4.0.8</Version>
  <Segments>
    <Segment>
      <Name>Test</Name>
      <SegmentField>
        <SegmentIndex>0</SegmentIndex>
        <Name>RecordTypeID</Name>
        <Value>
          <Source>Literal</Source>O</Value>
      </SegmentField>
      <SegmentField>
        <SegmentIndex>1</SegmentIndex>
        <Name>OrderSequenceNumber</Name>
        <Value>
          <Source>Calculated</Source>
          <Initial>1</Initial>Sequence</Value>
      </SegmentField>
      <SegmentField>
        <SegmentIndex>3</SegmentIndex>
        <Name>InstrumentSpecimenID</Name>
        <Value>
          <Source>Property</Source>BarCode</Value>
      </SegmentField>
    </Segment>
  </Segments>
</Something>

There are two immediate problems with the output:
1) The attributes in the root element are lost.
2) The attributes from the 'Value' element are created as child element instead of siblings.

To address the first issue, I tried to assign the attributes of the source element to the result element, but had that caused a "prefix '' cannot be redefined from '' to 'http://www.something.com' within the same start element tag" error. I commented out the code that caused the error for illustration.

To address the second issue, I attempted to add the element created from the attribute to the source.Parent element, but that resulted in the new element not appearing at all.

I also rewrote the method to operate directly on the source element:

private static void ConvertAttribToElement2(XElement source)
{
    if (source.HasElements)
    {
        foreach (var element in source.Elements())
        {
            ConvertAttribToElement2(element);
        }
    }

    if (source.Parent != null)
    {
        while (source.HasAttributes)
        {
            var attrib = source.LastAttribute;
            source.Parent.AddFirst(new XElement(attrib.Name.LocalName, attrib.Value.Trim()));
            attrib.Remove();
        }
    }
}

The rewrite produced the following XML:

<Something xmlns="http://www.something.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.xomething.com segments.xsd">
  <Version>4.0.8</Version>
  <Segments>
    <Name xmlns="">Test</Name>
    <Segment>
      <SegmentField>
        <Source xmlns="">Literal</Source>
        <SegmentIndex>0</SegmentIndex>
        <Name>RecordTypeID</Name>
        <Value>O</Value>
      </SegmentField>
      <SegmentField>
        <Source xmlns="">Calculated</Source>
        <Initial xmlns="">1</Initial>
        <SegmentIndex>1</SegmentIndex>
        <Name>OrderSequenceNumber</Name>
        <Value>Sequence</Value>
      </SegmentField>
      <SegmentField>
        <Source xmlns="">Property</Source>
        <SegmentIndex>3</SegmentIndex>
        <Name>InstrumentSpecimenID</Name>
        <Value>BarCode</Value>
      </SegmentField>
    </Segment>
  </Segments>
</Something>

The rewrite did resolve the first issue of preserving the attributes of the root element. It also partially addressed the second issue, but has produced a new problem: the new elements have a blank xmlns attribute.

回答1:

This XSLT transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:x="http://www.something.com">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:variable name="vNamespace" select="namespace-uri(/*)"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="*/*/@*">
  <xsl:element name="{name()}" namespace="{$vNamespace}">
   <xsl:value-of select="."/>
  </xsl:element>
 </xsl:template>

 <xsl:template match="x:Value">
  <xsl:copy>
   <xsl:apply-templates/>
  </xsl:copy>
  <xsl:apply-templates select="@*"/>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document:

<Something xmlns="http://www.something.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.xomething.com segments.xsd">
    <Version>4.0.8</Version>
    <Segments>
        <Segment Name="Test">
            <SegmentField>
                <SegmentIndex>0</SegmentIndex>
                <Name>RecordTypeID</Name>
                <Value Source="Literal">O</Value>
            </SegmentField>
            <SegmentField>
                <SegmentIndex>1</SegmentIndex>
                <Name>OrderSequenceNumber</Name>
                <Value Source="Calculated" Initial="1">Sequence</Value>
            </SegmentField>
            <SegmentField>
                <SegmentIndex>3</SegmentIndex>
                <Name>InstrumentSpecimenID</Name>
                <Value Source="Property">BarCode</Value>
            </SegmentField>
        </Segment>
    </Segments>
</Something>

produces exactly the wanted, correct result:

<Something xmlns="http://www.something.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.xomething.com segments.xsd">
   <Version>4.0.8</Version>
   <Segments>
      <Segment>
         <Name>Test</Name>
         <SegmentField>
            <SegmentIndex>0</SegmentIndex>
            <Name>RecordTypeID</Name>
            <Value>O</Value>
            <Source>Literal</Source>
         </SegmentField>
         <SegmentField>
            <SegmentIndex>1</SegmentIndex>
            <Name>OrderSequenceNumber</Name>
            <Value>Sequence</Value>
            <Source>Calculated</Source>
            <Initial>1</Initial>
         </SegmentField>
         <SegmentField>
            <SegmentIndex>3</SegmentIndex>
            <Name>InstrumentSpecimenID</Name>
            <Value>BarCode</Value>
            <Source>Property</Source>
         </SegmentField>
      </Segment>
   </Segments>
</Something>

Explanation:

  1. The identity rule/template copies every node "as is".

  2. The identity rule is overriden by two templates -- one matching any attribute of any element that is not the top element of the document, another matching any Value element.

  3. The template matching attributes (the first overriding template) creates in place of the attribute an element with the same local name and value as the matched attribute. In addition, the element name is put in the same namespace as the one that the top element of the document belongs to (this avoids the xmlns="").

  4. The template matching any Value element copies it and processes al of its subtree (descendent nodes), then processes its attributes. In this way the elements generated from the attributes become siblings and not children of the Value element.



回答2:

Use this method to convert Xml attributes to xml nodes:

public static void ReplaceAttributesByNodes(XmlDocument document, XmlNode node)
{
    if (document == null)
    {
        throw new ArgumentNullException("document");
    }

    if (node == null)
    {
        throw new ArgumentNullException("node");
    }

    if (node.HasChildNodes)
    {
        foreach (XmlNode tempNode in node.ChildNodes)
        {
            ReplaceAttributesByNodes(document, tempNode);
        }
    }

    if (node.Attributes != null)
    {
        foreach (XmlAttribute attribute in node.Attributes)
        {
            XmlNode element = document.CreateNode(XmlNodeType.Element, attribute.Name, null);

            element.InnerText = attribute.InnerText;

            node.AppendChild(element);
        }

        node.Attributes.RemoveAll();
    }
}


//how to use it
static void Main()
{
    string eventNodeXPath = "Something/Segments/Segment";//your segments nodes only

    XmlDocument document = new XmlDocument();
    document.Load(@"your playlist file full path");//your input playlist file
    XmlNodeList nodes = document.SelectNodes(eventNodeXPath);

    if (nodes != null)
    {
        foreach (XmlNode node in nodes)
        {
            ReplaceAttributesByNodes(document, node);
        }
    }

    doc.Save("your output file full path");
}


回答3:

You could build an extension method to flatten each element:

public static IEnumerable<XElement> Flatten(this XElement element)
{
    // first return ourselves
    yield return new XElement(
        element.Name,

        // Output our text if we have no elements
        !element.HasElements ? element.Value : null,

        // Or the flattened sequence of our children if they exist
        element.Elements().SelectMany(el => el.Flatten()));

    // Then return our own attributes (that aren't xmlns related)
    foreach (var attribute in element.Attributes()
                                     .Where(aa => !aa.IsNamespaceDeclaration))
    {
        // check if the attribute has a namespace,
        // if not we "borrow" our element's
        var isNone = attribute.Name.Namespace == XNamespace.None;
        yield return new XElement(
            !isNone ? attribute.Name
                    : element.Name.Namespace + attribute.Name.LocalName,
            attribute.Value);
    }
}

You would use this like:

public static XElement Flatten(this XDocument document)
{
    // used to fix the naming of the namespaces
    var ns = document.Root.Attributes()
                          .Where(aa => aa.IsNamespaceDeclaration
                                    && aa.Name.LocalName != "xmlns")
                          .Select(aa => new { aa.Name.LocalName, aa.Value });
    return new XElement(
        document.Root.Name,

        // preserve "specific" xml namespaces
        ns.Select(n => new XAttribute(XNamespace.Xmlns + n.LocalName, n.Value)),

        // place root attributes right after the root element
        document.Root.Attributes()
                     .Where(aa => !aa.IsNamespaceDeclaration)
                     .Select(aa => new XAttribute(aa.Name, aa.Value)),
        // then flatten our children
        document.Root.Elements().SelectMany(el => el.Flatten()));
}

This produces output as you have indicated, except for the xsi:schemaLocation attribute, which is problematic as I've found. It selects a default namespace name (p1), but ultimately it works.

Produces the following:

<Something xmlns="http://www.something.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.xomething.com segments.xsd">
  <Version>4.0.8</Version>
  <Segments>
    <Segment>
      <SegmentField>
        <SegmentIndex>0</SegmentIndex>
        <Name>RecordTypeID</Name>
        <Value>O</Value>
        <Source>Literal</Source>
      </SegmentField>
      <SegmentField>
        <SegmentIndex>1</SegmentIndex>
        <Name>OrderSequenceNumber</Name>
        <Value>Sequence</Value>
        <Source>Calculated</Source>
        <Initial>1</Initial>
      </SegmentField>
      <SegmentField>
        <SegmentIndex>3</SegmentIndex>
        <Name>InstrumentSpecimenID</Name>
        <Value>BarCode</Value>
        <Source>Property</Source>
      </SegmentField>
    </Segment>
    <Name>Test</Name>
  </Segments>
</Something>