My current task involves writing a class library for processing HL7 CDA files.
These HL7 CDA files are XML files with a defined XML schema, so I used xsd.exe to generate .NET classes for XML serialization and deserialization.
The XML Schema contains various types which contain the mixed="true" attribute, specifying that an XML node of this type may contain normal text mixed with other XML nodes.
The relevant part of the XML schema for one of these types looks like this:
<xs:complexType name="StrucDoc.Paragraph" mixed="true">
<xs:sequence>
<xs:element name="caption" type="StrucDoc.Caption" minOccurs="0"/>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element name="br" type="StrucDoc.Br"/>
<xs:element name="sub" type="StrucDoc.Sub"/>
<xs:element name="sup" type="StrucDoc.Sup"/>
<!-- ...other possible nodes... -->
</xs:choice>
</xs:sequence>
<xs:attribute name="ID" type="xs:ID"/>
<!-- ...other attributes... -->
</xs:complexType>
The generated code for this type looks like this:
/// <remarks/>
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "2.0.50727.3038")]
[System.SerializableAttribute()]
[System.Diagnostics.DebuggerStepThroughAttribute()]
[System.ComponentModel.DesignerCategoryAttribute("code")]
[System.Xml.Serialization.XmlTypeAttribute(TypeName="StrucDoc.Paragraph", Namespace="urn:hl7-org:v3")]
public partial class StrucDocParagraph {
private StrucDocCaption captionField;
private object[] itemsField;
private string[] textField;
private string idField;
// ...fields for other attributes...
/// <remarks/>
public StrucDocCaption caption {
get {
return this.captionField;
}
set {
this.captionField = value;
}
}
/// <remarks/>
[System.Xml.Serialization.XmlElementAttribute("br", typeof(StrucDocBr))]
[System.Xml.Serialization.XmlElementAttribute("sub", typeof(StrucDocSub))]
[System.Xml.Serialization.XmlElementAttribute("sup", typeof(StrucDocSup))]
// ...other possible nodes...
public object[] Items {
get {
return this.itemsField;
}
set {
this.itemsField = value;
}
}
/// <remarks/>
[System.Xml.Serialization.XmlTextAttribute()]
public string[] Text {
get {
return this.textField;
}
set {
this.textField = value;
}
}
/// <remarks/>
[System.Xml.Serialization.XmlAttributeAttribute(DataType="ID")]
public string ID {
get {
return this.idField;
}
set {
this.idField = value;
}
}
// ...properties for other attributes...
}
If I deserialize an XML element where the paragraph node looks like this:
<paragraph>first line<br /><br />third line</paragraph>
The result is that the item and text arrays are read like this:
itemsField = new object[]
{
new StrucDocBr(),
new StrucDocBr(),
};
textField = new string[]
{
"first line",
"third line",
};
From this there is no possible way to determine the exact order of the text and the other nodes.
If I serialize this again, the result looks exactly like this:
<paragraph>
<br />
<br />first linethird line
</paragraph>
The default serializer just serializes the items first and then the text.
I tried implementing IXmlSerializable
on the StrucDocParagraph class so that I could control the deserialization and serialization of the content, but it's rather complex since there are so many classes involved and I didn't come to a solution yet because I don't know if the effort pays off.
Is there some kind of easy workaround to this problem, or is it even possible by doing custom serialization via IXmlSerializable
?
Or should I just use XmlDocument
or XmlReader
/XmlWriter
to process these documents?
I had the same problem as this, and came across this solution of altering the .cs generated by xsd.exe. Although it did work, I wasn't comfortable with altering the generated code, as I would need to remember to do it any time I regenerated the classes. It also led to some awkward code which had to test for and cast to XmlNode[] for the mailto elements.
My solution was to rethink the xsd. I ditched the use of the mixed type, and essentially defined my own mixed type.
I had this
and changed to
My generated code now gives me a class myText:
the order of the elements is now preserved in the serilization/deserialisation, but i do have to test for/ cast to/program against the types
myTextTextMailto
andmyTextText
.Just thought I'd throw that in as an alternative approach which worked for me.
What about
?
To solve this problem I had to modify the generated classes:
XmlTextAttribute
from theText
property to theItems
property and add the parameterType = typeof(string)
Text
propertytextField
fieldAs a result the generated code (modified) looks like this:
Now if I deserialize an XML element where the paragraph node looks like this:
The result is that the item array is read like this:
This is exactly what I need, the order of the items and their content is correct.
And if I serialize this again, the result is again correct:
What pointed me in the right direction was the answer by Guillaume, I also thought that it must be possible like this. And then there was this in the MSDN documentation to
XmlTextAttribute
:So the serialization and deserialization work correct now, but I don't know if there are any other side effects. Maybe it's not possible to generate a schema from these classes with xsd.exe anymore, but I don't need that anyway.