Generate xml from csv data in conformance with giv

2019-09-13 03:00发布

问题:

I have an xml schema and csv data to generate corresponding xml files.

I found this, how to generate xml files from given CSV file. But with my schema, it's always the same mapping because of the element. So the last 5 columns are mapped according to the DataType column.

I could expand the code with case switch, and map every element accordingly in each case, but there should be a simpler way to do this. I am new to xml, need some help here.
P.S: I tried all the tools, commercial and free, that advertise mapping capabilities, and found nothing to do this. Also Excel does not work with a denormalized schema.

Any help appreciated, thanks
Update1: Tools: C# command line code should be the easiest. Update2: Objective is to generate xml with the specified schema, using data from the given csv file.

CSV data

EntityName,FieldName,SQLType,DataType,Nullable,Caption,ColumnIndex,MinStringLength,MaxStringLength,D_Precision,D_Scale
SOChemistryRequirement,CE_Min,"decimal(7, 5)",Decimal,TRUE,CE_Min,82,,,7,5
SOChemistryRequirement,CE_Max,"decimal(7, 5)",Decimal,TRUE,CE_Max,83,,,7,5
SOTestRequirement,Weldability,bit,bool,FALSE,Weldability,107,,,,
SONumber,SONumber,varchar(6),string,FALSE,SONumber,0,,6,,

schema definition:

<?xml version="1.0" encoding="utf-8"?>
<xs:schema id="DataTypes" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!-- No empty string data type. -->
<xs:simpleType name="NoEmptyString">
  <xs:restriction base="xs:string">
  <xs:minLength value="1" />
</xs:restriction>
</xs:simpleType>

<!-- Root element. -->
<xs:element name="ExcelFile">
<xs:complexType>
  <xs:sequence>
    <xs:element name="SheetName" type="NoEmptyString" />
    <xs:element name="CellRange" type="NoEmptyString" />
    <!-- Definition of columns. -->
    <xs:element name="Columns">
      <xs:complexType>
        <xs:sequence maxOccurs="unbounded">
          <!-- Information about one column. -->
          <xs:element name="Column">
            <xs:complexType>
              <xs:sequence>
                <!-- Column index in excel file. -->
                <xs:element name="ColumnIndex" type="xs:unsignedByte" />
                <!-- Caption for current column in grid in importing screen. -->
                <xs:element name="Caption" type="xs:string" />
                <!-- Data type of current column. -->
                <xs:element name="DataType">
                  <xs:complexType>
                    <!-- It can be only one from following: -->
                    <xs:choice>
                      <xs:element name="Boolean"/>
                      <xs:element name="Int16">
                        <xs:complexType>
                          <xs:sequence>
                            <xs:element name="MinValue" type="xs:short" minOccurs="0" />
                            <xs:element name="MaxValue" type="xs:short" minOccurs="0" />
                          </xs:sequence>
                        </xs:complexType>
                      </xs:element>
                      <xs:element name="Int32">
                        <xs:complexType>
                          <xs:sequence>
                            <xs:element name="MinValue" type="xs:int" minOccurs="0" />
                            <xs:element name="MaxValue" type="xs:int" minOccurs="0" />
                          </xs:sequence>
                        </xs:complexType>
                      </xs:element>
                      <xs:element name="Int64">
                        <xs:complexType>
                          <xs:sequence>
                            <xs:element name="MinValue" type="xs:long" minOccurs="0" />
                            <xs:element name="MaxValue" type="xs:long" minOccurs="0" />
                          </xs:sequence>
                        </xs:complexType>
                      </xs:element>
                      <xs:element name="Decimal">
                        <xs:complexType>
                          <xs:sequence>
                            <xs:element name="MinValue" type="xs:decimal" minOccurs="0" />
                            <xs:element name="MaxValue" type="xs:decimal" minOccurs="0" />
                            <xs:element name="Precision" type="xs:int" minOccurs="0" />
                            <xs:element name="Scale" type="xs:int" minOccurs="0" />
                          </xs:sequence>
                        </xs:complexType>
                      </xs:element>
                      <xs:element name="DateTime"/>
                      <xs:element name="String">
                        <xs:complexType>
                          <xs:sequence>
                            <xs:element name="MinLength" type="xs:int" minOccurs="0" />
                            <xs:element name="MaxLength" type="xs:int" minOccurs="0" />
                          </xs:sequence>
                        </xs:complexType>
                      </xs:element>
                      <xs:element name="Custom"/>
                    </xs:choice>
                    </xs:complexType>
                    </xs:element>
                    <!-- Can be NULL value? -->
                    <xs:element name="IsNullable" type="xs:boolean" />
                    <!-- Entity name. Cannot be NULL. -->
                    <xs:element name="EntityName" type="NoEmptyString" />
                    <!-- Field name. It can be NULL because of composite target field. -->
                    <xs:element name="FieldName" type="xs:string" />
                  </xs:sequence>
                </xs:complexType>
              </xs:element>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

回答1:

A quick solution (if this is a one-off) would be to:

  1. use mysqlimport to pull the CSV into a temporary mysql table
  2. use mysqldump -X to output that table as a simple XML file
  3. process the outputted XML with an XSL stylesheet to map to your required schema.

If you're doing this regularly then something more robust/scriptable would be better, but the principle is the same:

1) convert your CSV to very simple XML in the same format as the CSV:

<csv>
  <record>
    <EntityName>SOChemistryRequirement</EntityName>
    <FieldName>CE_Min</FieldName>
    <SQLType>"decimal(7, 5)"</SQLType>
    <DataType>Decimal</DataType>
    <Nullable>TRUE</Nullable>
    <Caption>CE_Min</Caption>
    <ColumnIndex>82</ColumnIndex>
    <MinStringLength></MinStringLength>
    <MaxStringLength></MaxStringLength>
    <D_Precision>7</D_Precision>
    <D_Scale>5</D_Scale>
  </record>
  <!-- etc... -->
</csv>

2) process that XML through XSL to get an XML doc formatted following your schema.