Export data from Excel into XML with variable elem

2019-09-08 18:45发布

I need to create XML documents but want to automate the process using Excel because I can't copy and paste hundreds of values into the XML document. I am new to this feature of Excel though and am having trouble with the layout of the elements. This is an example of the final XML document:

<?xml version="1.0" encoding="UTF-8"?>
<ANALYSIS_SET>
    <ANALYSIS alias="Test Analysis 1" center_name="NeurOmics" broker_name="EGA" xmlns:com="SRA.common" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
        <TITLE>Bed data hsp 1</TITLE>
        <DESCRIPTION>Test of NeurOmics Tuebingen HSP Data 1</DESCRIPTION>
        <STUDY_REF accession="EGAS00001000697"></STUDY_REF>
        <SAMPLE_REF accession="EGAN00001409012"></SAMPLE_REF>
        <RUN_REF accession="EGAR00001446342"></RUN_REF>
        <SAMPLE_REF accession="EGAN00001409040"></SAMPLE_REF>
        <RUN_REF accession="EGAR00001446343"></RUN_REF>
        <SAMPLE_REF accession="EGAN00001404184"></SAMPLE_REF>
        <RUN_REF accession="EGAR000016818343"></RUN_REF>
        <SAMPLE_REF accession="EGAN0000148740"></SAMPLE_REF>
        <RUN_REF accession="EGAR00001431643"></RUN_REF>
        <ANALYSIS_TYPE>
            <SEQUENCE_ANNOTATION/>
        </ANALYSIS_TYPE>
        <FILES>
            <FILE filename="TUE-hpHSPv2_TargetRegions.bed" filetype="bed"/>
        </FILES>
    </ANALYSIS>
    <ANALYSIS alias="Test Analysis 2" center_name="NeurOmics" broker_name="EGA" xmlns:com="SRA.common" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
        <TITLE>BED data ataxia 1</TITLE>
        <DESCRIPTION>Test of NeurOmics Tuebingen Ataxia Data 1</DESCRIPTION>
        <STUDY_REF accession="EGAS00001000694"></STUDY_REF>
        <SAMPLE_REF accession="EGAN00001408934"></SAMPLE_REF>
        <RUN_REF accession="EGAR00001446272"></RUN_REF>
        <SAMPLE_REF accession="EGAN00001408935"></SAMPLE_REF>
        <RUN_REF accession="EGAR00001446271"></RUN_REF>
        <ANALYSIS_TYPE>
            <SEQUENCE_ANNOTATION/>
        </ANALYSIS_TYPE>
        <FILES>
            <FILE filename="TUE-hpSCAv1_TargetRegions.bed" filetype="bed"/>
        </FILES>
    </ANALYSIS>

The part I'm having trouble with are the elements SAMPLE_REF and RUN_REF as within each ANALYSIS section there can be any number of these paired values. The rest I think I can figure out, for now I'm just trying to get that part in a format I can copy into the XML document as that's the part that will take all the time. Anyway, I set up this XSD file for Excel:

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <xsd:element name="ANALYSIS_SET">
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="ANALYSIS" minOccurs="0" maxOccurs="unbounded">
                    <xsd:complexType>
                        <xsd:sequence>
                            <!-- single sample information -->
                            <xsd:element name="refname" type="xsd:string"/>
                            <xsd:element name="accession" type="xsd:string"/>
                        </xsd:sequence>
                    </xsd:complexType>
                </xsd:element>
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>
</xsd:schema>

I then link this to an Excel spreadsheet which has a column for refname and a column for accession, and then export to XML. This is what I get:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ANALYSIS_SET>
    <ANALYSIS>
        <refname>GS130178</refname>
        <accession>EGAR00001446342</accession>
    </ANALYSIS>
    <ANALYSIS>
        <refname>GS130627</refname>
        <accession>EGAR00001446343</accession>
    </ANALYSIS>
    <ANALYSIS>
        <refname>GS130781</refname>
        <accession>EGAR00001446344</accession>
    </ANALYSIS>
    <ANALYSIS>
        <refname>GS130782</refname>
        <accession>EGAR00001446345</accession>
    </ANALYSIS>
    <ANALYSIS>
        <refname>GS130783</refname>
        <accession>EGAR00001446346</accession>
    </ANALYSIS>
    <ANALYSIS>
        <refname>GS130843</refname>
        <accession>EGAR00001446348</accession>
    </ANALYSIS>
<ANALYSIS_SET>

The problem I have is that I don't want each pair to be nested within an ANALYSIS tag, I want all the refname-accession pairs to be within a single ANALYSIS section.

标签: xml excel xsd
0条回答
登录 后发表回答