Export denormalized data from excel to xml

2019-04-08 21:28发布

问题:

We are trying to export an excel table with "Denormalized Data" to xml. The table headers are as follows:

| AssetManager Code | AssetManager Date | Portfolio Code | Portfolio Name | MarketValue | NetCashFlow | Field | Field Code | Field Name |

The AssetManager Code and AssetManager Date are the same throughout, the rest of the columns contain variable data.

Here is an example of the xml output we want:

<AssetManager Code="PFM" Date="20130117">                   
    <Portfolios>            
        <Portfolio Code="CC PSP" Name="Consilium Capital">      
            <MarketValue>5548056.51</MarketValue>   
            <NetCashFlow>0</NetCashFlow>    
            <UserFields>    
                <Field Code="AM UCGT" Name="AM daily Unrealised CG">4375</Field>
            </UserFields>   
        </Portfolio>        
        <Portfolio Code="MM (FC)" Name="Money Market UT (FC)">      
            <MarketValue>28975149.6500735</MarketValue> 
            <NetCashFlow>0</NetCashFlow>    
            <UserFields>    
                <Field Code="UCGT" Name="AM daily Unrealised CG">0</Field>
            </UserFields>   
        </Portfolio>        
    </Portfolios>           
</AssetManager> 

And our xsd file containing the mappings:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="AssetManager">
    <xs:complexType>
        <xs:sequence>
                    <xs:element ref="Portfolios" />
            </xs:sequence>
        <xs:attribute name="Code" type="xs:string"/>
            <xs:attribute name="Date" type="xs:string"/>
    </xs:complexType>
</xs:element>
<xs:complexType name="FieldType">
    <xs:simpleContent>
        <xs:extension base="xs:decimal">
            <xs:attribute name="Code" type="xs:string"/>
                <xs:attribute name="Name" type="xs:string"/>
        </xs:extension>
    </xs:simpleContent>
</xs:complexType>
<xs:element name="Portfolios">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="Portfolio">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="MarketValue" type="xs:decimal"/>
        <xs:element name="NetCashFlow" type="xs:decimal"/>
        <xs:element name="UserFields">
          <xs:complexType>
            <xs:sequence>
                    <xs:element name="Field" type="FieldType"/>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
      <xs:attribute name="Code" type="xs:string"/>
      <xs:attribute name="Name" type="xs:string"/>
    </xs:complexType>
              </xs:element>
            </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

At the very least we'd like to know why excel considers data denormalised?

Any help will be much appreciated.

回答1:

First of all, you have a problem with the posted XSD. The Portfolio should have the maxOccurs set to a value greater than 1 - otherwise you're not matching the sample XML and you wouldn't get the "denormalized data" error when verifying your map in Excel.

This article should explain common errors you get with Excel maps - yours included.

I guess what you did was to drag-drop the root - this will not work with repeating elements.

You may get around with what I did below; it may not work for your concrete example, but it should give you an idea.

Modified your XSD to account for repeating particles:

<?xml version="1.0" encoding="UTF-8"?>
<!-- XML Schema generated by QTAssistant/XSD Module (http://www.paschidev.com) -->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="AssetManager">
        <xs:complexType>
            <xs:sequence>
                <xs:element ref="Portfolios"/>
            </xs:sequence>
            <xs:attribute name="Code" type="xs:string"/>
            <xs:attribute name="Date" type="xs:string"/>
        </xs:complexType>
    </xs:element>
    <xs:complexType name="FieldType">
        <xs:simpleContent>
            <xs:extension base="xs:decimal">
                <xs:attribute name="Code" type="xs:string"/>
                <xs:attribute name="Name" type="xs:string"/>
            </xs:extension>
        </xs:simpleContent>
    </xs:complexType>
    <xs:element name="Portfolios">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="Portfolio" minOccurs="0" maxOccurs="unbounded">
                    <xs:complexType>
                        <xs:sequence>
                            <xs:element name="MarketValue" type="xs:decimal"/>
                            <xs:element name="NetCashFlow" type="xs:decimal"/>
                            <xs:element name="UserFields">
                                <xs:complexType>
                                    <xs:sequence>
                                        <xs:element name="Field" type="FieldType"/>
                                    </xs:sequence>
                                </xs:complexType>
                            </xs:element>
                        </xs:sequence>
                        <xs:attribute name="Code" type="xs:string"/>
                        <xs:attribute name="Name" type="xs:string"/>
                    </xs:complexType>
                </xs:element>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

Drag the Code and Date only on the first sheet; rename that to something else if you want.

Drag Portfolios to another sheet.

Fill in some data and Export; this is what I got:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<AssetManager Code="a" Date="b">
    <Portfolios>
        <Portfolio Code="aa" Name="bb">
            <MarketValue>10</MarketValue>
            <NetCashFlow>100</NetCashFlow>
            <UserFields>
                <Field/>
            </UserFields>
        </Portfolio>
        <Portfolio Code="aa" Name="bb">
            <MarketValue>10</MarketValue>
            <NetCashFlow>100</NetCashFlow>
            <UserFields>
                <Field/>
            </UserFields>
        </Portfolio>
    </Portfolios>
</AssetManager>

It looks pretty close. It should help you move forward if not with the solution itself, then with your investigations.



回答2:

I wrote up some code to write a pivot table to a primitive XML format. Here I am not following any pre-set schema, just writing the pivot table heirarchy to XML. For this to work, you must use the outline form but not-compact (each new level should start a new column). Also the code expects no subtotals or grand totals, and only one level of numeric data in the data field is expected.

Your PT will be in an acceptable XML format with nodes named according to the PT headers, but the sub group titles come out as attributes unhelpfully named 'name ='. So you get XML that reads like - "Folder contents here".

See code below: one other caveat, this has not been cleaned up very well. there are some lines that will never get hit from old implementations of the code. Also, there is a stop right before the end for debugging - in case you need to make a change to the output and redo the file writing steps. Output is written as a text file named 'txt.txt' in the C: drive.

Edit and re-use as needed.

Private Sub XMLWriter()
Dim sht As Worksheet: Set sht = ActiveSheet
    'Debug.Print "The current Sheet is " & sht.Name
Dim pt As PivotTable: Set pt = sht.PivotTables(1)
    'Debug.Print "Pivot Table name is " & pt.Name
Dim begin As String: begin = pt.TableRange1.Cells(1, 1).Address

Dim rows As Integer: rows = pt.TableRange1.rows.Count
Dim LastCell As Range: Set LastCell = pt.TableRange1.Cells(rows, 1)

 If LastCell.PivotCell.PivotCellType = xlPivotCellGrandTotal Then Set LastCell = LastCell.Offset(-1, 0)
 If LastCell.PivotCell.PivotCellType = xlPivotCellSubtotal Then Stop 'not implemented routine does not expect subtotals in rows - (will not create good xml)

Dim LastRow As Integer: LastRow = LastCell.Row

Dim celly As Range: Set celly = sht.Range(begin)
Dim level As Integer: level = 1
Dim levels As Integer: levels = 0 ' PRECEEDING CODE INITIALIZED VARIABLES - Depends on Pivot table in active worksheet (first on sheet, assumes only one on sheet)

Do 'determines nesting depth
    If celly.Offset(0, levels + 1).Value = "" Then
        levels = levels + 1
        Exit Do
    Else: levels = levels + 1
    End If
Loop
'Stop
Dim dataFieldPresent As Boolean
Dim ShutDown As Boolean
If celly.Offset(levels - 1, levels - 1).PivotCell.PivotCellType = xlPivotCellValue Then
levels = levels - 1
dataFieldPresent = True
End If
'Stop


Dim ary() As String ' initializes array
ReDim ary(1 To levels, 7) As String ' based on nesting depth, seven placeholders set to accomadate data
Dim n As Integer
For n = LBound(ary) To UBound(ary)      ' populates 'folder' names from pivottable headings
    ary(n, 0) = celly.Offset(0, n - 1).Value  ' 0 based folder holds name, or already completed xml group's string/data
    ary(n, 1) = gettabs(n) & Cap(ary(n, 0))          ' 1 based folder holds node's'front cap' following xml syntax
    ary(n, 2) = Cap("/" & ary(n, 0)) & vbCrLf   ' 2 based folder holds 'end cap' to close node
    ary(n, 0) = ""
Next

Set celly = celly.Offset(1, 0)
If celly.Value = "" Then Stop ' error occurred, there must be a cell in first column position at first row under Row Heading

ary(level, 3) = nameElement(celly.Value) & vbCrLf ' get value in current cell to name folder   'ary(level, 4) = nameElement("/" & celly.Value) ' level 4 was created for old implementation, no longer used

Dim tabs As String
'Stop
'tabs = gettabs(level)
ary(level, 6) = ary(level, 2) & vbCrLf
ary(level, 5) = ary(level, 1) & ary(level, 3) & vbCrLf

Dim lvlref As Integer: lvlref = 1
Dim addcrlf As String: addcrlf = vbCrLf

Do
    Set celly = celly.Offset(1, -(celly.Column - 1))
'    If celly.Row = 780 Then Stop

    If celly.Row = LastRow Then ShutDown = True


    If celly.Value = "Liabilities" Then Stop
    If Not celly.Value = "" Then
        closetoplevel
        level = 1
        ary = levelup(ary, level, lvlref, levels)
            ary(level, 3) = nameElement(celly.Value) & vbCrLf
 '           ary(level, 4) = nameElement("/" & celly.Value)
            ary(level, 5) = ary(level, 5) & gettabs(level) & ary(level, 3)
            ary(level, 6) = ary(level, 3)
            ary(level, 7) = celly.Value
        writeout ary(1, 0)
'        Stop
    Else
        level = 2
        Do
            Set celly = celly.Offset(0, 1)
             On Error GoTo Term:
             Nam = celly.PivotCell.PivotCellType ' error trap - should always be in pivot table
            On Error GoTo 0
            If celly.Value = "" Then
                level = level + 1
            Else
                Exit Do
            End If
        Loop

        getPosition (celly.Cells(1))

'        If level = lvlref And level > 2 Then Stop ' update: seems to work fine after refactoring code originally ('not implemented - code does not expect given schema structure"
        If level < lvlref Then
            'Stop
            ary = levelup(ary, level, lvlref, levels)
            'getPosition (celly.Cells(1))
            'Stop
            lvlref = level - 1
            GoTo ReInsertionPoint:


        Else


ReInsertionPoint:







            If level = levels Then
                addcrlf = ""
            Else: addcrlf = vbCrLf
            End If

            ary(level, 3) = nameElement(celly.Value) & addcrlf
            If level = levels And dataFieldPresent = True Then
'                Stop
                ary(level, 3) = ary(level, 3) & CStr(celly.Offset(0, 1).Value)
            End If
            ary(level, 5) = ary(level, 5) & ary(level, 3)
            ary(level, 6) = ary(level, 3)
            ary(level, 7) = celly.Value

         If level = levels Then ' XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX  not operating properly failing to add last item (number & accoiunt) of each section
'            Stop

                Dim nextlevel As Integer: nextlevel = 1
                'Stop
                Dim nextlvlcell As Range: Set nextlvlcell = celly.Offset(1, -(level - 1))
                Debug.Print nextlvlcell.Address
                Do
                    If nextlvlcell.Value = "" Then
                        If nextlvlcell.Row > LastRow Then
                            nextlevel = 1
                            GoTo Closure:
                        Else
                            Set nextlvlcell = nextlvlcell.Offset(0, 1)
                            nextlevel = nextlevel + 1
                        End If
                    Else: Exit Do
                    End If
                Loop
                Debug.Print nextlvlcell.Address
                If level - nextlevel > 1 Then
Closure:
                    'Stop
                    ary = pushup(ary(), level, levels, lvlref)
                    'Stop
                    ary = levelup(ary(), level - 1, levels, lvlref)
                Else

                    ary = pushup(ary, level, levels, lvlref)
                End If
            End If

        'Stop

        End If
    End If
lvlref = level
If ShutDown = True Then
    level = 1
    ary = levelup(ary, level, lvlref, levels)
    Exit Do
End If
Loop

writeout "<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>" & vbCrLf & "<Root xmlns:xsi=""http://www.w3.org/2001/XMLSchema-instance"">" & vbCrLf & ary(1, 0) & "</Root>"

Stop
End
Term:
Stop

writeout "<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>" & vbCrLf & "<Root xmlns:xsi=""http://www.w3.org/2001/XMLSchema-instance"">" & vbCrLf & "<xmldoc>" & vbCrLf & ary(1, 0) & "</xmldoc>" & vbCrLf & "</Root>"
'writeout (ary(1, 0))
Stop
Exit Sub
'created by derik bingner Jan 2014 www.dbexcelaccounting.blogspot.com

End Sub
Private Sub getPosition(x As Range)
Debug.Print "Cell addy is " & x.Address & ". Cell level and text is " & x.Column & " - " & x.Value
End Sub
Private Function gettabs(x As Integer, Optional y As Integer) As String
For n = 1 To (x) ' - y) old implementation allowed offsets
gettabs = vbTab & "" & gettabs
Next
'If ((x * 2) - y) = 8 Then Stop

End Function

Private Function cnam(c As Range)
cnam = c.Value
End Function
Private Function Cap(x As String) As String
If Left(x, 1) = "/" Then
Cap = "</" & Right(x, Len(x) - 1) & ">"
Else: Cap = "<" & x & " name="""
End If
End Function
Private Function nameElement(x As String) As String
nameElement = x & """>"
End Function

Private Sub closetoplevel()
'Stop
'not implemented
End Sub

Private Function pushup(r() As String, l As Integer, s As Integer, ref As Integer)
Dim x As Integer: x = ref - l - 1
'Stop



'If ref <> s Then
'    MsgBox "error, structure issue - not implemented"
'    Stop
'End If
Dim y As Integer
If x > 1 Then ' tests if this function needs to be run recursively to step up multiple levels
    For y = 1 To x - 1
    Dim groupnumber As Integer
    'Stop
    If y <> 2 And InStr(1, r(l + 1, 0), "<") > 0 Then
        groupnumber = 2
    Else: groupnumber = 2 + y - 1
    End If
    'If groupnumber = 2 Then Stop
    Call rlevelup(r, l + (x - y), s, ref, groupnumber) ' recursive section
    'Stop
    Next
End If
'Stop
If r(l, 5) = r(l, 3) Then ' this triggers for first item in group
    'Stop
    r(l, 5) = r(l, 1) & r(l, 5) & r(l, 2)
Else
    If y = 0 Then
    r(l, 5) = r(l, 1) & r(l, 5) & r(l, 2) & vbCrLf
    Stop
    Else
        r(l, 5) = r(l, 5) & vbCrLf & r(l + 1, 5) & vbCrLf & gettabs(l, -1) & r(l, 4) & vbCrLf
'        Stop
    End If
End If

'Debug.Print r(l, 5)
Dim PlaceHolder As String: PlaceHolder = r(l, 0)

    If Left(PlaceHolder, 1) = vbTab Or Left(PlaceHolder, 4) = "    " Or Left(PlaceHolder, 1) = "<" Then
        'Debug.Print PlaceHolder
    Else
        PlaceHolder = ""
    End If

r(l, 0) = PlaceHolder & r(l, 5)
'Stop
For n = LBound(r) To UBound(r)
    If n >= l Then
        For i = 3 To 7
            If r(n, i) <> r(l, 5) Then r(n, i) = ""
        Next
    End If
Next

r(l, 3) = r(l, 5)
r(l, 5) = ""


'Stop
'not implemented

pushup = r
End Function

Private Function levelup(r() As String, l As Integer, s As Integer, ref As Integer)
Dim x As Integer: x = s - l - 1
'If x > 3 Then Stop
'r = pushup(r(), s - 1, s, ref)


'If ref <> s Then
'    MsgBox "error, structure issue - not implemented"
'    Stop
'End If
Dim y As Integer
If x > 1 Then ' tests if this function needs to be run recursively to step up multiple levels
    For y = 1 To x - 1
    Dim groupnumber As Integer
    'Stop
    If y <> 2 And InStr(1, r(l + 1, 0), "<") > 0 Then
        groupnumber = 2
    Else: groupnumber = 2 + y - 1
    End If
'Stop
    'If groupnumber = 2 Then Stop
    Call rlevelup(r, l + (x - y), s, ref, groupnumber) ' recursive section
    'Stop
    Next
End If
'Stop
If r(l, 5) = r(l, 3) Then ' this triggers for first item in group
    'Stop
    r(l, 5) = r(l, 1) & r(l, 5) & gettabs(l) & r(l, 2) & vbCrLf
Else
    If y = 0 Then
    r(l, 5) = r(l, 1) & r(l, 5) & gettabs(l) & r(l, 2) & vbCrLf
    Stop
    Else
        r(l, 5) = r(l, 5) & vbCrLf & r(l + 1, 5) & vbCrLf & gettabs(l, -1) & r(l, 4) & vbCrLf
'        Stop
    End If
End If

'Debug.Print r(l, 5)
Dim PlaceHolder As String: PlaceHolder = r(l, 0)

    If Left(PlaceHolder, 1) = vbTab Or Left(PlaceHolder, 4) = "    " Or Left(PlaceHolder, 1) = "<" Then
        'Debug.Print PlaceHolder
    Else
        PlaceHolder = ""
    End If

r(l, 0) = PlaceHolder & r(l, 1) & r(l, 3) & r(l + 1, 0) & gettabs(l) & r(l, 2)
r(l + 1, 0) = ""
'Stop
For n = LBound(r) To UBound(r)
    If n >= l Then
        For i = 3 To 7
            If r(n, i) <> r(l, 5) Then r(n, i) = ""
        Next
    End If
Next

'r(l, 3) = r(l, 5)
r(l, 5) = ""


'Stop
'not implemented

levelup = r
End Function




Private Function rlevelup(r() As String, l As Integer, s As Integer, ref As Integer, Optional groupnumber As Integer)
Dim x As Integer: x = ref - l - 1
'Stop
'called by level up


'If ref <> s Then
'    MsgBox "error, structure issue - not implemented"
'    Stop
'End If
Dim y As Integer
If x > 1 Then ' tests if this function needs to be run recursively to step up multiple levels
    For y = 1 To x - 1
    'Dim groupnumber As Integer
    'Stop
    'If y <> 2 And InStr(1, r(l + 1, 0), "<") > 0 Then
        groupnumber = 2
    'Else: groupnumber = 2 + y - 1
    'End If
    'If groupnumber = 2 Then Stop
    'Call rpushup(r, l + (x - y), s, ref, groupnumber) ' recursive section
    'Stop
    Next
End If
'Stop
If r(l, 5) = r(l, 3) Then ' this triggers for first item in group
    'Stop
    r(l, 5) = r(l, 1) & r(l, 5) & gettabs(l) & r(l, 2) & vbCrLf
Else
    If y = 0 Then
    r(l, 5) = r(l, 1) & r(l, 5) & gettabs(l) & r(l, 2) & vbCrLf
    Stop ' delete this comment when stop hit programmatically - may be deletable
    Else
        r(l, 5) = r(l, 5) & vbCrLf & r(l + 1, 5) & vbCrLf & gettabs(l, -1) & r(l, 4) & vbCrLf
'        Stop
    End If
End If

'Debug.Print r(l, 5)
Dim PlaceHolder As String: PlaceHolder = r(l, 0)

    If Left(PlaceHolder, 1) = vbTab Or Left(PlaceHolder, 4) = "    " Or Left(PlaceHolder, 1) = "<" Then
        'Debug.Print PlaceHolder
    Else
        PlaceHolder = ""
    End If

r(l, 0) = PlaceHolder & r(l, 1) & r(l, 3) & r(l + 1, 0) & gettabs(l) & r(l, 2)
r(l + 1, 0) = ""
'Stop
For n = LBound(r) To UBound(r)
    If n >= l Then
        For i = 3 To 7
            If r(n, i) <> r(l, 5) Then r(n, i) = ""
        Next
    End If
Next

'r(l, 3) = r(l, 5)
r(l, 5) = ""


'Stop
'not implemented
'writeout (r(l, 0))
rlevelup = r
End Function

Private Sub writeout(s As String)

Dim fso As Object
Set fso = CreateObject("Scripting.FileSystemObject")
Dim oFile As Object
Set oFile = fso.CreateTextFile("c:/txt.txt")
oFile.WriteLine s
oFile.Close
Set fso = Nothing
Set oFile = Nothing

End Sub