I want to get the values in the cells which are in an .ods
file's /content.xml
using an XSLT 2.0 transform. Regardless of blanks cells I want to get the values of column D
for example.
It could be any column to iterate over.
The same thing for all the other columns and I want to put them into different nodes in XML output.
For example, this content.xml
file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<office:document-content
xmlns:table="urn:oasis:names:tc:opendocument:xmlns:table:1.0"
xmlns:office="urn:oasis:names:tc:opendocument:xmlns:office:1.0"
xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0" xmlns:style="urn:oasis:names:tc:opendocument:xmlns:style:1.0" xmlns:draw="urn:oasis:names:tc:opendocument:xmlns:drawing:1.0" xmlns:fo="urn:oasis:names:tc:opendocument:xmlns:xsl-fo-compatible:1.0" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:number="urn:oasis:names:tc:opendocument:xmlns:datastyle:1.0" xmlns:svg="urn:oasis:names:tc:opendocument:xmlns:svg-compatible:1.0" xmlns:msoxl="http://schemas.microsoft.com/office/excel/formula">
<office:font-face-decls>
<style:font-face style:name="Calibri" svg:font-family="Calibri"/>
</office:font-face-decls>
<office:automatic-styles>
<style:style style:name="ce1" style:family="table-cell" style:parent-style-name="Default" style:data-style-name="N0"/>
<style:style style:name="co1" style:family="table-column">
<style:table-column-properties fo:break-before="auto" style:column-width="1.69333333333333cm"/>
</style:style>
<style:style style:name="ro1" style:family="table-row">
<style:table-row-properties style:row-height="15pt" style:use-optimal-row-height="true" fo:break-before="auto"/>
</style:style>
<style:style style:name="ta1" style:family="table" style:master-page-name="mp1">
<style:table-properties table:display="true" style:writing-mode="lr-tb"/>
</style:style>
</office:automatic-styles>
<office:body>
<office:spreadsheet>
<table:calculation-settings table:case-sensitive="false" table:search-criteria-must-apply-to-whole-cell="false"/>
<table:table table:name="Folha1" table:style-name="ta1">
<table:table-column table:style-name="co1" table:number-columns-repeated="16384" table:default-cell-style-name="ce1"/>
<table:table-row table:style-name="ro1">
<table:table-cell office:value-type="string" table:style-name="ce1">
<text:p>item1</text:p>
</table:table-cell>
<table:table-cell table:number-columns-repeated="2" table:style-name="ce1"/>
<table:table-cell office:value-type="string" table:style-name="ce1">
<text:p>item3</text:p>
</table:table-cell>
<table:table-cell table:number-columns-repeated="16380" table:style-name="ce1"/>
</table:table-row>
<table:table-row table:style-name="ro1">
<table:table-cell table:number-columns-repeated="16384"/>
</table:table-row>
<table:table-row table:style-name="ro1">
<table:table-cell office:value-type="string" table:style-name="ce1">
<text:p>item2</text:p>
</table:table-cell>
<table:table-cell table:number-columns-repeated="16383" table:style-name="ce1"/>
</table:table-row>
<table:table-row table:style-name="ro1">
<table:table-cell table:number-columns-repeated="16384"/>
</table:table-row>
<table:table-row table:style-name="ro1">
<table:table-cell table:number-columns-repeated="3" table:style-name="ce1"/>
<table:table-cell office:value-type="string" table:style-name="ce1">
<text:p>item4</text:p>
</table:table-cell>
<table:table-cell table:number-columns-repeated="3" table:style-name="ce1"/>
<table:table-cell office:value-type="string" table:style-name="ce1">
<text:p>item5</text:p>
</table:table-cell>
<table:table-cell table:number-columns-repeated="16376"/>
</table:table-row>
<table:table-row table:number-rows-repeated="5" table:style-name="ro1">
<table:table-cell table:number-columns-repeated="16384"/>
</table:table-row>
<table:table-row table:style-name="ro1">
<table:table-cell table:number-columns-repeated="7" table:style-name="ce1"/>
<table:table-cell office:value-type="string" table:style-name="ce1">
<text:p>item5</text:p>
</table:table-cell>
<table:table-cell table:number-columns-repeated="16376"/>
</table:table-row>
<table:table-row table:number-rows-repeated="3" table:style-name="ro1">
<table:table-cell table:number-columns-repeated="16384"/>
</table:table-row>
<table:table-row table:style-name="ro1">
<table:table-cell table:number-columns-repeated="7" table:style-name="ce1"/>
<table:table-cell office:value-type="string" table:style-name="ce1">
<text:p>item6</text:p>
</table:table-cell>
<table:table-cell table:number-columns-repeated="16376"/>
</table:table-row>
<table:table-row table:number-rows-repeated="1048561" table:style-name="ro1">
<table:table-cell table:number-columns-repeated="16384"/>
</table:table-row>
</table:table>
<table:table table:name="Folha2" table:style-name="ta1">
<table:table-column table:style-name="co1" table:number-columns-repeated="16384" table:default-cell-style-name="ce1"/>
<table:table-row table:number-rows-repeated="1048576" table:style-name="ro1">
<table:table-cell table:number-columns-repeated="16384"/>
</table:table-row>
</table:table>
<table:table table:name="Folha3" table:style-name="ta1">
<table:table-column table:style-name="co1" table:number-columns-repeated="16384" table:default-cell-style-name="ce1"/>
<table:table-row table:number-rows-repeated="1048576" table:style-name="ro1">
<table:table-cell table:number-columns-repeated="16384"/>
</table:table-row>
</table:table>
</office:spreadsheet>
</office:body>
</office:document-content>
<?xml version="1.0" encoding="UTF-8"?>
Basic XSLT Example:
<xsl:template match="office:document-content/office:body/office:spreadsheet//table:table[@table:name='Folha1']">
<xsl:variable name="description" select="table:table-row/table:table-cell[4]/text:p"/>
<!--xsl:if test="$description != '.'"-->
<Description>
<xsl:value-of select="$description"/>
</Description>
</xsl:template>
The output of this example gives me "item5", okay, but I really don't know the logic behind this - how it is counting etc.
As far as I know in XSLT empty cells are not really counted!
Am I right?
But somehow an .ods
file does save these empty cells and the structure of the document as well so that it can retrieve it the next time you open the file.
But how?
So what I want to know is:
how do we really select the items with counting these "empty" cells as well and find the certain place of each item in the sheet.
The desired output would be like:
<group1>
<Description>item</Description>
<Description>item2</Description>
<group4>
<Description>item3</Description>
<Description>item4</Description>
<group8>
<Description>item5</Description>
<Description>item6</Description>
<Description>item7</Description>
I extended my code for the solution to this problem. In this SO answer I explained a small part of how
.ods
files are structured and how to find a specific cell. The new code extends that to blocks of cells in the common spreadsheet notation likeand makes use of XSLT-2.0 features. I do not claim this to be the most elegant solution, but the tests so far look promising.
Here is the code to retrieve the content of a block of cells:
The name of the
content.xml
file is specified in theODSpath
variable at the beginning. So the source file is not passed to the XSLT processor but rather via a variable containing its name. The desired cells/columns are specified with the stringA1:A20,D1:D20,H1:h20
in the demo template at the bottom.I tested this with your above
content.xml
which failed because LibreOffice said that the file is damaged(???). But (re)creating the file with the content from the image succeeded and resulted in the following output:To get the exact result you described above just split the three columns to separate calls to
GetMultipleCellsNodes
:The output of this demo template is: