Read Excel and get the row number and file extensi

2020-02-16 02:54发布

问题:

I am using Apache POI with Java 1.8 in my application. In my application, I try to read Excel and get the embedded objects.

I need to know how to get the row number and file extensions for each embedded OLE object.

Workbook workbook = WorkbookFactory.create(new File(path));

XSSFWorkbook fWorkbook = (XSSFWorkbook) workbook;

List<PackagePart> embeddedDocs = fWorkbook.getAllEmbedds();

To get embeddedDocs.getContentType Which returns the application/vnd.openxmlformats-officedocument.oleObject.

But is there anyway where we can get the file extensions (i.e pdf,ppt,mp3) as which is returned by the MimeType. And which way to get row number of embedded objects. Any ideas / Coding logic to resolve this will be very useful.

回答1:

The following is working for - I guess - the usual suspects. I've tested it with .xls/x on the POI trunk, which will be POI 4.1.0, but should work with POI 4.0.1 too.

Known issues are:

  • an object wasn't embedded based on a file, then you don't get a filename. This probably also applies to most .xls files.

  • a .xlsx only contains a vmlDrawing*.xml, then the DrawingPatriach can't be extracted and no shapes can be determined

  • a shape in a .xlsx wasn't anchored via twoCellAnchor, then you don't get a ClientAnchor

Code:

import java.io.FileInputStream;
import java.io.IOException;

import org.apache.poi.hpsf.ClassIDPredefined;
import org.apache.poi.poifs.filesystem.DirectoryEntry;
import org.apache.poi.ss.usermodel.ChildAnchor;
import org.apache.poi.ss.usermodel.ClientAnchor;
import org.apache.poi.ss.usermodel.Drawing;
import org.apache.poi.ss.usermodel.ObjectData;
import org.apache.poi.ss.usermodel.Shape;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
import org.junit.Test;

public class TestEmbed {
    @Test
    public void extract() throws IOException {
//        String xlsName = "test-data/spreadsheet/WithEmbeddedObjects.xls";
        String xlsName = "embed.xlsx";
        try (FileInputStream fis = new FileInputStream(xlsName);
             Workbook xls = WorkbookFactory.create(fis)) {
            for (Sheet s : xls) {
                Drawing<?> dp = s.getDrawingPatriarch();
                if (dp != null) {
                    for (Shape sh : dp) {
                        if (sh instanceof ObjectData) {
                            ObjectData od = (ObjectData)sh;
                            String filename = od.getFileName();
                            String ext = null;

                            if (filename != null && !filename.isEmpty()) {
                                int i = filename.lastIndexOf('.');
                                ext = (i > 0) ? filename.substring(i) : ".bin";
                            } else {
                                String ct = null;

                                try {
                                    DirectoryEntry de = od.getDirectory();

                                    if (de != null) {
                                        ClassIDPredefined ctcls = ClassIDPredefined.lookup(de.getStorageClsid());
                                        if (ctcls != null) {
                                            ext = ctcls.getFileExtension();
                                        }
                                    }
                                } catch (Exception ignore) {
                                }
                            }

                            if (ext == null) {
                                ext = ".bin";
                            }

                            ChildAnchor chAnc = sh.getAnchor();
                            if (chAnc instanceof ClientAnchor) {
                                ClientAnchor anc = (ClientAnchor) chAnc;
                                System.out.println("Rows: " + anc.getRow1() + " to " + anc.getRow2() + " - filename: "+filename+" - ext: "+ext);
                            }
                        }
                    }
                }
            }
        }
    }
}