Read Xlsx file in PhpSpreadsheet

2020-06-04 05:47发布

问题:

I want to read an xlsx file that was created in Microsoft Excel, but when I run the following code...

$Source_File = "test.xlsx";
$Spreadsheet = \PhpOffice\PhpSpreadsheet\IOFactory::load($Source_File);

...I receive the following error:

Fatal error: Uncaught PhpOffice\PhpSpreadsheet\Reader\Exception: Unable to identify a reader for this file in /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/IOFactory.php:163
Stack trace:
  #0 /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/IOFactory.php(93): PhpOffice\PhpSpreadsheet\IOFactory::createReaderForFile('file:///home/ar...')
  #1 /var/www/html/Function_Spreadsheet.php(480): PhpOffice\PhpSpreadsheet\IOFactory::load('file:///home/ar...')
  #2 /var/www/html/Function_Home.php(3747): Spreadsheet_Reader_1('/var/www/html/F...', 3745, Array, Array)
  #3 {main} thrown in /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/IOFactory.php on line 163

I get the same error if I instead use $Spreadsheet = IOFactory::load($Source_File);

I get the following error if I instead use $Spreadsheet = $reader->load($Source_File);

Warning: ZipArchive::getFromName(): Invalid or uninitialized Zip object in /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Xlsx.php on line 311

Warning: ZipArchive::getFromName(): Invalid or uninitialized Zip object in /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Xlsx.php on line 313

Notice: Trying to get property 'Relationship' of non-object in /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Xlsx.php on line 350

Warning: Invalid argument supplied for foreach() in /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Xlsx.php on line 350

Warning: ZipArchive::getFromName(): Invalid or uninitialized Zip object in /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Xlsx.php on line 311

Warning: ZipArchive::getFromName(): Invalid or uninitialized Zip object in /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Xlsx.php on line 313

Notice: Trying to get property 'Relationship' of non-object in /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Xlsx.php on line 397

Warning: Invalid argument supplied for foreach() in /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Xlsx.php on line 397

Warning: ZipArchive::getFromName(): Invalid or uninitialized Zip object in /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Xlsx.php on line 311

Warning: ZipArchive::getFromName(): Invalid or uninitialized Zip object in /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Xlsx.php on line 313

Notice: Trying to get property 'Override' of non-object in /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Xlsx.php on line 1855

Warning: Invalid argument supplied for foreach() in /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Xlsx.php on line 1855

Warning: ZipArchive::close(): Invalid or uninitialized Zip object in /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Xlsx.php on line 1883

The file is readable and open-able by my PHP v7.2 script, in Apache on Ubuntu 18.04. I read several forum posts, which suggest the following, which I have done:

  • Install the required extensions and restart apache
  • Check the file permissions

I tried opening the file in LibreOffice and saving it as an xlsx there, but the same error occurs (no error if I save as xls).

I can create a reader $reader = new \PhpOffice\PhpSpreadsheet\Reader\Xlsx();, but when I do $Spreadsheet = $reader->load($Source_File); or $Spreadsheet = IOFactory::load($Source_File); I get the same error.

Also, I can create an xls reader that can read xls files. I can also create an xlsx reader, but it will not read the xlsx file, it gives the same error when trying to read the xlsx file. So, why is the error occurring with the xlsx file?

Also, I read the source code that the error message points to (IOFactory.php) and found the following location (near line #139) where the error occurs...

//Let's see if we are lucky
if (isset($reader) && $reader->canRead($filename))
{
    return $reader;
}

...and I searched for the definition of canRead, but did not find it anywhere in /vendor/phpoffice/phpspreadsheet/. Where is canRead defined? I think if I could read the definition of canRead, then maybe I will understand what the root cause of the issue is.

UPDATE:

I learned from comments and discussion that canRead() is defined in \PhpSpreadsheet\Reader\Xlsx.php starting around line 65. In canRead(), $zip->open($pFilename) returns an error code, ZipArchive::ER_NOENT, which means "No such file". However, the file exists. So, why is this error occurring?

UPDATE - 2018-12-18

This web page suggests that there are multiple types of xlsx files. So, I ran file test.xlsx, which displayed Microsoft Excel 2007+. Then I opened up the spreadsheet in LibreOffice Calc and saved it as an OOXML type of xlsx file and re-ran file test.xlsx, which displayed Microsoft OOXML. Then I re-ran the PHP script, but got the same error. So, it seems that my xlsx file type is not the issue.

So, I decided to use PHPExcel (even though it is deprecated) to get some necessary work done. When I ran the script using PHPExcel, I received a similar error about canRead() not being able to detect the xlsx file.

So, I continued reading this web page and followed wesood's last suggestion, which was sourced from the accepted answer on this web page. This solution worked for me: In the file /PHPExcel/IOFactory.php, I added PHPExcel_Settings::setZipClass(\PHPExcel_Settings::PCLZIP); immediately before if (isset($reader) && $reader->canRead($filename)).

However, I still want to know how to solve this issue in PhpSpreadsheet. It seems I need to learn more about how pclzip works and if a similar action needs to be done with PhpSpreadsheet.

UPDATE 2019-02-10:

I tried running the script today and it seems that the addition of PHPExcel_Settings::setZipClass(\PHPExcel_Settings::PCLZIP); no longer works. So, I am stuck again...

What am I doing wrong? Any help is welcome!

UPDATE 2019-02-18:

Following recommendations from comments, I tested the script using random XLSX files found via Google search results (e.g., this file), which were either Excel 2007+ or Microsoft OOXML types and the same error displays for PhpSpreadsheet:

Fatal error: Uncaught PhpOffice\PhpSpreadsheet\Reader\Exception: Unable to identify a reader for this file in /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/IOFactory.php:176 Stack trace: #0 /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/IOFactory.php(113): PhpOffice\PhpSpreadsheet\IOFactory::createReaderForFile('file:///var/www...') #1 /var/www/html/Function_Spreadsheet.php(798): PhpOffice\PhpSpreadsheet\IOFactory::identify('file:///var/www...') #2 /var/www/html/Function_Home.php(3748): Spreadsheet_Reader_1('/var/www/html/F...', 3746, Array, Array) #3 {main} thrown in /var/www/html/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/IOFactory.php on line 176

回答1:

From my understanding, you are missing a piece. Why don't you first create a reader and then load the file.

Try the following code. It can identify the extension and create the reader of that type accordingly.

$inputFileName = "Text.xlsx";

/**  Identify the type of $inputFileName  **/
$inputFileType = \PhpOffice\PhpSpreadsheet\IOFactory::identify($inputFileName);

/**  Create a new Reader of the type that has been identified  **/
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);

/**  Load $inputFileName to a Spreadsheet Object  **/
$spreadsheet = $reader->load($inputFileName);

/**  Convert Spreadsheet Object to an Array for ease of use  **/
$schdeules = $spreadsheet->getActiveSheet()->toArray();

Now you can simply run a foreach loop on the result array.

foreach( $schdeules as $single_schedule )
{               
    echo '<div class="row">';
    foreach( $single_schedule as $single_item )
    {
        echo '<p class="item">' . $single_item . '</p>';
    }
    echo '</div>';
}

This is all tested and working code picked from one of my projects. Cheers.



回答2:

I ran into this same exact error when trying to load in an XLSX file. For me personally, I discovered a very easy fix that fixed my issue. I was manually grabbing the extension off the filename as xlsx. I noticed some other code of mine using the old PHP Spreadsheet library was taking in the extension Xls. So I tried loading in Xlsx and it worked perfectly.

Here is the code I am using to correctly load in the extension. It simply grabs all the characters after the last period and then captilizes the first character of that substring. ucfirst simply uppercases the first letter of the string passed into it. substr returns a substring where the first parameter is the string to grab from and the second parameter is what index to start the substring at in the given string. And finally strrpos finds the last occurrence of a substring in the given string.

https://www.php.net/manual/en/function.ucfirst.php

https://www.php.net/manual/en/function.strrpos

https://www.php.net/manual/en/function.substr.php

$inputFileType = ucfirst(substr($cccFile, strrpos($cccFile, '.') + 1));

/**  Create a new Reader of the type defined in $inputFileType  **/
$reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType);

Once, I added in the ucfirst command, it solved the issue for me.



回答3:

I had the same issue, after adding .xlsx files to a git repository on my Mac.
The problem was that git auto-converted the line endings.

The solution was to add these lines to the .gitattributes file:

*.xls   binary
*.xlsx  binary


回答4:

I was able to read xlsx files that come from Google spreadsheet. However, xlsx files from Microsoft excel give the same error. Running the file command on those files give different file types:

google.xlsx: Zip archive data, at least v2.0 to extract
microsoft.xlsx: Microsoft Excel 2007+

I haven't found out what is the difference between both. Apparently xlsx files are actually zip files, so I don't know why linux distinguish between the source of the file.

Have you been able to solve this issue?