How to skip over corrupt files with PHP libraries

2019-07-03 22:32发布

问题:

I am using the PHP libraries TCPDF and FPDI to combine PDF documents, and am getting the following error:

TCPDF ERROR: Unable to find object (10, 0) at expected location

I have the commercial version of FPDI.

It appears that the issue is only happening with PDF Version 1.3 (Acrobat 4.x) files. Here is a screenshot of a file's document properties that is creating the error. http://imagebin.org/215041

I'd like to skip over any files with errors instead of letting the script die. I have modified the error handling with a new class ErrorIgnoringTCPDF, however, it is not working.

Any ideas?

require_once('../../libraries/tcpdf/tcpdf.php');
require_once('../../libraries/fpdi/fpdi.php');

class ErrorIgnoringTCPDF extends FPDI {

   public function Error($msg) {
       // unset all class variables
       $this->_destroy(true);

       // exit program and print error
       //die('<strong>TCPDF ERROR: </strong>'.$msg);
   }

}

$pdf = new ErrorIgnoringTCPDF();
$pdf->setPrintHeader(false);

$prows = fetch_data($id);

foreach ($prows AS $row) {

    $irows = get_imaged_docs($row['pat_id']);

    foreach($irows AS $irow){

        if ($irow['type'] === 'application/pdf'){

            $doc_id = $irow['id'];

            $content = get_pdf_imaged_docs($doc_id);

            $pagecount = $pdf->setSourceFile($content);

            for ($i = 1; $i <= $pagecount; $i++) {
                 $tplidx = $pdf->ImportPage($i);
                 $s = $pdf->getTemplatesize($tplidx);
                 $pdf->AddPage('P', array($s['w'], $s['h']));
                 $pdf->useTemplate($tplidx);
            }    

        } else {

            $pdf->AddPage();

            $doc  = fetch_document_content($irow['id'], $irow['filename']);
            $img = base64_encode($doc);

            $imgdata = base64_decode($img);

            $pdf->Image('@'.$imgdata);

        }

    }

}

$pdf->Output('documents.pdf', 'D');

回答1:

If you are using Linux you can use shell_exec to combine files

function combine_pdf($outputName,$fileArray)
{


         $cmd = "gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=$outputName ";

         foreach($fileArray as $file)
         {
           $cmd .= $file." ";
         }
         $result = shell_exec($cmd);

 }


回答2:

Have you tried just suppressing the error?

$pagecount = @$pdf->setSourceFile($content);

if (empty($pagecount))
    continue;  // or whatever you want to do, maybe set $is_invalid = true;


回答3:

This simply indicates that the PDF document is errorious. It points to a specific byte offset position where the expected object is not found.



回答4:

I wont say this is an appropriate/best fix, but it may resolve your problem,

In: pdf_parser.php, comment out the line:

$this->error("Unable to find object ({$obj_spec[1]}, {$obj_spec[2]}) at expected location");

It should be near line 544.

You'll also likely need to replace:

    if (!is_array($kids))
        $this->error('Cannot find /Kids in current /Page-Dictionary');

with:

    if (!is_array($kids)){
     //   $this->error('Cannot find /Kids in current /Page-Dictionary');
     return;
    }

in the fpdi_pdf_parser.php file

Hope that helps. It worked for me.



回答5:

I have the same problem and i am using this code to fix my problems.

class convertPDF extends FPDI {

   public function error($msg) {
      throw new Exception($msg); 
   }
   ...other stuff...
}

try {
    $convertPdf = new convertPDF();
} catch(Exception $e) {
    die($e->getMessage);
}

This answer is for people who search for this problem. Have luck!