I'm new to PHPExcel, and I've scoured the Googles, but not finding much for my specific problem here. I'm finding that updating/evaluating the formulas in my sheet is being really slow.
I've got a pretty small excel file on which I'm doing some very basic goal-seeking (goal seeking done in PHP, end result calculation done in excel sheet). I've got it working accurately, but the speed is absolutely killing me. It appears that the formula calculation is to blame -- how can formula calculations/updates be sped up?
Unfortunately, I can't post a copy of the excel file as the contents are a trade secret for my company, but it's nothing out of the ordinary. Very simple arithmetic in the formulas. The only thing I can think of that might have an effect here is that some of the cell-dependency chains can be somewhat long (15-ish dependencies).
As you can see from the output below, we're only executing 11 iterations for the goal seeking, and taking a total of 4-5 seconds. Since this will be an AJAX service, I really need it to be faster than that.
Code
This is very quick and dirty proof-of-concept code, please bear with me:
<?php
Stopwatch::start();
$inputFileType = PHPExcel_IOFactory::identify( './example.xlsx' );
var_dump( 'FileType: '.$inputFileType );
Stopwatch::rel( 'identify filetype' );
$objReader = PHPExcel_IOFactory::createReader( $inputFileType );
$objReader->setReadDataOnly( true );
$filterSubset = new ReadFilter( 1, 35, range( 'A', 'J' ));
$objReader->setReadFilter( $filterSubset );
Stopwatch::rel( 'create reader' );
$objPHPExcel = $objReader->load( $inputFileName );
Stopwatch::rel( 'load file' );
$data = $objPHPExcel->getSheetByName( 'Data' );
$inputCell = $data->getCell( 'B9' );
$outputCell = $data->getCell( 'B35' );
Stopwatch::rel( 'get cells' );
goalSeek( $inputCell, $outputCell, '0.10', 1, 5 );
function goalSeek( $inputCell, $outputCell, $targetValue ) {
$cellValue = function() use ( &$outputCell, $precision ) {
return round( $outputCell->getCalculatedValue(), $precision );
};
$setValue = function( $value ) use ( &$inputCell, &$objPHPExcel, $cellValue ) {
$inputCell->setValue( $value );
PHPExcel_Calculation::getInstance( $objPHPExcel )->clearCalculationCache(); // -- clear cache so updates are calculated
Stopwatch::rel( 'goal-seek' );
};
// -- very basic goal seeking psuedo-code
while( $stillHunting ) { // -- outside tolerance
$setValue( $newInputValue );
}
};
class ReadFilter implements PHPExcel_Reader_IReadFilter {
private $_startRow = 0;
private $_endRow = 0;
private $_columns = [];
public function __construct( $startRow, $endRow, $columns ) {
$this->_startRow = $startRow;
$this->_endRow = $endRow;
$this->_columns = $columns;
}
public function readCell( $column, $row, $worksheetName = '' ) {
if( $row >= $this->_startRow && $row <= $this->_endRow ) { // -- valid row
if( in_array( $column, $this->_columns )) { // -- valid column
return true;
}
}
// else (implicit)
return false;
}
}
Output
string 'FileType: Excel2007' (length=19)
array (size=2)
'rel' =>
array (size=17)
'identify' => float 0.008597135543823242
'create reader' => float 0.0001199245452880859
'load file' => float 0.387645959854126
'get cells' => float 5.292892456054688E-5
'goal-seek' => float 0.4194750785827637
'goal-seek2' => float 0.3829901218414307
'goal-seek3' => float 0.3478608131408691
'goal-seek4' => float 0.3471150398254395
'goal-seek5' => float 0.3569440841674805
'goal-seek6' => float 0.378180980682373
'goal-seek7' => float 0.3683559894561768
'goal-seek8' => float 0.3778479099273682
'goal-seek9' => float 0.3664979934692383
'goal-seek10' => float 0.4503841400146484
'_avg' => float 0.2794940630594889
'_untilStop' => float 0.5339441299438477
'total' => float 4.726345062255859
OK, one possible solution that might speed things up if you're recalculating the same formula, but with different values in related cells is to parse the formula once, and only once, but execute multiple times.
getCalculatedValue()
calls two methods; the first isparseFormula()
which accepts the formula as a string, and builds a parser stack (as an array) of steps for the execution of that formula; the second (a private method, so you'd need to change that to public in Calculation.php) isprocessTokenStack()
which accepts 3 arguments, the token stack generated by the call toparseFormula()
, the cell ID (as a string) and the cell object.It might be possible for you to execute the parseFormula() step only once, and then call the
processTokenStack()
for each iteration, which would eliminate the parse step for all but the first iteration