Speed up reading an Excel File in Powershell

2019-08-03 04:03发布

I wonder if there is any way to speed up reading an Excel file with powershell. Many would say I should stop using the do until, but the problem is I need it badly, because in my Excel sheet there can be 2 rows or 5000 rows. I understand that 5000 rows needs some time. But 2 rows shouldn't need 90sec+.

$Excel               = New-Object -ComObject Excel.Application
    $Excel.Visible       = $true
    $Excel.DisplayAlerts = $false
$Path = EXCELFILEPATH
$Workbook            = $Excel.Workbooks.open($Path)
$Sheet1 = $Workbook.Worksheets.Item(test)

$URows = @()
Do {$URows += $Sheet1.Cells.Item($Row,1).Text; $row = $row + [int] 1} until (!$Sheet1.Cells.Item($Row,1).Text)
$URows | foreach {
$MyParms = @{};
$SetParms = @{};

And i got this 30 times in the script too:

If ($Sheet1.Cells.Item($Row,2).Text){$var1    = $Sheet1.Cells.Item($Row,2).Text
$MyParms.Add("PAR1",$var1)
$SetParms.Add("PAR1",$var1)}
                 }

I have the idea of running the $MyParms stuff contemporarily, but I have no idea how. Any suggestions?

Or

Increase the speed of reading, but I have no clue how to achieve that without destroying the "read until nothing is there".

Or

The speed is normal and I shouldn't complain.

2条回答
成全新的幸福
2楼-- · 2019-08-03 04:24

Appending to an array with the += operator is terribly slow, because it will copy all elements from the existing array to a new array. Use something like this instead:

$URows = for ($row = 1; !$Sheet1.Cells.Item($row, 1).Text; $row++) {
           if ($Sheet1.Cells.Item($Row,2).Text) {
             $MyParms['PAR1']  = $Sheet1.Cells.Item($Row, 2).Text)
             $SetParms['PAR1'] = $Sheet1.Cells.Item($Row, 2).Text)
           }
           $Sheet1.Cells.Item($Row,1).Text
         }

Your Do loop is basically a counting loop. The canonical form for such loops is

for (init counter; condition; increment counter) {
  ...
}

so I changed the loop accordingly. Of course you'd achieve the same result like this:

$row = 1
$URows = Do {
           ...
           $row += 1
         }

but that would just mean more code without any benefits. This modification doesn't have any performance impact, though.

Relevant in terms of performance are the other two changes:

  1. I moved the code filling the hashtables inside the first loop, so the code won't loop twice over the data. Using index and assignment operators instead of the Add method for assigning values to the hashtable prevents the code from raising an error when a key already exists in the hashtable.
  2. Instead of appending to an array (which has the abovementioned performance impact) the code now simply echoes the cell text in the loop, which PowerShell automatically turns into a list. The list is then assigned to the variable $URows.
查看更多
小情绪 Triste *
3楼-- · 2019-08-03 04:35

Don't use Excel.Application in the first place if you need speed. You can use an Excel spreadsheet as an ODBC data source - the file is analogous to a database, and each worksheet a table. The speed difference is immense. Here's an intro on using Excel spreadsheets without Excel

查看更多
登录 后发表回答