New to PowerShell, so kind of learning by doing.
The process I have created works, but it ends up locking down my machine until it is completed, eating up all memory. I thought I had this fixed by looking into forcing the garbage collector, and also moving from a for-each statement to using %()
to loop through everything.
Quick synopsis of process: Need to merge multiple SharePoint log files into single ones to track usage across all of the companies' different SharePoint sites. PowerShell loops through all log directories on the SP server, and checks each file in the directory if it already exists on my local machine. If it does exist it appends the file text, otherwise it does a straight copy. Rinse-repeat for each file and directory on the SharePoint Log server. Between each loop, I'm forcing the GC because... Well because my basic understanding is the looped variables are held in memory, and I want to flush them. I'm probably looking at this all wrong. So here is the script in question.
$FinFiles = 'F:\Monthly Logging\Logs'
dir -path '\\SP-Log-Server\Log-Directory' | ?{$_.PSISContainer} | %{
$CurrentDir = $_
dir $CurrentDir.FullName | ?(-not $_.PSISContainer} | %{
if($_.Extension -eq ".log"){
$DestinationFile = $FinFiles + '\' + $_.Name
if((Test-Path $DestinationFile) -eq $false){
New-Item -ItemType file -path $DestinationFile -Force
Copy-Item $_.FullName $DestinationFile
}
else{
$A = Get-Content $_.FullName ; Add-Content $DestinationFile $A
Write-Host "Log File"$_.FullName"merged."
}
[GC]::Collect()
}
[GC]::Collect()
}
Granted the completed/appended log files get very very large (min 300 MB, max 1GB). Am I not closing something I should be, or keeping something open in memory? (It is currently sitting at 7.5 of my 8 Gig memory total.)
Thanks in advance.
Don't nest
Get-ChildItem
commands like that. Use wildcards instead. Try:dir "\\SP-Log-Server\Log-Directory\*\*.log"
instead. That should improve things to start with. Then move this to aForEach($X in $Y){}
loop instead of aForEach-Object{}
loop (what you're using now). I'm betting that takes care of your problem.So, re-written just off the top of my head:
Edit: Oh, right, Alexander Obersht may be quite right as well. You may well benefit from a StreamReader approach as well. At the very least you should use the
-readcount
argument toGet-Content
, and there's no reason to save it as a variable, just pipe it right to theadd-content
cmdlet.To explain my answer a little more, if you use
ForEach-Object
in the pipeline it keeps everything in memory (regardless of your GC call). Using aForEach
loop does not do this, and should take care of your issue.You might find this and this helpful.
In short: Add-Content, Get-Content and Out-File are convenient but notoriously slow when you need to deal with large amounts of data or I/O operations. You want to fall back to StreamReader and StreamWriter .NET classes for performance and/or memory usage optimization in cases like yours.
Code sample: