I understand that PowerShell piping works by taking the output of one cmdlet and passing it to another cmdlet as input. But how does it go about doing this?
Does the first cmdlet finish and then pass all the output variables across at once, which are then processed by the next cmdlet?
Or is each output from the first cmdlet taken one at a time and then run it through all of the remaining piped cmdlet’s?
You can see how pipeline order works with a simple bit of script:
function a {begin {Write-Host 'begin a'} process {Write-Host "process a: $_"; $_} end {Write-Host 'end a'}}
function b {begin {Write-Host 'begin b'} process {Write-Host "process b: $_"; $_} end {Write-Host 'end b'}}
function c { Write-Host 'c' }
1..3 | a | b | c
Outputs:
begin a
begin b
process a: 1
process b: 1
process a: 2
process b: 2
process a: 3
process b: 3
end a
end b
c
Powershell pipe works in an asynchronous way. Meaning that output of the first cmdlet is available to the second cmdlet immediately one object at the time (even if the first one has not finished executing).
For example if you run the below line:
dir -recurse| out-file C:\a.txt
and then stop the execution by pressing Control+C you will see part of directory is written to the text file.
A better example is the following code:(which is indeed useful to delete all of .tmp files on drive c:)
get-childitem c:\ -include *.tmp -recurse | foreach ($_) {remove-item $_.fullname}
Each time $_ in the second cmdlet gets value of a (single file)
Both answers thusfar give you some good information about pipelining. However, there is more to be said.
First, to directly address your question, you posited two possible ways the pipeline might work. And they are both right... depending on the cmdlets on either side of the pipe!
However, the way the pipeline should work is closer to your second notion: objects are processed one at a time. (Though there's no guarantee that an object will go all the way through before the next one is started because each component in the pipeline is asynchronous, as S Nash mentioned.)
So what do I mean by "it depends on your cmdlets" ?
If you are talking about cmdlets supplied by Microsoft, they likely all work as you would expect, passing each object through the pipeline as efficiently as it can. But if you are talking about cmdlets that you write, it depends on how you write them: it is just as easy to write cmdlets that fail to do proper pipelining as those that succeed!
There are two principle failure modes:
- generating all output before emitting any into the pipeline, or
- collecting all pipeline input before processing any.
What you want to strive for, of course, is to process each input as soon as it is received and emit its output as soon as it is determined. For detailed examples of all of these see my article, Ins and Outs of the PowerShell Pipeline, just published on Simple-Talk.com.