I want to split each line of a pipe on spaces, and then print each token on its own line.
I realise that I can get this result using:
(cat someFileInsteadOfAPipe).split(" ")
But I want more flexibility. I want to be able to do just about anything with each token. (I used to use AWK on Unix, and I'm trying to get the same functionality.)
I currently have:
echo "Once upon a time there were three little pigs" | %{$data = $_.split(" "); Write-Output "$($data[0]) and whatever I want to output with it"}
Which, obviously, only prints the first token. Is there a way for me to for-each over the tokens, printing each in turn?
Also, the %{$data = $_.split(" "); Write-Output "$($data[0])"}
part I got from a blog, and I really don't understand what I'm doing or how the syntax works.
I want to google for it, but I don't know what to call it. Please help me out with a word or two to Google, or a link explaining to me what the %
and all the $
symbols do, as well as the significance of the opening and closing brackets.
I realise I can't actually use (cat someFileInsteadOfAPipe).split(" ")
, since the file (or preferable incoming pipe) contains more than one line.
Regarding some of the answers:
If you are using Select-String
to filter the output before tokenizing, you need to keep in mind that the output of the Select-String
command is not a collection of strings, but a collection of MatchInfo
objects. To get to the string you want to split, you need to access the Line
property of the MatchInfo
object, like so:
cat someFile | Select-String "keywordFoo" | %{$_.Line.Split(" ")}
Another way to accomplish this is a combination of Justus Thane's and mklement0's answers. It doesn't make sense to do it this way when you look at a one liner example, but when you're trying to mass-edit a file or a bunch of filenames it comes in pretty handy:
This will come out as:
The key is
$_
, which stands for the current variable in the pipeline.About the code you found online:
%
is an alias forForEach-Object
. Anything enclosed inside the brackets is run once for each object it receives. In this case, it's only running once, because you're sending it a single string.$_.Split(" ")
is taking the current variable and splitting it on spaces. The current variable will be whatever is currently being looped over byForEach
.To complement Justus Thane's helpful answer:
As Joey notes in a comment, PowerShell has a powerful, regex-based
-split
operator.-split '...'
),-split
behaves likeawk
's default field splitting, which means that:In PowerShell v4 an expression-based - and therefore faster - alternative to the
ForEach-Object
cmdlet became available: the.ForEach()
collection "operator" (method), as described in this blog post (alongside the.Where()
method, a more powerful, expression-based alternative toWhere-Object
).Here's a solution based on these features:
Note that the leading and trailing whitespace was ignored, and that the multiple spaces between
One
andfor
were treated as a single separator.