I have a requirement to select the 7th column from a tab delimited file. eg:
cat filename | awk '{print $7}'
The issue is that the data in the 4th column has multiple values with blank in between. example - The last line in the below output:
user \Adminis FL_vol Design 0 - 1 -
group 0 FL_vol Design 19324481 - 3014 -
user \MAK FL_vol Design 16875161 - 2618 -
tree 826 FL_vol Out Global Doc Mark 16875162 - 9618 - /vol/FL_vol/Out Global Doc Mark
If the data is unambiguously tab-separated, then
cut
will cut on tabs, not spaces:You can certainly do that with
awk
, too:Judging by the format of your input file, you can get away with delimiting on
-
instead of spaces:FS
stands for Field Separator, just think of it as the delimiter for input.-
, your 7th field before now becomes the 2nd field.filename
as an argument to awk instead.Alternatively, if your data fields are separated by tabs, you can do it more explicitly as follows:
And this will resolve the issue since
Out Global Doc Mark
looks to be separated by spaces.This might work for you (GNU sed):
This substitute command selects everything in the line and returns the 7th non-tab. In
sed
the last thing grouped by(...)
will be returned in the lefthand side of the substitution by using a back-reference. In this case the first back-reference would return both the non-tab characters and the tab character (if present N.B. the?
meta-character which either one or none of the proceeding pattern).The.*
just swallows up what was left on the line if any.If fields are separated by tabs and your concern is that some fields contain spaces, there is no problem here, just:
(cut defaults to tab delimited fields.)