I have a space delimited input text file. I would like to delete columns where the column header is size using sed or awk.
Input File:
id quantity colour shape size colour shape size colour shape size
1 10 blue square 10 red triangle 8 pink circle 3
2 12 yellow pentagon 3 orange rectangle 9 purple oval 6
Desired Output:
id quantity colour shape colour shape colour shape
1 10 blue square red triangle pink circle
2 12 yellow pentagon orange rectangle purple oval
awk
commandpretty printing
result
Given a fixed file format:
A general solution using
awk
. There is a hard-coded variable (columns_to_delete
) in theBEGIN
block to indicate positions of fields to delete. The script then will calculate the width of each field and will delete those that match the position of the variable.Assuming
infile
has the content of the question and following content ofscript.awk
:Run it like:
With following output:
EDIT:
Oh oh, just now realised that output is not right, because of a join between two fields. Fix that would be too much work because there will be to check the max column size for every line before starting to process anything. But with this script I hope you get the idea. Not time now, perhaps I can try to fix it later on, but not sure.EDIT 2: Fixed adding an additional space for each field deleted. It was easier than expected :-)
EDIT 3: See comments.
I've modified the
BEGIN
block to check that an extra variable is provided as argument.And added to
FNR == 1
pattern the process of calculating the numbers of the columns to delete:Now run it like:
And result will be the same.
If you have GNU cut available this can be done like so:
Which generates a comma separated list based on the heading, and then cuts the complement of that list from INPUT_FILE.
Use
cut
: