I have a file which contains lines of the following format:
w1#1#x w2#4#b w3#2#d ...
Each word (token) in the line (e.g. w1#1#x) is made of 3 parts, the first showing some index (w1 in this case), the second is an integer (1 in this case) , and the third is a character (x in this case)
Now, for each word (token), I need to print an additional field which will be calculated based on value of second and third part (i.e., the 4th part will be a function of 2nd and 3rd part), and the output file should look like:
w1#1#x#f1 w2#4#b#f2 w3#2#d#f3 ...
where
f1 = function(1,x), f2 = function(4,b), f3 = function (2,d)
Now, using the sed patterns I can identify the components in every word (token), e.g.,
echo $line | sed "s/([^#])#([^#])#([^# ]*) /\1#\2#\3 /g"
where \2 and \3 are parts of the pattern (I am calling them parts of the pattern because of this link)
Now, I need to compute the 4th part using \2 and \3. I have defined a shell function getInfo() which takes 2 arguments and does the required computation and gives me back the 4th part. The problem is inserting this function in the sed command. I tried following:
echo $line | sed "s/([^#])#([^#])#([^# ]*) /\1#\2#\3`getInfo \2 \3` /g"
but this is not working. Shell is not receiving the parts of the pattern as arguments.
So the question is:
How to pass the sed parts of the pattern to a shell (function)?
I can easily write a shell script which would split the line word-by-word and do the required job and then stitch the file back, but I would really appreciate if shell can receive parts of the pattern as arguments from sed within the sed command.
Regards,
Salil Joshi
This might work for you:
Or if you have GNU sed:
There comes a point at which
sed
is no longer the correct tool for the job. I think this task has reached that point (but see the clever answer by potong which shows that it can be done withbash
andsed
).Which alternative tool do you use? You don't show the function, but if it can be conveniently calculated in the shell with a shell function, the chances are that
awk
is powerful enough to do the job. I'd probably fall back on Perl myself, but Python (or Ruby) would also work well. All of these allow you to write a function, to read the data and apply the function to the data before writing the data back out.The problem with trying to use a function in
sed
is that it has no mechanism to define functions or to execute shell functions. To usesed
, you'd have to think in terms of two passes through the data, the first extracting the (unique) tokens for subsequent processing, which would be to apply the shell function to each token, generating ased
script which simply matches each token and substitutes it with its replacement, followed by applying that script in the second pass over the data.