I'm trying to use a regexp using sed
. I've tested my regex with kiki, a gnome application to test regexpd, and it works in kiki.
date: 2010-10-29 14:46:33 -0200; author: 00000000000; state: Exp; lines: +5 -2; commitid: bvEcb00aPyqal6Uu;
I want to replace author: 00000000000;
with nothing. So, I created the regexp, that works when I test it in kiki:
author:\s[0-9]{11};
But doesn't work when I test it in sed
.
sed -i "s/author:\s[0-9]{11};//g" /tmp/test_regex.txt
I know regex have different implementations, and this could be the issue. My question is: how do I at least try do "debug" what's happening with sed? Why is it not working?
If you want to debug a
sed
command, you can use thew
(write) command to dump which linessed
has matched to a file.From
sed manpages
:Applying to your question
Let's use a file named sed_dump.txt as the sed dump file.
1) Generate the sed dump:
2) Check file sed_dump.txt contents:
Output:
It's empty...
3) Trying to escape '{' regex control character:
4) Check file sed_dump.txt contents:
Output:
Conclusion
In step 4), a line has been matched, this means that
sed
matched your pattern in that line. It does not guarantee the correct answer, but it's a way of debugging usingsed
itself.You are using the -i flag incorrectly. You need to put give it a string to put on the temporary file. You also need to escape your curly braces.
I usually debug my statement by starting with a regex I know will work (like 's/author//g' in this case). When that works I know that I have the right arguments. Then I expand the regex incrementally.
The fact that you are substituting
author: 00000000000
is already said insed
when you add thes
before the first/
.There is a Python script called
sedsed
by Aurelio Jargas which will show the stepwise execution of ased
script. A debugger like this isn't going to help much in the case of characters being taken literally (e.g.{
) versus having special meaning (e.g.\{
), especially for a simple substitution, but it will help when a more complex script is being debugged.The latest SVN version.
The most recent stable release.
Disclaimer: I am a minor contributor to
sedsed
.Another
sed
debugger,sd
by Brian Hiles, written as a Bourne shell script (I haven't used this one).You have to use the -r flag for extended regex:
or you have to escape the {} characters:
My version of
sed
doesn't like the{11}
bit. Processing the line with:works fine.
And the way I debug it is exactly what I did here. I just constructed a command:
and removed the more advanced regex things one at a time:
<space>
instead of\s
, didn't fix it.[0-9]{11}
with 11 copies of[0-9]
, that worked.It pretty much had to be one of those since I've used every other feature of your regex before with
sed
successfully.But, in fact, this will actually work without the hideous 11 copies of
[0-9]
, you just have to escape the braces[0-9]\{11\}
. I have to admit I didn't get around to trying that since it worked okay with the multiples and I generally don't concern myself too much with brevity insed
since I tend to use it more for quick'n'dirty jobs :-)But the brace method is a lot more concise and adaptable and it's good to know how to do it.