How to use regex negative lookahead

2019-07-23 04:55发布

I'm trying to get the email addresses from a file using egrep -o -e and having trouble with addresses at the end of a line.

Here is my regex:

egrep -o -e "[._a-zA-Z0-9]+@[._a-zA-Z0-9]+.[._a-zA-Z0-9]+" ~/myfile.txt

I realize this will not catch every variation of an email address, but if the address is at the end of a line this is what I get:

user@_12345@myemail.com\ul

So I figured I'd try a negative lookahead, but I have no idea how to properly use it. I've read a few things online but I'm confused by how it works.

This is what I've tried:

egrep -o -e "(?!\\[._a-zA-Z0-9]+@[._a-zA-Z0-9]+.[._a-zA-Z0-9]+)" ~/myfile.txt

Bash fails with event not found: \\[._a

Any suggestions?

2条回答
劳资没心,怎么记你
2楼-- · 2019-07-23 05:10

What does the dot stand for?

"[._a-zA-Z0-9]+@[._a-zA-Z0-9]+.[._a-zA-Z0-9]+"
                              ^
                             here

It matches the at-sign. If you remove it, your original regex with no lookahead will work.

Moreover, ! is a special character in bash (history expansion). You have to backslash it to use it literally.

查看更多
\"骚年 ilove
3楼-- · 2019-07-23 05:33

The ! is being interpolated as a history expansion command in bash. You should use single quotes rather than double quotes to prevent this.

However you should note that negative lookahead may not be supported by your version of grep either. In this case you need a more powerful regex tool like perl or ack.

查看更多
登录 后发表回答