的正则表达式下划线后发现信(regex for finding Letter after Under

2019-09-30 09:12发布

我想编写使用的Unix命令,正则表达式,用于标识不确认以下格式的所有字符串

First Leter is UpperCase    
Followed by any number of letters
Underscore
Followed by UpperCase Letter
Followed by any number of letters
Underscore
and so on .............

下划线的数量是可变的

So valid ones are                                     Invalid ones are
Alpha_Beta_Gamma                                      alph_Beta_Gamma
Alpha_Beta_Gamma_Delta                                Alpha_beta_Gamma
Alppha_Beta                                           Alpha_beta
Aliph_Theta_Pi_Chi_Ming                               Alpha_theta_Pi_Chi_Ming

Answer 1:

grep具有-v其中反转匹配选项(即返回非匹配线)。 所述-E选项提出的grep成extended-regexp模式(其允许+和括号中的图案要反向转义)。

您可以使用该模式(分解为清楚起见):

^              # beginning of string
  [A-Z]        # a single uppercase letter
  [a-z]*       # zero or more lowercase letters
  (            # start a group
    _          # an underscore
    [A-Z]      # a single uppercase letter
    [a-z]*     # zero or more lowercase letters
  )+           # close the group and it can appear one or more times
$              # end of string

所以,假设你有一个文件test.dat包含从你的问题你8个字符串:

grep -E -v "^[A-Z][a-z]*(_[A-Z][a-z]*)+$" test.dat

返回:

alph_Beta_Gamma
Alpha_beta_Gamma
Alpha_beta
Alpha_theta_Pi_Chi_Ming


文章来源: regex for finding Letter after Underscore
标签: regex unix grep