What is the best way to match only letters in a re

2019-04-04 06:59发布

I would really like to use \w but it also matches underscores so I'm going with [A-Za-z] which feels unnecessarily verbose and America centric. Is there a better way to do this? Something like [\w^_] (I doubt I got that syntax right)?

7条回答
We Are One
2楼-- · 2019-04-04 07:16

You could use /[a-z]/i or /[[:alpha:]]/ just as well. In fact, \w includes numbers so that won't even work.

查看更多
三岁会撩人
3楼-- · 2019-04-04 07:17

you're looking for internationalization in your regex? then you'll need to do something like this guy did: JavaScript validation issue with international characters

explicitly match on all of the moon language letters :)

查看更多
劳资没心,怎么记你
4楼-- · 2019-04-04 07:25

Perhaps you mean /[[:alpha:]]/? See perlre for the discussion of POSIX character classes.

查看更多
劫难
5楼-- · 2019-04-04 07:30
[^\W0-9_]

# or

[[:alpha:]]

See perldoc perlre

查看更多
手持菜刀,她持情操
6楼-- · 2019-04-04 07:30

A few options:

1. /[a-z]/i               # case insensitive
2. /[A-Z]/i               # case insensitive
3. /[A-z]/                # explicit range listing (capital 'A' to lowercase 'z')
4. /[[:alpha:]]/          # POSIX alpha character class

I recommend using either the case-insensitive, or the true way /[a-zA-z]/, unless you have a certain language preference in mind.

Note:

  • Number 3 requires the capital 'A' first and then lowercase 'z' because of the order of the ASCII values; it does not work if you do the reverse: a-Z. Also: this method would fail the no-underscore criteria, since it includes [ \ ] ^ _ ` .
  • Number 4 will match on those additional language characters, but it also matches on:
    ʹʺʻˍˎˏːˑˬˮ̀́   (plus many others)
查看更多
够拽才男人
7楼-- · 2019-04-04 07:37

Just use \p{L} which means "any Unicode letter" and works in Perl (/\p{L}/). You probably need to use utf8;.

查看更多
登录 后发表回答