Regular expression to allow spaces between words

2020-01-24 03:34发布

问题:

I want a regular expression that prevents symbols and only allows letters and numbers. The regex below works great, but it doesn't allow for spaces between words.

^[a-zA-Z0-9_]*$

For example, when using this regular expression "HelloWorld" is fine, but "Hello World" does not match.

How can I tweak it to allow spaces?

回答1:

tl;dr

Just add a space in your character class.

^[a-zA-Z0-9_ ]*$

 


Now, if you want to be strict...

The above isn't exactly correct. Due to the fact that * means zero or more, it would match all of the following cases that one would not usually mean to match:

  • An empty string, "".
  • A string comprised entirely of spaces, "      ".
  • A string that leads and / or trails with spaces, "   Hello World  ".
  • A string that contains multiple spaces in between words, "Hello   World".

Originally I didn't think such details were worth going into, as OP was asking such a basic question that it seemed strictness wasn't a concern. Now that the question's gained some popularity however, I want to say...

...use @stema's answer.

Which, in my flavor (without using \w) translates to:

^[a-zA-Z0-9_]+( [a-zA-Z0-9_]+)*$

(Please upvote @stema regardless.)

Some things to note about this (and @stema's) answer:

  • If you want to allow multiple spaces between words (say, if you'd like to allow accidental double-spaces, or if you're working with copy-pasted text from a PDF), then add a + after the space:

    ^\w+( +\w+)*$
    
  • If you want to allow tabs and newlines (whitespace characters), then replace the space with a \s+:

    ^\w+(\s+\w+)*$
    

    Here I suggest the + by default because, for example, Windows linebreaks consist of two whitespace characters in sequence, \r\n, so you'll need the + to catch both.

Still not working?

Check what dialect of regular expressions you're using.* In languages like Java you'll have to escape your backslashes, i.e. \\w and \\s. In older or more basic languages and utilities, like sed, \w and \s aren't defined, so write them out with character classes, e.g. [a-zA-Z0-9_] and [\f\n\p\r\t], respectively.

 


* I know this question is tagged vb.net, but based on 25,000+ views, I'm guessing it's not only those folks who are coming across this question. Currently it's the first hit on google for the search phrase, regular expression space word.



回答2:

One possibility would be to just add the space into you character class, like acheong87 suggested, this depends on how strict you are on your pattern, because this would also allow a string starting with 5 spaces, or strings consisting only of spaces.

The other possibility is to define a pattern:

I will use \w this is in most regex flavours the same than [a-zA-Z0-9_] (in some it is Unicode based)

^\w+( \w+)*$

This will allow a series of at least one word and the words are divided by spaces.

^ Match the start of the string

\w+ Match a series of at least one word character

( \w+)* is a group that is repeated 0 or more times. In the group it expects a space followed by a series of at least one word character

$ matches the end of the string



回答3:

This one worked for me

([\w ]+)


回答4:

Try with:

^(\w+ ?)*$

Explanation:

\w             - alias for [a-zA-Z_0-9]
"whitespace"?  - allow whitespace after word, set is as optional


回答5:

I assume you don't want leading/trailing space. This means you have to split the regex into "first character", "stuff in the middle" and "last character":

^([a-zA-Z0-9_][a-zA-Z0-9_ ]*[a-zA-Z0-9_]$

or if you use a perl-like syntax:

^\w[\w ]*\w$

Also: If you intentionally worded your regex that it also allows empty Strings, you have to make the entire thing optional:

^(\w[\w ]*\w)?$

If you want to only allow single space chars, it looks a bit different:

^((\w+ )*\w+)?$

This matches 0..n words followed by a single space, plus one word without space. And makes the entire thing optional to allow empty strings.



回答6:

This regular expression

^\w+(\s\w+)*$

will only allow a single space between words and no leading or trailing spaces.

Below is the explanation of the regular expression:

  1. ^ Assert position at start of the string
  2. \w+ Match any word character [a-zA-Z0-9_]
    1. Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
  3. 1st Capturing group (\s\w+)*
    1. Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
    2. \s Match any white space character [\r\n\t\f ]
    3. \w+ Match any word character [a-zA-Z0-9_]
      1. Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
  4. $ Assert position at end of the string


回答7:

This does not allow space in the beginning. But allowes spaces in between words. Also allows for special characters between words. A good regex for FirstName and LastName fields.

\w+.*$


回答8:

For alphabets only:

^([a-zA-Z])+(\s)+[a-zA-Z]+$

For alphanumeric value and _:

^(\w)+(\s)+\w+$


回答9:

Try this: (Python version)

"(A-Za-z0-9 ){2, 25}"

change the upper limit based on your data set



回答10:

Just add a space to end of your regex pattern as follows:

[a-zA-Z0-9_ ]


回答11:

Had a good look at many of these supposed answers...

...and bupkis after scouring Stack Overflow as well as other sites for a regex that matches any string with no starting or trailing white-space and only a single space between strictly alpha character words.

^[a-zA-Z]+[(?<=\d\s]([a-zA-Z]+\s)*[a-zA-Z]+$

Thus easily modified to alphanumeric:

^[a-zA-Z0-9]+[(?<=\d\s]([a-zA-Z0-9]+\s)*[a-zA-Z0-9]+$

(This does not match single words but just use a switch/if-else with a simple ^[a-zA-Z0-9]+$ if you need to catch single words in addition.)

enjoy :D



回答12:

I find this one works well for a "FullName":

([a-z',.-]+( [a-z',.-]+)*){1,70}/


回答13:

try .*? to allow white spaces it worked for me