regex allows one character (it should not) why?

2020-05-01 07:10发布

问题:

Hello I am trying to create a regex that recognizes money and numbers being inputted. I have to allow numbers because I am expecting non-formatted numbers to be inputted programmatically and then I will format them myself. For some reason my regex is allowing a one letter character as a possible input.

[\$]?[0-9,]*\.[0-9][0-9]

I understand that my regex accepts the case where multiple commas are added and also needs two digit after the decimal point. I have had an idea of how to fix that already. I have narrowed it down to possibly the *\. as the problem

EDIT

I found the regex expression that worked [\$]?([0-9,])*[\.][0-9]{2} but I still don't know how or why it was failing in the first place

I am using the .formatCurrency() to format the input into a money format. It can be found here but it still allows me to use alpha characters so i have to further masked it using the $(this).inputmask('Regex', { regex: "[\$]?([0-9,])*[\.][0-9]{2}" }); where input mask is found here and $(this) is a reference to a input element of type text. My code would look something like this

<input type="text" id="123" data-Money="true">

 //in the script
 .find("input").each(function () {          
        if ($(this).attr("data-Money") == "true") {                            
            $(this).inputmask('Regex', { regex: "[\$]?([0-9,])*[\.][0-9]{2}" });
            $(this).on("blur", function () {
                $(this).formatCurrency();
            });

I hope this helps. I try creating a JSfiddle but Idk how to add external libraries/plugin/extension

回答1:

The "regular expression" you're using in your example script isn't a RegExp:

$(this).inputmask('Regex', { regex: "[\$]?([0-9,])*[\.][0-9]{2}" });

Rather, it's a String which contains a pattern which at some point is being converted into a true RegExp by your library using something along the lines of

var RE=!(value instanceof RegExp) ? new RegExp(value) : value;

Within Strings a backslash \ is used to represent special characters, like \n to represent a new-line. Adding a backslash to the beginning of a period, i.e. \., does nothing as there is no need to "escape" the period.

Thus, the RegExp being created from your String isn't seeing the backslash at all.

Instead of providing a String as your regular expression, use JavaScript's literal regular expression delimiters.

So rather than:

$(this).inputmask('Regex', { regex: "[\$]?([0-9,])*[\.][0-9]{2}" });

use

$(this).inputmask('Regex', { regex: /[\$]?([0-9,])*[\.][0-9]{2}/ });

And I believe your "regular expression" will perform as you expect.

(Note the use of forward slashes / to delimit your pattern, which JavaScript will use to provide a true RegExp.)



回答2:

Firstly, you can replace '[0-9]' with '\d'. So we can rewrite your first regex a little more cleanly as

\$?[\d,]*\.\d\d

Breaking this down:

\$?       - A literal dollar sign, zero or one
[\d,]*    - Either a digit or a comma, zero or more
\.        - A literal dot, required
\d        - A digit, required
\d        - A digit, required

From this, we can see that the minimum legal string is \.\d\d, three characters long. The regex you gave will never validate against any one character string.

Looking at your second regex,

[\$]?     - A literal dollar sign, zero or one
([0-9,])* - Either a digit or a comma, subexpression for later use, zero or more
[\.]      - A literal dot, required
[0-9]{2}  - A digit, twice required

This has the exact same minimum matchable string as above - \.\d\d.

edit: As mentioned, depending on the language you may need to escape forward slashes to ensure they aren't misinterpretted by the language when processing the string.

Also, as an aside, the below regex is probably closer to what you need.

[A-Z]{3} ?(\d{0,3}(?:([,. ])\d{3}(?:\2\d{3})*)?)(?!\2)[,.](\d\d)\b

Explanation:

[A-Z]{3}       - Three letters; for an ISO currency code
 ?             - A space, zero or more; for readability
(              - Capture block; to catch the integer currency amount
  \d{0,3}        - A digit, between one and three; for the first digit block
  (?:            - Non capturing block (NC)
    ([,. ])        - A comma, dot or space; as a thousands delimiter
    \d{3}          - A digit, three; the first possible whole thousands
    (?:            - Non capturing block (NC) 
      \2             - Match 2; the captured thousands delimiter above
      \d{3}          - A digits, three
    )*             - The above group, zero or more, i.e. as many thousands as we want
  )?              - The above (NC) group, zero or one, ie. all whole thousands
)              - The above group, i.e everything before the decimal
[.,]           - A comma or dot, as a decimal delimiter
(\d{2})        - Capture, A digit, two; ie. the decimal portion
\b             - A word boundry; to ensure that we don't catch another
                 digit in the wrong place.

The negative lookahead was provided by an answer from John Kugelman in this question.

This correctly matches (matches enclosed in square brackets):

[AUD 1.00]
[USD 1,300,000.00]
[YEN 200 000.00]
I need [USD 1,000,000.00], all in non-sequential bills.

But not:

GBP 1.000
YEN 200,000