Specifying a unicode range in an actionscript regu

2019-05-11 12:06发布

问题:

I have been trying to write a regular expression that would match all unicode word character something like :

/[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF\w]/gi

But this completely fails and doesn't match anything. I have tried a variety of expressions and it seems that as soon as I try to specify a range it fails. As anyone been luckier than me?

I wish actionscript would offer something like \p{L}, but if there's anything in the like, I couldn't find it in the doc.

回答1:

You can use String.fromCharCode with the unicode characters and then the ranges will work correctly in a regular expression. Here is an example using your original problem:

var exp:RegExp = new RegExp("[" + generateRangeForUnicodeVariables(0x00A0, 0xD7FF) + generateRangeForUnicodeVariables(0xF900, 0xFDCF) + generateRangeForUnicodeVariables(0xFDF0, 0xFFEF) + "\w]", "gi");

private function generateRangeForUnicodeVariables(var1:Object, var2:Object):String
{
   return String.fromCharCode(var1) + "-" + String.fromCharCode(var2);
}


回答2:

This has been a problem for sometime and I couldn't find any information that it has been solved, previously asked in:

Restrict input to a specified language

and

How to specify a unicode range in a RegExp?

I know this is a hack, but it does work in JavaScript so you could use ExternalInterface to farm the test out there and pass the result back.



回答3:

Hmm. Looks like it's not about ranges, it's about multi-byte characters.

This works:

 var exp:RegExp = new RegExp("[\u00A0-\u0FCF]", "gi");
 var str:String = "\u00A1 \u00A2 \u00A3 \u00A3";
 trace("subject:", str);
 trace("match:", str.match(exp));

And this does not:

 var exp:RegExp = new RegExp("[\u00A0-\u0FD0]", "gi");
 var str:String = "\u00A1 \u00A2 \u00A3 \u00A3";
 trace("subject:", str);
 trace("match:", str.match(exp));

Anyway, you can use RegExp constructor that converts a string to a matching pattern.