Regex and accents/tildes

2019-07-02 16:43发布

问题:

How should I include accents into a regex? I'd like to detect letters from a-z and äáàëéèíìöóòúùñç but this regex doesn't work:

$pattern = '/^([a-zäáàëéèíìöóòúùñç])/i';

回答1:

How about:

cat test.php
<?php
preg_match('/\pL/u', 'é', $m);
print_r($m);
?>


php -q test.php
Array
(
    [0] => é
)


回答2:

You can try using a pre-defined class to match all letters, for instance:

[\p{L}]

This will work in most regex-engines.

You can read more about unicode in regexes here, for instance: http://www.regular-expressions.info/unicode.html



回答3:

You might try to add the u flag to your regex (see PCRE_UTF8 on modifiers page)



回答4:

What if you remove the last 'i', I'm using Rubular to test it and it works without it (and also without the / since those are php specific.)

so basically test it in this page like this: ^([a-zäáàëéèíìöóòúùñç])



回答5:

Here you can find the solution to my problem: Using of regex whith preg_replace_callback , it seemed the regex has to be like: $pattern = '/(\p{L})(.+)/iu';