Detect non-alphabetic language string in PHP

2019-08-02 16:12发布

问题:

I have a similar question to this one

However, I have a less strict requirement.

All I need is to detect if the input string contains any non alphabetic string.

If it does contain non-alphabetic string, then I will select a different font file.

if it contains ONLY alphabetic string, then I will select a font file like AmericanTypeWriter.

By alphabetic string, that would include all kinds of possible symbols such as commas, punctuations, etc.

It is hard to define alphabetic string.

Let me define what is an example of non-alphabetic string.

这是中文

And assuming utf-8 format for the string.

Another way to define: anything that does not fall under non-European language character, is automatically assumed to be alphabetic string.

I need to do this detection in php by the way.

回答1:

Use ctype_alnum(string). Documentation Here. This function takes in a string and returns a bool that tells you whether the string contains other characters or not.

You could also use a more complex regex which will work and check for spaces. The following should work.

preg_match("/^[a-zA-Z0-9 ]*$/u", $string) == 1 will do the trick.



回答2:

I found a better answer. Inspired by https://stackoverflow.com/a/4923410/80353

header('Content-Type: text/html; charset=utf-8');
mb_internal_encoding('utf-8');

function isThisChineseText($text) {
    return preg_match("/\p{Han}+/u", $text);
}