I have some strings that have a mix of English and none English letters. For example:
w='_1991_اف_جي2'
How can I recognize these types of string using Regex or any other fast method in Python?
I prefer not to compare letters of the string one by one with a list of letters, but to do this in one shot and quickly.
You can just check whether the string can be encoded only with ASCII characters (which are Latin alphabet + some other characters). If it can not be encoded, then it has the characters from some other alphabet.
Note the comment
# -*- coding: ....
. It should be there at the top of the python file (otherwise you would receive some error about encoding)It will return
F, T, F, F, T
If you work with strings (not unicode objects), you can clean it with translation and check with
isalnum()
, which is better than to throw Exceptions:Also you can filter non-ascii characters from string with this function: