I deal with strings that contain Greek and English (Latin) text. I'd like to use a regex to catch all the Greek words that contain 4 or more characters on them.
Using regexp manual I figure out that I can use \p{Greek} to grab all Greek words and \w{4,} in order to grab 4+ character words. However, these two don't work together, from various tests I made.
Is there any way to do what I want using 1 regexp expression? Strings are UTF-8 and come out of tweets.
Regards
Are you using the UTF-8 pattern modifier?