In my case word length is "2" and I am using this regex:
text = text.replace(/\b[a-zA-ZΆ-ώἀ-ῼ]{2}\b/g, '') );
but cannot make it work with greek characters. For your convenience here is a demo:
text = 'English: the on in to of \n Greek: πως θα το πω';
text = text.replace(/\b[0-9a-zA-ZΆ-ώἀ-ῼ]{2}\b/g, '');
console.log(text);
As far as the greek characters are concerned, I try to use a range with 2 sets: "Greek and Coptic" and "Greek Extended" (as seen on unicode-table.com).
JavaScript has problems with Unicode support in regular expressions. To make the things working, I'd suggest to use XRegExp library, which has a stable support of Unicode.
MORE: http://xregexp.com/plugins/#unicode
Why using regex, I think you problem can be resolved without using regex
check the example below it should give you a hint on how to start
The problem with greek characters is because of
\b
. You can take a look here: Javascript - regex - word boundary (\b) issue where @Casimir et Hippolyte proposes the following solution:I also added
0-9
inside the first and the third match because it was removing words like "2TB" or "mp3"try this
Edit-------------------
Try this,
You will mantain \t, \n and will remove 2-letter word is between 2 tabs or two line feeds