In PHP I would use this:
$text = "Je prends une thé chaud, s'il vous plaît";
$search = array('é','î','è'); // etc.
$replace = array('e','i','e'); // etc.
$text = str_replace($search, $replace, $text);
But the Java String method "replace" doesn't seem to accept arrays as input. Is there a way to do this (without having to resort to a for loop to go through the array)?
Please say if there's a more elegant way than the method I'm attempting.
There's no standard method as far as I know, but here's a class that does what you want:
http://www.javalobby.org/java/forums/t19704.html
There's no method that works identically to the PHP one in the standard API, though there may be something in Apache Commons. You could do it by replacing the characters individually:
A more sophisticated method that does not require you to enumerate the characters to substitute (and is thus more likely not to miss anything) but does require a loop (which will happen anyway internally, whatever method you use) would be to use
java.text.Normalizer
to separate letters and diacritics and then strip out everything with a character type ofCharacter.MODIFIER_LETTER
.You're going to have to do a loop:
Note: Some characters are replaced with multiple characters. In German, for example, u-umlaut is converted to "ue".
Edit: Made it much more efficient.
You'll need a loop.
An efficient solution would be something like the following:
Of course in a real program you would encapsulate both the construction of the map and the replacement in their respective methods.
I'm not a Java guy, but I'd recommend a generic solution using the Normalizer class to decompose accented characters and then remove the Unicode "COMBINING" characters.
A really nice way to do it is using the
replaceEach()
method from theStringUtils
class in Apache Commons Lang 2.4.Results in