I'm looking for a way to do the following PHP code in Ruby in a succinct and efficient manner:
$normalizeChars = array('Š'=>'S', 'š'=>'s', 'Ð'=>'Dj','Ž'=>'Z', 'ž'=>'z', 'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A',
'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E', 'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I',
'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O', 'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U',
'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'Ss','à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a',
'å'=>'a', 'æ'=>'a', 'ç'=>'c', 'è'=>'e', 'é'=>'e', 'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i',
'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o', 'ô'=>'o', 'õ'=>'o', 'ö'=>'o', 'ø'=>'o', 'ù'=>'u',
'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'ý'=>'y', 'þ'=>'b', 'ÿ'=>'y', 'ƒ'=>'f');
$cleanGenre = strtr($this->entryArray['genre'], $normalizeChars);
Here the strtr()
function will replace the character on the left with the one on the right in the array. Pretty handy for a cleanup job. But I can't seem to find anywhint similar in Ruby, that is, a way to specify which characters to replace all in one array rather than with lengthy conditionals for each character.
Note that tr
won't work cause you can't replace one letter with two (D => Dj). Plus it gives me an InvalidByteSequenceError: "\xC5" on US-ASCII
for this line:
entry["genre"].tr('ŠšŽž', 'SsZz')
Thanks.
This works as I suppose you'd like it to have: translating characters in the array and leaving those not in there as they are:
for example this:
gives you 'aSsZz'.
Or move the block logic into the lookup table itself (thanks to steenslag for simplifying the default proc solution!):
then the call would look as follows:
Or even better (thanks again to steenslag for pointing out):
I'll make it easy for you to implement
In Ruby 1.9.3 you can use the
:fallback
option withencode
:It's also possible to do it with
gsub
as it accepts a conversion table as a hash argument in 1.9.x:Or better yet (by @steenslag):
This sort of character conversion is called transliteration, which is good to know if you wish to google for more solutions (there are many Ruby libraries that support transliteration, but none of the ones I tested supported your character set completely).