Let's say I have the word "Russian" written in Cyrillic. This is would be the quivalent of the following in Hex:
Русский
My question is: how do I write a function which will go from "Russian" in Cyrillic to it's hex value as above? Could this same function work also for singel byte characters?
The 〹
thingies are called HTML Entities. In PHP there is a function that can create these: mb_encode_numericentity
Docs, it's part of the Multibyte String extension (Demo):
$cyrillic = 'русский';
$encoding = 'UTF-8';
$convmap = array(0, 0xffff, 0, 0xffff);
$encoded = mb_encode_numericentity($cyrillic, $convmap, $encoding);
echo $encoded; # русский
However: You need to know the encoding of your Cyrillic string. In this case I've chosen UTF-8
, depending on it you need to modify the $encoding
parameter of the function and the $convmap
array.
Your provided example isn't hex, but if you want to convert to hex, try this:
function strToHex($string)
{
$hex='';
for ($i=0; $i < strlen($string); $i++)
{
$hex .= dechex(ord($string[$i]));
}
return $hex;
}
function hexToStr($hex)
{
$string='';
for ($i=0; $i < strlen($hex)-1; $i+=2)
{
$string .= chr(hexdec($hex[$i].$hex[$i+1]));
}
return $string;
}
echo strToHex('русский'); // d180d183d181d181d0bad0b8d0b9