So I am playing with this tool:
http://www.unit-conversion.info/texttools/ascii/
When I try this character:
'
I see the value 039 which can be verified from: http://www.asciitable.com
But I am curios about:
’
This character in the same tool will return: 226 128 153
But as far as I know ASCII is 8 bits (or even 7...)
What is 226 128 153 in here?
it seems that that is the UTF16 representation. probably that website is converting the characters to their code representation with "’".charCodeAt(0);
in Javascript
The character you have is U+2019 RIGHT SINGLE QUOTATION MARK, which is also the typographically correct way of representing the apostrophe in most positions.
What the site does, is representing the characters in UTF-8. As you can see in the page I linked, this character is encoded as three bytes, 0xE2 0x80 0x99
in hexadecimal, or 226 128 153 in decimal.
The reason that that page uses UTF-8 instead of ASCII? Simple. First, ASCII is a subset of UTF-8. Second, UTF-8 supports the entire Unicode. So there's rarely a reason to use ASCII if UTF-8 can be used instead.
The first character is ASCII, code 39. The second is UNICODE character, code 8217.
See UNICODE character table, specifically for this character.
For more information read the UNICODE article.
$(document).ready(function(){
$('#res').html("’".charCodeAt(0));
})
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div id='res'><div>
I have this same issue (trying to actually convert a string to uppercase, ran into this character and it 'broke' a bunch of methods of converting a string with special characters to uppercase.
I used this solution:
$text = preg_replace("/[`‛′’‘]/u", "'", $text);
(NOT MINE - taken from here: https://stackoverflow.com/a/24925209/6136613)
This then converts it to a regular comma, and you can perform normal php functions on it.