I am trying to use the Microsoft Bing API.
$data = file_get_contents("http://api.microsofttranslator.com/V2/Ajax.svc/Speak?appId=APPID&text={$text}&language=ja&format=audio/wav");
$data = stripslashes(trim($data));
The data returned has a ' ' character in the first character of the returned string. It is not a space, because I trimed it before returning the data.
The ' ' character turned out to be %EF%BB%BF.
I wonder why this happened, maybe a bug from Microsoft?
How can I remove this %EF%BB%BF in PHP?
You should not simply discard the BOM unless you're 100% sure that the stream will: (a) always be UTF-8, and (b) always have a UTF-8 BOM.
The reasons:
I think a more appropriate way to handle this would be something like:
You could use
substr
to only get the rest without the UTF-8 BOM:I had the same problem today, and fixed by ensuring the string was set to UTF-8:
http://php.net/manual/en/function.utf8-encode.php
$content = utf8_encode ( $content );
$data = str_replace('%EF%BB%BF', '', $data);
You probably shouldn't be using
stripslashes
-- unless the API returns blackslashed data (and 99.99% chance it doesn't), take that call out.It's a byte order mark (BOM), indicating the response is encoded as UTF-8. You can safely remove it, but you should parse the remainder as UTF-8.
To remove it from the beginning of the string (only):