I have a need to fetch some URL's which have some characters from the Swedish alphabet.
If you take an example of such string as https://en.wikipedia.org/wiki/Åland_Islands
, passing that straight into the file_get_contents
call as a parameter works just fine. But if you run that URL through urlencode
first, then the call fails with the message:
failed to open stream: No such file or directory
despite the documentation for file_get_contents
saying:
Note: If you're opening a URI with special characters, such as spaces, you need to encode the URI with urlencode().
So for example, if you run the following code:
error_reporting(E_ALL);
ini_set("display_errors", true);
$url = urlencode("https://en.wikipedia.org/wiki/Åland_Islands");
$response = file_get_contents($url);
if($response === false) {
die('file get contents has failed');
}
echo $response;
You will get the error. If you just remove the "urlencode" from the code, it will run just fine.
The problem I am facing is that there is a parameter in my URL that is taken from a submitted form. And since PHP always runs submitted values through the urlencode
, the Swedish characters in my constructed URL will cause the error to happen.
How do I get around this?
use this
The problem is likely due to urlencode escaping your protocol:
This is a problem I have also faced, and could only fix by trying to target the escaping to only what is necessary for escape:
This is as can be imagined tricky depending on where your characters are located. I usually opt for an encode patch solution, but some people I have worked with prefer to only encode the dynamic segment of their urls.
Here is my approach:
Code:
Hope it helps.