I am trying to get HTML code from a webpage that is not in the same domain. The html text is parsed & summarises a recipe(recipe name, main ingredients, no. of steps) found on that page the HTML code was from.
The user can then click the link & go to that webpage outside the domain to view the recipe.
I'm aware of the Same-Origin-Policy, but does that apply to getting HTML code from a webpage outside the domestic domain? I imagine its exactly the same as getting XML, so this is legal & allowed isn't it?
Is there a way I can get the HTML text/code from a domain outside my domestic domain?
Using Javascript & JQuery, the idea is to limit the amount of server requests & storage by having the user perform requests for each recipe & parsing the HTML on the client side. This stops server side bottlenecks & also means I dont have to go through the server & delete old outdated recipe summarisations.
I'm open to Solutions/Suggestions in any programming language or API or etc.
What you are trying to do can't be done using any AJAX library. Browsers' cross-domain policy won't allow you to do this.
But you can do this with a combination of php (or any other server-side language) and AJAX. Create a php script like this:
<?php
$url=$_POST['url'];
if($url!="")
echo file_get_contents($url);
?>
Let us say the script's name is fetch.php
.
Now you can throw an AJAX call from your jQuery code to this fetch.php
and it will fetch the HTML code for you.
No, this will not work from client-side JavaScript. The browser prevents it for security reasons. You would need to make ajax calls to a local server-side script (PHP, for example) which would then fetch the content (via cURL, for example) and return the HTML you want.
To add something to the answers you already got, I can tell you that html
is not meant to be used as a way to transmit data "like a service". For that purpose there is XML
or JSON
exposed through SOAP
or REST
.
In your scenario, the best approach that I can think of, keeping in mind both technical and legal aspects, is to use an iframe
to display the external content and citing the source of the iframe content, including an external link like you're already doing.
You can still try the server side approach to fetch the remote html but again, not a clean way to do it, surely not a good practice and possibly not legal.
If the author of the content wants it to be reusable outside of its site, he can express this intent by making the unformatted content available through a service or an RSS
/ Atom
feed.
The same origin applies. try this code and you'll face security error
$.get("other web page site", {}, function(content){
$("#receipe").html(content)
}, "html")
btw, you'll more likely violate copyright law, so be wary ;-)
I´m not to sure if it counts as pure javascript solution but: http://developer.yahoo.com/yql/ could help you with what you are looking for.