So I'm running a crawler on my server and I'm needing to execute javascript to gain access to some of the data on my target site (target being the one I want to crawl). I had a question regarding a different approach to the problem here, but it's not needed for answering this one: [Dead]How to successfully POST to an old ASP.NET site utilizing Asynchronous Postback
My javascript is executed in the browser I call my php crawler from. The problem is that all javascript requests are targeted back at my own server rather than the target site (I get lead to links like /index.php on my own site rather than the target site).
My experience with javascript is pretty minimal and I'm not sure how I should redirect my requests to my target. Here is an example of a javascript function from the page that I'm calling:
<script type="text/javascript">
//<![CDATA[
var theForm = document.forms['aspnetForm'];
if (!theForm) {
theForm = document.aspnetForm;
}
function __doPostBack(eventTarget, eventArgument) {
if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
theForm.__EVENTTARGET.value = eventTarget;
theForm.__EVENTARGUMENT.value = eventArgument;
theForm.submit();
}
}
//]]>
</script>
... and the way that I call it:
echo "<SCRIPT language='javascript'>__doPostBack('-254870369', '')</SCRIPT>";
Is there some way of aliasing the server address from my own server to the target server or doing some other kind of handy workaround that would fix this problem?
There is no need to inject javascript in the target. You can use wireshark to study all request made by the target. Wireshark is a quite hard to master but powerful. Instead you can try the
net
tab of the firebug addon.Once you know how the target send requests and receive data from their server, you can use curl to imitate the request/receiving data. You don't need any more to build crawlers.
If this not answers your question explain a little more the scenario.