I'm trying to use google apps script to login to an ASP.Net website and scrape some data that I typically have to retrieve manually. I've used Chrome Developer tools to get the correct payload names (TEXT_Username, TEXT_Password, _VIEWSTATE, _VIEWSTATEGENERATOR), I also got a ASP Net session Id to send along with my Post request.
When I run my function(s) it returns a Response Code = 200 if followRedirects is set to false and returns Response Code = 302 if followRedirects is set to true. Unfortunately in neither case do the functions successfully authenticate the website. Instead the HTML returned is that of the Login Page.
I've tried different header variants and parameters, but I can't seem to successfully login.
Couple of other points. When I do the login in Chrome using the Developer tools, the response code appears to be 302 Found.
Does anyone have any suggestions on how I can successfully login to this site. Do you see any errors in my functions that could be the cause of my problems. I'm open to any and all suggestions.
My GAS functions follow:
function login(cookie, viewState,viewStateGenerator) {
var payload =
{
"__VIEWSTATE" : viewState,
"__VIEWSTATEGENERATOR" : viewStateGenerator,
"TEXT_Username" : "myUserName",
"TEXT_Password" : "myPassword",
};
var header = {'Cookie':cookie};
Logger.log(header);
var options =
{
"method" : "post",
"payload" : payload,
"followRedirects" : false,
"headers" : header
};
var browser = UrlFetchApp.fetch("http://tnetwork.trakus.com/tnet/Login.aspx?" , options);
Utilities.sleep(1000);
var html = browser.getContentText();
var response = browser.getResponseCode();
var cookie2 = browser.getAllHeaders()['Set-Cookie'];
Logger.log(response);
Logger.log(html);
}
function loginPage() {
var options =
{
"method" : "get",
"followRedirects" : false,
};
var browser = UrlFetchApp.fetch("http://tnetwork.trakus.com/tnet/Login.aspx?" , options);
var html = browser.getContentText();
// Utilities.sleep(500);
var response = browser.getResponseCode();
var cookie = browser.getAllHeaders()['Set-Cookie'];
login(cookie);
var regExpGen = new RegExp("<input type=\"hidden\" name=\"__VIEWSTATEGENERATOR\" id=\"__VIEWSTATEGENERATOR\" value=\"(.*)\" \/>");
var viewStateGenerator = regExpGen.exec(html)[1];
var regExpView = new RegExp("<input type=\"hidden\" name=\"__VIEWSTATE\" id=\"__VIEWSTATE\" value=\"(.*)\" \/>");
var viewState = regExpView.exec(html)[1];
var response = login(cookie,viewState,viewStateGenerator);
return response
}
I call the script by running the loginPage() function. This function obtains the cookie (session id) and then calls the login function and passes along the session id (cookie).
Here is what I see in the Google Developer tools Network section when I login using Google's Chrome browser:
Remote Address: 66.92.89.141:80
Request URL: http://tnetwork.trakus.com/tnet/Login.aspx
Request Method: POST
Status Code:302 Found
**Request Headers** view source
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip, deflate
Accept-Language: en-US,en;q=0.8
Cache-Control:max-age=0
Connection:keep-alive
Content-Length: 252
Content-Type:application/x-www-form-urlencoded
Cookie: ASP.NET_SessionId=jayaejut5hopr43xkp0vhzu4; userCredentials=username=myUsername; .ASPXAUTH=A54B65A54A850901437E07D8C6856B7799CAF84C1880EEC530074509ADCF40456FE04EC9A4E47D1D359C1645006B29C8A0A7D2198AA1E225C636E7DC24C9DA46072DE003EFC24B9FF2941755F2F290DC1037BB2B289241A0E30AF5CB736E6E1A7AF52630D8B31318A36A4017893452B29216DCF2; __utma=260442568.1595796669.1421539534.1425211879.1425214489.16; __utmc=260442568; __utmz=260442568.1421539534.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utma=190106350.1735963725.1421539540.1425152706.1425212185.18; __utmc=190106350; __utmz=190106350.1421539540.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)
Host:tnetwork.trakus.com
Origin:http://tnetwork.trakus.com
Referer:http://tnetwork.trakus.com/tnet/Login.aspx?
User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.115 Safari/537.36
**Form Dataview** sourceview URL encoded
__VIEWSTATE: O7YCnq5e471jHLqfPre/YW+dxYxyhoQ/VetOBeA1hqMubTAAUfn+j9HDyVeEgfAdHMl+2DG/9Gw2vAGWYvU97gml5OXiR9E/9ReDaw9EaQg836nBvMMIjE4lVfU=
__VIEWSTATEGENERATOR:F4425990
TEXT_Username:myUsername
TEXT_Password:myPassword
BUTTON_Submit: Log In
Update: It appears that the website is using an HttpOnly cookie. As a result, I don't think I am capturing the whole cookie and therefore my header is not correct. As a result, I believe I need to set followRedirects to false and handle the redirect and cookie manually. I'm currently researching this process, but welcome input from anyone who has been down this road.
I notice that the provided Chrome payload includes
BUTTON_Submit: Log In
but yourPOST
payload does not. I have found that forPOST
s in GAS things go much more smoothly if I explicitly set asubmit
variable in mypayload
objects. In any case, if you're trying to emulate what Chrome is doing, this is a good first step.So in your case, it's a one line change:
I was finally able to successfully login to the page. The issue seems to be that the urlFetchApp was unable to follow the redirect. I credit this stackoverflow post: how to fetch a wordpress admin page using google apps script
This post described the following process that led to my successful login:
Here is the relevant code: