I have developed a script whose sole purpose is to check if the website/service is up and running. The way it does that is ,it connects to the page using its url and logs in to the site using the user credentials. If the login is successful then the service is running fine.
This script has been written in Java and uses HTMLUnit. Here lies my problem. How do i ensure the HTML page returned after logging in(clicking the login/sign in button after filling out the form) is the post-login "Account home page".In other words how do i determine if the login operation was successful.
Here is how i am doing it right now. Account pages usually have some user related info. For instance if i log in to yahoo mail it will have "Welcome , Username" on the top right corner of the page Or page will always have "Compose" or "inbox" on it. I am using this logic to test for success.
This has been my observation while testing this script. I have come across cases where this rule falls apart.
Some times the page returned after login is an error page asking you to
check your entered credentials.There are times where page returned may ask you to turn on your javascript or
enable cookie in your browser.I have come across a case where the page returned by the server has been the same pre-login page (no explanation given as to why)
Some web pages are dynamic in nature hence the content changes from time to time. In such cases key-word search may result in false
negatives.Which is why this logic of searching for presence of a string hinges
purely on the choice of "search string/keywords".
The point that i am trying to make is that coding for these cases upfront is not realistic.
I tried comparing urls of the pre-login and post-login pages but found out
that there are many cases where both are the same.Hence even this method is not conclusive.
I need a sure shot way of determining if the login was successful. I am not a professional web developer. Does the server return any status code with the new page, that can be resolved? Does HTMLUnit have some ability to test for success and failure.
I appreciate your help/comments.Thank you!!!
Well... this is kind of a tricky question. This is because you've no control of the server. I you ask for A you might probably receive A but you should be prepared to receive B, C and D... and you will probably miss E.
Based on your comments, looking for the "Welcome <Username>" string should be quite sure shot. In other (more programmatic) words, if you you have that string in the result page then you are logged in. There is your sure shot.
Now, you've mentioned that there are cases in which you try to log in and you don't receive that string. In those cases, and based on your examples, you should almost always not be logged in.
However, as you said, they can change that string from "Welcome, <Username>" to "There you are again!" and you will be getting false negatives. It is unlikely, however, that you ever get false positives applying that logic.
So, is there any way to have a 100% success on guessing if the user is logged in and also 100% success on guessing if the user is not logged in? No, there isn't. The simplest way to understand this is using the web in a human-like way:
Scenario 1:
Scenario 2:
Scenario 3:
Scenario 4:
Those are just a few scenarios but there are many more. Now think of this: even a human head con not be 100% sure of the result of a log in trial... how can we expect a headless browser to be? :)