I am making a website that makes graphs of the number of people present in groups (from www.codecamy.com).
To achieve this I came with a plan.
I will have a server which will poll the CodeCademy groups page (http://www.codecademy.com/groups) every 30 seconds and retrieve the needed information from that HTML.
Then, when a client connects to my website, the server will give the client that information and then the client will use either http://www.chartjs.org/docs/ or http://www.jqplot.com/ to draw the graph based on that information.
However, there is a big problem. If you have clicked any of the links from CodeCademy, then you realized you need to have an account to actually see the website. This can be a facebook account, a google account or a twitter account.
So, short story, if I want to access the page with the information about the groups, I need to have a Bot account for my server and I need to teach my server to login into that account.
Thus, I have created a dummy account at gmail, called codecademybot, and I want my server to use this account to login into codecademy so it can see that page's content.
By following a quickstart python tutorial that connects me to google+ I now also have the code to interact with it.
However, despite all this, I still don't have the smallest idea on how to interact with the website. I have the following questions:
- How do I detect if I am logged in my google account?
- How do I connect myself to that account so I can then access the page?
- Is there a special link to login into that website?
I am quite lost and would appreciate any possible help.
Don't let all the code samples and howto's lead you astray. They are intended for more complicated cases.
This means that you only need to automate what ordinary users do when logging into codecademy. Play that interaction that in the browser a couple times with a dev tool listening in (IE dev tool, FireBug, whatever) and look at the conversation of HTTP requests.
This is what you wish to emulate.
From what I can see,
remember_user_token
and_session_id
) get set in my browser before I get forwarded to http://www.codecademy.com/That last bit, I think, is interesting. How about you manually log in using your browser, listen in on the conversation and copy these two cookies to your automated code. See if they suffice as authentication tokens and allow you to fetch the data from the website.
If not, then I warmly recommend @CrisBee21 s answer. Let's hope pyCurl can emulate the browser well enough to do the conversation for you.
One more thing, when I browse around the site, I see one REST api request, namely http://www.codecademy.com/api/v1/notifications/userid/unread_count?authentication_token=some token
Surfing to http://www.codecademy.com/api/v1/users/userid/?authentication_token=the token gives me more info about myself
http://www.codecademy.com/api/v1/users/userid/groups?authentication_token=the token gives me the groups I'm in.
If you have more documentation about the codecademy REST api, you could try and take it from there. I couldn't find any documentation, am making this up as I go along.