When I access a page on an IIS server to retrieve xml, using a query parameter through the browser (using the http in the below example) I get a pop-up login dialog for username and password (appears to be a system standard dialog/form). and once submitted the data arrives. as an xml page.
How do I handle this with urllib? when I do the following, I never get prompted for a uid/psw.. I just get a traceback indicating the server (correctly ) id's me as not authorized. Using python 2.7 in Ipython notebook
f = urllib.urlopen("http://www.nalmls.com/SERetsHuntsville/Search.aspx?SearchType=Property&Class=RES&StandardNames=0&Format=COMPACT&Query=(DATE_MODIFIED=2012-09-28T00:00:00%2B)&Limit=10")
s = f.read()
f.close()
Pointers to doc also appreciated! did not find this exact use case.
I plan to parse the xml to csv if that makes a difference.
There are many ways to do it but i suggest you start with urllib2 and it's batteries included.
You can use requests, beautifulsoup,mechanize or selenium if your task gets harder. Googling will give you enough examples for each one of these,
You are dealing with http authentication. I've always found it tricky to get working quickly with the urllib library. The requests python package makes it super simple.
If you look at the headers for that url you can see that it is using digest authentication:
So you will need:
This can be done in a couple of ways:
urllib
/urllib2
andrequests
as others have suggestedMechanize
to simulate manual form-filling and get back the response