可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

Hi fellow programmers!

I am trying to write a script to login into my universities "food balance" page using python and the mechanize module...

This is the page I am trying to log into: http://www.wcu.edu/11407.asp The website has the following form to login:

<FORM method=post action=https://itapp.wcu.edu/BanAuthRedirector/Default.aspx><INPUT value=https://cf.wcu.edu/busafrs/catcard/idsearch.cfm type=hidden name=wcuirs_uri> 
<P><B>WCU ID Number<BR></B><INPUT maxLength=12 size=12 type=password name=id> </P>
<P><B>PIN<BR></B><INPUT maxLength=20 type=password name=PIN> </P>
<P></P>
<P><INPUT value="Request Access" type=submit name=submit> </P></FORM>

From this we know that I need to fill in the following fields: 1. name=id 2. name=PIN

With the action: action=https://itapp.wcu.edu/BanAuthRedirector/Default.aspx

This is the script I have written thus far:

#!/usr/bin/python2 -W ignore

import mechanize, cookielib
from time import sleep

url   = 'http://www.wcu.edu/11407.asp'
myId  = '11111111111'
myPin = '22222222222'

# Browser
#br = mechanize.Browser()
#br = mechanize.Browser(factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True))
br = mechanize.Browser(factory=mechanize.RobustFactory()) # Use this because of bad html tags in the html...

# Cookie Jar
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)

# Browser options
br.set_handle_equiv(True)
br.set_handle_gzip(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
br.set_handle_robots(False)

# Follows refresh 0 but not hangs on refresh > 0
br.set_handle_refresh(mechanize._http.HTTPRefreshProcessor(), max_time=1)

# User-Agent (fake agent to google-chrome linux x86_64)
br.addheaders = [('User-agent','Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11'),
                 ('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'),
                 ('Accept-Encoding', 'gzip,deflate,sdch'),                  
                 ('Accept-Language', 'en-US,en;q=0.8'),                     
                 ('Accept-Charset', 'ISO-8859-1,utf-8;q=0.7,*;q=0.3')]

# The site we will navigate into
br.open(url)

# Go though all the forms (for debugging only)
for f in br.forms():
    print f


# Select the first (index two) form
br.select_form(nr=2)

# User credentials
br.form['id']  = myId
br.form['PIN'] = myPin

br.form.action = 'https://itapp.wcu.edu/BanAuthRedirector/Default.aspx'

# Login
br.submit()

# Wait 10 seconds
sleep(10)

# Save to a file
f = file('mycatpage.html', 'w')
f.write(br.response().read())
f.close()

Now the problem...

For some odd reason the page I get back (in mycatpage.html) is the login page and not the expected page that displays my "cat cash balance" and "number of block meals" left...

Does anyone have any idea why? Keep in mind that everything is correct with the header files and while the id and pass are not really 111111111 and 222222222, the correct values do work with the website (using a browser...)

Thanks in advance

EDIT

Another script I tried:

from urllib import urlopen, urlencode                                           
import urllib2                                                                  
import httplib                                                                  

url = 'https://itapp.wcu.edu/BanAuthRedirector/Default.aspx'                    

myId = 'xxxxxxxx'                                                               
myPin = 'xxxxxxxx'                                                              

data = {                                                                        
            'id':myId,                                                          
            'PIN':myPin,                                                        
            'submit':'Request Access',                                          
            'wcuirs_uri':'https://cf.wcu.edu/busafrs/catcard/idsearch.cfm'      
        }                                                                       

opener = urllib2.build_opener()                                                 
opener.addheaders = [('User-agent','Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11'),
                     ('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'),
                     ('Accept-Encoding', 'gzip,deflate,sdch'),                  
                     ('Accept-Language', 'en-US,en;q=0.8'),                     
                     ('Accept-Charset', 'ISO-8859-1,utf-8;q=0.7,*;q=0.3')]      

request = urllib2.Request(url, urlencode(data))                                 
open("mycatpage.html", 'w').write(opener.open(request))

This has the same behavior...

回答1:

# User credentials
br.form['id']  = myId
br.form['PIN'] = myPin

I believe this is the problem line.

Try changing it to

br['id'] = myId
br['PIN'] = myPin

I'm also pretty sure that you don't need br.form.action = 'https://itapp.wcu.edu/BanAuthRedirector/Default.aspx' because you have already selected the form so just calling submit should work, but I could be wrong.

Additionally, I have done a similar task just using urllib and urllib2, so if this doesn't work I will post that code.

Edit: here is the the technique that I used with urllib and urllib2:

import urllib2, urllib

opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
urllib2.install_opener(opener)
encoded = urllib.urlencode({"PIN":my_pin, "id":my_id})
f = opener.open('http://www.wcu.edu/11407.asp', encoded)
data = f.read()
f.close()

Edit 2:

>>> b = mechanize.Browser(factory=mechanize.RobustFactory())
>>> b.open('http://www.wcu.edu/11407.asp')
<response_seek_wrapper at 0x10acfa248 whose wrapped object = <closeable_response at 0x10aca32d8 whose fp = <socket._fileobject object at 0x10aaf45d0>>>
>>> b.select_form(nr=2)
>>> b.form
<mechanize._form.HTMLForm instance at 0x10ad0dbd8>
>>> b.form.attrs
{'action': 'https://itapp.wcu.edu/BanAuthRedirector/Default.aspx', 'method': 'post'}

This could be your problem? Not sure.

Edit 3:

Used an html inspector, I think there's a decent chance you need to set 'wcuirs_uir' to 'https://cf.wcu.edu/busafrs/catcard/idsearch.cfm'. I'm 95% sure that will work.

回答2:

I suggest the following library: http://docs.python-requests.org/en/latest/

It is a nice and easy library. It has a good documentation. I have used this library to do different kind of scripting, just like the one you are doing.

You need to do something like this:

import requests 

s = requests.Session()
url = 'https://itapp.wcu.edu/BanAuthRedirector/Default.aspx'                    
myId = 'xxxxxxxx'                                                               
myPin = 'xxxxxxxx'  
headers = {'User-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11',
           'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
           'Accept-Encoding': 'gzip,deflate,sdch',                  
           'Accept-Language': 'en-US,en;q=0.8',                     
           'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3'}
data = {'id':myId,                                                          
        'PIN':myPin,                                                        
        'submit':'Request Access',                                          
        'wcuirs_uri':'https://cf.wcu.edu/busafrs/catcard/idsearch.cfm'} 
response = s.post(url, headers = headers, data=date)

if response.status_code == 200: #Maybe add another constraint to be sure we are logged in
    #Now do any call you want
    response = s.get(another_url)
    print response.text

You can get more info here

回答3:

Another solution I've used in messing w/ ASPX is robobrowser.

For example:

def auth(mailbox, password):
    browser = RoboBrowser(history=False)
    browser.open(oc_auth_uri)

    signin = browser.get_form(id='aspnetForm')
    signin['SubLoginControl:mailbox'].value = mailbox
    signin['SubLoginControl:password'].value = password
    signin['SubLoginControl:javascriptTest'].value = 'true'
    signin['SubLoginControl:btnLogOn'].value = 'Logon'
    signin['SubLoginControl:webLanguage'].value = 'en-US'
    signin['SubLoginControl:initialLanguage'].value = 'en-US'
    signin['SubLoginControl:errorCallBackNumber'].value = 'Entered+telephone+number+contains+non-dialable+characters.'
    signin['SubLoginControl:cookieMailbox'].value = 'mailbox'
    signin['SubLoginControl:cookieCallbackNumber'].value = 'callbackNumber'
    signin['SubLoginControl:serverDomain'].value = ''

    browser.submit_form(signin)
    return browser

Note: You may need to update the form to add hidden form fields such as __VIEWSTATE and friends to the form prior to submitting. See this post for further info.