可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I'm trying to fetch a URL from a Jekins server. Until somewhat recently I was able to use the pattern described on this page (HOWTO Fetch Internet Resources Using urllib2) to create a password-manager that correctly responded to BasicAuth challenges with the user-name & password. All was fine until the Jenkins team changed their security model, and that code no longer worked.

# DOES NOT WORK!
import urllib2
password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
top_level_url = "http://localhost:8080"

password_mgr.add_password(None, top_level_url, 'sal', 'foobar')
handler = urllib2.HTTPBasicAuthHandler(password_mgr)
opener = urllib2.build_opener(handler)

a_url = 'http://localhost:8080/job/foo/4/api/python'
print opener.open(a_url).read()

Stacktrace:

Traceback (most recent call last):
  File "/home/sal/workspace/jenkinsapi/src/examples/password.py", line 11, in <module>
    print opener.open(a_url).read()
  File "/usr/lib/python2.7/urllib2.py", line 410, in open
    response = meth(req, response)
  File "/usr/lib/python2.7/urllib2.py", line 523, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python2.7/urllib2.py", line 448, in error
    return self._call_chain(*args)
  File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 531, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden
[Finished in 0.0s with exit code 1]

The problem appears to be that Jenkins returns not with the expected 401 code, but a 403 which urllib2 interprets as an end of conversation. It never actually sends the password. After some surfing around github found another developer's solution which works...

# WORKS... SORTA
def auth_headers(username, password):
   return 'Basic ' + base64.encodestring('%s:%s' % (username, password))[:-1]

auth = auth_headers('sal', 'foobar')
top_level_url = "http://localhost:8080"
a_url = 'http://localhost:8080/job/foo/4/api/python'
req = urllib2.Request(a_url)
req.add_header('Authorization', auth)
print urllib2.urlopen(req).read()

But that seems rather unsatisfying. It's not bothering to check whether the domain is relevant to the username and password... it's just sending my login details regardless!

Can anybody suggest a way to make the original script work? I'd like to use a urllib2 password manager in such a way that I can login to Jenkins.

回答1:

See this gist as well: https://gist.github.com/dnozay/194d816aa6517dc67ca1

Jenkins does not return 401 - retry HTTP error code when you need to access a page that needs authentication; instead it returns 403 - forbidden. In the wiki, https://wiki.jenkins-ci.org/display/JENKINS/Authenticating+scripted+clients, it shows that using the command-line tool wget you need to use wget --auth-no-challenge which is exactly because of that behavior.

Retrying with basic auth when you get a `403 - forbidden`:

let's say you defined:

jenkins_url = "https://jenkins.example.com"
username = "johndoe@example.com"
api_token = "my-api-token"

You can subclass a urllib2.HTTPBasicAuthHandler to handle 403 HTTP responses.

import urllib2

class HTTPBasic403AuthHandler(urllib2.HTTPBasicAuthHandler):
    # retry with basic auth when facing a 403 forbidden
    def http_error_403(self, req, fp, code, msg, headers):
        host = req.get_host()
        realm = None
        return self.retry_http_basic_auth(host, req, realm)

Then it is a matter of using that handler, e.g. you can install it so it works for all urllib2.urlopen calls:

def install_auth_opener():
    '''install the authentication handler.

    This handles non-standard behavior where the server responds with
    403 forbidden, instead of 401 retry. Which means it does not give you the
    chance to provide your credentials.'''
    auth_handler = HTTPBasic403AuthHandler()
    auth_handler.add_password(
        realm=None,
        uri=jenkins_url,
        user=username,
        passwd=api_token)
    opener = urllib2.build_opener(auth_handler)
    # install it for all urllib2.urlopen calls
    urllib2.install_opener(opener)

and here is a simple test to see if it works okay.

if __name__ == "__main__":
    # test
    install_auth_opener()
    page = "%s/me/api/python" % jenkins_url
    try:
        result = urllib2.urlopen(page)
        assert result.code == 200
        print "ok"
    except urllib2.HTTPError, err:
        assert err.code != 401, 'BAD CREDENTIALS!'
        raise err

Using pre-emptive authentication.

There is a good example in this answer: https://stackoverflow.com/a/8513913/1733117. Rather than retrying when you get a 403 forbidden you would send the Authorization header when the url matches.

class PreemptiveBasicAuthHandler(urllib2.HTTPBasicAuthHandler):
    '''Preemptive basic auth.

    Instead of waiting for a 403 to then retry with the credentials,
    send the credentials if the url is handled by the password manager.
    Note: please use realm=None when calling add_password.'''
    def http_request(self, req):
        url = req.get_full_url()
        realm = None
        # this is very similar to the code from retry_http_basic_auth()
        # but returns a request object.
        user, pw = self.passwd.find_user_password(realm, url)
        if pw:
            raw = "%s:%s" % (user, pw)
            auth = 'Basic %s' % base64.b64encode(raw).strip()
            req.add_unredirected_header(self.auth_header, auth)
        return req

    https_request = http_request

回答2:

Rather than defining your own handler and installing it globally or using it for individual requests, it's a lot easier to just add the header to the request:

auth_header = 'Basic ' + base64.b64encode('%s:%s' % (USERNAME,
                                                      API_KEY)).strip()
headers = {'Authorization': auth_header}

request = urllib2.Request(url, urllib.urlencode(data), headers)
result = urllib2.urlopen(request)