I'm trying to fetch a URL from a Jekins server. Until somewhat recently I was able to use the pattern described on this page (HOWTO Fetch Internet Resources Using urllib2) to create a password-manager that correctly responded to BasicAuth challenges with the user-name & password. All was fine until the Jenkins team changed their security model, and that code no longer worked.
# DOES NOT WORK!
import urllib2
password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
top_level_url = "http://localhost:8080"
password_mgr.add_password(None, top_level_url, 'sal', 'foobar')
handler = urllib2.HTTPBasicAuthHandler(password_mgr)
opener = urllib2.build_opener(handler)
a_url = 'http://localhost:8080/job/foo/4/api/python'
print opener.open(a_url).read()
Stacktrace:
Traceback (most recent call last):
File "/home/sal/workspace/jenkinsapi/src/examples/password.py", line 11, in <module>
print opener.open(a_url).read()
File "/usr/lib/python2.7/urllib2.py", line 410, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 523, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 448, in error
return self._call_chain(*args)
File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden
[Finished in 0.0s with exit code 1]
The problem appears to be that Jenkins returns not with the expected 401 code, but a 403 which urllib2 interprets as an end of conversation. It never actually sends the password. After some surfing around github found another developer's solution which works...
# WORKS... SORTA
def auth_headers(username, password):
return 'Basic ' + base64.encodestring('%s:%s' % (username, password))[:-1]
auth = auth_headers('sal', 'foobar')
top_level_url = "http://localhost:8080"
a_url = 'http://localhost:8080/job/foo/4/api/python'
req = urllib2.Request(a_url)
req.add_header('Authorization', auth)
print urllib2.urlopen(req).read()
But that seems rather unsatisfying. It's not bothering to check whether the domain is relevant to the username and password... it's just sending my login details regardless!
Can anybody suggest a way to make the original script work? I'd like to use a urllib2 password manager in such a way that I can login to Jenkins.
See this gist as well: https://gist.github.com/dnozay/194d816aa6517dc67ca1
Jenkins does not return 401 - retry
HTTP error code when you need to access a page that needs authentication; instead it returns 403 - forbidden
. In the wiki, https://wiki.jenkins-ci.org/display/JENKINS/Authenticating+scripted+clients, it shows that using the command-line tool wget
you need to use wget --auth-no-challenge
which is exactly because of that behavior.
Retrying with basic auth when you get a 403 - forbidden
:
let's say you defined:
jenkins_url = "https://jenkins.example.com"
username = "johndoe@example.com"
api_token = "my-api-token"
You can subclass a urllib2.HTTPBasicAuthHandler
to handle 403
HTTP responses.
import urllib2
class HTTPBasic403AuthHandler(urllib2.HTTPBasicAuthHandler):
# retry with basic auth when facing a 403 forbidden
def http_error_403(self, req, fp, code, msg, headers):
host = req.get_host()
realm = None
return self.retry_http_basic_auth(host, req, realm)
Then it is a matter of using that handler, e.g. you can install it so it works for all urllib2.urlopen
calls:
def install_auth_opener():
'''install the authentication handler.
This handles non-standard behavior where the server responds with
403 forbidden, instead of 401 retry. Which means it does not give you the
chance to provide your credentials.'''
auth_handler = HTTPBasic403AuthHandler()
auth_handler.add_password(
realm=None,
uri=jenkins_url,
user=username,
passwd=api_token)
opener = urllib2.build_opener(auth_handler)
# install it for all urllib2.urlopen calls
urllib2.install_opener(opener)
and here is a simple test to see if it works okay.
if __name__ == "__main__":
# test
install_auth_opener()
page = "%s/me/api/python" % jenkins_url
try:
result = urllib2.urlopen(page)
assert result.code == 200
print "ok"
except urllib2.HTTPError, err:
assert err.code != 401, 'BAD CREDENTIALS!'
raise err
Using pre-emptive authentication.
There is a good example in this answer: https://stackoverflow.com/a/8513913/1733117.
Rather than retrying when you get a 403 forbidden
you would send the Authorization
header when the url matches.
class PreemptiveBasicAuthHandler(urllib2.HTTPBasicAuthHandler):
'''Preemptive basic auth.
Instead of waiting for a 403 to then retry with the credentials,
send the credentials if the url is handled by the password manager.
Note: please use realm=None when calling add_password.'''
def http_request(self, req):
url = req.get_full_url()
realm = None
# this is very similar to the code from retry_http_basic_auth()
# but returns a request object.
user, pw = self.passwd.find_user_password(realm, url)
if pw:
raw = "%s:%s" % (user, pw)
auth = 'Basic %s' % base64.b64encode(raw).strip()
req.add_unredirected_header(self.auth_header, auth)
return req
https_request = http_request
Rather than defining your own handler and installing it globally or using it for individual requests, it's a lot easier to just add the header to the request:
auth_header = 'Basic ' + base64.b64encode('%s:%s' % (USERNAME,
API_KEY)).strip()
headers = {'Authorization': auth_header}
request = urllib2.Request(url, urllib.urlencode(data), headers)
result = urllib2.urlopen(request)