Google Drive Python API resumable upload error 401

2019-01-20 11:38发布

问题:

First of all, I'm sorry if this is a too silly question... this is the first time I'm trying to use any of the technologies involved in this script (Python, the drive api, oauth 2.0, etc). I swear I've been searching and trying this for about a week before posting the question. hehehe

I'm trying to use the google-api-python-client to upload a big file (3.5GiB) that is on a terminal only Linux Debian. I've had some success uploading small files, but when I try to upload the big file, the upload stops about 1~2 hours after it started with HTTP 401 error (unauthorized). I've been looking on how to get a new access token but have had little success.

This is my (updated) code so far:

#!/usr/bin/python

import httplib2
import pprint
import time

from apiclient.discovery import build
from apiclient.http import MediaFileUpload
from apiclient import errors
from oauth2client.client import OAuth2WebServerFlow

# Copy your credentials from the APIs Console
CLIENT_ID = 'myclientid'
CLIENT_SECRET = 'myclientsecret'

# Check https://developers.google.com/drive/scopes for all available scopes
OAUTH_SCOPE = 'https://www.googleapis.com/auth/drive'

# Redirect URI for installed apps
REDIRECT_URI = 'urn:ietf:wg:oauth:2.0:oob'

# Run through the OAuth flow and retrieve credentials
flow = OAuth2WebServerFlow(CLIENT_ID, CLIENT_SECRET, OAUTH_SCOPE, REDIRECT_URI)
authorize_url = flow.step1_get_authorize_url()
print 'Go to the following link in your browser: ' + authorize_url
code = raw_input('Enter verification code: ').strip()
credentials = flow.step2_exchange(code)

# Create an httplib2.Http object and authorize it with our credentials
http = httplib2.Http()
http = credentials.authorize(http)

drive_service = build('drive', 'v2', http=http)

# Insert a file
media_body = MediaFileUpload('bigfile.zip', mimetype='application/octet-stream', chunksize=1024*256, resumable=True)
body = {
    'title': 'bigfile.zip',
    'description': 'Big File',
    'mimeType': 'application/octet-stream'
}

retries = 0
request = drive_service.files().insert(body=body, media_body=media_body)
response = None
while response is None:
    try:
            print http.request.credentials.access_token
            status, response = request.next_chunk()
            if status:
                    print "Uploaded %.2f%%" % (status.progress() * 100)
                    retries = 0
    except errors.HttpError, e:
            if e.resp.status == 404:
                    print "Error 404! Aborting."
                    exit()
            else:   
                    if retries > 10:
                            print "Retries limit exceeded! Aborting."
                            exit()
                    else:   
                            retries += 1
                            time.sleep(2**retries)
                            print "Error (%d)... retrying." % e.resp.status
                            continue
print "Upload Complete!"

After some digging, I found out that the authorized http object automatically refreshes the access token after receiving 401. Although it's really changing the access token, it's still not continuing the upload as expected... see the output below:

ya29.AHES6ZTo_-0oDqwn3JnU2uCR2bRjpRGP0CSQSMHGr6KvgEE
Uploaded 2.28%
ya29.AHES6ZTo_-0oDqwn3JnU2uCR2bRjpRGP0CSQSMHGr6KvgEE
Uploaded 2.29%
ya29.AHES6ZTo_-0oDqwn3JnU2uCR2bRjpRGP0CSQSMHGr6KvgEE
Uploaded 2.29%
ya29.AHES6ZTo_-0oDqwn3JnU2uCR2bRjpRGP0CSQSMHGr6KvgEE
Uploaded 2.30%
ya29.AHES6ZTo_-0oDqwn3JnU2uCR2bRjpRGP0CSQSMHGr6KvgEE
Error (401)... retrying.
ya29.AHES6ZQqp3_qbWsTk4yVDdHnlwc_7GvPZiFIReDnhIIiHao
Error (401)... retrying.
ya29.AHES6ZSqx90ZOUKqDEP4AAfWCVgXZYT2vJAiLwKDRu87JOs
Error (401)... retrying.
ya29.AHES6ZTp0RZ6U5K5UdDom0gq3XHnyVS-2sVU9hILOrG4o3Y
Error (401)... retrying.
ya29.AHES6ZSR-IOiwJ_p_Dm-OnCanVIVhCZLs7H_pYLMGIap8W0
Error (401)... retrying.
ya29.AHES6ZRnmM-YIZj4S8gvYBgC1M8oYy4Hv5VlcwRqgnZCOCE
Error (401)... retrying.
ya29.AHES6ZSF7Q7C3WQYuPAWrxvqbTRsipaVKhv_TfrD_gef1DE
Error (401)... retrying.
ya29.AHES6ZTsGzwIIprpPhCrqmoS3UkPsRzst5YHqL-zXJmz6Ak
Error (401)... retrying.
ya29.AHES6ZSS_1ZBiQJvZG_7t5uW3alsy1piGe4-u2YDnwycVrI
Error (401)... retrying.
ya29.AHES6ZTLFbBS8mSFWQ9zK8cgbX8RPeLghPxkfiKY54hBB-0
Error (401)... retrying.
ya29.AHES6ZQBeMWY50z6fWXvaCcd5_AJr_AYOuL2aiNKpK-mmyU
Error (401)... retrying.
ya29.AHES6ZTs2mYYSEyOqI_Ms4itKDx36t39Oc5RNZHkV4Dq49c
Retries limit exceeded! Aborting.

I'm using debian lenny with Python 2.5.2 installed, and installed the ssl and google-api-python-client through pip install about a week ago.

Thanks in advance for any help.

EDIT: Apparently, the problem isn't with the api. I tried the same code above, but with two small files, with 1h between them (system.sleep()). The output was:

ya29.AHES6ZRUssiLfuhqCP9Cu7C7LuhRV2rYzPldU27wiMJZWb8
Uploaded 66.89%
ya29.AHES6ZRUssiLfuhqCP9Cu7C7LuhRV2rYzPldU27wiMJZWb8
Upload 1 Complete!
ya29.AHES6ZRUssiLfuhqCP9Cu7C7LuhRV2rYzPldU27wiMJZWb8
Uploaded 57.62%
ya29.AHES6ZQd3o1ciwXpNFImH3CK0-dJAtQba_oeIO9DDbIq154
Upload 2 Complete!

For the second upload, a new access token was used successfully. So, perhaps the resumable session is expiring after some time or is only valid for that specific access token?

回答1:

I filed an issue on the google-api-python-client project, and according to Joe Gregorio from google, the problem is in the backend:

"This is an issue with the backend and not with the API or with your code. As you deduced, if the upload goes too long the access_token expires and at that point the resumable upload can't be continued. There is work on progress to fix this issue right now, I will update this bug once the issue is fixed on the server side."



回答2:

I assume the problem is that after the 1-2 hour limit your access token to your remote database expires; cutting off your connection with the remote server. I think what you could do is look at your hosts API manual... They should have something in there about 'refresh tokens'(They get you another Access Token, note some hosts only allow you to use one refresh token per session), if they are allowed an unlimited amount you can use a combination of a timer and AJAX to keep asking for more access tokens.

If not then you would have a make an AJAX request for another Authorization Token and exchange that for another Access token every hour. That sounds like a very rigorous process but I think that is the only way if your token keeps expiring.

Also just on another note have you tried other methods of uploading? If you said the above script ran for 1-2 hours and it only uploaded 1.44% of the file that could take 100+ hours to fully upload (Way too long for only 3 Gigs).