How to upload dataset in google colaboratory?

2019-02-27 22:53发布

I need to upload dataset of images in google colaboratory. It has subfolder inside it which contains images. Whatever I found on the net was for the single file.

from google.colab import files

uploaded = files.upload()

Is there any way to do it?

1条回答
爷的心禁止访问
2楼-- · 2019-02-27 23:19

You need to copy your dataset into Google Drive. Then obtain the DATA_FOLDER_ID. The best way to do so, is to open the folder in your Google Drive and copy the last part of html address. For example the folder id for the link:

https://drive.google.com/drive/folders/xxxxxxxxxxxxxxxxxxxxxxxx is xxxxxxxxxxxxxxxxxxxxxxxx

Then you can create local folders and upload each file recursively.

DATA_FOLDER_ID = 'xxxxxxxxxxxxxxxxxxxxxxxx'
ROOT_PATH = '~/you_path'
!pip install -U -q PyDrive
import os
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# choose a local (colab) directory to store the data.
local_root_path = os.path.expanduser(ROOT_PATH)
try:
  os.makedirs(local_root_path)
except: pass

def ListFolder(google_drive_id, destination):
  file_list = drive.ListFile({'q': "'%s' in parents and trashed=false" % google_drive_id}).GetList()
  counter = 0
  for f in file_list:
    # If it is a directory then, create the dicrectory and upload the file inside it
    if f['mimeType']=='application/vnd.google-apps.folder': 
      folder_path = os.path.join(destination, f['title'])
      os.makedirs(folder_path)
      print('creating directory {}'.format(folder_path))
      ListFolder(f['id'], folder_path)
    else:
      fname = os.path.join(destination, f['title'])
      f_ = drive.CreateFile({'id': f['id']})
      f_.GetContentFile(fname)
      counter += 1
  print('{} files were uploaded in {}'.format(counter, destination))

ListFolder(DATA_FOLDER_ID, local_root_path)
查看更多
登录 后发表回答