How to write to XLSX file on Google Cloud Storage

2020-02-16 04:13发布

问题:

How can I write to an xlsx file on Google Cloud Storage using python? I'm doing this right now but unsure how to format my write() to add in a row. I want to add in a row ['Sally', 'Ja', 15] to my names.xlsx file on cloud storage.

import cloudstorage

file = cloudstorage.open('/master/names.xlsx')
file.write(##what goes in here?##)
filehandle.close()

回答1:

Objects in Google Cloud Storage are immutable, so you cannot append a new row to an existing object. You would need to rewrite the entire object.

You can get a limited "append" functionality using the compose API, but this only appends bytes to the end of the object and appending bytes to the end of the object, and I don't believe that can easily achieved for XLSX data.



回答2:

As mentioned by Travis , you cannot append but re-rewrite the entire object ,example below(assuming text.csv is your existing file) , you can read the file in dataframe , add some data and copy it using gsutil command to GCP bucket. This will overwrite the previous version of text.csv.

  import pandas as pd
  data = [['Alex','Feb',10],['Bob','jan',12]]
  df = pd.DataFrame(data,columns=['Name','Month','Age'])
  print df

Output

       Name Month  Age
    0  Alex   Feb   10
    1   Bob   jan   12

Add a row

  row = ['Sally','Oct',15]
  df.loc[len(df)] = row
  print df

output

    Name Month  Age
0   Alex   Feb   10
1    Bob   jan   12
2  Sally   Oct   15

write/copy to GCP Bucket using gsutil

 df.to_csv('text.csv', index = False)
!gsutil cp 'text.csv' 'gs://BucketName/folderName/'

write/copy to GCP Bucket using python

`pip3 install xlsxwriter # install package`

python code

from google.cloud import storage
import pandas as pd

#define configurations
bucket_name = 'my_gcp_bucket'
xlsx_file = 'output.xlsx'

#create dataframe
data = [['Alex','Feb',10],['Bob','jan',12]]
df = pd.DataFrame(data,columns=['Name','Month','Age'])
df.to_excel("output.xlsx")

#Upload to Google cloud storage
client = storage.Client()
bucket = client.get_bucket(bucket_name)
blob = bucket.blob(xlsx_file)
blob.upload_from_filename(xlsx_file)