可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have a googlesheet where a column may contain no information in it. While iterating through the rows and looking at that column, if the column is blank, it's not returning anything. Even worse, if I do a get of a full row and include that common, say get 5 columns, I get back only 4 columns when any of the columns are empty. How do I return either NULL or an empty string if I'm getting a row of columns and one of the cells in a column is empty?
// Build a new authorized API client service.
Sheets service = GoogleSheets.getSheetsService();
range = "Functional Users!A3:E3";
response = service.spreadsheets().values().get(spreadsheetId, range).execute();
values = response.getValues();
cells = values.get(0);
I am getting 5 cells in the row. cells.size() should ALWAYS return five. However if any of the 5 cells are blank, it will return fewer cells. Say only the cell at B3 is empty. cells.size() will be 4. Next iteration, I get A4:E4 and cell D4 is empty. Again, cells.size() will be 4. With no way to know just which cell is missing. If A4 AND D4 AND E4 are empty, cells.size() will be 2.
How do I get it to return 5 cells regardless of empty cells?
回答1:
I've dabbled in Sheetsv4 and this is indeed the behavior when you're reading a range of cells with empty data. It seems this is the way it has been designed. As stated in the Reading data docs:
Empty trailing rows and columns are omitted.
So if you can find a way to write a character that represents 'empty values', like zero, then that will be one way to do it.
回答2:
The way I solved this issue was converting the values into a Pandas dataframe. I fetched the particular columns that I wanted in my Google Sheets, then converted those values into a Pandas dataframe. Once I converted my dataset into a Pandas dataframe, I did some data formatting, then converted the dataframe back into a list. By converting the list to a Pandas dataframe, each column is preserved. Pandas already creates null values for empty trailing rows and columns. However, I needed to also convert the non trailing rows with null values to keep consistency.
# Authenticate and create the service for the Google Sheets API
credentials = ServiceAccountCredentials.from_json_keyfile_name(KEY_FILE_LOCATION, SCOPES)
http = credentials.authorize(Http())
discoveryUrl = ('https://sheets.googleapis.com/$discovery/rest?version=v4')
service = discovery.build('sheets', 'v4',
http=http,discoveryServiceUrl=discoveryUrl)
spreadsheetId = 'id of your sheet'
rangeName = 'range of your dataset'
result = service.spreadsheets().values().get(
spreadsheetId=spreadsheetId, range=rangeName).execute()
values = result.get('values', [])
#convert values into dataframe
df = pd.DataFrame(values)
#replace all non trailing blank values created by Google Sheets API
#with null values
df_replace = dataset.replace([''], [None])
#convert back to list to insert into Redshift
processed_dataset = df_replace.values.tolist()
回答3:
I experienced the same issue using V4 of the sheets api but was able to workaround this using an extra column at the end of my range and the valueRenderOption
argument for the values.get
API
Given three columns, A, B and C any of which might contain a null value, add an additional column, D and add an arbitrary value here such as 'blank'.
Ensure you capture the new column in your range and add the additional parameter,
valueRenderOption: 'FORMATTED_VALUE'
.
You should end up with a call similar to this:
sheets.spreadsheets.values.get({
spreadsheetId: SOME_SHEET_ID,
range: "AUTOMATION!A:D",
valueRenderOption: 'FORMATTED_VALUE'
}, (err, res) => {})
This should then give you a consistent length array for each value, returning a blank string "" in the place of the empty cell value.
回答4:
The only solution I could find is writing your own function:
def _safe_get(data, r, c):
try:
return data[r][c]
except IndexError:
return ''
def read(range_name, service):
result = service[0].spreadsheets().values().get(spreadsheetId=service[1],
range=range_name).execute()
return result.get('values', [])
def safe_read(sheet, row, col, to_row='', to_col='', service=None):
range_name = '%s!%s%i:%s%s' % (sheet, col, row, to_col, to_row)
data = read(range_name, service)
if to_col == '':
cols = max(len(line) for line in data)
else:
cols = ord(to_col.lower()) - ord(col.lower()) + 1
if to_row == '':
rows = len(data)
else:
rows = to_row - row + 1
return [[_safe_get(data, r, c)
for c in range(cols)]
for r in range(rows)]
回答5:
I know that this is super late, but just in case someone else who has this problem in the future would like a fix for it, I'll share what I did to work past this.
What I did was increase the length of the range of cells I was looking for by one. Then within the Google Spreadsheet that I was reading off of, I added a line of "."s in the extra column (The column added to the array now that the desired range of cells has increased). Then I protected that line of periods so that it can't be changed from the "."
This way gives you an array with everything you are looking for, including null results, but does increase your array size by 1. But if that bothers you, you can just make a new one without the last index of the arrays.