I'm trying to covert a coordinate value in excel to a row number and column number in openpyxl.
For example if my cell coordinate is D4 I want to find the corresponding row and column numbers to use for future operations, in the case row = 3, column = 3. I can get the row number easily using ws.cell('D4').row
which returns 4
then it's just a matter of subtracting 1. But a similar argument ws.cell('D4').column
returns D
and I don't know how to easily get this into int form for subsequent operations. So I turn to you wise folks of stackoverflow. Can you help me?
What you want is openpyxl.utils.coordinate_from_string()
and openpyxl.utils.column_index_from_string()
from openpyxl.utils import coordinate_from_string, column_index_from_string
xy = coordinate_from_string('A4') # returns ('A',4)
col = column_index_from_string(xy[0]) # returns 1
row = xy[1]
openpyxl has a function called get_column_letter that converts a number to a column letter.
from openpyxl.utils import get_column_letter
print(get_column_letter(1))
1 --> A
50 --> AX
1234-- AUL
I have been using it like:
from openpyxl import Workbook
from openpyxl.utils import get_column_letter
#create excel type item
wb = Workbook()
# select the active worksheet
ws = wb.active
counter = 0
for column in range(1,6):
column_letter = get_column_letter(column)
for row in range(1,11):
counter = counter +1
ws[column_letter + str(row)] = counter
wb.save("sample.xlsx")
This is building off of Nathan's answer. Basically, his answer does not work properly when the row and/or column is more than one character wide. Sorry - I went a little over board. Here is the full script:
def main():
from sys import argv, stderr
cells = None
if len(argv) == 1:
cells = ['Ab102', 'C10', 'AFHE3920']
else:
cells = argv[1:]
from re import match as rematch
for cell in cells:
cell = cell.lower()
# generate matched object via regex (groups grouped by parentheses)
m = rematch('([a-z]+)([0-9]+)', cell)
if m is None:
from sys import stderr
print('Invalid cell: {}'.format(cell), file=stderr)
else:
row = 0
for ch in m.group(1):
# ord('a') == 97, so ord(ch) - 96 == 1
row += ord(ch) - 96
col = int(m.group(2))
print('Cell: [{},{}] '.format(row, col))
if __name__ == '__main__':
main()
Tl;dr with a bunch of comments...
# make cells with multiple characters in length for row/column
# feel free to change these values
cells = ['Ab102', 'C10', 'AFHE3920']
# import regex
from re import match as rematch
# run through all the cells we made
for cell in cells:
# make sure the cells are lower-case ... just easier
cell = cell.lower()
# generate matched object via regex (groups grouped by parentheses)
############################################################################
# [a-z] matches a character that is a lower-case letter
# [0-9] matches a character that is a number
# The + means there must be at least one and repeats for the character it matches
# the parentheses group the objects (useful with .group())
m = rematch('([a-z]+)([0-9]+)', cell)
# if m is None, then there was no match
if m is None:
# let's tell the user that there was no match because it was an invalid cell
from sys import stderr
print('Invalid cell: {}'.format(cell), file=stderr)
else:
# we have a valid cell!
# let's grab the row and column from it
row = 0
# run through all of the characters in m.group(1) (the letter part)
for ch in m.group(1):
# ord('a') == 97, so ord(ch) - 96 == 1
row += ord(ch) - 96
col = int(m.group(2))
# phew! that was a lot of work for one cell ;)
print('Cell: [{},{}] '.format(row, col))
print('I hope that helps :) ... of course, you could have just used Adam\'s answer,\
but that isn\'t as fun, now is it? ;)')
Old topic, but the answer is not correct!
dylnmc method was a good one, but has some errors. The calculated row for cell coords like "AA1" or "AAB1" is not correct.
Below is the corrected version as a function.
NOTE: This function returns the real coordinated. If you want to use it for example in ExcelWriter, both ROW and COL should be deducted by one. So replace the last line with return(row-1,col-1)
For example 'AA1' is [1,27] and 'AAA1' is [1,703]; but the python must have them as [0,26] and [0,702].
import re
def coord2num(coord):
cell = coord.lower()
# generate matched object via regex (groups grouped by parentheses)
m = re.match('([a-z]+)([0-9]+)', cell)
if m is None:
print('Invalid cell: {}'.format(cell))
return [None,None]
else:
col = 0
for i,ch in enumerate(m.group(1)[::-1]):
n = ord(ch)-96
col+=(26**i)*(n)
row = int(m.group(2))
return[row,col]
You could just use pure Python:
cell = "D4"
col = ord(cell[0]) - 65
row = int(cell[1:]) - 1
This uses the ord
function which takes a character and returns its ASCII code. In ASCII the letter A
is 65, B
is 66 and so on.