Why is there still a row limit in Microsoft Excel?

2019-02-01 16:19发布

问题:

Until Office 2007, Excel has a maximum of 65,000 rows. Office 2007 bumped that up to a max of 1 million rows, which is nicer of course; but I'm curious -- why is there a limit at all? Obviously, performance will slow down exponetially as you increase the spreadsheet size; but it shouldn't be very hard to have Excel optimize for that by starting with a small sheet and dynamically "re-sizing" it only as needed. Given how much work it must have been to increase the limit from 65K to 1 million, why didn't they go all the way so it's limited only by the amount of available memory and disk space?

回答1:

(updated because of error... A suggestion to everyone: don't post on SO before you are fully awake)

Probably because of optimizations. Excel 2007 can have a maximum of 16 384 columns and 1 048 576 rows. Strange numbers?

14 bits = 16 384, 20 bits = 1 048 576

14 + 20 = 34 bits = more than one 32 bit register can hold.

But they also need to store the format of the cell (text, number etc) and formatting (colors, borders etc). Assuming they use two 32-bit words (64 bit) they use 34 bits for the cell number and have 30 bits for other things.

Why is that important? In memory they don't need to allocate all the memory needed for the whole spreadsheet but only the memory necessary for your data, and every data is tagged with in what cell it is supposed to be in.

Update 2016:

Found a link to Microsoft's specification for Excel 2013 & 2016

  • Open workbooks: Limited by available memory and system resources
  • Worksheet size: 1,048,576 rows (20 bits) by 16,384 columns (14 bits)
  • Column width: 255 characters (8 bits)
  • Row height: 409 points
  • Page breaks: 1,026 horizontal and vertical (unexpected number, probably wrong, 10 bits is 1024)
  • Total number of characters that a cell can contain: 32,767 characters (signed 16 bits)
  • Characters in a header or footer: 255 (8 bits)
  • Sheets in a workbook: Limited by available memory (default is 1 sheet)
  • Colors in a workbook: 16 million colors (32 bit with full access to 24 bit color spectrum)
  • Named views in a workbook: Limited by available memory
  • Unique cell formats/cell styles: 64,000 (16 bits = 65536)
  • Fill styles: 256 (8 bits)
  • Line weight and styles: 256 (8 bits)
  • Unique font types: 1,024 (10 bits) global fonts available for use; 512 per workbook
  • Number formats in a workbook: Between 200 and 250, depending on the language version of Excel that you have installed
  • Names in a workbook: Limited by available memory
  • Windows in a workbook: Limited by available memory
  • Hyperlinks in a worksheet: 66,530 hyperlinks (unexpected number, probably wrong. 16 bits = 65536)
  • Panes in a window: 4
  • Linked sheets: Limited by available memory
  • Scenarios: Limited by available memory; a summary report shows only the first 251 scenarios
  • Changing cells in a scenario: 32
  • Adjustable cells in Solver: 200
  • Custom functions: Limited by available memory
  • Zoom range: 10 percent to 400 percent
  • Reports: Limited by available memory
  • Sort references: 64 in a single sort; unlimited when using sequential sorts
  • Undo levels: 100
  • Fields in a data form: 32
  • Workbook parameters: 255 parameters per workbook
  • Items displayed in filter drop-down lists: 10,000


回答2:

In a word - speed. An index for up to a million rows fits in a 32-bit word, so it can be used efficiently on 32-bit processors. Function arguments that fit in a CPU register are extremely efficient, while ones that are larger require accessing memory on each function call, a far slower operation. Updating a spreadsheet can be an intensive operation involving many cell references, so speed is important. Besides, the Excel team expects that anyone dealing with more than a million rows will be using a database rather than a spreadsheet.