Office 2007 [and higher] interop: retrieve RGB-col

2019-02-20 12:16发布

问题:


UPDATE: If you need to determine rgb-color in office document (format 2007) look at my answer below.

Have:

  • Interop.Word.dll ver.14 from VS2010 PIA,
  • VS2010 Express Edition
  • MS Word 2010 (ver.14)
  • .docx-file made in mentioned Word manually without Interop. File contains several tables with colored corner cells.

Purpose: To build another .docx-file with Interop contained those tables filled with gradient color based on colors in its corners.
Where problem appears: I need to transform colors in tables corners from WdColor to System.Drawing.Color to calculate gradient. So I work with cell's Shading.BackgroundPatternColor property. And I found that sometimes it contains correct BGR 24-bit value and sometimes it doesn't.
The second case appears only when cell has one of the theme palette colors (standard and rgb-palette colors works well, but theme palette colors lead to problem). For example, when I set 0x00F2F2F2 (the lightest gray) color then it stores in document.xml inside .docx-file archive correctly but Shading.BackgroundPatternColor property is set to 0xDC00F2FF. So ColorTranslator.FromOle returns different color.
Btw, there's no WdColor for this gray in enum. The lighest gray wdColorGray05 = 0xF3F3F3 due to .Net Reflector. It means that not all palette default colors correspond to enum colors.
Also if I set manually the same color in RGB-palette in Word (ie. 242, 242, 242), save file and open again by Interop - color will be set properly as 0x00F2F2F2!
Question: Anybody had that problem? How to properly retrieve RGB color from Shading.BackgroundPatternColor property? Why this property doesn't correspond to value stored in document.xml?

回答1:

It's second time I have problems with retrieving RGB-colors from office documents. First time it was Excel 2007 .xlsx file format, now it's Word 2010 .docx (still 2007 format though). So after a little search I decided to answer my own question for all those fellows who will have the same trouble.

For more deep explanation and examples I send you to the article which helped me a lot . Since examples used in this article probably will be harder to read for C# developers since they're written on VBA I attached link to my implementation of rgb color retriever.

So. If you open one of the Office program (particularly Excel or Word) you can set color of the most objects, text, background etc. And there's a dialog showed to select it. In Office 2007 or higher you'll see a set of 10 standard colors and set of 60 colors based on theme. If you click on 'More colors..' you'll be able to select color from predetermined color set or from RGB-palette.

The way you've chosen a color from that dialog determines format for storing the color. The property which stores color value is a 32-bit integer where 1st the most significant byte (let's call it FormatByte) is for format specification and the other 24-bit for color value or anything else (let's call these 24-bits ColorValue). Here're possible format specifications:

  • FormatByte == 0x00
    ColorValue is common BGR-value. In C# you can retrieve RGB by ColorTranslator.FromOle(ColorValue);. This format is used when you've selected standard colors, or one of the colors from 'More colors..' dialog (predetermined or palette).

  • FormatByte == 0xFF
    ColorValue will be 0x000000. It's wdColorAutomatic value. It's a contrast color and that's all what I know about it (in my case It always was white for background and black for font). Haven't researched it more.

  • FormatByte == 0x80
    ColorValue will be in range [0x000000, 0x000018]. These colors you can meet in ActiveX controls in a document. It's a system KnownColors (there's a c# superset - System.Drawing.KnownColor which contains that values). If I understood right you can retrieve RGB also by ColorTranslator.FromOle(_color);, where _color is all 32-bit property value, because due to reflected implementation of ColorTranslator.FromOle() it checks if color is from KnownColor enum. But I've never been faced with that values while parsing Office files to approve.

  • FormatByte in range [0xD4, 0xDF]
    In that case you deal with color based on theme. It represents as index of base color and tint or shade shift.

Let's take a deeper look at the last case because there're much more difficulties.
As you see the first half of FormatByte is always 0xD and the other half varies from 0x4 to 0xF. This second half is an index of one of the 10 base colors.

There's a wdThemeColorIndex enumeration in Word for that index and it can be translated to more basic Office msoThemeColorSchemeIndex enumeration (it's placed in Microsoft.Office.Core.dll which can be linked from tabPage .COM as Microsoft Office XX.0 Object Library where XX.0 - version of Office). You can take a look to article linked above for the translation sheet or VBA function or to my implementation to C# method. From this msoThemeColorSchemeIndex we can obtain RGB property by ActiveDocument.DocumentTheme.
Then we retrieve tint or shade from ColorValue, translate base color to HSL (hue, saturation, lightness), apply tint or shade to it and translate result back to RGB.