reading word file in c#

2019-04-07 17:19发布

问题:

I have a word document I want to parse with C#. There are plenty of tutorials out there, but I have a hard time deciding what library to use. I found the following dlls:

  1. Microsoft.Office.Interop.Word
  2. Microsoft.Office.Tools.Word
  3. Microsoft.Office.Tools.Word.v4.0.Utilities
  4. COM Microsoft Word 12.0 Object Library
  5. Open XML sdk

These are all I found on the web. Which one should I use? Which of those are obsolete?

回答1:

You can also do it using NetOffice

Site: http://netoffice.codeplex.com/

Using it you don't need to worry about versions and "Syntactically and semantically identical to the Microsoft Interop Assemblies" so you do your coding the same way.

Some other advantages:

  • Office integration without version limitations
  • All objects, methods,properties and events of the Office versions 2000, 2002, 2003, 2007,2010 are included
  • Attribute concept and XML source documentation for information which Office version(s) are offering the particular method or property
  • No training if you already know the Office object model, use your existing PIA code
  • Reduced and more readable code with automatic management of COM proxies
  • No deployment hurdles, no problematic registration, no dependencies, no interop assemblies, no need for VSTO
  • Usable with .NET version 2.0 or higher
  • Easy Addin Development


回答2:

Beth Massi has written several articles on OpenXML SDK on her blog http://blogs.msdn.com/b/bethmassi , and has even done several screen casts on www.dnrtv.com - that could give you an idea of what you're up against.