I have a word document I want to parse with C#. There are plenty of tutorials out there, but I have a hard time deciding what library to use. I found the following dlls:
- Microsoft.Office.Interop.Word
- Microsoft.Office.Tools.Word
- Microsoft.Office.Tools.Word.v4.0.Utilities
- COM Microsoft Word 12.0 Object Library
- Open XML sdk
These are all I found on the web. Which one should I use? Which of those are obsolete?
You can also do it using NetOffice
Site: http://netoffice.codeplex.com/
Using it you don't need to worry about versions and "Syntactically and semantically identical to the Microsoft Interop Assemblies" so you do your coding the same way.
Some other advantages:
- Office integration without version limitations
- All objects, methods,properties and events of the Office versions 2000, 2002, 2003, 2007,2010 are included
- Attribute concept and XML source documentation for information which Office version(s) are offering the particular method or property
- No training if you already know the Office object model, use your existing PIA code
- Reduced and more readable code with automatic management of COM proxies
- No deployment hurdles, no problematic registration, no dependencies, no interop
assemblies, no need for VSTO
- Usable with .NET version 2.0 or higher
- Easy Addin Development
Beth Massi has written several articles on OpenXML SDK on her blog http://blogs.msdn.com/b/bethmassi , and has even done several screen casts on www.dnrtv.com - that could give you an idea of what you're up against.