IFilter or SDK for many file types?

2019-08-04 12:39发布

问题:

Does anybody know of an API/SDK or IFilter in .NET that can read the subject ('title' metadata) and text from the following files:

.PDF .DOC .XLS .PPT .CSV .TXT .DOCX .XLS .PPTX + the OpenOffice and Open Document standards.

Open source would be awesome... but commercial is OK too.

I can't find anything anywhere!

回答1:

I don't think you will be able to find a single IFilter that will be able to access the contents of all of those types. Typically, an IFilter will be for a specific technology.

For example, Adobe have one for PDFs, Microsoft provide one for Office that can do Word, Excel, Powerpoint, CSV (that I believe comes pre-installed with Windows).