Using full-text search with PDF files in SQL Serve

2020-07-11 10:27发布

问题:

I have SQL Server 2008 R2 and am trying to implement full-text search on a PDF BLOB.

I have installed the iFilter from Adobe and confirmed it is installed

Using

EXEC sp_help_fulltext_system_components 'filter';

filter .pdf E8978DA6-047F-4E3D-9C78-CDBE46041603
C:\Program Files\Adobe\Adobe PDF iFilter 11 for 64-bit platforms\bin\PDFFilter.dll
11.0.1.36 Adobe Systems, Inc.

I then created a fulltext catalog for the FT Index and created the FT index

CREATE FULLTEXT INDEX ON Compliance_Updates
( 
FileDesc
 Language 1033,
 FileData
   TYPE COLUMN FileDataType
) 
 KEY INDEX PK_Compliance_Updates
     ON FT_Compliance_Updates; 

I then forced a rebuild of the index after adding some PDF's to the table. The index shows..

Catalogue Size : 0MB
Item Count : 2
Unique Key Count : 7
Name : FT_Compliance_Updates
Last Population Date : 12/11/2013 09:36
Population Status : Idle

However, when I perform the following search, I get zero results...

SELECT FileID, FileDesc, PubDate 
FROM Compliance_Updates 
WHERE CONTAINS(FileData, 'mortgage')

I've tried deleting the catalog, removing all the table records and indexes (including PK), re-running the iFilter install

exec sp_fulltext_service 'load_os_resources', 1;
exec sp_fulltext_service 'verify_signature', 0;

Restarting SQL Server, re-creating the indexes and FT catalog, nothing seems to work?

回答1:

  • Version 11.x didn't work for me, but 9.x worked.
  • Also you need to add C:\Program Files\Adobe\Adobe PDF iFilter 9 for 64-bit platforms\bin\ at the end of the System's PATH variable as well. Start > Control Panel > System > Advanced Environment Variables -> System Variables -> find PATH


回答2:

Version 11.x didn't work for me too. 9.x works :) It is hard to find 9.x 64 Bit on the website of Adobe. But on FTP you could find it here: ftp://ftp.adobe.com/pub/adobe/acrobat/win/9.x/



回答3:

iFilter generally works but on some machines it does not. I successfully installed it at work but I failed with my personal laptop. You can try the following:

  • Installing iFilter with a short path without spaces, national characters and make it short.
  • Grant full access to all users to the directory where iFilter is installed. When you make it working, you can gradually restrict access.
  • Make sure you add iFilter bin path to the SYSTEM path not the USER one.

Video recorded steps can be found here - http://dba-presents.com/index.php/sql-server/48-full-text-search-with-pdf-documents-in-sql-server-2014.



回答4:

FWIW, even with SQL Server 2014, I was not able to get Version 11.x to work and so downloaded Version 9.x from the FTP link kindly provided above. Version 9.x still seems to be the way to go as it also worked for me! :^)