Wednesday, May 19, 2004 9:45 PM Jan Tielens

Searching in SharePoint: IFilter & Indexing PDF Documents

I always tell everyone that SharePoint is very extensible and customizable, and this is really true. For example, let's take a look at the search functionality in SharePoint. By default only Office documents (which are in a document library for example) are indexed by the Indexing Service so they can be found by using the search functionality of SharePoint. Of course in the real world there are a lot more document types that are used, for example a lot of companies have PDF documents. So I get quite a lot questions of people asking if PDF documents can be indexed too. The good news is that the Indexing Service can be extended by using the IFilter interface:

The IFilter interface scans documents for text and properties (also called attributes). It extracts chunks of text from these documents, filtering out embedded formatting and retaining information about the position of the text. It also extracts chunks of values, which are properties of an entire document or of well-defined parts of a document. IFilter provides the foundation for building higher-level applications such as document indexers and application-independent viewers.

Even better news is that Adobe has a free IFilter DLL for PDF documents!

Adobe PDF IFilter is a free, downloadable Dynamic Link Library (DLL) file that provides a bridge between a Microsoft indexing client and a library of Adobe PDF files. It consists of code that understands the Adobe PDF file format as well as code that can interface with the indexing client. When an indexing client needs to index content from PDF documents, it will look in its registry for an appropriate DLL and it will find the Adobe PDF IFilter. Adobe PDF IFilter will return text to the indexing client. The indexing client will then index the results and return the appropriate results to the user.

For more info on how to install it, take a look at Eric Legault's post. If you look in the internet you'll find plenty of other IFilter implementations, for example this one for JPEG files. There's even an IFilter Shop! Some other cool IFilter implementations: Visio 2003, XML, MP3.

Filed under:

Comments

# re: Searching in SharePoint: IFilter & Indexing PDF Documents

Wednesday, May 19, 2004 4:11 PM by Patrick Tisseghem

And don't forget that 'forgotten' pdf icon you have to add to the icons.xml file :)

# re: Searching in SharePoint: IFilter & Indexing PDF Documents

Friday, May 21, 2004 2:26 AM by Mike Walsh Helsinki

So I was wondering why you were bothering to tell us about the pdf IFilter (as it ought by now to be old news) and nearly about to skip the rest, when I saw the last sentence. Now *that* was new to me! Thanks, Jan.

(People looking for a pdf icon and where the icons.xml file is can search the WSS FAQ - www.wssfaq.com - for "pdf" or "icons" which will give them the item.)

# Where to install the iFilter?

Thursday, June 03, 2004 5:35 AM by Rohan Cragg

It's not clear from any of the documentation I have found so far whether iFilters need to be installed on the SQL Server (where the content database is) or on the IIS Server (where Windows SharePoint Services is running).

Anyone know?

# re: Searching in SharePoint: IFilter & Indexing PDF Documents

Monday, June 07, 2004 5:35 AM by sr301

Hi..jan..

i tried to install the ifilter from the adobe's site and also have installed the filter. but now will my query work the same way as it was before.. like it would now search for pdf files as well.. so m i suppose to add some lines of code to accomplish "search for text in pdf file" or it shall automatically be done by installing the ifilter.
I use ms indexing service, iis 6 and win xp.
For my search page using asp is
ixsso.query object.
So can u plz guide if i need to do some changes or it should run the same way but give me results for pdf files as well

cheers
sr301

# re: Searching in SharePoint: IFilter & Indexing PDF Documents

Wednesday, June 23, 2004 4:35 AM by Rohan Cragg

UPDATE: By trial and error it has become clear that if you have SQL on a different server then you need to install the iFilter on the SQl Server not on the IIS server.

# re: Searching in SharePoint: IFilter & Indexing PDF Documents

Wednesday, June 23, 2004 4:51 AM by NickPark

sr301,
Check the Adobe site... there is an issue with ifilter on XP - you need to make some registry changes to fix it.

# re: Searching in SharePoint: IFilter & Indexing PDF Documents

Thursday, June 22, 2006 7:44 AM by Lakshmi

Hi Jan,

I am trying to search data inside the documents uploaded in a document library using Sharepoint search in SPS 2003. I have excluded and included all the sites and subsites and also enabled Advanced Search option. I have also checked the option Select Full Update and Full Crawl on the site.

But i am not able to see the search results and i get the error message

"No results were found that match your query. Please consider the following:
Is your query spelled correctly? "

Please let me know how do i enable search to search content inside documents in document library.

Looking forward for your help
Regards
Lakshmi Murthy (MVP - BizTalk)

# re: Searching in SharePoint: IFilter & Indexing PDF Documents

Friday, October 13, 2006 5:02 PM by James McDowell

I have been completely unable to implement ifilters on WSS 3.0, but have been able to do at least the pdf iFilter on WSS 2.0 on SQL 2005.

Can anyone help?

# re: Searching in SharePoint: IFilter & Indexing PDF Documents

Wednesday, November 29, 2006 1:25 PM by jay.adamson

What specific information can be searched for in SharePoint Services ver 2? Can all fields in the "properties" of a Word document be searched for? Does the Adobe PDF ifilter allow the searching of file properties?

Thanks

# re: Searching in SharePoint: IFilter & Indexing PDF Documents

Tuesday, February 06, 2007 1:40 AM by Orgho

Hi , jan

I Made a Searching Software and it is finely used by client but currently my client want to integrate my software with sharepoint

client want searching from inside share point using web parts.

please give me any solution

# re: Searching in SharePoint: IFilter & Indexing PDF Documents

Tuesday, March 06, 2007 10:26 PM by Alex

In short, to enable PDF in Sharepoint search

1. install IFilter

2. Restart PC

3. Add pdf icon to the images folder

4. Add pdf type in site config and xml doc for mapping

5. Reindex site

# re: Searching in SharePoint: IFilter & Indexing PDF Documents

Wednesday, April 25, 2007 6:09 PM by sinnister

HI jan,

Does pdf stores a thumbnail view of the document and if does can

the ifilter get it.

# re: Searching in SharePoint: IFilter & Indexing PDF Documents

Friday, January 04, 2008 7:16 PM by Ricardo

I follow the steps to do this, and didn't work... i'm using sharepoint services, not the Sharepoint Portal version

Can somebody help me!!

# Full Text Search in WSS 3.0 | keyongtech

Sunday, March 01, 2009 9:08 PM by Full Text Search in WSS 3.0 | keyongtech

Pingback from  Full Text Search in WSS 3.0 | keyongtech