In the last week, I had a few bad days with Sharepoint Incremental crawl. Would Like to share the problem and reason of the problem with you all.
We have a Sharepoint application, which contains a huge amount of data. The main functionality of the application is to search the data, with the help of the custom search page. The availability of data is very important and it’s a must that data added, updated should be available must be available within 2 minutes of update.
So we provided the client with a way to start the Sharepoint Incremental crawl. I had earlier published an article on the topic. You can read it here. Starting an incremental crawl in Share point programmatically.
Incremental crawl would normally take 1 minutes, because normally it would find only one document updated. But after we had implemented this in the application and saw it working great for 10 days (at least). We found another problem with Sharepoint. The incremental crawl was running fine, but sometimes (out of blue) the Incremental crawl would take about 7 hours. During this crawl no document would be indexed (we checked the crawl log for that). But after 7 hours or so, the crawl would complete and after that the incremental crawl would work as we expect it to work.
Also this happened only in the production environment (twice), but was not replicated in the test environment.
Later after a few testing we found that this long/delayed Incremental crawl was taking place only when we were adding/removing users form the site. After we add or remove a user from the site, the next incremental crawl would normally take a very long time. In-fact with experiment we could reproduce the issue in the test and dev environment also.
After doing some more research we found that this happened because after a user is added or removed from the Sharepoint web application, the next incremental crawl for Sharepoint would be security only crawl. When incremental crawl starts, these security changes, “Updated ACL’s”, must be pushed down to all affected items within the index. Hence this incremental crawl takes a very very long time (depending on the size of index).
So if you are having the same problem with incremental crawl, the only simple solution/ problem remover that I know of know is to see to it that users are added/ removed at a time after which incremental crawl taking long time does not affect the system very much.