Hi,
In the last week, I
had a few bad days with Sharepoint Incremental crawl. Would Like to
share the problem and reason of the problem with you all.
We have a Sharepoint
application, which contains a huge amount of data. The main
functionality of the application is to search the data, with the help
of the custom search page. The availability of data is very important
and it’s a must that data added, updated should be available must be
available within 2 minutes of update.
So we provided the
client with a way to start the Sharepoint Incremental crawl. I had
earlier published an article on the topic. You can read it here. Starting an incremental crawl in Share point programmatically.
Incremental crawl
would normally take 1 minutes, because normally it would find only one
document updated. But after we had implemented this in the application
and saw it working great for 10 days (at least). We found another
problem with Sharepoint. The incremental crawl was running fine, but
sometimes (out of blue) the Incremental crawl would take about 7 hours.
During this crawl no document would be indexed (we checked the crawl
log for that). But after 7 hours or so, the crawl would complete and
after that the incremental crawl would work as we expect it to work.
Also this happened only in the production environment (twice), but was not replicated in the test environment.
Later after a few
testing we found that this long/delayed Incremental crawl was taking
place only when we were adding/removing users form the site. After we
add or remove a user from the site, the next incremental crawl would
normally take a very long time. In-fact with experiment we could
reproduce the issue in the test and dev environment also.
After doing some more
research we found that this happened because after a user is added or
removed from the Sharepoint web application, the next incremental crawl
for Sharepoint would be security only crawl. When incremental crawl
starts, these security changes, “Updated ACL’s”, must be pushed down to
all affected items within the index. Hence this incremental crawl takes
a very very long time (depending on the size of index).
So if you are having
the same problem with incremental crawl, the only simple solution/
problem remover that I know of know is to see to it that users are
added/ removed at a time after which incremental crawl taking long time
does not affect the system very much.
Vikram