Miscellaneous Debris

Avner Kashtan's Frustrations and Exultations

November 2004 - Posts

Some thoughts on SPS Content Indexing [3/3]

...and just one more.

Various indexing problems I've encountered.

  • "Cannot create the file because it already exists - 0x800700B7"

This error was reported in the gatherer log for my Portal_Content index. It occured several times - once for each Portal Area I had that was derived from a certain Site Definition I had customized. No content in these areas was indexed. This had me stumped for a while, since I couldn't understand what file the message referred to.
The problem, it turned out, was in the Site Definition's XML files.
The sitedef contained a custom list definition, which contained its own SCHEMA.XML file with the list's configuration. When customizing this file, I accidently left the "DefaultView=TRUE" attribute set for two different Views of the list. This apparently caused SPS to choke when trying to index sites based on the definition and leave them unindexed.
A good way to tell if this has happened (and which should have alerted me sooner) was that when viewing the list of Lists in an area, instances of that custom list were shown twice - with links leading to both default views.

After fixing the SCHEMA.XML, new areas based on the template were indexed properly. To fix existing areas (which already had data in them), a small console application connected to the server and used the Object Model to iterate over all problematic Areas and change the List's View's "DefaultView" property to False:
SPView non_default_view 
      = currArea.Web.Lists["Notices"].Views["MyCustomView"];
non_default_view.DefaultView = false;
non_default_view.Update();

  • Contents of WSS sites listed in a Site Directory aren't indexed - only the site's existence can be found.

A two-step problem here. The first problem I had with indexing my WSS contents was that I was using a custom Site Directory rather than the default one. This was fixed using the method described in the previous article.

The second problem I had was that the sites were added to the portal when I was connected locally to the server, and browsed the http://localhost URL. This caused the newly-created WSS site's internally-saved URL to be under the localhost hostname, rather than my server name. When the crawling engine reached the sites, it noticed that their URLs differed - it was searching under servername, and found localhost instead. Since the default crawling rules don't allow server-hops while crawling, the sites were skipped.

I could not find a way to change the URLs of existing sites through the interface (changing the link in the Sites list only changed the pointer, not the SPSite itself) so I resorted to some quick DB editing - simply fix up the FullUrl column in the Sites table in your content database and all is well.

Well, that's it for my late-night verbal assault. Hope I helped someone.

Posted: Nov 29 2004, 09:55 PM by AvnerK | with no comments
Filed under:
Some thoughts on SPS Content Indexing [2/3]

..and we're back.

  • Enabling multiple Site Directories

By default, a Sharepoint installation comes with a Site Directory where WSS sites are listed and found. Each Site Directory holds a custom list called Sites which contains metadata about the site - Owner, Division and Region are among the default properties. We will often wish to add custom fields to the site metadata (which is easy, same as any WSS list), but sometimes we will wish to have differents sets of metadata - internal company sites might have a Department or TargetAudience property, while external sites have a Rating property. Maybe we just want to have logical, visual seperation of the two types of site. Whatever.

Creating a second Site Directory is easy:

  •  Enable different Area Templates to be created under the Portal root (by default only Topic areas are created):
    Home Page -> Settings (from the sidebar) -> Page (tab) -> Sub areas can use any template. Now, when we create a sub-area, we can choose the Site Directory template.
  • Create a new Site Directory and customize its metadata. This metadata will automatically be crawled and indexed in the Portal_Content index.
  • By default, all sites defined in the main Site Directory are crawled and their contents added to the index. To have SPS index sites in our new Site Directory, we'll have to create a Content Source for it:
    • Site Settings -> Search Settings -> Other Content Sources -> Add Content Source
    • Choose Sharepoint Portal Server Site Directory in the Non_Portal_Content index.
    • Write the full URL to your new Site Directory. If you're working on the server itself, make sure to write the real URL and not http://localhost/ in there. Add it to the Site Directory source-group, or create a new one if you prefer.
  • Run an update of the Non_Portal_Content index to have sites under the new Directory indexed.
Posted: Nov 29 2004, 09:41 PM by AvnerK | with no comments
Filed under:
Some thoughts on SPS Content Indexing [1/3]

[Warning: I only have Hebrew SPS installations around, so if I make some mistakes when translating the menu options, try to guess what I meant :]

  • Advanced Search Mode

Most of the features mentioned require Sharepoint to be in Advanced Search mode - this is enabled in the Site Settings -> Search Settings page.
SPS will ominously warn about the change being irreversible, but it is harmless - simply adds some more complexity to the Search administration screens.

  • Crawling custom list items.

A default installation of SPS will automatically create a content index called Portal_Content, which indexes all the data stored in the SPS storage - this includes the main page, Portal Areas, listings, document libraries and custom lists saved in Areas.

For some reason, the default behavior when indexing lists is to crawl over all the items therein, but when searched to return the complete list as the search result, rather than the individual list item. I assume it's done for efficiency, or whatever. We'll often want to change that:

Go to Site Settings -> Search Settings -> Manage Indexes, edit the Portal_Content index and go to the Inclusion Rules page.
We'll see two rules grouped under a heading - the first is an Exclusion rule that prevents ASPX pages from being indexed, the second is the general Inclusion rule to index everything else.

Editing the inclusion rule properties will show two checkboxes that will enable custom item crawls - "Enable Alerts for Individual Items" and "Crawl individual list items". The problem is that they're both disabled.

I don't know why they are, but the solution is simple - erase the rule and recreate it.
Remember to make sure the rule is after the Exclusion rule, points at http://server:port/, and has the bottom two checkboxes set (Be careful - the fourth checkbox automatically sets the third too, so don't uncheck it by accident).

Now run an update on the index to make sure the custom list items are indexed.

Posted: Nov 29 2004, 09:29 PM by AvnerK | with no comments
Filed under:
Crash and Burn
An annoying bug that had one developer here tearing his hair out (our would have, had he any) - an ASCX user-control that has been working fine for a few weeks started to cause strange Visual Studio crashes or hangs.
The project compiles - and runs - fine, but if he opens the code in Visual Studio in Design mode (or in Code mode and then switches to Design), the Studio would crash when switching to Code view, or hang when saving.
 
We tried this on different machines, I tried erasing the code-behind file - no go
When I started going over the ASCX manually, in Notepad (it's a huge ASCX, with lots of different tables that get alternately shown or hidden) I finally spotted the problem - inside a small TD tag somewhere in the middle the text got corrupted - the Style tag had about 8000 spaces inserted between the opening quote mark and the first css attribute - a bit hard to see when you're working without word-wrap and you can't see the end of the line.
Erased it in Notepad and reopened it in VS - works like a charm.
 
I don't know who to blame for the text corruption - my guess would be Clearcase.
 
[Update: It seems there's a known bug that causes Visual Studio to corrupt files located in ClearCase views]
 
 
Posted: Nov 26 2004, 11:50 AM by AvnerK | with 7 comment(s)
Filed under:
More Posts