Deconstructing the QueryProvider

SharePoint search capabilties are pretty powerful but when you're dealing with clients, more often than not, you need to build your own search mechanism because the client wants a custom page to look a certain way or they don't want to train users in how to use the advanced search (yeah, it happens) so I thought I would take you down the QueryProvider lane for a bit. Some of this information is documented in the SDK (and repeated here), however you need to sacrifice a small marsupial or skink to the great one if you want to actually try to pull it together to make any use of it. Hence the reason for this blog.

We'll start with the QueryProvider, a simple class that you use to issue searching against SharePoint. The creation of it is documented in the SDK but briefly you build one like this:

TopologyManager topology = new TopologyManager();
PortalSite portal = topology.PortalSites[new Uri("http://servername")];
PortalContext context = PortalApplication.GetContext(portal);
QueryProvider queryProvider = new QueryProvider(context.SearchApplicationName);

Note that the server name in the PortalSite creation must be the NETBIOS name of the server, not "localhost" because that won't work.

Once you have your QueryProvider created the challenge of actually getting a query working is the main cause of anyone tearing their hair out. Why? Because a) the syntax is SQL-like, but not quite and b) trying to do a complex query will blow your head out the back of your skull. I could blog for days just on all the options in the Search query but there are 3 main groups that you need to worry about.

SELECT
This is the basic SELECT statement you would make in SQL and will return the fields you want in your output. Examples are "DAV:href" and "urn:schemas-microsoft-com:office:office#Author". URNs must be specified as delimited identifiers, enclosed in double quotation marks. You have to have at least one field so include something like "DAV:href" or something that will be there for any type of object. Otherwise, your query is going nowhere. Any other fields you add are up to you and will be returned in the DataSet once we execute the query. So at a minimum we're going to have the SELECT statement look like this:

SELECT "DAV:href"

It's also good to put in the "urn:schemas.microsoft.com:fulltextqueryinfo:sdid" column so you don't get an empty DataSet returned if there's no match in your query.

FROM
Okay, now that we have the fields we want, we have to tell SharePoint where we want to select from. This is generally two things:

  • Portal Content (i.e. stuff that exists in the portal)
  • Non-Portal Content (i.e. you know what this should be)

That scope will cover off everything that's in SharePoint and everything in your WSS sites, the Profile Database (for searching people) and other goodies (like if you have your SharePoint server crawling a file share or something). You specify it like this:

FROM ( TABLE Portal_Content..Scope() UNION ALL TABLE Non_Portal_Content..Scope() )

WHERE
Finally we need to specify the WHERE clause, which filters the results of what we specified in our SELECT statement (otherwise we would get the entire contents of the portal coming back).

There are a couple of ways of specifying search criteria. First you can just specify the column name and matching value:

WHERE ("DAV:contentclass" = 'urn:content-class:SPSPeople')

This will give you results of all the profiles in SharePoint. Another way is to find text using the CONTAINS or FREETEXT predicates. Here's a simple example:

WHERE Contains('computer')

This finds anything with the exact word of "computer" in it. CONTAINS allows you to use a wildcard for multiple matches, but only for prefix matching so this will work:

WHERE Contains('computer*')

This won't:

WHERE Contains('*computer')

Here's an example that only finds matches where the FirstName property of the users profile matches "gary".

WHERE Contains("urn:schemas-microsoft-com:sharepoint:portal:profile:FirstName", '"gary"')

This uses the column name (the first parameter to the Contains predicate) which tells us to only compare against the FirstName column. You can use "ALL" or "*" if you want to search all columns. The CONTAINS predicate is better suited for "exact matches, in contrast to the FREETEXT predicate, which is better suited to finding documents containing combinations of the search words spread throughout the column.

You can also combine the search values (like looking for people with Gary as their first name) with the content-class so, for example, add this to your where clause to only search People records (rather than documents or sites by Gary):

WHERE (Contains("urn:schemas-microsoft-com:sharepoint:portal:profile:FirstName", '"gary"')
AND
("DAV:contentclass" = 'urn:content-class:SPSPeople'))

Again, basic SQL syntax so remember to get your quotes, double quotes, and brackets correct as that's a common mistake trying to debug a query. There are other SQL statements that you can use here like ORDER BY (to specify how the returned values are sorted) and RANK BY. Additionaly you can also weight words using the IsAbout type, and other neat tricks like "this" NEAR "that". Check out the SDK Topic "General Query Language Information" for all the dirt. This posting is just meant to get your started.

TOOLS
There are some great tools out there to help you with building queries. One of them is the SharePoint Query Builder Tool. Invaluable as it will look at your SharePoint server, grab all the content sources and properties and build the SQL for you. You can then just use this as-is in your code or update it to accept values from say a TextBox control in a Web Part. Very handy to explore the Search features of SharePoint and see what works (and what doesn't) so I recommend you getting this. It's available free from here on GotDotNet.

EXAMPLE
Here's a simple example that gives you the bare minimum. It searches the Profile directory for people with first name of "John" and returns the link to the users MySite page.

SELECT
"DAV:href"
FROM
( TABLE Portal_Content..Scope() UNION ALL TABLE Non_Portal_Content..Scope() )
WHERE
Contains("urn:schemas-microsoft-com:sharepoint:portal:profile:FirstName", '"John"') AND
("DAV:contentclass" = 'urn:content-class:SPSPeople')

Now take this query and call QueryProvider.Execute(query) with it (where query is the query above). The result is a DataSet but since I can't show you a DataSet, here's the result in XML (from the Web Service, but same results either way):

<ResponsePacket xmlns="urn:Microsoft.Search.Response">
<Response domain="QDomain">
<Copyright>Microsoft (c) Office SharePoint (tm) Portal Server 2003</Copyright> 
<Range>
<StartAt>1</StartAt> 
<Count>40</Count> 
<TotalAvailable>1</TotalAvailable> 
<Results>
<Document xmlns="urn:Microsoft.Search.Response.Document">
<Action>
<LinkUrl fileExt="aspx">http://spsdev/MySite/Public.aspx?guid=3751B6E1-34C0-4B93-963B-B827BCFDF8A1</LinkUrl> 
</Action>
<Properties xmlns="urn:Microsoft.MSSearch.Response.Document.Document">
<Property>
<Name>DAV:href</Name> 
<Type>String</Type> 
<Value>http://spsdev/MySite/Public.aspx?guid=3751B6E1-34C0-4B93-963B-B827BCFDF8A1</Value> 
</Property>
</Properties>
</Document>
</Results>
</Range>
<Status>SUCCESS</Status> 
</Response>
</ResponsePacket>

You can take this and use XSLT to format it. If you're using the DataSet, just bind it to a grid and you're off to the races. A more advanced blog would feature connecting this to the Search Results web part, but I'm not advanced and it's late on a Sunday evening so you should be able to take this and run with it, building your own queries and getting the results you want.

To get to really understand what parameters you need, you'll need to check out the mapped properties in your search. Just crack open your portal and got to the Manage Properties of Crawled Content page (http://servername//_layouts/1033/Properties.aspx). This will show you the names of the properties that have been brought over from ActiveDirectory (for people) and ones that have been crawled from documents and content (like everything under urn:schemas-microsoft-com:office:office for Office documents).

Also you might want to consider building a class or some helper methods to build your queries as you can get quite complex. Imagine if you were searching for all employees with a first name of "John" who reported to a manager named "Bob" and had a phone number starting with "416". Yeah, it can be a pretty ugly SQL statement at the end (and if you do manage to create one that complex send it to me so I have something to scare the kids next Halloween).

Have fun!

No Comments