Erik Porter's Blog

Life and Development at Microsoft and Other Technology Discussions

News

    SQL WHERE List Matches Any or All

    I saw a cool post recently from Jon Galloway called "Passing lists to SQL Server 2005 with XML Parameters".  This is a pattern I've used several times while building the new version of Channel 9.  If you'd like to learn how to pass in lists to stored procedures, check out Jon's post.

    One of the times I've used it is to search our database for all entries from two of our forums.  For this example, we'll say Techoff and Sandbox.  Once you have a temp table with the two forum ids (actually forums in our system are just tags too) you can just do a where in statement like the following:

    SELECT e.* FROM Entry e INNER JOIN EntryForum ef ON e.EntryID = ef.EntryID WHERE ef.ForumID IN (SELECT ForumID FROM ForumList)

    Note: This is all pseudo code to represent the basics of how we do this.  This is not the exact code.

    This selects all the entries (or posts) from our database that are from the list of forums I passed into the ForumList temp table.  WHERE IN specificies that all rows be returned that match ANY of the records in my temp table.  The following statement would be equivalent and work exactly the same.

    SELECT e.* FROM Entry e INNER JOIN EntryForum ef ON e.EntryID = ef.EntryID WHERE ef.ForumID = @ForumID1 OR ef.ForumID = @ForumID2

    Note: In the above example, @ForumID1 and @ForumID2 have the values that were stored in the ForumList temp table in the example above that one.

    This works pretty well.  The other thing we do with passing in lists though is selecting only the entries that match ALL (not ANY) of the list we pass in to the stored procedure.  The example of this is when searching by multiple tags.  So for instance, you want to search on our site for all content that contains information on WPF AND WCF.  The previous example won't work.  It would instead need to be something like this...

    SELECT e.* FROM Entry e INNER JOIN EntryTag et ON e.EntryID = et.EntryID WHERE et.TagID = @TagIDWPF AND et.TagID = @TagIDWCF

    Using WHERE IN, we can't do this (at least I couldn't find anything in the docs or internet searching to say otherwise).  Duncan helped figure out the idea on how to do this and here is the implentation I came up with:

    DECLARE @TagCount int
    DECLARE @Tags TABLE (TagID bigint)
    DECLARE @Entries TABLE (EntryID bigint)

    SELECT @TagCount = COUNT(*) FROM @Tags
    ;
    WITH Entries(EntryID, MatchCount) AS
    (
        SELECT
            e.EntryID,
            COUNT(DISTINCT t.TagID) AS MatchCount
        FROM
            Entry e
                INNER JOIN
            EntryTag et
                ON
                    e.EntryID = et.EntryID
                INNER JOIN
            @Tags t
                ON
                    et.TagID = t.TagID
        GROUP BY
            e.EntryID
    )
    INSERT INTO @Entries (EntryID) SELECT EntryID FROM Entries WHERE MatchCount = @TagCount

    What is this code doing?  Well, first, it's doing a count on the tags that were passed in (again, from XML turned into a temp table) and storing it in a variable.  Then, it creates a Common Table Expression or CTE around a query that returns all the entries that match the tag list and how many of those tags it matches up with.  If you're not familiar with CTEs, they're basically a wrapper around a query so you can write a query against it.  Kind of like a subquery, but much more organized.  Recursive CTEs are particularly powerful and cool, but that's another blog post.  So then after creating the CTE, fill another temp table with everything from the CTE where the MatchCount equals the count of how many tags were passed in originally.  This means that the entry returned had ALL the tags passed in associated with it.  So this will now only return entries that match ALL of the tags from the list that I passed in (stored in @Tags).  I hope this helps someone.  :)

    UPDATE: Check out the first comment from Bryan.  He points out a slightly better implementation.  Thanks, Bryan!

    Comments

    Bryan said:

    You can replace your entire query with this and it will even work on 2000.

    DECLARE @TagCount int

    DECLARE @Tags TABLE (TagID bigint)

    DECLARE @Entries TABLE (EntryID bigint)

    SELECT @TagCount = COUNT(*) FROM @Tags

    INSERT INTO @Entries (EntryID)

    SELECT

       e.EntryID

    FROM

       Entry e

           INNER JOIN

       EntryTag et

           ON

               e.EntryID = et.EntryID

           INNER JOIN

       @Tags t

           ON

               et.TagID = t.TagID

    GROUP BY

       e.EntryID

    HAVING COUNT(DISTINCT t.TagID) = @MatchCount

    # March 27, 2007 4:56 PM

    HumanCompiler said:

    Thanks, Bryan!  HAVING is exactly what I needed for that.  The original code I posted wouldn't be too much slower than what you have there, but a little.  HAVING cleans it up nicely.  Thanks!

    # March 28, 2007 4:14 PM

    AL|EN said:

    A view example without any SPs, I hope it helps, worked for me ;)

    CREATE VIEW dbo.ithaller_calisanlar

    AS

     select

    TOP 100 PERCENT

    * from dbo.mlz_kart where fabrika IN (SELECT Depo2007_car_kart.r_sayac from Depo2007_car_kart where LEFT(Depo2007_car_kart.kod,2) like 'SI%' or LEFT(Depo2007_car_kart.kod,2) like 'KI%') and dbo.mlz_kart.devam_durumu='C'

    order by dbo.mlz_kart.kod,dbo.mlz_kart.ozel_kod

    # April 9, 2007 3:08 AM