-[Danny Chen]- Blog of an ASP.NET QA tester

Tips and info about Site Navigation, ImageMap, Menu and other cool ASP.NET v2.0 features.

Dealing with concurrency issues in custom SiteMapProviders

I spend a lot of time on the ASP.NET forums.  Hopefully my efforts over there help steer users into the right direction of how to use my features and ASP.NET properly and most effectively.  But sometimes I see some really alarming things.  Recently, the most scary thing I've seen is some quite improperly written SiteMapProviders.  Unfortunately, I'm not saying there are just plain buggy or wrongly implemented.  Those are the obvious kinds of issues that the developers will find easily through testing and don't really bother me that much.  What does bother me is code that will only fail when it's deployed and stressed. 

Here's an example, what's wrong with the following code:

' BAD CODE EXAMPLE
Public Class BadProvider
    
Inherits StaticSiteMapProvider

    
Private _rootNode As SiteMapNode

    
Public Overrides Function BuildSiteMap() As SiteMapNode
        
If _rootNode Is Nothing Then            
            AddNode(newNode1,
Nothing)
            AddNode(newNode2, newNode1)
            AddNode(newNode3, newNode1)

            _rootNode = newNode1
        
End If

        Return _rootNode
    
End Function

    Protected Overrides Function GetRootNodeCore() As SiteMapNode
        
Return BuildSiteMap()
    
End Function

    Protected Overrides Sub Clear()
        _rootNode =
Nothing
        MyBase.Clear()
    
End Sub

    Protected Sub MyInvalidationMethod()
        Clear()
    
End Sub
End
Class

Give up yet?  Simply and generally stated, the problem is that it's not Thread Safe Why not?  Well, perhaps this is the hardest concept for users to either grasp or accept about the Provider Model.  Providers are instantiated as a single instance across the app domain.  This means a single provider is shared by all of the worker processes (ie all the requests) that come into the site. 

In this example, one particular request triggers the Clear() event while the others all query data.  Each query to data implicitly calls into BuildSiteMap out of necessity.  There are several potential problems here:

   1) A race condition between BuildSiteMap and Clear trying to modify the data at the same time
   2) Simultaneous threads executing the BuildSiteMapCode at the same time
   3) A race condition between BuildSiteMap, Clear, and the calling page performing future accesses

These are not new issues presented by the SiteMapProvider model, these are fundamental issues that need to be dealt with in any multithreaded design.  Ok, I've talked the talk, how do I walk the walk.  With ASP.NET, it's relatively easy.  Simply use the lock(object) (C#) or SyncLock object (VB) features.  The important part is to know what to lock.  The StaticSiteMapProvider protects all of it's APIs but anytime an API is overridden that can modify the data, it needs to be protected.  This means:  AddNode, RemoveNode, Clear, and BuildSiteMap.  Here's the previous example properly coded.  If AddNode or RemoveNode needs to be overriden, they should be overridden the same way Clear has been.

Public Class GoodProvider
    
Inherits StaticSiteMapProvider

    
Private lockObj As New Object
    Private _rootNode As SiteMapNode

    Public Overrides Function BuildSiteMap() As SiteMapNode
        
Dim _tempRoot As SiteMapNode = _rootNode

        
If _tempRoot IsNot Nothing Then
            Return _tempRoot
        
End If

        SyncLock lockObj
            
If _rootNode Is Nothing Then
                MyBase.Clear()
                AddNode(newNode1,
Nothing)
                AddNode(newNode2, newNode1)
                AddNode(newNode3, newNode1)

                _rootNode = newNode1
            
End If
            Return _rootNode
        
End SyncLock
    End Function


    Protected Overrides Function GetRootNodeCore() As SiteMapNode
        
Return BuildSiteMap()
    
End Function

    Protected Overrides Sub Clear()
        
SyncLock lockObj
            _rootNode =
Nothing
        End SyncLock
    End Sub

    Protected Sub MyInvalidationMethod()
        Clear()
    
End Sub
End
Class

public class GoodProvider : StaticSiteMapProvider
{
    
private object _lockObj = new object();
    
private SiteMapNode _rootNode;

    public override SiteMapNode BuildSiteMap()
    {
        
SiteMapNode _tempRoot = _rootNode;

        
if (_tempRoot != null)
            
return _tempRoot;

        
lock (_lockObj)
        {
            
if (_rootNode == null)
            {
                
base.Clear();
                AddNode(newNode1,
null);
                AddNode(newNode2, newNode1);
                AddNode(newNode3, newNode1);

                _rootNode = newNode1;
            }
            
return _rootNode;
        }
    }


    
protected override SiteMapNode GetRootNodeCore()
    {
        
return BuildSiteMap();
    }

    
protected override void Clear()
    {
        
lock (_lockObj)
        {
            _rootNode =
null;
        }
    }

    
protected void MyInvalidationFunction()
    {
        Clear();
    }

}

One thing to look more closely at in this example is the BuildSiteMap method.  As I previously mentioned, all the data reading APIs of the SiteMapProvider call into BuildSiteMap to ensure the provider data has been populated.  In short, it gets called a lot.  Because of this, it's important that the primary flow of that function not be locked and only the creation portion does.

Update 1/12/2006: I was alerted to a couple remaining race conditions in my good code so I've updated BuildSiteMap to eliminate them.  Thanks Joe for looking at this.

Comments

hagay said:

Thanks for the examples , i beleive that you missed one line in the C# example :
after
if (_rootNode == null)
{
base.Clear();
AddNode(newNode1, null);
AddNode(newNode2, newNode1);
AddNode(newNode3, newNode1);
}
there should be _rootNode = newNode1;
# January 8, 2006 10:15 AM

Danny Chen said:

Yes, you are correct. I will correct the code sample. Thanks for pointing this out to me.
--
Danny
# January 9, 2006 11:54 AM

AndrewSeven said:

Could it be that people are expecting the framework to take care of concurrency issues with the providers?
# January 20, 2006 4:24 PM

Danny Chen said:

The underlying providers themselves for any concurrency in calls to the APIs inside themselves. In fact, this example was taken by observing the strategy being used in our providers and applying it to an extended provider.

However, the framework cannot account for code that the user may write. In this example, there are two potential risk areas:

1) Multiple threads entering BuildSiteMap simultaneously before the _rootNode value is set
2) A race condition between Clear and BuildSiteMap

Since both of these methods contain implementations only known to the coder, I hope it's clear that there is no way the framework can account for what is written in these functions.

I really want encourage users to extend the functionality of our base classes into objects that solve their challenges. But, as a tester, I also want to encourage the use of good techniques and awareness of potential issues to stop bugs before they cause failures. The effort of this blog was to address a trend I had seen in many samples posted by users to the ASP.NET forums and hopefully help them design with concurrency in mind instead of spending even more effort to debug intermittent failures.
# January 20, 2006 4:47 PM

Paul said:

Great article but I have noticed a potential problem....

SiteMapNode _tempRoot = _rootNode;

if (_tempRoot != null)

   return _tempRoot;

Since _tempRoot is a reference to _rootNode and _rootNode then the statement could return an incomplete SiteMap since it's being populated within the lock further down the BuildSiteMap method.

I can't see anyway to short circuit the lock in this instance?

# November 14, 2007 11:22 AM
Leave a Comment

(required) 

(required) 

(optional)

(required)