Follow me on Twitter at Twitter.com/wbm
FYI, I'm blogging most of my stuff over at More Wally now.
You might want to add my rss feed to your reader at:http://morewally.com/cs/blogs/wallym/rss.aspx
TSql vs. SQL CLR Performance Analysis - Wallace B. McClure

Wallace B. McClure

All About Wally McClure - The musings of Wallym on Web, HTML5, Mobile, Xamarin.iOS, Xamarin.Android, and Windows Azure.

News

Visual Studio Magazine Column Personal Blog

.NET

Book Authors

Business

Family

Friends

Georgia Tech Bloggers

Personal

Archives

TSql vs. SQL CLR Performance Analysis

Original Post: http://morewally.com/cs/blogs/wallym/archive/2006/12/27/616.aspx


I went and spoke to the Michiana .NET User Group on November 21, 2006.  The talk was in South Bend, IN.  The subject of the talk was Sql Server 2005 CLR Objects.  Whenever I do the talk, I got get the question of “Which should I do?  TSql or CLR objects.”  Personally, my generic answer has always been, that if a developer can perform something in TSql, its best to do it in TSql.  If you can perform the necessary operations using TSql, in general the performance will be better.  This is before one takes into account the project overhead of managing the source code for stored procedures, triggers, and other database objects.

 

Once you decide that your scenario actually needs a CLR Object, what’s the best way to perform the operation?  Should this be done entirely within a CLR Object?  I have decided to take a look at this question based on the WebSearch spider and its use of CLR Objects to perform certain data operations.

 

Question: What is the best way to implement a CLR Object within a Sql Server 2005 Database?  Which will perform the best?


Comment: Terms like "best" are highly subjective.  As a result, the question that was answered is which performs the best.  Obviously, that type of question is very dependent on the scenario.  As a result, please test this in your own shop to get the best results for your application.

 

Short Answer:  Do as much as you possibly can in TSql.  Use CLR Objects judiciously.  Once the need is determined, perform as many operations in TSql and use CLR Objects only as absolutely needed.

 

Much Longer Answer: First off, not all situations are the same.  In the following example, we are looking at this from the basis standpoint of keeping data synchronized within a single database table.  To answer this question, I have built three different solutions to this problem:

  • Mostly TSql.  A TSQL trigger which creates a cursor and calls out to some custom CLR Functions on an individual basis.
  • TSql Cursor.  A TSQL trigger that creates a cursor that calls out to some custom CLR Functions through a look within the trigger.
  • A CLR Trigger which uses a Generic List <String> to hold the pending data along with a SqlDataReader to iterate through data.  Afterwards, each url is updated by iterating through the Generic List.
  • A CLR Trigger which uses a DataSet/DataTable to hold the data and then the Sql Data Adapter’s .Update method to transfer the data back.

The relative performance of each was measured:

 

Average time

Standard Deviation

Mostly TSql (base)

1x

10 / 1x

TSql Cursor

1.42x

.61x

CLR with Generic List

2.04x

10.4x

CLR with Datasets

4.65x

15.6x

 

Performance Analysis

Now that we have seen the performance differences of the four solutions, let’s try and analyze the differences between them. 

The “Mostly TSql” solution performs the best.  This is not a surprise.  TSql has been around for a number of years.  It has been embedded within SQL Server and has a highly optimized relationship with the database.    While embedding the CLR with the database, it provides its own overhead, so its not a shock the “Mostly TSql” solution performs the best.

The TSql Cursor solution takes 1.42 times as long as the “Mostly TSql” solution to perform the same work.  This is not surprising as cursors generally take longer to perform a given operation compared to comparable sql commands.

The real surprise in this test is the performance of the CLR trigger which used a Generic List to hold it’s the sql commands that are to be sent back to the database.  The Generic List Trigger has performed more than twice as fast as the Dataset Trigger.  Personally, I would think that the Dataset would be more optimized in this scenario, however, it does not perform as well.  This is probably due to much more overhead with than a ‘simpler” Generic List.

 

Note: BobB suggested this reason for the Dataset not performing as well:

-- IMHO --

Arrays (especially multi-dimensional arrays) *might* be faster if you're doing array-only processing, preferrable (sometimes) over gobs of temp tables. But using a heavyweight object like a DataSet takes memory management away from the SQL engine. SQL engine is always best when it managed all the data buffers.

-- IMHO --

How was the data created?  It is important to communicate how the above data was calculated. 

1.      In each situation, approximately 15 updates were performed against the 1000 records in the table.  The command is “UPDATE TBLSEARCHRESULTS SET SERVERNAME=NULL”

2.      When testing was performed, the triggers that were not being tested were not in installed on the database table.

3.      The first 5 updates were not used for the calculation.  This is due to the necessary startup and JIT operations that are necessary for the objects within the database.  The idea is to simulate a running system.  As a result, startup isn’t a major concern for this test.

4.      The sql commands AVG() and STDEV() were used to create the data in the above table.

 

The test platform for the scenario was:

  • Laptop.
  • 1.8 GHz Pentium 4.
  • 1 gigabyte of RAM.
  • 60 gigabyte hard drive.
  • Windows 2003 Server with Service Pak 1 and upto date security updates through Windows Update.
  • Sql Server 2005 Service Pak 1.

Areas of concern/problems

There are a number of issues and assumptions made within this test that could cause problems with the data that was generated.

A number of Assumptions have been made in this test:

  • There is no need to go outside of the single database.  If there was a need to go outside of the database, this test would not be applicable to your needs.
  • Only updates are tracked.  No inserts were tracked during this testing.
  • This was tested using a laptop, not a multi-processor database server with an appropriate drive system layout.  Different results for different hardware layout would not be unexpected.
  • I don’t believe I have any Sql Injection attack openings, but I might be wrong.

Note: Bob Beauchemin took a look at this before I published it.  Many thanks to him for taking the time to look at it and provide a couple of suggestions and insight.

 

The Source Code

The source code for the three scenarios is included below.

TSQL Trigger

 

USE [dbWebSearchYukon]

GO

/****** Object:  Trigger [dbo].[Trigger1]    Script Date: 11/28/2006 11:42:55 ******/

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

CREATE TRIGGER [dbo].[Trigger1]

ON [dbo].[tblSearchResults]

FOR INSERT

AS

BEGIN

            UPDATE TBLSEARCHRESULTS SET

                        SEARCHCODE=DBO.CALCULATESEARCHCODE(SEARCHURL),

                        SERVERNAME=DBO.CALCULATESERVERNAME(SEARCHURL),

                        DATEENTERED=GETDATE(),

                        DATEUPDATED=GETDATE()

                        WHERE TBLSEARCHRESULTSID IN (SELECT TBLSEARCHRESULTSID FROM INSERTED)

END

 

USE [dbWebSearchYukon]

GO

/****** Object:  Trigger [dbo].[Trigger2]    Script Date: 11/28/2006 11:43:29 ******/

SET ANSI_NULLS ON

GO

SET QUOTED_IDENTIFIER ON

GO

CREATE TRIGGER [dbo].[Trigger2]

ON [dbo].[tblSearchResults]

FOR UPDATE

AS

BEGIN

            UPDATE TBLSEARCHRESULTS SET

                        SEARCHCODE=DBO.CALCULATESEARCHCODE(SEARCHURL),

                        SERVERNAME=DBO.CALCULATESERVERNAME(SEARCHURL),

                        DATEUPDATED=GETDATE()

                        WHERE TBLSEARCHRESULTSID IN (SELECT TBLSEARCHRESULTSID FROM INSERTED)

END

 

TSql Cursor

set ANSI_NULLS ON

set QUOTED_IDENTIFIER ON

GO

ALTER TRIGGER [dbo].[TR_TBLSEARCHRESULTS_UPDATE] ON [dbo].[tblSearchResults] FOR UPDATE

AS

BEGIN

            SET NOCOUNT ON;

            DECLARE @SEARCHURL VARCHAR(4096)

            DECLARE @SEARCHRESULTSID INT

            DECLARE SEARCHRESULTSCURSOR CURSOR FOR SELECT TBLSEARCHRESULTSID, SEARCHURL FROM INSERTED

            OPEN SEARCHRESULTSCURSOR

            FETCH NEXT FROM SEARCHRESULTSCURSOR INTO

                        @SEARCHRESULTSID, @SEARCHURL

 

            WHILE @@FETCH_STATUS = 0

            BEGIN

                        UPDATE TBLSEARCHRESULTS SET SERVERNAME=DBO.CALCULATESERVERNAME(@SEARCHURL),

                                    SEARCHCODE=DBO.CALCULATESEARCHCODE(@SEARCHURL), DATEUPDATED=GETDATE() WHERE TBLSEARCHRESULTSID=@SEARCHRESULTSID

                        FETCH NEXT FROM SEARCHRESULTSCURSOR INTO @SEARCHRESULTSID, @SEARCHURL

            END

            CLOSE SEARCHRESULTSCURSOR

            DEALLOCATE SEARCHRESULTSCURSOR

END

 

VB.NET CLR Functions

Imports System

Imports System.Data

Imports System.Data.SqlClient

Imports System.Data.SqlTypes

Imports Microsoft.SqlServer.Server

 

Partial Public Class UserDefinedFunctions

    <Microsoft.SqlServer.Server.SqlFunction(DataAccess:=DataAccessKind.None, SystemDataAccess:=SystemDataAccessKind.None)> _

    Public Shared Function CalculateServerName(ByVal pstrUrl As String) As String

        Dim strTemp As String

        Dim strHttp As String = "http://"

        Dim iLengthHttp As Integer = strHttp.Length

        Dim iLengthTillSlash As Integer = pstrUrl.IndexOf("/", iLengthHttp, pstrUrl.Length - iLengthHttp)

        If (iLengthTillSlash = -1) Then

            strTemp = pstrUrl.Substring(strHttp.Length, pstrUrl.Length - iLengthHttp)

        Else

            strTemp = pstrUrl.Substring(iLengthHttp, iLengthTillSlash - iLengthHttp)

        End If

        Return (strTemp)

    End Function

 

    <Microsoft.SqlServer.Server.SqlFunction(DataAccess:=DataAccessKind.None, SystemDataAccess:=SystemDataAccessKind.None)> _

    Public Shared Function CalculateSearchCode(ByVal pstrurl As String) As Long

        Return (pstrurl.GetHashCode())

    End Function

End Class

 

 

VB.NET Trigger with Dataset

Imports System

Imports System.Data

Imports System.Data.SqlClient

Imports System.Data.SqlTypes

Imports Microsoft.SqlServer.Server

 

 

Partial Public Class Triggers

    ' Enter existing table or view for the target and uncomment the attribute line

    <SqlTrigger(Name:="tr_tblSearchResults_InsertUpdate1", Event:="FOR INSERT, UPDATE", Target:="tblSearchResults")> _

    Public Shared Sub Trigger1()

        Dim strSql As String = String.Empty

        Dim glResults As New Generic.List(Of String)

        Dim sqlCn As New SqlConnection("Context Connection=true")

        Dim sqlCmd As New SqlCommand

        Dim sqlUpd As New SqlCommand

        Dim sqlInsert As New SqlCommand

        Dim sqlTrCtx As SqlTriggerContext = SqlContext.TriggerContext

        Dim sqlDa As New SqlDataAdapter()

        Dim dtData As New DataTable

 

        Try

            strSql = "select SearchUrl, tblSearchResultsId, SearchCode, ServerName, DateEntered, DateUpdated from tblSearchResults where tblSearchResultsId in (select tblSearchResultsId from inserted)"

            sqlCmd.Connection = sqlCn

            sqlCn.Open()

            sqlCmd.CommandText = strSql

            sqlCmd.CommandType = CommandType.Text

            sqlDa.SelectCommand = sqlCmd

            sqlDa.Fill(dtData)

            For Each drData As DataRow In dtData.Rows

                drData("SearchCode") = CalculateSearchCode(Convert.ToString(drData("SearchUrl")))

                drData("ServerName") = CalculateServerName(Convert.ToString(drData("SearchUrl")))

                drData("DateUpdated") = DateTime.Now

                If sqlTrCtx.TriggerAction = TriggerAction.Insert Then

                    drData("DateEntered") = DateTime.Now

                End If

            Next

            sqlUpd.Parameters.Add("@SearchCode", SqlDbType.Int, Nothing, "SearchCode")

            sqlUpd.Parameters.Add("@ServerName", SqlDbType.VarChar, 100, "ServerName")

            sqlUpd.Parameters.Add("@DateUpdated", SqlDbType.DateTime, Nothing, "DateUpdated")

            sqlUpd.Parameters.Add("@tblSearchResultsId", SqlDbType.Int, Nothing, "tblSearchResultsId")

 

            sqlInsert.Parameters.Add("@SearchCode", SqlDbType.Int, Nothing, "SearchCode")

            sqlInsert.Parameters.Add("@ServerName", SqlDbType.VarChar, 100, "ServerName")

            sqlInsert.Parameters.Add("@DateUpdated", SqlDbType.DateTime, Nothing, "DateUpdated")

            sqlInsert.Parameters.Add("@DateEntered", SqlDbType.DateTime, Nothing, "DateEntered")

            sqlInsert.Parameters.Add("@tblSearchResultsId", SqlDbType.Int, Nothing, "tblSearchResultsId")

            sqlUpd.CommandText = "update tblSearchResults set SearchCode=@SearchCode, ServerName=@ServerName, DateUpdated=@DateUpdated where tblSearchResultsId=@tblSearchResultsId"

            sqlUpd.Connection = sqlCn

            sqlInsert.CommandText = "insert into tblSearchResults (SearchCode, ServerName, DateEntered, DateUpdated) values (@SearchCode, @ServerName, @DateEntered, @DateUpdated)"

            sqlInsert.Connection = sqlCn

            sqlDa.UpdateCommand = sqlUpd

            sqlDa.InsertCommand = sqlInsert

            sqlDa.Update(dtData)

        Catch sqlExc As SqlException

            Dim sqlPipe As SqlPipe = SqlContext.Pipe

            sqlPipe.Send("Error: " & sqlExc.Message.ToString())

            sqlPipe = Nothing

            Throw

        Finally

            If sqlCn.State <> ConnectionState.Closed Then

                sqlCn.Close()

            End If

            sqlCmd.Dispose()

        End Try

    End Sub

 

    Private Shared Function CalculateServerName(ByVal pstrUrl As String) As String

        Dim strTemp As String

        Dim strHttp As String = "http://"

        Dim iLengthHttp As Integer = strHttp.Length

        Dim iLengthTillSlash As Integer = pstrUrl.IndexOf("/", iLengthHttp, pstrUrl.Length - iLengthHttp)

        If (iLengthTillSlash = -1) Then

            strTemp = pstrUrl.Substring(strHttp.Length, pstrUrl.Length - iLengthHttp)

        Else

            strTemp = pstrUrl.Substring(iLengthHttp, iLengthTillSlash - iLengthHttp)

        End If

        Return (strTemp)

    End Function

 

    Private Shared Function CalculateSearchCode(ByVal pstrurl As String) As Long

        Return (pstrurl.GetHashCode())

    End Function

End Class

 

VB.NET Trigger with Generic List

Option Explicit On

Option Strict On

Imports System

Imports System.Data.Sql

Imports System.Data.SqlClient

Imports System.Data.SqlTypes

Imports System.Diagnostics

Imports Microsoft.SqlServer.Server

Partial Public Class Triggers

    <SqlTrigger(Name:="tr_tblSearchResults_InsertUpdate", Event:="FOR INSERT, UPDATE", Target:="tblSearchResults")> _

    Public Shared Sub InsertUpdateSearchResultsTrigger()

        Dim i As Integer

        Dim iSearchCode As Long

        Dim strServerName As String

        Dim strSql As String = String.Empty

        Dim glResults As New Generic.List(Of String)

        Dim sqlCn As New SqlConnection("Context Connection=true")

        Dim sqlCmd As New SqlCommand

        Dim sqlTrCtx As SqlTriggerContext = SqlContext.TriggerContext

        Dim dsData As New DataSet()

        Dim dtData As New DataTable

        Dim sqlDr As SqlDataReader

 

        Try

            strSql = "select SearchUrl, tblSearchResultsId from inserted"

            sqlCmd.Connection = sqlCn

            sqlCn.Open()

            sqlCmd.CommandText = strSql

            sqlCmd.CommandType = CommandType.Text

 

            sqlDr = sqlCmd.ExecuteReader()

            While sqlDr.Read()

                iSearchCode = CalculateSearchCode(Convert.ToString(sqlDr("SearchUrl")))

                strServerName = CalculateServerName(Convert.ToString(sqlDr("SearchUrl")))

                strSql = "update tblSearchResults set SearchCode=" & iSearchCode.ToString() & ", ServerName='" & _

                    strServerName.Replace("'", "''") & "', DateUpdated=getdate() where tblSearchResultsId=" & _

                    Convert.ToString(sqlDr("tblSearchResultsId"))

                If sqlTrCtx.TriggerAction = TriggerAction.Insert Then

                    strSql = strSql & ";update tblSearchResults set DateEntered=getdate() where tblSearchresultsId=" & _

                        Convert.ToString(sqlDr("tblSearchResultsId"))

                End If

                glResults.Add(strSql)

            End While

            If Not (sqlDr.IsClosed) Then

                sqlDr.Close()

            End If

 

            For i = 0 To (glResults.Count - 1)

                sqlCmd.CommandText = glResults.Item(i)

                sqlCmd.ExecuteNonQuery()

            Next

            sqlDr.Dispose()

        Catch sqlExc As SqlException

            Dim sqlPipe As SqlPipe = SqlContext.Pipe

            sqlPipe.Send("Error: " & sqlExc.Message.ToString())

            sqlPipe = Nothing

            Throw

        Finally

            If sqlCn.State <> ConnectionState.Closed Then

                sqlCn.Close()

            End If

            sqlCmd.Dispose()

        End Try

    End Sub

 

    Private Shared Function CalculateServerName(ByVal pstrUrl As String) As String

        Dim strTemp As String

        Dim strHttp As String = "http://"

        Dim iLengthHttp As Integer = strHttp.Length

        Dim iLengthTillSlash As Integer = pstrUrl.IndexOf("/", iLengthHttp, pstrUrl.Length - iLengthHttp)

        If (iLengthTillSlash = -1) Then

            strTemp = pstrUrl.Substring(strHttp.Length, pstrUrl.Length - iLengthHttp)

        Else

            strTemp = pstrUrl.Substring(iLengthHttp, iLengthTillSlash - iLengthHttp)

        End If

        Return (strTemp)

    End Function

 

    Private Shared Function CalculateSearchCode(ByVal pstrurl As String) As Long

        Return (pstrurl.GetHashCode())

    End Function

End Class

Comments

No Comments