SQL for the .NET Developer

There are a lot of .NET Developers out there that are not taking advantage of all the SQL has to offer. If you are writing .NET code to iterate over database tables to aggregate data, ie, summing, averaging, etc., then you need to learn more about SQL. Why write all that code in VB.NET or C# when you could use ANSI SQL commands to do the job for you. You will not only save programming time, but also memory and processing time for your users.

While the following is a simple example it does illustrate how a lot of developers tend to perform batch type processing.

VB.NET
Private Sub SumCode()
 Dim ds As New DataSet
 Dim strSQL As String
 Dim decSales As Decimal = 0

 strSQL = "SELECT SalesAmt FROM SalesData"

 ds = DataLayer.GetDataSet(strSQL, AppConfig.ConnectString)

 For Each dr As DataRow In ds.Tables(0).Rows
  decSales += Convert.ToDecimal(dr("SalesAmt"))
 Next

 MessageBox.Show("Total Sales: " & decSales.ToString())
End Sub

C#
private void SumCode()
{
 DataSet ds = new DataSet();
 string strSQL;
 decimal decSales = 0;

 strSQL = "SELECT SalesAmt FROM SalesData";

 ds = DataLayer.GetDataSet(strSQL, AppConfig.ConnectString);

 foreach (DataRow dr in ds.Tables[0].Rows)
 {
  decSales += Convert.ToDecimal(dr["SalesAmt"]);
 }

 MessageBox.Show("Total Sales: " + decSales.ToString());
}

The above code shows using a For Each construct to loop through rows in a DataTable and perform some operation on the data. While hopefully everyone knows about the ANSI SQL aggregate functions Sum() and Avg() and Min() and Max() to do these types of operations, I still come across many who do not. Below is the same code that uses a Sum() aggregate function to get the same result as above.

VB.NET
Private Sub SumFunction()
 Dim strSQL As String
 Dim decSales As Decimal = 0

 strSQL = "SELECT Sum(SalesAmt) FROM SalesData"

 decSales = Convert.ToDecimal( _
    DataLayer.ExecuteScalar(strSQL, _
    AppConfig.ConnectString))

 MessageBox.Show("Total Sales: " & decSales.ToString("c"))
End Sub

C#
private void SumFunction()
{
 string strSQL;
 decimal decSales = 0;

 strSQL = "SELECT Sum(SalesAmt) FROM SalesData";

 decSales = Convert.ToDecimal(
  DataLayer.ExecuteScalar(strSQL,
  AppConfig.ConnectString));

 MessageBox.Show("Total Sales: " + decSales.ToString("c"));
}

This code is much more efficient that the previous code.

Anytime you are doing a loop in code to iterate over data, you should stop and think about how you can perform that same operation with SQL code instead. The result is your code will be much faster. Sometimes you may not think you can perform some operation because you have to do several steps within the loop. If this is the case, think about breaking down the steps into discreet SQL operations and maybe creating temporary tables of intermediate results. By using "Set" operations instead of "record processing" operations, your code will be much more efficient and easier to maintain.

There are many other things that you can do with SQL statements as well. For example GROUP BY and HAVING clauses make grouping data very simple. Using CASE statements in T-SQL and TOP n in T-SQL are also great constructs that will help speed up your code and eliminate network traffic.

Another helpful SQL technique is Correlated Subqueries. This technique comes in handy when you wish to eliminate duplicates from a single table. Below is an example of removing duplicate emails from a table named MarketingList.

DELETE FROM MarketingList
WHERE EMailAddress IN
  (SELECT EMailAddress FROM MarketingList GROUP BY EMailAddress HAVING Count(*) > 1)

By using the IN statement within your WHERE clause, and using a GROUP BY and the HAVING clause, you can eliminate duplicate records with a single SQL statement. Now that is POWER!

Hope this inspires you to explore the SQL language a little bit more.

Sincerely,
Paul

Past Blog Content

Blog Archive

2 Comments

  • Rod,

    Thanks for catching that. I guess the rest of my query got chopped off when I copied and pasted it from my stored proc into the blog post.

    You do indeed need to make sure that you check the primary key field to make sure you do not delete the final 1 row.

    Thanks,
    Paul

  • No problem, thanks for the post! The following is the SQL approach I would use, assuming the first entry entered is the one I wanted to keep.

    DELETE m1
    FROM marketinglist m1
    INNER JOIN
    ( SELECT emailaddress, MIN(marketinglistid) marketinglistid
    FROM marketinglist
    GROUP BY emailaddress HAVING Count(*) > 1
    ) m2
    ON m1.emailaddress = m2.emailaddress
    WHERE m1.marketinglistid > m2.marketinglistid

Comments have been disabled for this content.