LINQ looks good, but DLINQ scares me

One of Microsoft's announcements at the PDC this week has been LINQ (Language INtegrated Query). Here's the elevator speech version: "LINQ enables developers to query objects, databases and XML using a unified programming model because LINQ makes data transforms and queries first class NET citizens."

There's a lot more about it on the MSDN LINQ page, including 101 LINQ code samples. Paul Vick has a pretty good introductory article on LINQ, as well. Barry Gervin has come up with a nice summary of how the devlopment community is reacting to LINQ .

On the whole, I think it's a nice advance. While it's nice to simplify things a bit mentally by separating procedural logic into applications and declarative, set based logic into the database, that line has been bluring lately with the introduction of the .NET CLR to SQL Server 2005 (and of course, JServer back with Oracle 8i). I'll probably cling to this distinction where performance and / or maintainability are driving factors, but I definitely see how SQL syntax queries against collections can boost productivity.

XLINQ (extensions to allow LINQ to operate on XML data) looks good, too. It's still way too hard to query XML, and XLINQ looks like it would solve that.

DLINQ scares me, though. Not for the reasons Frans Bouma and Paul Wilson are talking about, although I do respect their OR/M expertise. Paul says:What's wrong with DLinq?  Here's the list I have so far: attribute-based, MS Sql Server only, overly complicated, poor stored proc support, no server-side paging, very limited functionality, and no WinFS/OPath integration."

I'm just scared about code maintainability.

I actually have some relevant experience here - several years ago I was tasked with migrating a legacy PowerBuilder application to VB COM and T-SQL. PowerBuilder allows you to intermingle data access SQL with your procedural logic, and the application developers had made full use of this capability. Over the years, this application had grown to the point that it was mission critical, but very difficult to maintain. Additionally, there was a need to migrate from inline SQL to stored procedures for performance and data management reasons.

The project to upgrade to VB COM and T-SQL failed. The program flow was nearly impossible to follow, and the set-based and procedural logic were intermingled in such a way that separating them was just too expensive.

This is being pushed as a productivity enhancement for developers, and I totally agree with that. DLINQ makes it very easy to write database oriented applications very quickly. Properly architected applications could, of course, isolate the data access to layers or components to future-proof. DLINQ has support for custom UpdateMethods which could call stored procedures, for example. Developers and projects focused on productivity probably won't bother with any of that, in my experience. They'll code their SQL inline and move on to the next project.

I've had a similar concern with the SQL-CLR integration, but I think DLINQ is much more subject to abuse. If the SQL-CLR makes it easy to shoot yourself in the foot, DLINQ is a Junior H-Bomb Activity Kit. Take a look at the following code samples (from the LINQ Code Samples page and the DLINQ Hands On Lab), and imagine a three year old application with thousands of lines of code which use DLINQ for data access:

DLINQ - Simple Select example
// DataContext takes a connection string 
DataContext db = new DataContext("c:\\northwind\\northwnd.mdf");

// Get a typed table to run queries
Table<Customer> Customers = db.GetTable<Customer>();

// Query for customers from London
var q =
      from c 
in Customers
      where c.City == "London"
      select c;

foreach (var cust in q)
      Console.WriteLine("id = {0}, City = {1}", cust.CustomerID, cust.City);

That's not too bad, but it wouldn't be very easy to migrate a complex query to a stored procedure or move the Cities into a link table. Here's a database update:

DLINQ - Demonstration of a database update
// Use a standard connection string for updates
Northwind db = new Northwind(@"C:\Temp\northwnd.mdf");

using(TransactionScope ts = new TransactionScope()) 
{
    var q =
      from p 
in db.Products
      where p.ProductID == 15
      select p;
   
    Product prod = q.First();
                   
    
// Show UnitsInStock before update
    
Console.WriteLine("In stock before update: {0}", prod.UnitsInStock);   
    
if (prod.UnitsInStock > 0) prod.UnitsInStock--;
    db.SubmitChanges();
    ts.Complete();
    Console.WriteLine("Transaction successful");
}
Console.ReadLine();

Now imagine we've got multiple nested selects, inserts, and updates. Complicated, right?

Hey, PM,I can't figure out what's causing this weird bug and QA doesn't want to retest the binary, but I can fix it with an INSERT TRIGGER on the database...Pretty soon we've got an unmaintainable application on our hands.

12 Comments

  • Expression trees are general enough to allow stored procedures to be included into queries, in the same manner that DLink maps several CLR functions to corresponding functions in SQL.

  • True. There's also attribute based support for custom methods - [UpdateMethod], etc.



    My concern is that there's nothing but common sense to enforce using them. At first glance, DLINQ seems to make it far too easy to mix inline SQL with code such that it would be very difficult to un-mix (into stored procedures, other database systems, etc.).

  • Why would it be complicated? If you want to schedule actions, use a unitofwork.. DLinq doesn't offer one though.



    Though I don't see why nested selects, inserts and updates are complicated for the developer. For the O/R mapper developer, yes, calculating Insert/update queues prior to commiting the changes is not that simple (Topology sort of a directed graph while you don't know the graph before you begin, have fun! :)) though for the developer USING the O/R mapper... it's a piece of cake.

  • So what else is new? you can get the same spageti ball today, i've seens apps written in VB6 and ADO that were a total mess... It all depends on the programmer, and a good programmer will be able to be much more productive with this.

  • There's nothing to be &quot;scared of&quot; in DLinq. I've been moaning for over a year (since I spent six months in hell writing SQL queries) that we needed queries integrated into the language. I used to do COBOL and Dbase programming in my youth, and havin the data access built into the language resulted in BETTER code, not worse.



    Now, you have a point where it comes to weak support for stored procedures and updates/deletes. This is something that someone (perfereably Anders' team) needs to work on. But, what we got at PDC doesn't even qualify as a beta. It's a C# 3.0 feature, and C# 2.0 is still in beta, so there's a lot of time for improvement there.

  • Sure. I write data access code, and this will make that part of my life easier. I also enhance or upgrade existing applications (legacy), and I think there's a good chance that part might get a little harder. I just wanted to raise the issue of code maintainability in hopes of

    (1) some best practices and guidance

    (2) dissuading developers from &quot;code and go&quot; data access

    (3) seeing if the the smart people can think of a way to give us a DLINQ that's a little more &quot;distinguishable from magic&quot;.



    What do I mean about the magic thing? I've liked how VS.NET has made things simple, but accessible if need be. For instance, big chunks of generated code that are hidden in a region, but can be edited if necessary. Generated XSD's. That sort of thing.



    DLINQ is built on attributes and compiler magic. That's great for initial development (as long as my attributes are right), but makes it a harder to troubleshoot or change. If DLINQ used a mapping file or generated some code, I'd be able to see what was going and change what I needed.



    Kind of guessing on that last one. DLINQ might be the exception when it comes to gen'd code or a mapping file, but that general pattern has worked pretty well in the past.

  • I've done a little research into the scary aspects of Dlinq, and (although I really hate it when people do this...) if you follow me to my website, you may find an answer.

  • Your fears are justified, whenver a new powerful developer feature comes along, there are risks of misuse.



    Linq is good. DLinq is good. But they are NOT meant to be used together (as in the same class). Database code (Dlinq) should still be separated from business logic (Linq). As separate technologies, they are quite brilliant.



    And by the way, use an O/R mapper! The best for .NET is most likely www.llblgen.com.

  • I have written a small project about LINQ and DLinq. This might an interesting read for some of you guys.

  • True OO is data and logic in the same class.

  • I have the same concern as you. I just got the beta 2 drop and I'm horrified that after creating a "DataClasses.dbml" file, a developer can simply whip out this line of code:

    foreach (Product p in db.Products)

    And it goes and does the mysterious "GetTable" behind the scenes.

    And as far as I can see, even though you can map insert/update/delete, I see no way to disable this.

    And I'm worried about the attribute based methods used to assign the insert/update/delete as well.

    Since this post, have you found anything new about using the built-in O/R, but specifying all queries that will take place explicitly?

  • I understand your point. Even with DLinq’s command line utility called SqlMetal, there are still bunch of problems that go with it. It is indeed a missed chance for MS to create something great.

Comments have been disabled for this content.