LLBLGen Pro QuerySpec: the basics

Thursday, April 21, 2011

Last time I introduced LLBLGen Pro QuerySpec, a new fluent API for specifying queries for LLBLGen Pro. As promised I'll write a couple of blogposts about certain aspects of the new API and how it works. Today I'll kick off with the basics.

Two types of queries: EntityQuery<T> and DynamicQuery

There are two types of queries in QuerySpec: entity queries (specified with objects of type EntityQuery<T>, where T is the type of the entity to return) and dynamic queries, which are queries with a custom projection (specified with objects of type DynamicQuery or its typed variant DynamicQuery<T>). The difference between them is mainly visible in the number of actions you can specify on the query. For example, an entity query doesn't have a way to specify a group by, simply because fetching entities is about fetching rows from tables/views, not rows from a grouped set of data. Similarly, DynamicQuery doesn't have a way to specify an entity type filter, simply because it's about fetching a custom projection, not about fetching entities. This difference guides the user of the API with writing the final query: the set of actions to specify, e.g. Where, From, OrderBy etc., is within the scope of what the query will result in.

A custom projection is any projection (the list of elements in the 'SELECT' statement returned by a query) which isn't representing a known, mapped entity. This distinction between entity queries and dynamic queries might sound odd at first, but it will be straightforward once you've worked with a couple of queries.

QuerySpec queries are specifications, they're not executed when enumerated, in fact you can't enumerate them. To obtain the result-set, you have to execute the queries. I'll dig deeper in how to execute QuerySpec queries in a follow up post.

The starting point: the QueryFactory

To get started with a query, the user has to decide what kind of objects the query has to produce: entity objects or objects which contain the result of a custom projection. The question is rather easy if you formulate it like this: "Do I want to obtain one (or more) instances of a known entity type, or something else?". If the answer to that is: "One (or more) instances of a known entity type" the query you'll need is an EntityQuery<T>, in all other cases you'll need a DynamicQuery.

If you change your mind half-way writing your query, no worries: you can create a DynamicQuery from an EntityQuery<T> with the .Select(projection) method and can define a DynamicQuery to return entity class instances, so there's always a way to formulate what you want.

To create a query, we'll use a factory. This factory is a small generated class called QueryFactory. It's the starting point for all your queries in QuerySpec: to create a query, you need an instance of the QueryFactory:

var qf = new QueryFactory();

The QueryFactory instance offers a couple of helper methods and a property for each known entity type, which returns an EntityQuery<entitytype> instance, as well as a method to create a DynamicQuery instance.

We'll first focus on entity queries.

Formulating your query, getting started.

QuerySpec is defined to be close to SQL, so it contains the same building blocks for queries as SQL does. Not all elements are mandatory. For example, not all queries need ordering, a where filter etc. and e.g. entity queries have a fixed projection so you don't have to specify it.

A projection which defines the result-set shape (the 'SELECT' list). Automatic for Entity Queries, specified with .Select(projection) for Dynamic Queries.
A source set from which the result-set has to be obtained (the 'FROM' clause). Automatic for a query if a single source (entity) is used, otherwise required for both query types, specified with .From(join list).
A filter on the source set to obtain a subset of the complete set (the 'WHERE' clause). Optional, specified with .Where(predicate).
An ordering definition to return the result-set rows in a given order (the 'ORDER BY' clause). Optional, but recommended when a limit or paging directive is specified. Specified with .OrderBy(sort clauses).
A grouping specification to group source set rows on a subset of fields (the 'GROUP BY' clause). Optional, only available in Dynamic Queries. Specified with .GroupBy(field list).
A filter on the grouped set of rows after a grouping action (the 'HAVING' clause). Optional, only available in Dynamic Queries, and only used with queries with a grouping specification specified with .GroupBy(). Specified with .Having(expression list).
A limit specification to limit the number of rows in the result-set (the 'TOP', 'LIMIT', 'FIRST' or equivalent clause). Optional, specified with .Limit(number).
A specification which filters out duplicate rows in the result-set (the 'DISTINCT' clause). Optional. Automatic for entity queries, if the engine decides it's more efficient to use distinct. Specified with .Distinct().
A specification to obtain a specific range of rows in the result-set (a Paging directive). Optional, specified with .Page(pageNumber, pageSize).

There are other specific elements per query type, e.g. prefetch paths for entity queries, a typed projection lambda mechanism for dynamic queries, but we'll get to that in a later post.

Entity Queries, getting started.

To create an entity query in its most simplest form, we can simply use one of the properties exposed by a QueryFactory instance. Say we want to formulate a query for Customer entities. The QueryFactory class, which is generated for our domain, exposes a property for each entity in the domain and each property returns an EntityQuery<entitytype>, so the Customer property returns an EntityQuery<CustomerEntity>:

var qf = new QueryFactory();
var q = qf.Customer;

the variable 'q' contains an EntityQuery<CustomerEntity> instance and can be used to fetch customer entity instances. As it doesn't contain any filter, executing it will fetch all customer instances. This is the base line: a query which specifies the complete set of all instances of a given entity type, in this case Customer entities.

In general, you'd want to specify a subset of this complete set. To do so, you specify a Where clause. For example, let's specify a query for all customer entities from the USA:

var qf = new QueryFactory();
var q = qf.Customer
              .Where(CustomerFields.Country=="USA");

This query specification is similar to our first specification, but it now defines a filter on the complete set of 'Customer' entities. The .Where method accepts a predicate object, which can be a single predicate (e.g. field == value) or a predicate expression, which can be a combination of predicates concatenated with AND, OR, AND NOT, or OR NOT.

The mechanism to formulate the predicates is the native LLBLGen Pro query api mechanism, and which is documented extensively in the runtime framework documentation. The generated code base contains helper classes to specify a field of a given entity, like the CustomerFields.Country specification above, and the runtime framework uses operator overloading to formulate predicates under the hood. QuerySpec offers additional methods to help you formulate predicates, as shown in the following example which is equal to the one above:

var qf = new QueryFactory();
var q = qf.Customer
             .Where(CustomerFields.Country.Equal("USA"));

For now we'll leave the predicate specification semantics and focus on the query definitions themselves.

Now that we can filter our set of entities, we can look at how we can do more advanced things. For example, let's specify a query for all Customer entities which have an order on file which was filed by employee with employee id '2'. This is a query which has a filter on a related entity. To do that we can use several mechanisms, one of them is a join. This comes down to: defining a large set with Customer and Order data (the source set), filter that data with a filter and obtain the result-set from that filtered set. In short: join customer with order, use a where clause on the order fields, select the customer fields:

var qf = new QueryFactory();
var q = qf.Customer
             .From(QueryTarget.InnerJoin(qf.Order)
                         .On(CustomerFields.CustomerId==OrderFields.CustomerId))
             .Where(OrderFields.EmployeeId==2);

We need to specify the source set to be a joined set between Customer and Order, so when filtering on the fields from Order in that set, we automatically filter the rows of Customer. To do that in SQL you'd use Customer INNER JOIN Order ON .... Here we do the same thing, we specify an INNER JOIN between Customer and Order in the From() call. As the query we're specifying it on is already a Customer query, we use QueryTarget instead of re-specifying qf.Customer. In case you're wondering why the .InnerJoin isn't defined on qf.Customer instead of the From() clause, it's because it introduced ambiguity which is explained more in detail in the document in the QuerySpec beta archive available to LLBLGen Pro v3 customers.

In a follow up post we'll go deeper into what's available in QuerySpec with respect to joins and filters.

Dynamic Queries, getting started.

After we've seen how to create entity queries, it's now time to show how to create Dynamic Queries. This is done roughly in the same way as entity queries:

var qf = new QueryFactory();
var q = qf.Create();

The variable 'q' now contains a DynamicQuery instance, but we can't really do much with it: it's completely empty. To specify a query, we have to specify a projection using the .Select() method. Let's specify a query for all country names available in the Customer set, with the additional requirement that duplicates have to be filtered out. LLBLGen Pro is clever enough that when you specify CustomerFields.Country, you're fetching from 'Customer' so you don't have to specify a source set.

var qf = new QueryFactory();
var q = qf.Create()
             .Select(CustomerFields.Country)
             .Distinct();

This query is an untyped query, the resultset will be a list of object arrays (with 1 value in each array). To make it a typed query, we have to specify a typed projection. In the untyped variant, 'q' is a DynamicQuery. In the typed variant, 'q' will be a DynamicQuery<T>, where 'T' is the type of the objects returned by the query when it's executed. To create a typed variant, we use a different overload of .Select(), namely the one which accepts a lambda:

var qf = new QueryFactory();
var q = qf.Create()
             .Select(()=>CustomerFields.Country.ToValue<string>())
             .Distinct();

The variable 'q' is now a DynamicQuery<string>, due to the lambda specification in the Select(). One other new element in the Select call is the .ToValue<T>() call. QuerySpec will parse the lambda specification in the Select() method and will transform the objects which have calls to known methods to elements for the projection and leave all other code alone. This allows you to formulate complex projection statements with code which isn't convertible to SQL statements. For each row to project into the resultset, the lambda is ran at runtime, executing all code you've specified there. In our simple example above, it simply returns the value for CustomerFields.Country.

Executing the query above will result in a List<string> with all country values from the Customer set.

Dynamic Queries can, similar to Entity Queries, contain definitions for larger source sets from which you project your final result-set. Let's say you want all CompanyName values for all customers who have an order on file filed by employee with employee id '2':

var qf = new QueryFactory();
var q = qf.Create()
             .Select(()=>CustomerFields.CompanyName.ToValue<string>())
             .From(qf.Customer.InnerJoin(qf.Order)
                         .On(CustomerFields.CustomerId == OrderFields.CustomerId))
             .Where(OrderFields.EmployeeId==2)
             .Distinct();

Here we first specify the projection, using a .Select() method call. We then specify the source-set by using a .From() method call and after that we specify the filter on that source-set by using the .Where() method call. To limit duplicates, we specify .Distinct().

There's much more possible than this tiny scratch of the surface. I'll go deeper into the possibilities of QuerySpec in a follow up post.

QuerySpec is a free add-on for LLBLGen Pro v3 customers and is currently in beta. The complete archive with sourcecode is downloadable from the customer area on our website.

I was really hoping all the 'CustomerFields', 'CustomerFieldIndex', 'CustomerRelations' generated entity twin classes would only be there for backwards compatibility and could be removed if only using linq to generate less code at some point in the future, now you're actually extending their life span?

wtijsma - Tuesday, April 26, 2011 4:23:52 PM

@wtijsma: the *Fields classes can be removed if you only use linq, but as linq isn't always what you need (it can't do everything), you need the classes. They won't be marked deprecated, simply because a lot of people's code relies on them. The relation classs and field indexes are used internally btw, so they're not removable

FransBouma - Tuesday, April 26, 2011 5:05:01 PM

Hi Frans,

I'm not sure I like the QuerySpec API, partially because of the Fields class dependency, in my opinion a LINQ expression is a prettier way to specify a property reference.

I knew about the Relation/FieldIndex classes being used internally, it would be nice if the runtime would be able to function without them, so we can minimize the generated code.

The Fields classes are indeed still useful (not required) in cases where you need predicates (concurrency, direct updates/deletes etc.), but they can be easily replaced with a LINQ expression.

Will give it a try though, thanks!

wtijsma - Tuesday, April 26, 2011 7:18:25 PM

@wtijsma

In my previous post I gave 2 reasons why Linq isn't going to work in some (and for some people: many) cases.

Linq expressions might look great to you, and that's fine, the linq provider won't go away, if you like Linq more, by all means: use it :). The problem is though that it's not always usable, and it can lead to cumbersome SQL and queries. It can also lead to very complex expression trees with all its ambiguities and weirdness which aren't mappable to SQL so strange errors occur. With QuerySpec you don't have that: the query will work, period. The sql you'll get is predictable, in 100% of the cases.

Some people like that, and I'm among them. Others might not like it (like you), but that's fine: there's choice, so use whatever you like.

Limiting the amount of generated code: that's not really an option in the area where you're looking at. We cut code to generate in every version, but it won't become a POCO framework or close to it: we generate code so no magic happens at runtime. Code is cheap in .net: you share all code among all instances of objects, there's no real penalty for it, so it's not a real issue how much code is generated: if it helps creating better software for your client, the better.

After all, that's the only goal :)

FransBouma - Tuesday, April 26, 2011 9:33:30 PM

I remember developing in 4GL environments like Clipper and FoxPro and how cosy (with the DB) my code used to feel. I had hoped Linq would bring that to .NET and indeed I tried my best to use tools like LinqPad to force myself into thinking Linq. I have to say that although Linq goes along way into bringing that 4GL experience into our .NET code it just does not feel SQL because you can never think in SQL when you write Linq queries hence defeating one of the goals declared by its creators.
Reading both of Frans's blog posts I realize how QuerySpec does feel like SQL as for each query example I was really reading natural flow SQL not just an SQL like language as the case in Linq.

I would hope Frans could provide a tool like LinqPad (QSPad maybe :)) to further demonstrate the effectiveness of the syntax and provide a great development helper tool that can be used to build and test QuerySpec queries in the same manner that Management Studio can be used to test and run SQL queries.

For the claim of LLBL's excessive code, I never experienced LLBL's generated DAL to be bloated and indeed generate code is always cheap and better...its maintainable :)

Omar Shraim - Thursday, May 12, 2011 11:34:24 PM

Two types of queries: EntityQuery<T> and DynamicQuery

The starting point: the QueryFactory

Formulating your query, getting started.

Entity Queries, getting started.

Dynamic Queries, getting started.

5 Comments