EF Core Performance Optimisations

Monday, May 6, 2024

Introduction

We all know O/RMs aren't the fastest tools when it comes to loading data, and Entity Framework Core is surely no exception. After all, they need to do a lot of operations, such as translating LINQ queries to SQL, instantiating the right entities, hydrating them (setting their properties from the database-retrieved values), taking a snapshot for the change-tracking functionality, etc, etc. But there, however, are a few things that we can do to help speed things a bit! Here are a few tips for you to make the best usage of Entity Framework Core.

Use Compiled Queries

One of the things that can take some time is the translation from LINQ to SQL. This needs to happen all the time we execute a query from LINQ, unless, of course, we use a compiled query. Compiled queries are not exactly new in EF, and you can read more about them here: https://learn.microsoft.com/en-us/ef/core/performance/advanced-performance-topics.

In a nutshell, here is the deal: you pick up a LINQ query that you are going to execute many times and you compile it so that the SQL is generated only once. Here is out it works:

private static readonly Func<ProductContext, int, IEnumerable<Product>> GetProductsByType = EF.CompileQuery((ProductContext ctx, int type) => ctx.Set<Product>().Where(p => p.Type == type));

As you can see, this takes two parameters: one context (an hypothetical ProductContext) and an int, which represents the product type. Mind you, this is just an example, the query can be arbitrarily complex!

Here's how you'd use it:

var jumpers = GetProductsByTitle(ctx, 10).ToList();

Assuming that jumpers are product of type 10. A few remarks:

Your query can be terminal or open-ended, meaning, it can either end in a Count(), ToList(), First(), FirstOrDefault(), Single(), SingleOrDefault(), etc, or just return an IQueryable<T>, which means it can be composable
The query can be asynchronously compiled

Here's an example of a compiled asynchronous query:

private static readonly Func<ProductContext, int, Task<ValueTask<Product?>>> GetProductById = EF.CompileAsyncQuery((ProductContext ctx, int id) => ctx.Set<Product>().FindAsync(id));

And its usage (a bit more clunky):

var product = await GetProductById(ctx, 1212).Result;

Disable Lazy Loading

By now, you should already be aware of the evil that is involved in lazy loading. Don't worry, if you don't do anything, like, you don't configure anything, it just won't happen!

Essentially, lazy loading allows you to retrieve related entities (one-to-one, one-to-many, many-to-one, many-to-many) just at the moment you access the navigation property, meaning, you defer the execution of the SQL and the retrieving of the data to that moment. This may cause unexpected results (for example, when you are no longer connected to the database) and affect the performance (for example, if it happens in a loop). Read more about it here https://learn.microsoft.com/en-us/ef/core/querying/related-data/lazy.

Use Eager Loading Judiciously

Eager loading, also known as explicit loading, forces related entities to be loaded at the same time as the main entity. This is the opposite of lazy loading, because you get everything you want at the same time at the expense of possibly a lot of columns being returned, because of the joins. This is called a cartesian product. Read about eager loading here: https://learn.microsoft.com/en-us/ef/core/querying/related-data/eager. In general, this is something that you want to have, when you need to access at least some of the properties of related entities, but this should be used together with projections and/or query splitting. Read on!

Query Splitting

So, instead of using joins to fetch related entities and collections, when using eager/explicit loading, you can force EF Core to issue multiple queries. Depending on the actual scenario, although it is true that, in general, sending multiple queries can slow things a bit, because we are avoiding cartesian products of rows and columns, it can actually speed up things! There's no other way than to actually test the two and and see how they perform. The way to do enable query splitting can be either global:

protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
    optionsBuilder.UseSqlServer("<myconnectionstring>", option => option.UseQuerySplittingBehavior(QuerySplittingBehavior.SplitQuery));
}

Or per query:

var productsAndCategories = ctx.Products.Include(x => x.Category).AsSplitQuery().ToList();

I stress that you need to measure carefully the two scenarios before making a decision. Check out https://learn.microsoft.com/en-us/ef/core/querying/single-split-queries for more info.

Use Projections

When we query using EF Core, or pretty much any other O/RM, we get lists of entities, which means, we get all of the properties that are defined for the entity, minus, of course, the navigation properties (one-to-one, many-to-one, one-to-many, and many-to-many). But, if we only need a subset of these properties, we can project just the ones we're interested in, and, instead of having the whole entity, return an anonymous type:

var productNameAndPrice = ctx.Products.Select(x => new { x.Name, x.Price }).ToList();

If we are eager loading, it works too:

var productNameAndPrice = ctx.Products.Include(x => x.Category).Select(x => new { x.Name, x.Price, Category = x.Category.Name }).ToList();

Read all about it here: https://learn.microsoft.com/en-us/ef/core/performance/efficient-querying.

Use Paging

This one is very intuitive: instead of retrieving all records at the same time, only retrieve a few. To do this, use the Skip and Take methods:

var currentPage = 0; //start from here
var pageSize = 10;

var records = ctx.Products
    .Skip(currentPage)
    .Take(pageSize)
    .ToList();

The initial page starts at 0 and the page size is up for you to decide, whatever makes sense to show. When the results return less than the page size, you know that you've reached the end of the records. To learn more, see: https://learn.microsoft.com/en-us/ef/core/querying/pagination.

Disable Change Tracking

You may know that EF Core uses change tracking, a pattern by which it knows what properties have been modified when it comes to saving the context. It does this by taking a snapshot of the values as they are loaded from the database and the entities are hydrated. If we don't need to make modifications to the entities, for example, we just need to display some values on a page, we can disable change tracking, and save some memory and processing cycles. Now, there are two possible ways to do this:

Using identity resolution: with this option, EF Core knows each loaded entity by its identity, and will always return the same instance whenever the same identity is requested; this is generally what we want as it prevents some instantiations, but it does take some more memory and processing (QueryTrackingBehavior.NoTrackingWithIdentityResolution)
Not using identity resolution: EF Core does not care about an entity's identity (QueryTrackingBehavior.NoTracking)

We can do this globally (generally not what we want, unless we have a context just for querying), on the OnConfiguring method or when we configure the options for the DbContext, for example, AddDbContext:

protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
    optionsBuilder.UseQueryTrackingBehavior(QueryTrackingBehavior.NoTracking); //or NoTrackingWithIdentityResolution
 
    base.OnConfiguring(optionsBuilder);
}

Or per query:

var products = ctx.Products.Where(x => x.Provider == "MyProvider").AsNoTrackingWithIdentityResolution().ToList(); //or AsNoTracking()

Here's the documentation for the functionality: https://learn.microsoft.com/en-us/ef/core/change-tracking.

Disable Detecting Changes

This one is not the same the previous tip, what we want here is to disable the capacity of EF Core to periodically check for changes. For this use use the AutoDetectChangesEnabled property:

ctx.ChangeTracker.AutoDetectChangesEnabled = false;

If we are still making changes and we want them detected, we can call DetectChanges at all times, and if there is a SaveChanges or SaveChangesAsync call, they will be persisted. Read more about it here: https://learn.microsoft.com/en-us/ef/core/change-tracking/change-detection.

Disable Auto-Inclusions

Auto-inclusions is a feature of EF Core by which you can force it to automatically load related entities (one-to-one, many-to-one) and collections (one-to-many, many-to-many). This, of course, has a performance penalty, which you may want to avoid, at least, sometimes.

On a query, if you wish to prevent the loading of auto-inclusions, you apply IgnoreAutoIncludes, as this:

var products = ctx.Products.Where(x => x.Provider == "MyProvider").IgnoreAutoIncludes().ToList();

See more here: https://learn.microsoft.com/en-us/ef/core/querying/related-data/eager.

Use Batch Update Methods

In a nutshell: instead of making modifications one entity at a time (or deleting it), you can do it in a batch, without needing to load the entity into memory. You start from a LINQ query and then do what you need, for example, for deleting:

var deleteCount = ctx.Products.Where(x => x.Provider == "MyProvider").ExecuteDelete();

ExecuteDelete will return a count of the records that were actually deleted.

And for updates:

var updateCount = ctx.Products.Where(x => x.Id == 123).ExecuteUpdate(p => p.SetProperty(e => e.Price, e => e.Price * 1.10).SetProperty(e => e.UpdatedAt = DateTime.Now));

For updates, as you can see in this example, we can update multiple properties at the same time and ExecuteUpdate will return the number of affected records. See more here: https://learn.microsoft.com/en-us/ef/core/performance/efficient-updating.

Using Client-generated Ids

When you use server-generated ids, such as IDENTITY in SQL Server, EF Core needs to do some trick to get the generated ids back. In recent versions of SQL Server, this is easier done, but, this prevents, for example, batching, that is, inserting multiple records at the same time. To avoid this, you have a couple of options:

Use one of the Guid variants such as (BinaryValueGenerator, GuidValueGenerator, SequentialGuidValueGenerator, StringValueGenerator), in .NET, or a database function such as NEWID() or NEWSEQUENTIALID()
use the HiLo generator (HiLoValueGenerator<TValue>)

The latter uses a database sequence and fills it with numbers stored on the client, so, until these are exhausted (the lows), no additional query to retrieve the next high will be made.

GUIDs (uniqueidentifier in SQL Server) have the disadvantage of taking too much space (16 bytes) and can cause index fragmentation, which you want to avoid.

Use Find Instead of Where, FirstOrDefault, etc

Both Find and Where, FirstOrDefault, SingleOrDefault, etc, methods locate an entity with a given primary key, if it exists. The advantage Find has is, if the entity with the id has already been loaded by the context (first level cache), then it is returned immediately, no query is sent to the database. Now, this may or may not be what you want, for example, the entity's record may have changed in the database after the entity was loaded.

//using Find
var post = ctx.Posts.Find(1);

//using SingleOrDefault
var post = ctx.Posts.SingleOrDefault(x => x.PostId == 1);

Use Asynchronous Methods

Asynchronous methods don't necessarily speed things up, they actually can even slow things down, but they are good for scalability, meaning, they help in serving more requests at the same time. Make sure you use the asynchronous versions of every terminal operation (CountAsync(), ToListAsync(), FirstAsync(), FirstOrDefaultAsync(), SingleAsync(), SingleOrDefaultAsync(), etc, and also the bulk operations ExecuteDeleteAsync() and ExecuteUpdateAsync()), and always pass a CancellationToken, so that if your web request is aborted, EF Core aborts the database operation too. Read more about this here: https://learn.microsoft.com/en-us/ef/core/miscellaneous/async.

Choose the Best Table Inheritance Strategy

One last advise: if you're using table inheritance, make sure you chose the right strategy, as it can greatly impact your querying. I wrote a blog post about it here: https://weblogs.asp.net/ricardoperes/table-inheritance-with-ef-core. Also read the Microsoft documentation on the subject here: https://learn.microsoft.com/en-us/ef/core/modeling/inheritance.

Conclusion

These are just some tips that hopefully will get you started. Of course, nothing can replace good architectural design and good programming practices that take performance into consideration, and also a good database design and tuning. Use the database profiler tools that are available to identify the bottlenecks and then make adjustments as necessary. But, first, always establish a baseline and measure all your "improvements", otherwise you won't know the real gains, and may actually be going backwards - BenchmarkDotNet is your friend! As always, always looking forward to hearing from you!

Very nice article, I enjoyed reading your post, very nice share, I want to twit this to my followers. Thanks!

Walking Dead Rick Jacket - Friday, July 5, 2024 6:45:06 AM

Your piece was interesting, and I would want to share it with my followers on Twitter because of how well written it is. Much obliged.

geometry dash - Friday, July 26, 2024 2:43:00 AM

This is a fantastic post with valuable insights into optimizing EF Core performance! The tips provided are clear and practical, making them easy to implement for developers looking to enhance their applications. Thanks for sharing such helpful information!

lilyrozy - Thursday, October 24, 2024 7:20:27 PM