I recently ran into an issue with Entity Framework Core.
The query I was running was constantly timing out.
I tried to scale up the application server, and it didn't help.
I tried to scale up the database server, and it didn't help.
So how did I solve the problem?
I'm working on an application in the e-commerce domain. To be specific, it's an order management system for a kitchen cabinet manufacturer.
The table that I frequently query on is the
Order can have one or more
Order will contain 50
LineItems have a table that contains the valid dimensions -
This is the query I was trying to run:
dbContext .Orders .Include(order => order.LineItems) .ThenInclude(lineItem => lineItem.Dimensions) .First(order => order.Id == orderId);
When EF Core converts this into SQL, this is what it will send to the database:
SELECT o.*, li.*, d.* FROM Orders o LEFT JOIN LineItems li ON li.OrderId = o.Id LEFT JOIN LineItemDimensions d ON d.LineItemId = li.Id WHERE o.Id = @orderId ORDER BY o.Id, li.Id, d.Id;
In most cases, this query will execute just fine.
However, in my situation I was running into the problem of Cartesian Explosion.
This is mainly because of the join to the
And this is what's causing my query to fail, and time out.
So how did I solve this problem?
With the release of EF Core 5.0 we got a new feature called Query Splitting.
This allows us to specify that a given LINQ query should be split into multiple
To use Query Splitting, all you need to do is call the
dbContext .Orders .Include(order => order.LineItems) .ThenInclude(lineItem => lineItem.Dimensions) .AsSplitQuery() .First(order => order.Id == orderId);
In this case, EF Core will generate the following SQL queries:
SELECT o.* FROM Orders o WHERE o.Id = @orderId; SELECT li.* FROM LineItems li JOIN Orders o ON li.OrderId = o.Id WHERE o.Id = @orderId; SELECT d.* FROM LineItemDimensions d JOIN LineItems li ON d.LineItemId = li.Id JOIN Orders o ON li.OrderId = o.Id WHERE o.Id = @orderId;
Notice that for each
Include statement we have a separate
The benefit here is that we are not duplicating data when fetching from the database,
as we were in the previous case.
You can enable Query Splitting at the database context level.
When configuring your database context you need to call the
services.AddDbContext<ApplicationDbContext>(options => options.UseSqlServer( "CONNECTION_STRING", o => o.UseQuerySplittingBehavior( QuerySplittingBehavior.SplitQuery)));
This will cause all queries that EF Core generates to be split queries.
To revert back to a single query, you need to call the
dbContext .Orders .Include(o => o.LineItems) .ThenInclude(li => li.Dimensions) .AsSingleQuery() .First(o => o.Id == orderId);
Although query splitting is an excellent addition to EF Core, there are a few things you need to be aware of.
There is no consistency guarantee for multiple SQL queries. You may run into a problem if you have a concurrent update going through at the same time when you query your data. To mitigate this, you can wrap the queries inside of a transaction, but this will only introduce performance issues elsewhere.
Each query will require one network round trip. This can degrade performance if your latency to the database is high.
Now that you are armed with this knowledge, go and make your EF queries faster!