Entity Framework – improve query performance – limit roundtrips to database

Introduction

In the first post, I described query execution steps that can be potentially improved when the query is executed. In this third article, I will focus on the second phase named “Request to a database”. I will show you how to solve the following problem:

“Each create and update action send a SQL query in a separate round-trip. If your application requires 30ms to connect to a database and you create for example ten new objects, then application spends just 300ms on connecting to a database. The same problem exists for every read query that you make.”

Initially, I described only the issue with latency, but now I know that some batch operations can be faster processed by the SQLServer Server.

The only way to limit the number of roundtrips is to send many operations during one roundtrip to the database. There is no built-in mechanism in the Entity Framework to group queries and send them during one roundtrip to a database, but there are two external libraries that can help us with it:

  1. Entity Framework Plus – it’s a free library which has a “Query Future” mechanism that can help with improvements on a read side
  2. Entity Framework Extensions – it’s a paid library with a mechanism that can improve create and update side.

My entire series describe improvement with the Entity Framework 6, but It’s worth to mention that both described libraries support Entity Framework Core 2.x too.

Continue reading “Entity Framework – improve query performance – limit roundtrips to database”

Choosing technology – useful questions from architects perspective

Over a week ago, a friend of my asked me to review their proposal to the project they wanted to participate. They had some three options, and they couldn’t decide which one is the best to propose to their potential a new client. I received the document, and we had some short discussion about project, company and client. When I started revving this deck, I was reading it with great pleasure because it was a typical technical document, unfortunately only a technical document…

Continue reading “Choosing technology – useful questions from architects perspective”

Entity Framework – improve query performance – query compilation

Introduction

In the first post, I described query execution steps that can be potentially improved when the query is executed. In this article, I will focus on the first phase named “query compilation”. I will show you how to solve the following problem:

“When the query is executed the first time, internal plan compiler compiles the query and stores it in a query plan cache. When the query is executed the second time, the compiled query should be taken from this cache. If it does not exist, then compilation must be executed once again. Unfortunately, if you have some specific syntax in your LINQ query, then the query compilation phase is made with every query execution. Microsoft claims that this stage has a medium impact on performance but from my perspective, it has an extremely high influence on CPU and memory resources.”

Cause of issues

There are mainly two reasons why query compilation phase may be executed too often. If it’s executed every time for a specific query then based on my experience, you probably use IEnumerable<>.Contains() method in your query, if it’s not executed with every call but still often then probably you use constants in your query.

There is one more case when query can be compiled every X amount of time but this only exists in a really big application when throughput is very high. If you have such kind of application and you noticed a compilation issue even if you are sure that queries were created correctly, then this can indicate that query eviction algorithm has been executed. By default EF can store only 800 elements in a cache when this threshold is exceeded then clean is executed but there is a way to increase the number of the cached element. There is a great article describing it: https://blogs.msdn.microsoft.com/premier_developer/2017/03/23/entity-framework-query-caching/

IEnumerable<>.Contains()

IEnumerable<>.Contains() – EF has a rule that he doesn’t cache queries containing Contains() method in the expression. He has such rule because.Contains() is used with the collection and is hard to predict the number of elements in your collection. In our project to solve this problem, we made a few extensions methods with our custom implementation of the Contains() and not Contains()., the code is available here: https://gist.github.com/adamlepkowski/f8073dc45de498d4f7c6299f40dcd519. That extension will not work in every scenario that you can imagine building EF query but they may be very handy.

If you use a default implementation of the contains method:

The generated SQL looks like this:

Now let’s change our LINQ query to use our custom extensions method:

Then, the generated SQL statement looks like this:

As you can notice SQL queries are a bit different, in the first query elements from a collection are put inside a SQL query and in the second query collection elements are parameters of the query so that values can be easily replaceable and the same execution plan is used. Extensions that we developed work but it have some limitation regarding a caching strategy – cache is made for a specific amount of elements in the array. It means that if you execute a query twice with the same amount of elements in the collection, then the second query will not be compiled because has the same amount of elements as the collection from the first query had.

Constants

Let’s take a look at the sample LINQ query:

var contacts = dbContext.Contacts.Where(x => x.Name == "Adam").ToList();

The generated SQL looks like this:

As you can see the name that we used for filters is inside the SQL query as a constant. So, let’s change our LINQ query a bit:

Then SQL query looks like this:

Now the query is completely different there is no static ‘Adam’ value inside SQL query but there is a parameter. If we decide to run the second query with a different name than ‘Adam’ then the query plan will be reused. If you try to execute the first query multiple times in different places of your application, then it will be reused only when the constant value is the same.

Provided example is pretty straightforward but It should let understand how to recognise potential problematic queries and analyse a SQL query to find an issue. Interesting is that some LINQ extensions methods use constants instead of variables by default (Skip() and Take()). Those two are mainly used to make paging, and if you run a query to retrieve the first page with 20 elements on the page then the query will be compiled with constants values if you change page size or page number query will be compiled once again. To solve this specific problem you can start using those two methods but from System.Data.Entity.QueryableExtensions namespace.

Problem recognition

The problem is mainly revealed by high CPU consumption or jumping of a CPU usage. If the issue exists in multiple queries then CPU consumption is high, if only a small percentage of your queries have this problem then CPU consumption increase when that problematic query is executed. When it comes to a memory consumption then allocation occur very often and Garbage collector is often fired.

There are three ways to find problematic queries and confirm a suspicion  that problem with query compilation exists:

  1. doTrace / ANTS Profiler / PerfView can be used, if you notice that GetExecutionPlan() method from EntityFramework namespace is often executed then it means that problem exists
  2. based on what I described further in a document you can open your favourite code editor and try to manually search for issues based on a syntax that can be problematic (Contains(), Skip(), Take())
  3. check you generated SQL queries using a SQL Profiler or any other tools to find queries that are not using a constants

Conclusion

I hope that after reading this article you know the impact of the query compilation on your application and you should know how to investigate and solve this problem occurring in your queries. The most important is to remember that this issue is mainly connected with a constant expression used in your LINQ query expression tree and you should always use a parameter / variable expression instead.

 

Entity Framework – Contains() on empty collection

Recently when I was doing a performance analysis I noticed an interesting query and I need to admit it was first time when I saw it.

My stackify profiler presented me the following SQL query:

The structure of this query can vary depending on what additional statements you use in your query (i.e. include, select) but they key part presented below always exists in every scenario.

FROM ( SELECT 1 AS X ) AS [SingleRowTable1]

It appeared that such SQL query is generated when you make a contains on an empty collection, in short your LINQ query looks like below:

To improve it you just need to verify whether your collection that you use to filter data has any elements. If has then make a query, if not then it doesn’t make any sense to execute a query in this scenario.

 

UPDATE (01/12/2017 – referring to Marek comment)
It’s tough to say whether you should introduce described change but based on my experience and scenarios I had to solve you should know about the following potential problems:

  • This kind of query is compiled whenever executed. When you have a few include statements, the query compilation phase can take even up to 500ms. You can read more about it here http://devmate.net/2017/10/entity-framework-the-query-plan-cache-story/
  • Additional roundtrip is made to a database – take a look at you network latency when communication is made to a database.
  • Database processing – when your DB is not busy then query execution time is very low, but when you have a very high load in a database, then execution time is much higher for such “dummy” query.

Entity Framework – the query plan cache story

Introduction

This post describes my adventure with investigation of performance issues in an application with Entity Framework as data access technology. I can just tell you that I have really improved my entity framework knowledge, especially compilation phase, because I had multiple cases where my queries were compiled multiple times and sometimes randomly. Continue reading “Entity Framework – the query plan cache story”