Optimize .Net API Performance with SQL Server

In this article we’re going to improve performance of an .Net API application using sql server database. We’re going to use milions of rows to simulate real huge catalog of products, say around 5 millions of products. To make it even more interesting, we’re going to simulate between 10-20 concurrent users browsing our website and looking for the products.

When dealing with product in retail industry and e-commerce, customers needs to filter by titles, descriptions or even pricing or maybe category like shoes. Our goal will be to ensure that even with huge amount of data, our users will get very good performance and nice user experience. These techniques are also important from costs perspective. Each time we are not cautious about performance or queries, the uses of our system can skyrocket. And most of the time, cloud providers like Azure or AWS are tight to usage of the components and pricing is affected.

First, I’ll explain how you can setup this project on your local computer. Then, I’m going to start with very basic baseline project, without any optimizations, we will check insights using k6 testing tool made by Grafana. Our testing scenarios will try to replicate customers actions like browsing the products, filtering by name or description or pricing. But we will test also admin actions like removing a product from catalog or creating or updating existing products.

Then, step by step, we will optimize our sql database and c# logic to ensure we stay close to our tresholds and limits. For each step we will spend some time on testing outcomes and analyze numbers.

Tech stack

Below is a list of tech stack and tools we’re going to use in this project:

.Net Core 9
Sql Server Express / Developer Edition
Entity Framework Core 9
Redis
K6 (Grafana) testing tool
Visual Studio 2022

Development

You can clone my GitHub repository in any time and check final project here.

First, lets create a clean architecture splitting our whole project in clean layers like this:

Domain project will contain only Products.cs class which represents our Sql table and will be used mainly by EF core and crud operations. I think is always good idea to use domain layer and separate our core logic with domain.

Infrastructure project contains sql database context. Having one specific project for Infrastructure like tools is important so we can easily switch for example a database to another type of database and give as clear visibility where the infrastructure stay.

Seeder application will be used to seed or import data in our database. This is key for this project because we want to test realistic scenario with huge amount of data. Seeder is using database context defined in Infrastrucutre layer so that it can communicate with the database. that is the only responsability of this project.

API project contains core logic and endpoints. Is is worth spend more time on this project. Ednpoints we’re going to use are very basic crud operations where we can create/update/delete/read entity like product.

Sql Server will be used to store our entity permanently.

Initial Setup

If you clone my githubproject, you would first need to run EF Core migrations so that products table is created in your sql database.

Important: Please note to not to use docker sql database. Later in the article we will implement replication and most docker versions of sql database doesn’t provide this functionality. Is strongly reccomended to use Sql Server Express or Developer edition for this project.

Once you have installed Sql Server instance, your connection string should looks like this:


Server=localhost;Database=dbName;UserId=sa;Password=YourStrong!Passw0rd;TrustServerCertificate=True;MultipleActiveResultSets=true;

Ensure you have all EF Core packages installed in the infrastructure and API projects:

Microsoft.EntityFrameworkCore
Microsoft.EntityFrameworkCore.Design
Microsoft.EntityFrameworkCore.SqlServer

Next run the migrations commands from your package manager console making sure your default project dropdown is Infrastructure, then run:


Add-Migration InitialCreate -StartupProject CatalogX.API

and then


Update-Database -StartupProject CatalogX.API

If no errors, you should have your database created, you can login to sql server instance using Sql Server management Studio:

Now we are ready to seed our 5 milions products data running Seeder application. The logic are very basic. What we’re doing here is to loop using FOR loop 5 milions of times, creating each time batch of products and generating casual data for each product like Pricing, Title, Description and Category. Run the application and wait till complete, this could take 5 or 10 minutes based on your pc performance.

Done? Alright, now you should have 5 milions of products in your database with fake products like this:

Optional: InluxDB and Grafana setup:

If you want to see visual graphs instead of plain numbers when testing, you would need to setup InfluxDB, which is a time series database used by Grafana to visualize our precious data collected when running the testing. Basically each time an endpoint is hit, and an action is happening like checks or eventual failures, we collect and capture them in Influx Database. Then Grafana will read from this database and show us in a nice way all the insights.

To setup InfluxDB and Grafana, the easiest way is to use docker images. You can use docker compose yml file from my github page here:

compose-docker.yml

Note: The yml file will enable also Redis on your docker app. We will talk about redis later in this article.

For testing we will use k6 tool, so please install it from this page based on your operating system.

Now we’re ready to start phase 1!

Phase 1 (Baseline – No Optimizations)

I would suggest before you start k6 testing to first became comfortable with API call using postman. This will give you context and understanding of k6 js script we will see in this chapter.

Now we can start deep diving into actual testing using K6 tool. K6 is using javascript language and is very flexible and powerful if you need to test real world scenario and high traffic on your local computer. You can imagine K6 js script like automation of your postman calls, but instead of having nice UI and doing manual clicks you can automate with a javascript code.

You can find final javascipt code here on my github page.

What this script does is to simulate six different user’s like scenarios (actually seven scenarios, I added another one later in this article…), with multiple virtual users and calling product API endpoints with params or json payload in random order. For example, to fetch products using pagination, scenario is defined like:


paginationTest: {
    executor: 'constant-vus',
    exec: 'pagination',
    vus: 10,
    duration: '1m',
}

For this scenario we want to measure some numerical performance, and the best way is to use percentiles. I’m focus in these tests on p95 percentile which gives us indication on how well our system is doing for 95% of the users and 5% that is actually exceeding our set limit. For example for pagination test I expect this:


'http_req_duration{scenario:paginationTest}': ['p(95)<300']

This means that 95% of the users should receive the paginated result of the products in less then 300ms. I can tolerate that 5% of the users will get the response higher then 300ms.

Each exported function is defining an API call and is performing some basic logic. This could be random IDs, random page access, body payload for product creation and so on. For example for pagination I decided to create page number between 1 and 500 with size of 20, means I want to get maximum 20 products per page.


const page = Math.floor(Math.random() * 500) + 1;

For each call we want to check the outcome and capture the result. thsi si where check() function provided by k6 tool comes handy. In our pagination test we want to check if the pagination has success status = 200 and we’ve returned some products.


check(res, {
    'pagination 200': (r) => r.status === 200,
    'has items': (r) => {
        if (r.status !== 200) return false;
        const body = r.json();
        return Array.isArray(body.data) && body.data.length > 0;
    },
});

Other tests are doing similar logic but each is focusing on specific endpoint.

So, lets recap where we have got so far. We have API with endpoints ready, we have sql server database instance with products table, you should have docker containers running with InfluxDB and Grafana. I hope you have also tried at this point API calls with postman. Now we’re ready to execute our javascript with k6 using below command from your visual studio 2022 developer powershell:


k6 run --out influxdb=http://admin:adminpass@localhost:8086/k6 random-test.js

This command will tell to k6 to execute our script random-test.js and write all the results into influxDB database.

As we can see, thresholds results looks like this:

What these numbers tell as is that most of the GET requests are extremely slow. We haven’t reach our range allowed and considered as acceptable but most importantly we are far away from our acceptance range. Worst one is SearchTest and I would like to focus on this particularly as first optimization.

Phase 2 – Sql Server Full-text search

Sql Server has very powerful feature called Full-Text search. You can read official microsoft documentation but let me briefly explain what it does.

SQL Server Full-Text Search is a tool that helps you quickly find words or phrases inside large text fields , like product descriptions, articles, or documents.

It will deserve one post article to explain in details how Full-Text Search works behind the scenes. If you want to deep dive into this, check this article on sqlhack.com.

As it is now, we do the following in API GET product endpoint:


if (!string.IsNullOrEmpty(queryParams.Search))
{
    query = query.Where(p => p.Name.Contains(queryParams.Search) ||
    p.Description.Contains(queryParams.Search));
}

This translates in sql syntax like:


WHERE Name LIKE '%searchTerm%'

This results in very slow query because it scans every row in our table looking for searchTerm in a column.

Now lets change this so we use full-text search feature. We need to create full text catalog with script like this in our database:


-- Create a full-text catalog if it doesn’t exist
IF NOT EXISTS (SELECT * FROM sys.fulltext_catalogs WHERE name = 'ProductCatalog')
BEGIN
    CREATE FULLTEXT CATALOG ProductCatalog AS DEFAULT;
END
GO

-- Create a full-text index on the Products table
IF NOT EXISTS (SELECT * FROM sys.fulltext_indexes WHERE object_id = OBJECT_ID('dbo.Products'))
BEGIN
    CREATE FULLTEXT INDEX ON dbo.Products(Name, Description)
    KEY INDEX PK_Products -- Replace with the actual name of your primary key index
    ON ProductCatalog;
END
GO

Then we can use this feature and change our c# code from this


 if (!string.IsNullOrEmpty(queryParams.Search))
 {
     query = query.Where(p => p.Name.Contains(queryParams.Search) ||
     p.Description.Contains(queryParams.Search));
 }

to this


if (!string.IsNullOrEmpty(queryParams.Search))
{
    var quotedSearch = $"\"{queryParams.Search}\"";
    // This leverages SQL Server's CONTAINS function behind the scenes
    query = query.Where(p => EF.Functions.Contains(p.Name, quotedSearch)
                             || EF.Functions.Contains(p.Description, quotedSearch));
}

EF.Functions.Contains change the sql query to this


SELECT * FROM Products WHERE CONTAINS(Description, 'some description');

Lets run the tests again and check SearchTest tresholds

We passed from 40s to 1.3s. Not too bad! Our threshold is still not passed as I set it to less then 500ms. But considering 5 millions products, we have now 95% of the users getting their products in less then 1.3s. Quite impressive!

If you remember from our first tests, second worst p95 percentile timing was coming from Category filtering test. Lets focus on this now.

Phase 2 – Non Clustered Indexing

Imagine a user that want to see the products related to a specific category. In our code in c#, the query is happening here:


if (!string.IsNullOrEmpty(queryParams.Category))
    query = query.Where(p => p.Category == queryParams.Category);

This translates in database in a query like this:


SELECT * 
FROM Products 
WHERE Category = @Category;

This means that database search engine will scan whole table and look for all products that have category equal to @category. This is usually quite bad from performance of view especially when rows number are huge. What we want instead is to apply non-clustered index on queries that we know in advance will be used by end users.

What is non-clustered index? You can see it like the table of content in a book and pointers are stored and used to quickly access data in a database. B-Tree is used in most cases. B-trees are actually very popular in relational databases and is worth reading about this algorithm in more details.

So lets create a non clustered index on category and pricing column including Name to cover case when we want to show the users name of the product faster.


CREATE NONCLUSTERED INDEX IX_Products_Category_Price ON dbo.Products(Category, Price) INCLUDE (Name);

Lets run again the test and compare:

before non clustered index creation:

and after

We have improved performance by 10 time for 95% of the users. That is quite impressive with just one non/clustered index.

Indexes can improve performance but they need more memory usage as they need to create indexes in the memory. This could also affect writes in database because the indexes needs to be updated. So be mindful when using them.

Phase 3 – Redis caching

Our endpoint that is filtering products and applying query on database is implementing pagination. Without pagination it will be impossible to run any kind of query as the data, in our case products, won’t be able to return any result in time. The reson behind is that .Net apply timeouts on connection level which will result in timeout errors (default timeout is 30 seconds).

With pagination however the perfomance is still not great. Imagine you have an ecommerce website and some of the products are very popular, or even a category top seller is so popular that each customer reach this page every time. Does it make sense to query your database every time and parse the result in json? Of course not. Much better solution is to use some kind of caching system.

Caching system is sort of fast, RAM in-memory store. It is key-value based. Means that you can quickly access a content if you know the exact key. If you are familiar with c# dictionary, so that is the same principle.

There are several benefits of using caching system like:

improve performance
reduce db load
lower latency

Althow there are many tools we can use, the most famous and popular is Redis database. It is in-memory key-value database and is very well integrated with .Net ecosystem. In the setup section we mentioned that Redis will be enabled in docker through docker-compose yml file. So you should have it already up and running.

I won’t be explaining how to inject Redis in .net and setup connection string because you can simply clone my github project or follow up Redis documentation for .net.

So how we’re going to use Redis in our scenario?

We assume that some of the Categories and pages are most popular. Say our most popular products stay on page 1 and are linked to Category 201. this means we know exactly how our key value should looks like.


var cacheKey = $"products:adv:pg={queryParams.PageNumber}:sz={queryParams.PageSize}"
         + (string.IsNullOrEmpty(queryParams.Category) ? "" : $":cat={queryParams.Category}")
         + (string.IsNullOrEmpty(queryParams.Search) ? "" : $":q={queryParams.Search}");

Even if the system change, and other products will became most popular, this system still works well. Each time a customer reach a pagination and category, this key and its value are saved in cache memory to be used later. If the request is completely new, the key doesn’t exists yet and we miss the cache and we need to read from the sql database (this is called Cold Cache ).

If the request was already performed previously, we read from the Redis fast in-memory database


var cached = await db.StringGetAsync(cacheKey);
if (cached.HasValue)
{
    _logger.LogInformation("Cache hit for {Key}", cacheKey);
    return Content(cached, "application/json");
}

This is our current implementation with Redis in a sequence diagram:

Cold Cache

Warm Cache (Hot Cache)

But there is a gap in our flow. What happens if we add or update or delete a product from our database? This would lead into an inconsistency and data won’t be updated. Customers will continue receiving old products because the key won’t change and the products will be read from Redis even after database updates.

To avoid this, we can invalidate cache memory every time there is an update on the catalog. Say we insert a new product in products table using POST endpoint in our product controller. In this case, all the keys linked to products will be deleted from the Redis memory and next paginated results request will add new key and value again with most updated content.


var server = _redis.GetServer(_redis.GetEndPoints()[0]);
foreach (var key in server.Keys(pattern: "products:adv:*"))
{
    await _redis.GetDatabase().KeyDeleteAsync(key);
}

The same logic needs to be added to UPDATE and DELETE endpoints.

Ok, so now we have in-memory system, consistent data when catalog of products change and we have reduced database load. This last point is very important because it allows the companies to r educe costs of the infrastructure. Database load is usually tight to pricing on Azure cloud or AWS.

Lets do some tests.

To test caching system, we would need to tweak little bit our k6 javascript scenarios. First, lets add this scenario


cacheTest: {
    executor: 'constant-vus',
    exec: 'cacheTest',
    vus: 5,
    duration: '50s',
    startTime: '0s',
}

As you can see, duration and startTime are respectively 50s and 0s. What I’m doing here is to anticipate cacheTesting that should start before CREATE/UPDATE and DELETE operations so we don’t invalidate caching memory between the tests.

Lets implement our function responsible to test paginated endpoint that hits with equal pageNumber, pageSize and Category.


export function cacheTest() {
    const url = `${API_BASE}`
        + '?PageNumber=1'
        + '&PageSize=20'
        + '&Category=Category%201';

    const res1 = http.get(url, { tags: { scenario: 'cacheTest' } });

    check(res1, {
        'cacheTest first fetch status 200': (r) => r.status === 200
    });

    sleep(0.2);

    const res2 = http.get(url, { tags: { scenario: 'cacheTest' } });
    check(res2, {
        'cacheTest second fetch status 200': (r) => r.status === 200,
        'cacheTest fast response <50ms': (r) => r.timings.duration < 50
    });

    sleep(0.2);
}

This function is implemented so it hit always the same endpoint with parameters like:

PageNumber=1&PageSize=20&Category=Category%201

and we check two scenarios:

High latency when we hit database and the response time should be quite high (around a second)
Redis served content , which should have very low latency around 50-100ms

Lets run the tests and check together the results:

Lets compare paginationTest and cacheTest. These two test give us an idea what is happening post caching implementation.

Pagination continue to be quite high in latency. This because pagination test is hitting always different endpoints and filter parameters and we actually see only Cold cache. This means that our users would access some random unpopular products and our system will take a while to show the products to them.

But for cacheTest the situation improved considerably. important indicator of improvement is that average time requested dropped under 100ms. Max is still high and equal to 1.08s. this is due to the cold cache as first time we hit database query. In one minute test, p95 percentile is under 300ms. means that 95% of the users would receive the products under 300ms, which is quite good and acceptable. For paginated only, 95% of customers would need to wait almost 2 seconds.

Conclusion

In this post we introduced very important concepts, techniques and tools that can help you if you are concern about performance of your system or if your traffic increases and is evolving.

These topics are very important when it comes to distributed systems where performance, latency, throughput and data consistency are crucial to ensure that the system can scale and maintain performance acceptable even if the traffic and data load increases.

What next?

Even though we have improved performance there is still space for getting our system even better. Replicationfor example could be very beneficial if your system has many writing in your database or if you want to distribute your data in different nodes and protect your data in case of failures. This will ensure your system is reliable even when errors occurs. Replication is increasing performance as well because we could technically split our readings and writings from two different databases. One used for reading only and second one for writing only.

Sql Server support also partitioning. Partitioning lets us divide physically a table into smaller parts. In our case, we could split products table into smaller pieces physically by ProductIDs but logically they will stay part of the same table. This will get faster our queries even more because queries are running only on specific partitions.

Happy coding! :)

Daniel Hajek @daniel_hajek_f3e950f9157e