Tag Archives: Performance optimization

Indexes in Sql Server – Part I

If you need your query to run faster, then you must know about the indexes. In this series of articles on Indexes, I will walk you through Sql Server Indexes with examples and explain how Sql Server Indexes come handy to resolve query performance issue. This series of articles will be helpful for both Sql beginners and advance users. I will try to keep this series of articles on Indexes as simple as possible by keeping in mind the beginner audience.

In this Part-I article of the series of articles on Sql Server Indexes, I will explain how to enable the execution plan, IO and Time statistics. It will also cover HEAP tables and problems while querying data from it.

The main purpose of Sql Server indexes is to facilitate the faster retrieval of the rows from a table. Sql Server indexes are similar to indexes at the end of the book whose purpose is to find a topic quickly.

To understand Sql Server Indexes with example, let us create a Customer table as shown in the below image. Execute the following script to create the Customer table with sample 100,000 records.

--Create Demo Database
CREATE DATABASE SqlHintsIndexTutorial
GO
USE SqlHintsIndexTutorial
GO
--Create Demo Table Customer
CREATE TABLE dbo.Customer (
    CustomerId INT IDENTITY(1,1) NOT NULL,
    FirstName VARCHAR(50), LastName  VARCHAR(50),
    PhoneNumber VARCHAR(10), EmailAddress VARCHAR(50),
    CreationDate DATETIME
)
GO
--Populate 1 million dummy customer records
INSERT INTO dbo.Customer (FirstName, LastName, PhoneNumber, EmailAddress, CreationDate)
SELECT TOP 1000000 REPLACE(NEWID(),'-','') FirstName, REPLACE(NEWID(),'-','') LastName, 
    CAST( CAST(ROUND(RAND(CHECKSUM(NEWID()))*1000000000+4000000000,0) AS BIGINT) AS VARCHAR(10))
	PhoneNumber,
    REPLACE(NEWID(),'-','') + '@gmail.com' EmailAddress,     
    DATEADD(DAY,CAST(RAND(CHECKSUM(NEWID())) * 3650 as INT) + 1 ,'2006-01-01') CreationDate
FROM sys.all_columns c1
        CROSS JOIN sys.all_columns c2
GO
--Update one customer record with some know values
UPDATE dbo.Customer
SET FirstName = 'Basavaraj', LastName = 'Biradar', 
    PhoneNumber = '4545454545', EmailAddress = 'basav@sqlhints.com'
WHERE CustomerId = 50000
GO

Enable Execution Plan

Enable the display of query execution plan in Sql Server Management Studio (SSMS) by clicking on the Actual Execution Plan option as shown in the below image. You can as well enable/disable the Actual Execution Plan by pressing the key stroke CTRL + M.

Enable Actual Execution Plan

What is execution plan?

Beginners may be wondering what is this execution plan? In Sql Server execution plan is nothing but the visual representation of steps or operations which Sql Server performs internally to execute a query and return the result.

Enable the display of IO and Time Statistics for a Query

Execute the following statements to enable both IO and Time statistics in one statement.

SET STATISTICS IO,TIME ON

The SET STATISTICS IO ON statement displays the disk activity performed to execute the Sql query. In Sql Server table data is stored in 8 KB data pages on the disk. Whenever we try to read the data from a table, the Sql Server query engine first checks whether data page is already in memory. If the page is already in memory then sql uses that, this operation is shown as logical read in SET STATISTICS IO output of the query. If sql doesn’t find the data page in the memory then it reads it from the disk, this operation is shown as physical read in SET STATISTICS IO output of the query. Both logical and physical reads of a data page is a costly operation. A query should have minimal page reads. The SET STATISTICS TIME ON output of the query shows the time taken by the query to complete the execution. In the following sections, I will explain both these settings results with an example. It will be more clear once you go through the examples in the following sections.

Execute the following query to get the details of a Customer whose CustomerId is 50000.

SELECT * 
FROM dbo.Customer WITH(NOLOCK) 
WHERE CustomerId = 50000

From the above result, we can see that we have one customer record in the Customer table with CustomerId as 50000.

Let us now go to the Execution Plan tab of the result and see the execution plan of the query.

From the above execution plan we can see that Sql Server is doing the table scan of the Customer table. Table scan means Sql server is reading all the data pages and rows of the table to find the records in the customer table. Even after finding the first record with CustomerId = 50000, Sql server will not stop searching till it reads the last row as it doesn’t know that there is only one record with CustomerId = 50000 unless it reads the last row.

Let us now go to the Messages tab and see the IO and Time Statistics.

From the above IO and Time statistics of the query, we can see that it is reading 18,182 data pages. Each data page is of 8KB size, so to get one Customer record Sql server is reading 142 MB of data. And it is taking 250 millisecond CPU time.

From the above result we can observe that Sql Server is doing lot of IO and consuming CPU resource to fetch just one Customer record. If you see such things in your environment on any transactional table then there is something terrible wrong. You should immediately solve such problems. next sections will guide you through how such problems can be solved using Indexes.

HEAP Table

A table without a Clustered Index is called as a Heap Table. Frankly speaking we should never have a HEAP table in an Online Transaction Processing System (OLTP). There are some .01% edge case scenario where we may go for heap table if we need faster DML (INSERT, UPDATE, DELETE) operations. If we see any transactional table without a Clustered index then you can assume the table is badly designed.

The Customer table at the current state is a HEAP table, as it doesn’t have any Clustered index. In the above execution plan we have seen that Sql Server is doing a Table Scan, we see this operation of ” Table Scan” only for the HEAP Table. In case of a Table without a Clustered index , table data is stored in a heap structure. It means data is stored in an unordered way.

What is the Solution for the above query problem where it is reading 142 MB of data and using a quarter of a second to return just one customer record? Clustered Index is the Solution for the above query problems. In the next article in this series of articles on Indexes I will explain how Clustered Index solves these problems. Will post this article on Clustered Index next weekend, till then bye and be safe.

Query Store in Sql Server 2016

Introduction to Query Store

Many a time we come across a scenario where suddenly in production without any release or changes some query which was working perfectly alright till yesterday is taking too long to execute or consuming lot of resources or timing out.

Most of the times such issue are related to execution plan change (commonly referred as Plan Regression). Till yesterday the query was running fine as it was running with good cached execution plan and today a bad plan is generated and cached. Because of this bad cached plan the query which was executing perfectly alright suddenly starts misbehaving.

To identify and fix such performance problems due to the execution plan change the Query Store feature introduced in Sql Server 2016 will be very handy.

Query Store basically captures and stores the history of query execution plans and its performance data. And provides the facility to force the old execution plan if the new execution plan generated was not performing well.

To understand Query Store with an example let us create a demo database SqlhintsQSDemo and a Customer table having sample data as shown in the following image by executing the following statement:

Customer Table for Query Store Demo

Script:

--------Create Demo database--------
CREATE DATABASE SqlhintsQSDemo
GO
USE SqlhintsQSDemo
GO
--------Create a Customer Table------
CREATE TABLE dbo.Customer( 
 Id INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
 FirstName NVARCHAR(50), LastName NVARCHAR(50))
GO
--Populate 100,000 customers with unique FirstName 
INSERT INTO dbo.Customer (FirstName, LastName)
SELECT TOP 100000 NEWID(), NEWID()
FROM SYS.all_columns SC1 
	CROSS JOIN SYS.all_columns SC2
GO 
--Populate 15000 customers with FirstName as Basavaraj
INSERT INTO dbo.Customer (FirstName, LastName)
SELECT TOP 15000 'Basavaraj', NEWID()
FROM SYS.all_columns SC1
        CROSS JOIN SYS.all_columns SC2

Execute the following statement to create a non-clustered index on the FirstName column of the customer table and the stored procedure GetCustomersByFirstName to get the customer details by FirstName

-- Create non-clustered index on the FirstName column
CREATE INDEX IX_Customer_FirstName on dbo.Customer (FirstName)
GO
-- Create stored procedure to get customer details by FirstName
CREATE PROCEDURE dbo.GetCustomersByFirstName
(@FirstName AS NVARCHAR(50))
AS
BEGIN
	SELECT * FROM dbo.Customer 
	WHERE FirstName = @FirstName
END

Enabling Query Store

Query store is a database level feature, which is disabled by default. Follow the following steps to enable it by using Management Studio

Step 1: Right click on the demo database SqlHintsQSDemo and select the Properties menu option as shown in the below image

Enable Query Store DB Properties window

Step 2: Step 1 Pops-up the properties window. In the properties window from the left navigation options select the new option Query Store. Below is the view of the properties window after selecting the Query Store option.

Query Store Properties window

Step 3: From the Query Store option window of Step 2 we can see that the Operation Mode (Requested) setting has three different options. Selecting the ReadWrite option will enable the Query Store and starts capturing the query execution data such as execution plan and it’s performance stats. Selecting Read option will only allow the data to read from the query store but no new query data is captured.
Below image shows the various query store options after selecting the ReadWrite Operation Mode. To know each of these options select the option it will display the details about it below.

Query Store Properties window Post Selecting ReadWrite

Query store captured data is stored in the respective query store enabled database. As we can see from the above image 100 MB is reserved for query store data in the SqlhintsQSDemo database. This value can be changed by changing the Max Size (MB) option value.

Click OK after selecting the desired query store options, it will enable the query store. And if we refresh the SqlhintsQSDemo database in the object explorer we can see the Query Store folder as shown in the below image

Query Store folder in the object explorer

Now query store feature is enabled let us understand how it comes handy in resolving performance issues

Example Usage of Query Store feature to resolve Performance issue

This example explains the scenario where suddenly in production without any release or changes some query which was working perfectly alright, starts taking too long to execute or consuming lot of resources or timing out. And how the query store comes handy in such situation

To explain this we will use the stored procedure GetCustomersByFirstName to fetch the records from the Customer table based on FirstName. And the Customer table has unique FirstName for all the customers except for the FirstName ‘Basavaraj’. There are 15,000 customers with FirstName as ‘Basavaraj’, so fetching a record where CustomerName is ‘Basavaraj’ will be better with table scan (i.e. clustered index scan) than using the non-clustered index on FirstName. But fetching the Customer records where FirstName value is anything other than ‘Basavaraj’, it makes sense to use the non-clustered index on FirstName. In this case where the majority queries will be better of executing by using the Non-clustered index on FirstName

Let us execute the following statement to get the details of the Customer whose FirstName is Xyz. While executing the below statement select the “Include Actual Execution Plan” option.

EXEC dbo.GetCustomersByFirstName @FirstName = N'Xyz'

Result:
Query doing Non Clustered Index Seek

From the execution plan we can see that the Stored Procedure is using the Non-clustered index seek on the FirstName column to fetch the Customer details, which is the optimal option too.

From the object explorer select the “Top Resource Consuming Queries” option as shown in the below image to view the query plan and the performance stats capture for the above SP execution in the Query Store

Top Resource Consuming Queries Option

Below is the Query Store view of the “Top Resource Consuming Queries”. Here we can see that Plan Id 6 is corresponding to the above stored procedure execution which was using Non-Clustered Index Seek. Mouse over on the bar corresponding to the query which got executed by the SP GetCustomersByFirstName shows that it is executed ones, has one execution plan, it took 16129 micro seconds and query id is 6.

Qury Store After First Execution of the Query

Now the execution of the GetCustomersByFirstName will always use the Non-Clustered index on the FirstName column, as the first time execution of it cached this Non-Clustered Index Seek plan. So as long as this plan is cached the query where we need to get the Customer data by FirstName will always execute optimally. The only way this cached query plan is removed from the cache if the server is re-started or the memory pressure on the server etc. We can also execute the following DBCC statement to remove all the cached plans from the database server

DBCC FREEPROCCACHE

After executing the above DBCC statement all the cached plans are removed from the cache. Now try executing the same stored procedure i.e. GetCustomersByFirstName but this time pass the @FirstName parameter value as Basavaraj instead of Xyz. While executing the following statement select the “Include Actual Execution Plan” option.

EXEC dbo.GetCustomersByFirstName @FirstName = N'Basavaraj'

Result:
Query doing Clustered Index San

From the result we can see that instead of using Non-Clustered Index on FirstName, it is doing the Clustered Index Scan. The reason it is doing clustered index scan is, there are 15,000 records whose FirstName is ‘Basavaraj’ and there are no previously cached plans. It makes sense for the Sql Server to use Clustered Index scan, instead of using a Non-Clustered index seek which uses bookmark look-up for each of the record. Now this plan is cached and each time the SP: GetCustomersByFirstName is executed it will use the clustered index scan

Below is the view of the Query Store after executing the SP: GetCustomersByFirstName with @FirstName parameter value as ‘Basavaraj’. You need to refresh the query store if it is already open. Here we can see a new plan with plan id 11 is generated which corresponds to the clustered index scan. So we can see that now we have two query plans corresponding to the query executed by the stored procedure GetCustomersByFirstName.

Query Store View with two plans for the Same query 1

Let us again execute the following statement to get the details of the Customer whose FirstName is Xyz. While executing the below statement select the “Include Actual Execution Plan” option.

EXEC dbo.GetCustomersByFirstName @FirstName = N'Xyz'

Result:
Using worst Cached Plan

From the result we can see that this time the stored procedure execution is using the Clustered index scan instead of the non-clustered index scan which it was doing previously. To get the customer details whose FirstName is ‘Xyz’, from the result we can see it is using the cached plan which is optimized/compiled for the @FirstName parameter value ‘Basavaraj’. This is how a perfectly working query starts misbehaving without any release or changes.

Below is the view of the Query Store after executing the SP: GetCustomersByFirstName with @FirstName parameter value as ‘Xyz’. You need to refresh the query store if it is already open. Mouse over on the bar corresponding to the query which got executed by the SP GetCustomersByFirstName shows that it is executed 3 times and has two execution plans i.e. 6 and 11 and plan 11 which is a clustered index scan is the current plan. In the below image we can clearly see than plan 11 is taking more time to execute compared to plan 6. So, in this case it is optimal to execute the query with plan 6 compared to plan 11.

Query Store View after  third execution

From the above query store view it is clear that plan 6 is the optimal plan for the execution of the stored procedure GetCustomersByFirstName compared to plan 11 as majority of the time the stored procedure is executed with a parameter value other than ‘Basavaraj’.

As shown in the below image query store gives an option to force the execution plan for the future execution of the queery. To force the plan select the plan 6 (i.e. light blue color circle) and then click on the Force Plan option.

Query Store force the execution plan 1

After forcing the plan 6 as shown in the above image. Now execute the GetCustomersByFirstName one more time with @FirstName parameter value as ‘Xyz’.

EXEC dbo.GetCustomersByFirstName @FirstName = N'Xyz'

Result:
Execution plan after forcing it by the query store

From the above result we can see that execution of the SP: GetCustomersByFirstName with @FirstName parameter value as ‘Xyz’ is doing a non-clustered index scan compared to clustered index which it was doing previously.

Below is the view of the Query Store after forcing the execution plan 6 and executing the SP: GetCustomersByFirstName with @FirstName parameter value as ‘Xyz’. We can see that plan 6 circle has a right tick mark, which means this is the force plan which now will not change even after server re-start. As shown in the below image adjacent to Force Plan option we have an Un-force Plan which an be used to un-force this forced plan.

Query store view post forcing the plan

New features in SQL SERVER 2016