Tag Archives: Sql

How to find the list of all Pages that belongs to a Table and Index? Tip 4: Sql Server 101 Performance Tuning Tips and Tricks

I was writing an article on when and why we have to use included columns and wanted to explain it by showing how key column and included column values are stored in the Index Pages. That time, I got realized that first I need to explain how we can look into the Sql internal storage. Because of this I have decided to explain: How to find all the pages allocated to a Table and Index? How to identify Root, Intermediate and Leaf Pages of the Index? and How to inspect the Page data?. As a result of this realization I have written this and the next two articles. These articles will be referred in most of the articles in this series of 101 Performance Tuning Tips and Tricks.

From Sql Server 2012 onwards we have an un-documented (i.e. feature that may change or removed without any notice or may produce un-expected result) dynamic management function sys.dm_db_database_page_allocations to get the list of pages associated with one or more Tables and Indexes. This function returns one row for each page associated with the Table/Index.

This dynamic Management function takes following 5 parameters/arguments

  • @DatabaseId: Database ID of the Database whose tables or indexes assocaited pages information is required. This is a mandatory required parameter. We can use the DB_ID() function to get the Database ID of the Database by Database Name
  • @TableId: Table Id of the table whose assocaited pages information is required. We can use the OBJECT_ID() functio to get the table id by table name. It is an optional parameter, if it is Passed as NULL then it returns the pages associated with all tables in the Database and when it is NULL then next two parameters (i.e. @IndexId and @PartionId) values are ignored
  • @IndexId: Index Id of the Index whose associated pages information we are looking for. We can use the sys.indexes catalog view to get the Index Id. It is an optional parameter, if it is passed as NULL then it returns the pages associated with all the indexes.
  • @PartitionId: Id of the partition whose associated pages information we are looking for. It is an optional parameter, if it is passed as NULL then it returns the pages associated with all the partitions.
  • @Mode: This is mandatory Parameter, it’s value can be either ‘LIMITED’ or ‘DETAILED’. ‘LIMITED’ returns less information. However, ‘DETAILED’ returns detailed/more information. Obivously the ‘DETAILED’ mode uses more resources.

Let us understand the usage of this Dynamic Management Function sys.dm_db_database_page_allocations with examples

To understand how we can use this DMF to identify the pages belonging to the Table and Index, let us create a Customer table as shown in the below image with 35K records. Execute the following script to Create the Customer Table with Clustered Index on the CustomerId column and Composite Non-Clustered Index on the FirstName and LastName Column with sample 35K records.

--Create Demo Database
CREATE DATABASE SqlHints101PerfTips4
GO
USE SqlHints101PerfTips4
GO
--Create Demo Table Customers
CREATE TABLE dbo.Customers (
	CustomerId INT IDENTITY(1,1) NOT NULL,
	FirstName VARCHAR(50), LastName  VARCHAR(50),
	PhoneNumber VARCHAR(10), EmailAddress VARCHAR(50),
	CreationDate DATETIME
)
GO
--Populate 35K dummy customer records
INSERT INTO dbo.Customers (FirstName, LastName, PhoneNumber, EmailAddress, CreationDate)
SELECT TOP 35000 REPLACE(NEWID(),'-',''), REPLACE(NEWID(),'-',''), 
    CAST( CAST(ROUND(RAND(CHECKSUM(NEWID()))*1000000000+4000000000,0) AS BIGINT) AS VARCHAR(10)),
    REPLACE(NEWID(),'-','') + '@gmail.com',     
    DATEADD(HOUR,CAST(RAND(CHECKSUM(NEWID())) * 19999 as INT) + 1 ,'2006-01-01')
FROM sys.all_columns c1
        CROSS JOIN sys.all_columns c2
GO
--Create a PK and a Clustered Index on CustomerId column
ALTER TABLE dbo.Customers 
ADD CONSTRAINT PK_Customers_CustomerId 
PRIMARY KEY CLUSTERED (CustomerId)
GO
--Create a Non-Clustered Composite Index on the FirstName and LastName column
CREATE NONCLUSTERED INDEX IX_Customers_FirstName_LastName
    ON dbo.Customers (FirstName, LastName)

Example 1: Get all the Pages Associated with the Table Customers

To get all the Pages associated with the table Customers we need to pass only the @DatabaseId, @TableId and @Mode Parameter values of the DMF sys.dm_db_database_page_allocations. We can write a query like below to get this information:

SELECT DB_NAME(PA.database_id) [DataBase], 
	OBJECT_NAME(PA.object_id) [Table], SI.Name [Index], 
	is_allocated, allocated_page_file_id [file_id], 
	allocated_page_page_id [page_id], page_type_desc, 
	page_level, previous_page_page_id [previous_page_id], 
	next_page_page_id [next_page_id]
FROM sys.dm_db_database_page_allocations 
	(DB_ID('SqlHints101PerfTips4'), 
         OBJECT_ID('Customers'),NULL, NULL, 'DETAILED') PA
	     LEFT OUTER JOIN sys.indexes SI 
		ON SI.object_id = PA.object_id 
                   AND SI.index_id = PA.index_id
ORDER BY page_level DESC, is_allocated DESC,
         previous_page_page_id

RESULT:
How to get all the Pages Associated with a Table in Sql Server

From the above result we can see that, this DMV has returned all the pages associated with all the indexes of the Table when we have not specified (i.e. NULL is passed here) the @IndexId parameter value. In the result we can see the pages associated with the non-clustered index IX_Customers_FirstName_LastName and as well as the clustered index PK_Customers_CustomerId.

The records with page_level as 0 are the leaf pages and INDEX_PAGE with previous_page_id and next_page_id as NULL are the ROOT pages of the Index. And the root pages page_level value will be the maximum value for that index.

Clustered Index leaf pages store the actual table data, so the page type of the leaf pages of the Clustered Index is a DATA_PAGE. The Page Type of the non-leaf page (i.e. Root Page and Intermediate Page) of the Clustered Index is an INDEX_PAGE. Where as for the Non-Clustered index all the pages including leaf pages are of Type INDEX_PAGE, as the leaf page of the Non-Clustered index again points to the Clustered Index where actual Table Data is stored.

allocated_page_file_id (i.e. file_id) and allocated_page_page_id (i.e. page_id) are the frequently used columns from the result set of this DMV.

Example 2: Get all the Pages Associated with an Index

To get all the Pages associated with a specific Index of the table, we need to pass the @DatabaseId, @TableId, @IndexId and @Mode Parameter values of the DMF sys.dm_db_database_page_allocations.

Assume that we want to find all the pages associated with the Non-Clustered Index IX_Customers_FirstName_LastName. Then first question comes in the mind is how to get the Index Id for the Non-Clustered Index IX_Customers_FirstName_LastName. Answer to this question is very simple we can use the sys.indexes catalog view to get the Index Id. We can write a query like below to get Customers tables all the Indexes and their Index Id

SELECT OBJECT_NAME(object_id) table_name, object_id, 
     name index_name, index_id, type, type_desc
FROM sys.indexes
WHERE OBJECT_NAME(object_id) = 'Customers'

RESULT:
How to get Index Id in Sql Server

From the above result we can see that Index Id for the Non_Clustered
Index IX_Customers_FirstName_LastName is 2.

Note:

  • Clustered Index will always will have the Index Id as 1.
  • Index Id for the Non-Clustered Index will be >=2.
  • Index Id for the Heap Table is 0.

We can write a query like below to get all the pages associated with the Non-Clustered Index IX_Customers_FirstName_LastName

SELECT DB_NAME(PA.database_id) [DataBase], 
	OBJECT_NAME(PA.object_id) [Table], SI.Name [Index], 
	is_allocated, allocated_page_file_id [file_id], 
	allocated_page_page_id [page_id], page_type_desc, 
	page_level, previous_page_page_id [previous_page_id], 
	next_page_page_id [next_page_id]
FROM sys.dm_db_database_page_allocations 
	(DB_ID('SqlHints101PerfTips4'), 
         OBJECT_ID('Customers'), 2, NULL, 'DETAILED') PA
	     LEFT OUTER JOIN sys.indexes SI 
		ON SI.object_id = PA.object_id 
                   AND SI.index_id = PA.index_id
ORDER BY page_level DESC, is_allocated DESC,
         previous_page_page_id

RESULT:
How to get all the Pages Associated with an Index

Following are the list of columns returned by this DMF

database_id
object_id
index_id
partition_id
rowset_id
allocation_unit_id
allocation_unit_type
allocation_unit_type_desc
data_clone_id
clone_state
clone_state_desc
extent_file_id
extent_page_id
allocated_page_iam_file_id
allocated_page_iam_page_id
allocated_page_file_id
allocated_page_page_id
is_allocated
is_iam_page
is_mixed_page_allocation
page_free_space_percent
page_type
page_type_desc
page_level
next_page_file_id
next_page_page_id
previous_page_file_id
previous_page_page_id
is_page_compressed
has_ghost_records

Does the order of Columns in a Composite Index matters? Tip 3: Sql Server 101 Performance Tuning Tips and Tricks

The order of the columns in a composite index does matter on how a query against a table will use it or not. A query will use a composite index only if the where clause of the query has at least the leading/left most columns of the index in it.

Let us understand how the order of a column in a composite index matters with an example. Let us create a Customer table as shown in the below image with sample one million records by executing the following script.

Performance Issue Function on Index Column Customers Demo Table

CREATE DATABASE SqlHints101PerfTips3
GO
USE SqlHints101PerfTips3
GO
--Create Demo Table Customers
CREATE TABLE dbo.Customers (
	CustomerId INT IDENTITY(1,1) NOT NULL,
        FirstName VARCHAR(50), 
	LastName  VARCHAR(50),
	PhoneNumber VARCHAR(10),
	EmailAddress VARCHAR(50),
	CreationDate DATETIME
)
GO
--Populate 1 million dummy customer records
INSERT INTO dbo.Customers (FirstName, LastName, PhoneNumber, EmailAddress, CreationDate)
SELECT TOP 1000000 REPLACE(NEWID(),'-',''), REPLACE(NEWID(),'-',''), 
	CAST( CAST(ROUND(RAND(CHECKSUM(NEWID()))*1000000000+4000000000,0) AS BIGINT) AS VARCHAR(10)),
	REPLACE(NEWID(),'-','') + '@gmail.com',		
	DATEADD(HOUR,CAST(RAND(CHECKSUM(NEWID())) * 19999 as INT) + 1 ,'2006-01-01')
FROM sys.all_columns c1
		CROSS JOIN sys.all_columns c2
GO
--Update one customer record with some know values
UPDATE dbo.Customers 
SET FirstName = 'Basavaraj', LastName = 'Biradar', 
    PhoneNumber = '4545454545', EmailAddress = 'basav@gmail.com'
WHERE CustomerId = 100000
GO
--Create a PK and a Clustered Index on CustomerId column
ALTER TABLE dbo.Customers 
ADD CONSTRAINT PK_Customers_CustomerId 
PRIMARY KEY CLUSTERED (CustomerId)
GO

Create a Composite Non-Clustered Index on FirstName and LastName columns

Create a Composite Non-Clustered Index on FirstName and LastName column of the Customers Table by executing following statement:

--Create a non-clustered index on FirstName column
CREATE NONCLUSTERED INDEX IX_Customers_FirstName_LastName
	ON dbo.Customers (FirstName, LastName)
GO

First let us now enable the execution plan in the Sql Server Management Studio by pressing the key stroke CTRL + M

Let us now verify how the Order of the columns in the above composite index impacts it’s usage for the queries against this table

Scanerio 1: Query having FirstName and LastName column in the WHERE clause in the same order

SELECT CustomerId, FirstName, LastName
FROM dbo.Customers WITH(NOLOCK)
WHERE FirstName = 'Basavaraj' AND LastName = 'Biradar' 

Below is the execution plan for the above query execution

Query With FirstName and LastName

From the above execution plan we can see that the composite index IX_Customers_FirstName_LastName is used and it is performing the INDEX SEEK of it.

Scanerio 2: Query having LastName and FirstName column in the WHERE clause in the same order

SELECT CustomerId, FirstName, LastName
FROM dbo.Customers
WHERE LastName = 'Biradar' AND FirstName = 'Basavaraj'

Below is the execution plan for the above query execution

Query With LastName and FirstName

From the above execution plan we can see that the composite index IX_Customers_FirstName_LastName is used and it is performing the INDEX SEEK of it.

Scanerio 3: Query having only FirstName column in the WHERE clause

SELECT CustomerId, FirstName, LastName
FROM dbo.Customers WITH(NOLOCK)
WHERE FirstName = 'Basavaraj'

Below is the execution plan for the above query execution

Query With FirstName Only

From the above execution plan we can see that the composite index IX_Customers_FirstName_LastName is used and it is performing the INDEX SEEK of it.

Scanerio 4: Query having only LastName column in the WHERE clause

SELECT CustomerId, FirstName, LastName
FROM dbo.Customers WITH(NOLOCK)
WHERE LastName = 'Biradar' 

Below is the execution plan for the above query execution

Query With LastName Only

From the above execution plan we can see that the composite index IX_Customers_FirstName_LastName is used but it is performing the INDEX SCAN of it. Also we can see from the execution plan that, Sql Server has given missing index hint and it is suggesting to add Index on the LastName column.

Verify the Cost of the Query

Execute the following four statement in one batch and verify the cost of the query

SELECT CustomerId, FirstName, LastName
FROM dbo.Customers WITH(NOLOCK)
WHERE FirstName = 'Basavaraj' AND LastName = 'Biradar' 
GO
SELECT CustomerId, FirstName, LastName
FROM dbo.Customers WITH(NOLOCK)
WHERE LastName = 'Biradar' AND FirstName = 'Basavaraj'
GO
SELECT CustomerId, FirstName, LastName
FROM dbo.Customers WITH(NOLOCK)
WHERE FirstName = 'Basavaraj'
GO
SELECT CustomerId, FirstName, LastName
FROM dbo.Customers WITH(NOLOCK)
WHERE LastName = 'Biradar' 
GO

Execution Plan Cost

From the above execution plan we can see that the queries which have the leading column FirstName of the Index in the WHERE clause have 0% cost compared to the last query which has 100% cost which has only LastName non-leading column of the index in the WHERE clause.

Conclusion:

In Sql Server the order of the columns in a composite index does matter on how a query against a table will use it or not. A query will use a composite index only if the where clause of the query has at least the leading/left most columns of the index in it. In the example listed in this article we can see that all the queries having the leading Index column FirstName in the WHERE clause are using the Index. And the query having only the non-leading column LastName of the Index in the WHERE clause is not doing the Index Seek

Evils of using function on an Index Column in the WHERE clause– Tip 2: Sql Server 101 Performance Tuning Tips and Tricks

Using a function on an Indexed Column in the WHERE clause leads to an Index/Table Scan instead of an Index Seek. This issue occurs in majority of the scenarios, but in some cases Sql Server does Index Seek even when you are using function on an indexed column in the WHERE clause.

Let us understand how using function on an Indexed Column in the WHERE clause results in the performance issue with an example. Let us create a Customer table as shown in the below image with sample one million records by executing the following script.

Performance Issue Function on Index Column Customers Demo Table

CREATE DATABASE SqlHints101PerfTips2
GO
USE SqlHints101PerfTips2
GO
--Create Demo Table Customers
CREATE TABLE dbo.Customers (
	CustomerId INT IDENTITY(1,1) NOT NULL,
    FirstName NVARCHAR(50), 
	LastName  NVARCHAR(50),
	PhoneNumber VARCHAR(10),
	EmailAddress NVARCHAR(50),
	CreationDate DATETIME
)
GO
--Populate 1 million dummy customer records
INSERT INTO dbo.Customers (FirstName, LastName, PhoneNumber, EmailAddress, CreationDate)
SELECT TOP 1000000 REPLACE(NEWID(),'-',''), REPLACE(NEWID(),'-',''), 
	CAST( CAST(ROUND(RAND(CHECKSUM(NEWID()))*1000000000+4000000000,0) AS BIGINT) AS VARCHAR(10)),
	REPLACE(NEWID(),'-','') + '@gmail.com',		
	DATEADD(HOUR,CAST(RAND(CHECKSUM(NEWID())) * 19999 as INT) + 1 ,'2006-01-01')
FROM sys.all_columns c1
		CROSS JOIN sys.all_columns c2
GO
--Update one customer record with some know values
UPDATE dbo.Customers 
SET FirstName = 'Basavaraj', LastName = 'Biradar', 
	PhoneNumber = '4545454545', EmailAddress = 'basav@gmail.com'
WHERE CustomerId = 100000
GO
--Create a PK and a Clustered Index on CustomerId column
ALTER TABLE dbo.Customers 
ADD CONSTRAINT PK_Customers_CustomerId 
PRIMARY KEY CLUSTERED (CustomerId)
GO

Create a Non-Clustered index on the DateOfBirth column of the Customers Table by executing following statement:

--Create a non-clustered index on FirstName column
CREATE NONCLUSTERED INDEX IX_Customers_FirstName
	on dbo.Customers (FirstName)
GO

First let us now enable the execution plan in the Sql Server Management Studio by pressing the key stroke CTRL + M and also enable the IO and TIME statistics by executing the following statement

SET STATISTICS IO,TIME ON

Use Case Scenario

Asssume that we need to write a query to return all the customers whose FirstName starts with Basav. For this requirement we can write a query like below:

SELECT *
FROM  dbo.Customers
WHERE LEFT(FirstName,5) = 'Basav'

Let us execute the above query and observe the result, execution plan and IO and Time Statistics.

RESULT:
Performance Issue Function on Index Column Result

Let us go to the Execution Plan tab of the result and see the execution plan.

Execution Plan
Performance Issue Function on Index Column Execution Plan

From the above execution plan, we can see that the index IX_Customers_FirstName on FirstName column is used, but it is doing an Index Scan instead of Index Seek. Let us hover the mouse over the Index Scan node in the execution plan and and observe the node properties

Performance Issue Function on Index Column Execution Plan Properties

In the node properties we can clearly see that index IX_Customers_FirstName on FirstName column is used but it is doing an index scan, because of this we can see Number of Rows Read as one million records.

Let us go to Messages tab of the result and go over the IO and TIME statistics.

IO and TIME Statistics
Performance Issue Function on Index Column IO and TIME

Why Index Scan?

Using function on the Index column in the where caluse causes the function to be evaluated against each row and thus forcing optimizer not to use the Index.

What is the solution?

One solution in this case is to re-write the query using LIKE clause as below instead of using the LEFT function.

SELECT *
FROM  dbo.Customers
WHERE FirstName LIKE 'Basav%' 

Execute the above re-written query and observe the Result, execution plan and IO and Time stats

RESULT:
Performance Issue Function on Index Column Result After Fix

From the above result we can see that there is no change in the result returned from the re-written query.

Execution Plan
Performance Issue Function on Index Column Execution Plan After Fix

Now from the execution plan we can see that it is doing the Index Seek of the non-clustered index IX_Customers_FirstName on FirstName column. Also the execution plan is very simplified and un-necessary parallelism and other overheads are no-longer present. Let us hover the mouse over the Index Seek node in the execution plan and and observe the node properties

Performance Issue Function on Index Column Execution Plan Properties After Fix

In the properties pop-up now we see the seek of the index IX_Customers_FirstName on FirstName column and also Number of Rows Read has reduced from one million rows to one row.

IO and Time Statistics
Performance Issue Function on Index Column IO and TIME After Fix

We can see that Logical reads have been reduced from 10341 to 8, CPU time has been reduced from 233 millseconds to 0 milliseconds and query elapsed time has reduced from 173 milliseconds to 71 milliseconds.

Comparison of the Execution Plan

Side By Side Comparision

From the above execution plan comparison, we can see that query with function on an index column has 99% cost, where as the query without function on an index column has 1% cost

Execute the following statement to stop displaying the IO and TIME statistics when query is executed

SET STATISTICS IO,TIME OFF

Conclusion:

In Sql Server in majority of the scenario using a function on an Indexed Column in the WHERE caluse leads to an Index/Table Scan instead of an Index Seek. If you are using function on an index column in the WHERE clause, verify the execution plan and if it is resulting in un-desired results then try to re-write the query without using the function on a Index Column in the WHERE clause.