Tag Archives: Sql

Do we need to include a Clustered Index Column too in a Non-Clustered Index to make it as a Covering Index for a Query? Tip 7: Sql Server 101 Performance Tuning Tips and Tricks

Answer to this question is NO, because by default when we create a Non-Clustered Index in Sql Server it adds the Clustered Index Key Column values too in the Non-Clustered Index.

Let us understand this with an example. Let us create a Customer table with Clustered Index on the CustomerId column as shown in the below image with sample 35K records by executing the following script.

Performance Issue Function on Index Column Customers Demo Table

CREATE DATABASE SqlHints101PerfTips7
GO
USE SqlHints101PerfTips7
GO
--Create Demo Table Customers
CREATE TABLE dbo.Customers (
	CustomerId INT IDENTITY(1,1) NOT NULL,
        FirstName VARCHAR(50), 
	LastName  VARCHAR(50),
	PhoneNumber VARCHAR(10),
	EmailAddress VARCHAR(50),
	CreationDate DATETIME
)
GO
--Populate 35K dummy customer records
INSERT INTO dbo.Customers (FirstName, LastName, PhoneNumber, EmailAddress, CreationDate)
SELECT TOP 35000 REPLACE(NEWID(),'-',''), REPLACE(NEWID(),'-',''), 
	CAST( CAST(ROUND(RAND(CHECKSUM(NEWID()))*1000000000+4000000000,0) AS BIGINT) AS VARCHAR(10)),
	REPLACE(NEWID(),'-','') + '@gmail.com',		
	DATEADD(HOUR,CAST(RAND(CHECKSUM(NEWID())) * 19999 as INT) + 1 ,'2006-01-01')
FROM sys.all_columns c1
		CROSS JOIN sys.all_columns c2
GO
--Update one customer record with some known values
UPDATE dbo.Customers 
SET FirstName = 'Basavaraj', LastName = 'Biradar', 
    PhoneNumber = '4545454545', EmailAddress = 'basav@gmail.com'
WHERE CustomerId = 10000
GO
--Create a PK and a Clustered Index on CustomerId column
ALTER TABLE dbo.Customers 
ADD CONSTRAINT PK_Customers_CustomerId 
PRIMARY KEY CLUSTERED (CustomerId)
GO

Assume that we have a requirement to get the CustomerId for the given customer FirstName and LastName. We can write a query like below to get the CustomerId by Customers FirstName and LastName. Let us now enable the execution plan in the Sql Server Management Studio by pressing the key stroke CTRL + M and then execute the following query.

SELECT CustomerId
FROM dbo.Customers WITH(NOLOCK)
WHERE FirstName = 'Basavaraj' AND LastName = 'Biradar'

RESULT:
TIP 7 Result of a Clustered Index Scan Query
EXECUTION PLAN:
TIP 7 Execution Plan Clustered Index Scan

From the above execution plan we can see that this query is doing Clustered Index Scan (i.e. Table Scan). Now let us create a covering index (i.e. an index containing all queried columns) for this query so that it gives result from this Covering Index itself without requiring it to be looked-up into the Clustered Index. Now from the query we can see that it is referring to three columns i.e. FirstName, LastName and CustomerId. Out of these three columns CustomerId column is a Clustered Index column and other two columns (i.e. FirstName and LastName) are non-indexed column. Now comes doubt in the mind, to make a new index a Covering Index for the above query do I need to add the CustomerId column too in the list of columns in the new index apart from FirstName and LastName columns. The answer to this question is, we don’t need to add CustomerId column to the new index as it is a Clustered Index column. Let us understand this with an example. Now create a Composite non-clustered index on the FirstName and LastName column without CustomerId column by executing the following script.

--Create a non-clustered index on FirstName and LastName column
CREATE NONCLUSTERED INDEX IX_Customers_FirstName_LastName
    ON dbo.Customers (FirstName, LastName)
GO

Now execute the above select statement once again and verify whether this new index is a Covering index for this query

SELECT CustomerId
FROM dbo.Customers WITH(NOLOCK)
WHERE FirstName = 'Basavaraj' AND LastName = 'Biradar'

RESULT:
TIP 7 Execution Plan Covering Index

From the above execution plan we can see that Sql Server is doing the Seek of the newly created composite index IX_Customers_FirstName_LastName and we don’t see any key look-up into the Clustered index. So, this index is a covering index for the above query even though CustomerId column refereed in the SELECT statement is not part of the non-clustered index.

Let us inspect the index pages of the non-clustered index to see how Sql Server is adding Clustered Index Key Column values too in the Non-Clustered Index.

To explain this I will be using the below function which I have Created and explained in the previous article How to get an Index’s Root Page, Intermediate Pages and Leaf Pages Information?. This function uses the un-documented (i.e. feature that may change or removed without any notice or may produce un-expected result. So avoid using it in the Production environment) Dynamic Management Function sys.dm_db_database_page_allocations which is available from Sql Server 2012 onwards.

CREATE FUNCTION dbo.GetPagesOfBPlusTreeLevel(
    @DBName VARCHAR(100), @TableName VARCHAR(100) = NULL, @IndexName VARCHAR(100) = NULL, 
    @PartionId INT = NULL, @MODE VARCHAR(20), @BPlusTreeLevel VARCHAR(20) 
)
RETURNS
@IndexPageInformation TABLE (
    [DataBase] VARCHAR(100), [Table] VARCHAR(100), [Index] VARCHAR(100), 
    [partition_id] INT, [file_id] INT, [page_id] INT, page_type_desc VARCHAR(100), 
    page_level INT, [previous_page_id] INT, [next_page_id] INT)
AS
BEGIN
 
    DECLARE @MinPageLevelId INT = 0 , @MaxPageLevelId INT = 0, @IndexId INT = NULL
         
    SELECT  @IndexId = index_id
    FROM sys.indexes
    WHERE OBJECT_NAME(object_id) = @TableName AND name = @IndexName
 
    IF @IndexId IS NULL
        RETURN
     
    IF @BPlusTreeLevel IN ('Root', 'Intermediate')  
    BEGIN  
        SELECT  @MaxPageLevelId = (CASE WHEN  @BPlusTreeLevel ='Intermediate' THEN MAX(page_level) - 1 ELSE MAX(page_level) END), 
                @MinPageLevelId = (CASE WHEN  @BPlusTreeLevel ='Intermediate' THEN 1 ELSE MAX(page_level) END)
        FROM sys.dm_db_database_page_allocations 
            (DB_ID(@DBName), OBJECT_ID(@TableName), @IndexId, @PartionId, 'DETAILED') PA
                    LEFT OUTER JOIN sys.indexes SI 
                ON SI.object_id = PA.object_id AND SI.index_id = PA.index_id
        WHERE is_allocated = 1 AND page_type in (1,2)  -- INDEX_PAGE and DATA_PAGE Only
 
        IF @MaxPageLevelId IS NULL OR @MaxPageLevelId = 0 
            RETURN
    END
  
    INSERT INTO @IndexPageInformation
    SELECT DB_NAME(PA.database_id) [DataBase], OBJECT_NAME(PA.object_id) [Table], SI.Name [Index], 
        [partition_id], allocated_page_file_id [file_id],  allocated_page_page_id [page_id], page_type_desc, 
        page_level, previous_page_page_id [previous_page_id], next_page_page_id [next_page_id]
    FROM sys.dm_db_database_page_allocations 
            (DB_ID(@DBName), OBJECT_ID(@TableName), @IndexId, @PartionId, 'DETAILED') PA
             LEFT OUTER JOIN sys.indexes SI 
            ON SI.object_id = PA.object_id AND SI.index_id = PA.index_id
    WHERE is_allocated = 1 AND page_type in (1,2) -- INDEX_PAGE and DATA_PAGE Only
            AND page_level between @MinPageLevelId AND @MaxPageLevelId
    ORDER BY page_level DESC, previous_page_page_id
 
    RETURN
END

Inspect the content of the Non-Culstered Index B+ tree’s ROOT Page

Let us use the above function GetPagesOfBPlusTreeLevel to get the Root page of a Non-Clustered Index. Execute the following statement to get the Root page of the Non-Clustered Index IX_Customers_FirstName_LastName on the Customers table.

SELECT *
FROM dbo.GetPagesOfBPlusTreeLevel ('SqlHints101PerfTips7', 
	'Customers', 'IX_Customers_FirstName_LastName',
	 NULL, 'DETAILED', 'Root')

RESULT:
TIP 7 Root Page of a Non-Clustered Index

Let us use the DBCC PAGE Command which I have explained in the previous article How to Inspect the content of a DATA PAGE or INDEX PAGE? to inspect the content of the Root Page with FileId:1 and PageId:2810 of the Non-Clustered Index. Note: DBCC PAGE is an un-documented (i.e. feature that may change or removed without any notice or may produce un-expected result. So avoid using it in the Production environment) command.

DBCC PAGE ('SqlHints101PerfTips7', 1, 2810, 3) WITH TABLERESULTS

RESULT:
TIP 7 Non-Clustered Index Root Page Content

From the above result we can see that Clustered Index Key Column CustomerId value corresponding to the the Non-Clustered Index Key Columns is added in the non-clustered Index Root Page.

Inspect the content of the Non-Culstered Index B+ tree’s INTERMEDIATE Page

Let us use execute the following statement to get the INTERMEDIATE pages of a Non-Clustered Index IX_Customers_FirstName_LastName on the Customers table.

SELECT *
FROM dbo.GetPagesOfBPlusTreeLevel ('SqlHints101PerfTips7', 
	'Customers', 'IX_Customers_FirstName_LastName',
	 NULL, 'DETAILED', 'Intermediate')

RESULT:
TIP 7 Intermediate Pages of a Non-Clustered Index

From the above result we can see that there are four Intermediate Level Pages for a Non-Clustered Index. Let us use the DBCC PAGE Command to inspect the content of the one of the Intermediate Level Page with FileId:1 and PageId:2811 of the Non-Clustered Index.

DBCC PAGE ('SqlHints101PerfTips7', 1, 2811, 3) WITH TABLERESULTS

RESULT:
TIP 7 Non-Clustered Index Intermediate Page Content

From the above content of the Non-Clustered Index Intermediate Page, we can see that Clustered Index Key Column CustomerId value corresponding to the the Non-Clustered Index Key Columns is added in the non-clustered Index Intermediate Level Page.

Inspect the content of the Non-Culstered Index B+ tree’s LEAF Page

Let us use execute the following statement to get all the LEAF Pages of a Non-Clustered Index IX_Customers_FirstName_LastName on the Customers table.

SELECT *
FROM dbo.GetPagesOfBPlusTreeLevel ('SqlHints101PerfTips7', 
	'Customers', 'IX_Customers_FirstName_LastName',
	 NULL, 'DETAILED', 'Leaf')

RESULT:
TIP 7 Leaf Pages of a Non-Clustered Index

In this case there are 347 leaf pages. Let us inspect the content of the one of the Leaf Level Page with FileId:1 and PageId:3000 of the Non-Clustered Index.

DBCC PAGE ('SqlHints101PerfTips7', 1, 3000, 3) WITH TABLERESULTS

RESULT:
TIP 7 Leaf Page of a Non-Clustered Index

From the above content of the Non-Clustered Index Leaf Page, we can see that Clustered Index Key Column CustomerId value corresponding to the the Non-Clustered Index Key Columns is added in the non-clustered Index Leaf Level Page.

Conclusion:

From the above examples we can see that when we create a Non-Clustered Index, Sql Server adds the Clustered Index Key Column values Corresponding to the Non-Clustered Index Key columns in the Index Root, Intermediate and Leaf Level Pages. So, we don’t explicitly need to include a Clustered Index Key Column in a Non-Clustered Index to make it as a Covering Index for a Query. In case of an unique Non-Clustered Index the Root and Intermediate level pages will not have the Clustered Index Key Column Values in it, but it’s Leaf Page will have the Clustered Index Key Column value. In this way again for Unique Non-Clustered Index also we don’t explicitly need to add Clustered Index Key Column.

How to Inspect the content of a DATA PAGE or INDEX PAGE ? Tip 6: Sql Server 101 Performance Tuning Tips and Tricks

In the previous articles “How to find the list of all Pages that belongs to a Table and Index?” and “How to get an Index’s Root Page, Intermediate Pages and Leaf Pages Information?” I have explained how to get the pages of a Table/Index. In this Article I will explain how we can inspect the content of these Pages (i.e. DATA PAGE or INDEX PAGE).

We can use the DBCC PAGE command to look into the content of any page in Sql Server. DBCC PAGE command is an Sql Server un-documented (i.e. feature that may change or removed without any notice or may produce un-expected result. So avoid using it in the Production environment.) feature.

Before we look into the DBCC PAGE command, let us look into the Sql Server Page Structure

In Sql Server Each Page size is 8KB. A page is tied to single object, it can’t be shared. In each page 96 bytes is reserved for the Page Header information and remaining 8096 bytes is Page Body which actually contains the Table/Index data. Page will also have a row offset array (Page Slot Array). The slot array is a list of 2 byte pointers to the beginning of each row. So if there are two rows in the table then this array will have two items in it and each one pointing to the beginning of their corresponding row. For clustered Index common assumption is that, data is stored physically in the sorted order as per the clustered index key column value. This is partially correct, because across the pages the records are sorted in a sorted order as per Clustered Index Key Column value but within a data page the data is not stored in a sorted order, instead the slot array is used to maintain the sorting of the records. When we insert a new record as per the sorting order it is inserted into the correct data page, but within data page it is just appended at the end of the last record. But actually the slot array maintains the sorting order by pointing to the rows as per the index columns sorted value. For example as shown in the below image, the row offset array’s first element is pointing to the second row in the data page where as the second slot array item is pointing to the first row stored on the page, because as per the index column value sorted order of second row comes first and first row comes second. Sql Server fetches the records from the Data Page as per the order of the rows pointed by the Slot Array pointers. So, records on the page are not sorted but the slot array values are stored in the sorted order as per the Clustered Index Key Column value.

Sql Server Page Structure Explained

Some of the important attributes which are maintained in the Page Header are highlighted in the above Sql Server Page Structure and below is the description of it. The above image is the also shows the content of the Clustered Index Root Page:

  • m_pageId = (1:968) : This is the id of the page which we are inspecting. In Sql Server Pages it is stored as FileId:PageId. Here 1 is FileId and 968 is the PageId
  • m_type = 2 Here m_type value 2 means it is an Index Page, where as value 1 means it is a Data Page and value 10 means it is an IAM page.
  • m_level = 2 This indicates the level of the page. For example it’s value 0 means it is an leaf level page, page with maximum m_level value with Previous and Next Page id value as NULL (i.e. (0:0)) means it is an Root Page. And pages whose m_level value in-between leaf page and root page m_level values are called intermediate pages. For more details on the Index Structure you can refer to the previous article.
  • m_prevPage = (0:0) It is the id of the previous page to the current page. Here (0:0) means there are no previous page for this page.
  • m_nextPage = (0:0) It is the id of the next page to the current page. Here (0:0) means there are no next page for this page.
  • m_slotCnt = 2 This attribute tells that there are two row offset array (i.e. Page Slot Array) elements pointing to two rows on the page.
  • m_freeCnt = 8070 This attribute tells the free available space on the page. In this case it is 8070 bytes.
  • m_ghostRecCnt = 0 When we delete a record, Sql Server doesn’t delete the record on the page. Instead it marks the record as ghost record and removes the pointer to the row from the slot array. This attribute maintains the count of Such ghost records on the page, so that it can be used by the Ghost Clean Process to remove these records later.

Syntax of DBCC PAGE Command:

DBCC PAGE ({@DatabaseId|@DatabaseName}, @FileId, @PageId 
            [,@OutPutOption={0|1|2|3}]) [WITH TABLERESULTS]

Parameters:

  • @DatabaseId|@DatabaseName : For the first parameter we can pass either the Database Id or Database Name.
  • @FileId : Id of the file which contains the Page which we want to Inspect.
  • @PageId : Id of the Page which we want to reflect.
  • @OutPutOption : This option specifies that, what information to be returned in the output of the DBCC PAGE command. Following are the possible values for this parameter:
    • 0-Returns the page header
    • 1-Returns the Page header, per row hex dump and Row Offset Array (i.e. page slot array).
    • 2-Returns the Page header and whole page hex dump
    • 3-Returns the Page header, per row hex dump and detailed per row explanation.
  • WITH TABLERESULTS: This clause is optional, this returns the output in tabular format. Without this option by default you will not be able to see the output of this DBCC command. This is because by default Sql Server sends the DBCC PAGE command output to the errorlog. To get the output of this command without the WITH TABLERESULTS clause to the current connection SSMS window, you need to enable the Trace flag 3604 by executing the command: DBCC TRACEON (3604). We can turn off this Trace flag by executing the command DBCC TRACEOFF (3604).

To explain how we can use the DBCC PAGE command to look into the content of Data/Index Page, let us create a Customer table as shown in the below image with 35K records. Execute the following script to Create the Customer Table with Clustered Index on the CustomerId column with sample 35K records. Note this is the same database and table which is created in the previous article, if you already have this database no-need to re-create it.

--Create Demo Database
CREATE DATABASE SqlHints101PerfTips5
GO
USE SqlHints101PerfTips5
GO
--Create Demo Table Customers
CREATE TABLE dbo.Customers (
	CustomerId INT IDENTITY(1,1) NOT NULL,
	FirstName VARCHAR(50), LastName  VARCHAR(50),
	PhoneNumber VARCHAR(10), EmailAddress VARCHAR(50),
	CreationDate DATETIME
)
GO
--Populate 35K dummy customer records
INSERT INTO dbo.Customers (FirstName, LastName, PhoneNumber, EmailAddress, CreationDate)
SELECT TOP 35000 REPLACE(NEWID(),'-',''), REPLACE(NEWID(),'-',''), 
    CAST( CAST(ROUND(RAND(CHECKSUM(NEWID()))*1000000000+4000000000,0) AS BIGINT) AS VARCHAR(10)),
    REPLACE(NEWID(),'-','') + '@gmail.com',     
    DATEADD(HOUR,CAST(RAND(CHECKSUM(NEWID())) * 19999 as INT) + 1 ,'2006-01-01')
FROM sys.all_columns c1
        CROSS JOIN sys.all_columns c2
GO
--Create a PK and a Clustered Index on CustomerId column
ALTER TABLE dbo.Customers 
ADD CONSTRAINT PK_Customers_CustomerId 
PRIMARY KEY CLUSTERED (CustomerId)

As explained in the previous article we can use Dynamic Management Function sys.dm_db_database_page_allocations to get the list of all the pages associated with a Table/Index. Note sys.dm_db_database_page_allocations is an un-documented (i.e. feature that may change or removed without any notice or may produce un-expected result) DMF. So avoid using it in the Production environment.

Get all the Pages Associated with the Clustered Index PK_Customers_CustomerId

To get all the Pages associated with the Index PK_Customers_CustomerId of the Customers table, we need to pass the @DatabaseId, @TableId, @IndexId and @Mode Parameter values of the DMF sys.dm_db_database_page_allocations. We can use sys.indexes catalog view as shown below to get the Index Id for a Index.

SELECT OBJECT_NAME(object_id) table_name, object_id, 
     name index_name, index_id, type, type_desc
FROM sys.indexes
WHERE OBJECT_NAME(object_id) = 'Customers'
	AND name = 'PK_Customers_CustomerId'

RESULT:
Use sys.indexes to get the indexid for the index

As explained in the previous article:

  • Clustered Index will always will have the Index Id as 1.
  • Index Id for the Non-Clustered Index will be >=2.
  • Index Id for the Heap Table is 0.

We can execute the below query to get all the pages associated with the Clustered Index PK_Customers_CustomerId

SELECT DB_NAME(PA.database_id) [DataBase], 
    OBJECT_NAME(PA.object_id) [Table], SI.Name [Index], 
    is_allocated, allocated_page_file_id [file_id], 
    allocated_page_page_id [page_id], page_type_desc, 
    page_level, previous_page_page_id [previous_page_id], 
    next_page_page_id [next_page_id]
FROM sys.dm_db_database_page_allocations 
    (DB_ID('SqlHints101PerfTips5'), 
         OBJECT_ID('Customers'), 1, NULL, 'DETAILED') PA
         LEFT OUTER JOIN sys.indexes SI 
        ON SI.object_id = PA.object_id 
                   AND SI.index_id = PA.index_id
WHERE is_allocated = 1  
	and page_type in (1,2) -- INDEX_PAGE and DATA_PAGE Only
ORDER BY page_level DESC, is_allocated DESC,
         previous_page_page_id

RESULT:
Clustered Index All Pages

As shown in the above image the Root Page File Id is 1 and Page Id is 968. Root Page is the one with maximum Page_level (here in this example it is 2) having previous_page_id and next_page_id as NULL. Intermediate Pages File Id is 1 and Page Id’s are: 2016 and 2032. Intermediate pages are the one whose Page_Level is less than the root page level and greater than 0 (i.e. Leaf Page Level (i.e. 0) < Intermediate Page level (here it is 1) < Root Page level (here it is 2)). And leaf level pages are the the ones whose page level is 0, here the leaf level pages File Id is 1 and Page Id's are 1760, 1761, 1762, 1763, 1764 etc.

Example 1: Let us inspect the Clustered Index Root Page with File Id:1 and Page Id:968 using DBCC PAGE Command

DBCC PAGE ('SqlHints101PerfTips5',1,968,3) WITH TABLERESULTS

RESULT:
DBCC PAGE OUTPUT For The Root Page of a Clustered Index

From the above result we can see that the Index Root Page has Page Header Information and hex dump of the two INDEX_RECORD. The second result set shows these two Index records. In this result set, we can see that Sql Server is storing Clustered Index Key Column CustomerId value and the Child Page Id. Here the first record’s CustomerId (i.e. Index Key) Column value is NULL and it’s Child Page Id is 2016 and Second Record’s CustomerId (i.e. Index key) column value is 26245 and it’s Child Page Id is 2032. As the m_level (i.e. Page level) of this clustered index root page value is 2, it means the child page is an Index Intermediate page. It means the the Child Page 2016 contains the CustomerId records whose value is NULL <= CustomerId < 26244. And the child page id 2032 contains the CustomerId records whose value is 26245 <= Customer <= max(CustomerId)

Example 2: Let us inspect the Clustered Index Intermediate Page with File Id:1 and Page Id:2016 using DBCC PAGE Command

DBCC PAGE ('SqlHints101PerfTips5',1,2016,3) WITH TABLERESULTS

RESULT:
Intermediate Page of a Clustered Index

From the above result we can see that this Clustered Index’s Intermediate Page has Page Header Information and hex dump of INDEX_RECORD’s. Here the m_type is 2, it means it is an Index Page. m_level value is 1, it means the child page of this page will be a leaf page with m_level value as 0. And the leaf level page type will be data page, as it is a Clustered Index contains. Here the m_slotCnt is 480, it means there are 480 Index Records on this Intermediate Page pointing to 480 child leaf level Data Pages.

The second result set shows these 480 Index records. In this result set, we can see that Sql Server is storing Clustered Index Key Column CustomerId value and the Child Page Id. Here the first record’s CustomerId (i.e. Index Key) Column value is NULL and it’s Child Page Id is 1760. Second Record’s CustomerId (i.e. Index key) column value is 56 and it’s Child Page Id is 1761 and So On. It means the the Child Page 1760 contains the table records whose CustomerId value is NULL <= CustomerId < 56. Similarly, the child page id 1761 contains the table records whose CustomerId value is 56 <= CustomerId < 111 and so on.

Example 3: Let us inspect the Clustered Index Leaf Page (i.e. Data Page) with File Id:1 and Page Id:1760 using DBCC PAGE Command

Leaf Page of a Clustered Index

Page Header: m_type value as shown above is 1, it means it is a data page. m_level value as 0, means it is leaf level page. m_slotCnt is 55, it means this page has 55 records of the table. As it is the first leaf page, it doesn’t have a previous page, but it has a next page (1:1761). The pages are linked by doubly linked list, each page point’s to previous and next page in Sql Server.

Page Body: As shown above we can see the complete row values for the Customers table are stored in the page. So, the leaf pages of a Clustered Index contains the actual table data.

How to get an Index’s Root Page, Intermediate Pages and Leaf Pages Information? Tip 5: Sql Server 101 Performance Tuning Tips and Tricks

In this article, I will explain you on How to get an Index’s Root Page, Intermediate Pages and Leaf Pages Information. Prior to that let us get some insights into the Index structure.

In Sql Server an index is made up of a set of pages (index nodes) that are organized in a B+ tree structure. This structure is hierarchical in nature. The top node is called the ROOT node (Root Page). The bottom level of nodes in the index are called the leaf nodes (leaf Pages). Any index levels between the root and the leaf nodes are collectively known as intermediate levels. In a clustered index, the leaf nodes contain the data pages of the underlying table. The root and intermediate level nodes contain index pages holding index rows. Each index row contains a key value and a pointer to either an intermediate level page in the B-tree, or a data row in the leaf level of the index. The pages in each level of the index are linked in a doubly-linked list. In case of Clustered Index, data in a table is stored in a sorted order by the Clustered Index key Column. As a result, there can be only one clustered index on a table or view. In addition, data in a table is sorted only if a clustered index has been defined on a table.

Non-Clustered Index structure is similar to that of a B+ Tree Structure of Clustered Index, but the leaf nodes of a Non-Clustered index contain only the values from the indexed columns and row locators that point to the actual data rows, rather than contain the data rows themselves. Because of this in case of Non-Clustered Index, Sql Server takes additional steps to locate the actual data. The row locators structure depends on whether underlying table is a clustered table (Table with Clustered Index) or a heap table (Table without Clustered Index). In case the underlying table is a Clustered table the row locator points to the Clustered Index key value corresponding to the Non-Clustered Key and it is used to look-up into the Clustered Index to navigate to the actual data row. If underlying table is a heap table, the row locator points to the actual data row address. In case of Non-Clustered Index, the data rows of the underlying table are not sorted and stored in order based on their Non-Clustered keys.

B+ Tree Structure of a Clustered Index

Structure of Clustered Index

Let us explain how to get an Index’s Root Page, Intermediate Pages and Leaf Pages Information with an example

To explain how to get an Index’s Root Page, Intermediate Pages and Leaf Pages Information, let us create a Customer table as shown in the below image with 35K records. Execute the following script to Create the Customer Table with Clustered Index on the CustomerId column with sample 35K records.

--Create Demo Database
CREATE DATABASE SqlHints101PerfTips4
GO
USE SqlHints101PerfTips4
GO
--Create Demo Table Customers
CREATE TABLE dbo.Customers (
	CustomerId INT IDENTITY(1,1) NOT NULL,
	FirstName VARCHAR(50), LastName  VARCHAR(50),
	PhoneNumber VARCHAR(10), EmailAddress VARCHAR(50),
	CreationDate DATETIME
)
GO
--Populate 35K dummy customer records
INSERT INTO dbo.Customers (FirstName, LastName, PhoneNumber, EmailAddress, CreationDate)
SELECT TOP 35000 REPLACE(NEWID(),'-',''), REPLACE(NEWID(),'-',''), 
    CAST( CAST(ROUND(RAND(CHECKSUM(NEWID()))*1000000000+4000000000,0) AS BIGINT) AS VARCHAR(10)),
    REPLACE(NEWID(),'-','') + '@gmail.com',     
    DATEADD(HOUR,CAST(RAND(CHECKSUM(NEWID())) * 19999 as INT) + 1 ,'2006-01-01')
FROM sys.all_columns c1
        CROSS JOIN sys.all_columns c2
GO
--Create a PK and a Clustered Index on CustomerId column
ALTER TABLE dbo.Customers 
ADD CONSTRAINT PK_Customers_CustomerId 
PRIMARY KEY CLUSTERED (CustomerId)

As explained in the previous article we can use Dynamic Management Function sys.dm_db_database_page_allocations to get the list of all the pages associated with an Index. Note sys.dm_db_database_page_allocations is an un-documented (i.e. feature that may change or removed without any notice or may produce un-expected result) DMF. So avoid using it in the Production environment.

Get all the Pages Associated with the Clustered Index PK_Customers_CustomerId

To get all the Pages associated with the Index PK_Customers_CustomerId of the Customers table, we need to pass the @DatabaseId, @TableId, @IndexId and @Mode Parameter values of the DMF sys.dm_db_database_page_allocations. We can use sys.indexes catalog view as shown below to get the Index Id for a Index.

SELECT OBJECT_NAME(object_id) table_name, object_id, 
     name index_name, index_id, type, type_desc
FROM sys.indexes
WHERE OBJECT_NAME(object_id) = 'Customers'
	AND name = 'PK_Customers_CustomerId'

RESULT:
Use sys.indexes to get the indexid for the index

As explained in the previous article:

  • Clustered Index will always will have the Index Id as 1.
  • Index Id for the Non-Clustered Index will be >=2.
  • Index Id for the Heap Table is 0.

We can execute the below query to get all the pages associated with the Clustered Index PK_Customers_CustomerId

SELECT DB_NAME(PA.database_id) [DataBase], 
    OBJECT_NAME(PA.object_id) [Table], SI.Name [Index], 
    is_allocated, allocated_page_file_id [file_id], 
    allocated_page_page_id [page_id], page_type_desc, 
    page_level, previous_page_page_id [previous_page_id], 
    next_page_page_id [next_page_id]
FROM sys.dm_db_database_page_allocations 
    (DB_ID('SqlHints101PerfTips5'), 
         OBJECT_ID('Customers'), 1, NULL, 'DETAILED') PA
         LEFT OUTER JOIN sys.indexes SI 
        ON SI.object_id = PA.object_id 
                   AND SI.index_id = PA.index_id
WHERE is_allocated = 1  
	and page_type in (1,2) -- INDEX_PAGE and DATA_PAGE Only
ORDER BY page_level DESC, is_allocated DESC,
         previous_page_page_id

RESULT:
Clustered Index All Pages

As shown in the above image the Root Page File Id is 1 and Page Id is 968. Root Page is the one with maximum Page_level (here in this example it is 2) having previous_page_id and next_page_id as NULL. Intermediate Pages File Id is 1 and Page Id’s are: 2016 and 2032. Intermediate pages are the one whose Page_Level is less than the root page level and greater than 0 (i.e. Leaf Page Level (i.e. 0) < Intermediate Page level (here it is 1) < Root Page level (here it is 2)). And leaf level pages are the the ones whose page level is 0.

We can use the below function to get the pages of the different levels of a B+ Tree Index:

We can use the below function to get the B+ Tree Clustered/Non-Clustered Index Root Page, Intermediate Level Pages and Leaf Level Pages. Note I am using sys.dm_db_database_page_allocations un-documented (i.e. feature that may change or removed without any notice or may produce un-expected result) DMF. So avoid using it in the Production environment.

CREATE FUNCTION dbo.GetPagesOfBPlusTreeLevel(
	@DBName VARCHAR(100), @TableName VARCHAR(100) = NULL, @IndexName VARCHAR(100) = NULL, 
	@PartionId INT = NULL, @MODE VARCHAR(20), @BPlusTreeLevel VARCHAR(20) 
)
RETURNS
@IndexPageInformation TABLE (
	[DataBase] VARCHAR(100), [Table] VARCHAR(100), [Index] VARCHAR(100), 
	[partition_id] INT, [file_id] INT, [page_id] INT, page_type_desc VARCHAR(100), 
	page_level INT, [previous_page_id] INT, [next_page_id] INT)
AS
BEGIN

	DECLARE @MinPageLevelId INT = 0 , @MaxPageLevelId INT = 0, @IndexId INT = NULL
		
	SELECT	@IndexId = index_id
	FROM sys.indexes
	WHERE OBJECT_NAME(object_id) = @TableName AND name = @IndexName

	IF @IndexId IS NULL
		RETURN
	
	IF @BPlusTreeLevel IN ('Root', 'Intermediate') 	
	BEGIN 	
		SELECT  @MaxPageLevelId = (CASE WHEN  @BPlusTreeLevel ='Intermediate' THEN MAX(page_level) - 1 ELSE MAX(page_level) END), 
				@MinPageLevelId = (CASE WHEN  @BPlusTreeLevel ='Intermediate' THEN 1 ELSE MAX(page_level) END)
		FROM sys.dm_db_database_page_allocations 
			(DB_ID(@DBName), OBJECT_ID(@TableName), @IndexId, @PartionId, 'DETAILED') PA
 					LEFT OUTER JOIN sys.indexes SI 
				ON SI.object_id = PA.object_id AND SI.index_id = PA.index_id
		WHERE is_allocated = 1 AND page_type in (1,2)  -- INDEX_PAGE and DATA_PAGE Only

		IF @MaxPageLevelId IS NULL OR @MaxPageLevelId = 0 
			RETURN
	END
 
	INSERT INTO @IndexPageInformation
	SELECT DB_NAME(PA.database_id) [DataBase], OBJECT_NAME(PA.object_id) [Table], SI.Name [Index], 
		[partition_id], allocated_page_file_id [file_id],  allocated_page_page_id [page_id], page_type_desc, 
		page_level, previous_page_page_id [previous_page_id], next_page_page_id [next_page_id]
	FROM sys.dm_db_database_page_allocations 
			(DB_ID(@DBName), OBJECT_ID(@TableName), @IndexId, @PartionId, 'DETAILED') PA
			 LEFT OUTER JOIN sys.indexes SI 
			ON SI.object_id = PA.object_id AND SI.index_id = PA.index_id
	WHERE is_allocated = 1 AND page_type in (1,2) -- INDEX_PAGE and DATA_PAGE Only
			AND page_level between @MinPageLevelId AND @MaxPageLevelId
	ORDER BY page_level DESC, previous_page_page_id

	RETURN
END

How to get an Index’s Root Page Information?

We can use the above function like below to get the root page of the Clustered Index PK_Customers_CustomerId on the Customers Table in the SqlHints101PerfTips5 Database.

SELECT *
FROM dbo.GetPagesOfBPlusTreeLevel 
	('SqlHints101PerfTips5', 'Customers', 
	 'PK_Customers_CustomerId', NULL, 'DETAILED', 'Root')

RESULT:
How to Get Root Page of B Tree Index

There will be always on Root Page. Root Page is the one with maximum Page_level having previous_page_id and next_page_id as NULL. If the number of records in the table are very less and just one page is enough to store it. Then Sql Server doesn’t create the B+ Tree structure. So, in such scenario, the above function will not return the Root Page information.

How to get an Index’s Intermediate Pages Information?

We can use the above function like below to get the Index’s Intermediate pages of the Clustered Index PK_Customers_CustomerId on the Customers Table in the SqlHints101PerfTips5 Database.

SELECT *
FROM dbo.GetPagesOfBPlusTreeLevel 
	('SqlHints101PerfTips5', 'Customers', 
	 'PK_Customers_CustomerId', NULL, 'DETAILED', 'Intermediate')

RESULT:
How to Get Intermediate Pages of a B Tree Index

Intermediate pages are the one whose Page_Level is less than the root page level and greater than Leaf Page Level (i.e. 0) (i.e. Leaf Page Level < Intermediate Page level < Root Page level). In the following situations an Index will not have intermediate pages: 1) If the number of records in the table are very less and just one page is enough to store it. Then Sql Server doesn't create the B+ Tree structure. 2) If the number records stored in the table are very less and it is spanning only couple of pages in such scenario Sql Server can have B+ Tree Index with just Root Page and the Leaf Level Pages. In such scenarios this function will not return any information. If we have a very large Table, then we can multiple Intermediate Levels.

How to get an Index’s Leaf Page Information?

We can use the above function like below to get the Index’s Leaf pages of the Clustered Index PK_Customers_CustomerId on the Customers Table in the SqlHints101PerfTips5 Database.

SELECT *
FROM dbo.GetPagesOfBPlusTreeLevel 
	('SqlHints101PerfTips5', 'Customers', 
	 'PK_Customers_CustomerId', NULL, 'DETAILED', 'Leaf')

RESULT:
How to Get Leaf Pages of a B Tree Index

In case of Clustered Index the Page_Type_Desc of Leaf Pages is DATA_PAGE where as in case of Non-Clustered Index it will be INDEX_PAGE.

In the Next Article I will explain how we can use the DBCC PAGE command to look into what actually Sql Server stores in the Page.