The objective of this article is to compare the legacy Large OBject (LOB) data type TEXT and the Sql Server 2005 introduced VARCHAR(MAX) LOB data type.
[ALSO READ] Difference Between Sql Server VARCHAR and VARCHAR(MAX)
TEXT | VarChar(MAX) |
Basic Definition | |
It is a Non-Unicode large Variable Length character data type, which can store maximum of 2147483647 Non-Unicode characters (i.e. maximum storage capacity is: 2GB). |
It is a Non-Unicode large Variable Length character data type, which can store maximum of 2147483647 Non-Unicode characters (i.e. maximum storage capacity is: 2GB). |
Version of the Sql Server in which it is introduced? | |
Text data type was present from the very old versions of Sql Server. If I remember correctly it was present even in Sql Server 6.5 days. |
VarChar(Max) data type was introduced in Sql Server 2005. |
Which one to Use? | |
As per MSDN link Microfost is suggesting to avoid using the Text datatype and it will be removed in a future versions of Sql Server. |
Varchar(Max) is the suggested data type for storing the large string values instead of Text data type. |
In-Row or Out-of-Row Storage? | |
Data of a Text type column is stored out-of-row in a separate LOB data pages. The row in the table data page will only have a 16 byte pointer to the LOB data page where the actual data is present. |
Data of a Varchar(max) type column is stored in-row if it is less than or equal to 8000 byte. If Varchar(max) column value is crossing the 8000 bytes then the Varchar(max) column value is stored in a separate LOB data pages and row will only have a 16 byte pointer to the LOB data page where the actual data is present. |
When LOB column value is less than 8000 bytes or available space in the row, then whether LOB column value is stored in-row or out-of-row? | |
Execute the following script to create a demo database SqlHintsLOBDemo if it doesn’t exists already. In the demo data base it creates a table TextTable with a Text LOB type column LargeString. Finally it inserts 100 records, where LargeString column value in each row is 4000 B characters (i.e. 4000 bytes). --Create a Demo Database IF DB_ID('SqlHintsLOBDemo') IS NULL CREATE DATABASE SqlHintsLOBDemo GO USE SqlHintsLOBDemo GO --Create a table with Text type --column CREATE TABLE dbo.TextTable ( Id INT IDENTITY(1,1), LargeString TEXT ) GO --INSERT 100 records, where --LargeString column value in --each row is 4000 B characters -- (i.e. 4000 bytes) INSERT INTO dbo.TextTable (LargeString) VALUES(REPLICATE('B', 4000)) GO 100 --Loop 100 times [ALSO READ] GO Statement can also be used to excute batch of T-Sql statement multiple times Execute the following statement to check whether the Text type column value is stored in-row or out-of-row: SELECT alloc_unit_type_desc, page_count FROM sys.dm_db_index_physical_stats (DB_ID('SqlHintsLOBDemo'), OBJECT_ID('dbo.TextTable'), NULL, NULL , 'DETAILED') From the above result we can see that even though one row (4 Byte for Integer Id column value + 4000 Bytes for Text type column value) can fit into one 8KB data page, but still as per design Sql Server always stores the Text type column value in the LOB data pages. Whether we are storing 1 byte or 2GB data in a Text type column Sql Server always stores the Text type column value out-of-row in the LOB data pages and the row will have a 16 byte pointer pointing to the LOB data pages where the data is stored. |
Execute the following script to create a demo database SqlHintsLOBDemo if it doesn’t exists already. In the demo data base it creates a table VarMaxTable with a VarChar(Max) LOB type column LargeString. Finally it inserts 100 records, where LargeString column value in each row is 4000 B characters (i.e. 4000 bytes). --Create a Demo Database IF DB_ID('SqlHintsLOBDemo') IS NULL CREATE DATABASE SqlHintsLOBDemo GO USE SqlHintsLOBDemo GO --Create a table with a --Varchar(Max) type column CREATE TABLE dbo.VarMaxTable ( Id INT IDENTITY(1,1), LargeString VARCHAR(MAX) ) GO --INSERT 100 records, where --LargeString column value in --each row is 4000 B characters --(i.e. 4000 bytes) INSERT INTO dbo.VarMaxTable (LargeString) VALUES(REPLICATE('B', 4000)) GO 100 Execute the following statement to check whether the VarChar(Max) type column value is stored in-row or out-of-row: SELECT alloc_unit_type_desc, page_count FROM sys.dm_db_index_physical_stats (DB_ID('SqlHintsLOBDemo'), OBJECT_ID('dbo.VarMaxTable'), NULL, NULL , 'DETAILED') From the above result we can see that LOB VarChar(MAX) type column value is stored in-row. For VarChar(MAX) type column Sql Server by default always tries to store the data in-row. Only if it is exceeding 8000 bytes or available space in the row, then only it stores out-of-row in a LOB data pages and in-row it will have 16 byte pointer to the LOB data pages where actual column value is stored. |
When LOB column value is more than 8000 bytes or available space in the row, then whether LOB column value is stored in-row or out-of-row? | |
Execute the following script to remove the previously inserted records and insert 100 records where LargeString column value in each row is 10,000 B characters (i.e. 10,000 bytes). --TRUNCATE the table to --remove all the previously --Inserted records TRUNCATE TABLE dbo.TextTable GO INSERT INTO dbo.TextTable (LargeString) VALUES(REPLICATE( CAST('B' AS VARCHAR(MAX)), 10000)) GO 100 Execute the following statement to check whether the Text type column value is stored in-row or out-of-row: The above result further re-affirms that: whether we are storing 1 byte or 2GB data in a Text type column Sql Server always stores the Text type column value out-of-row in the LOB data pages and the row will have a 16 byte pointer pointing to the LOB data pages where the data is stored. |
Execute the following script to remove the previously inserted records and insert 100 records where LargeString column value in each row is 10,000 B characters (i.e. 10,000 bytes). --TRUNCATE the table to --remove all the previously --Inserted records TRUNCATE TABLE dbo.VarMaxTable GO INSERT INTO dbo.VarMaxTable (LargeString) VALUES(REPLICATE( CAST('B' AS VARCHAR(MAX)), 10000)) GO 100 Execute the following statement to check whether the VarChar(Max) type column value is stored in-row or out-of-row: From the above result we can see that LOB VarChar(MAX) type column value is stored out-of-row in a LOB data pages. For VarChar(MAX) type column Sql Server by default always tries to store the data in-row. Only if it is exceeding 8000 bytes or available space in the row, then only it stores out-of-row in a LOB data pages having 16 byte pointer in-row pointing to LOB data pages where actual column value is stored. |
Do we have an option to change default In-Row and Out-Of-Row Storage behavior? | |
As we have already seen above whether we are storing 1 byte or 2GB data in a Text type column Sql Server always stores it out-of-row in the LOB data pages and the row will have a 16 byte pointer pointing to the LOB data pages where the data is stored. Sql Server provides a mechanism where we can change this default behavior of storing the data out-of-row even when we have a sufficient free space in the row to accommodate the Text type column value, by means of sp_tableoption system stored procedure with the option ‘text in row’. Execute the below statement to store the Text Type Column value in Row if Text Type column value is less than 7000 bytes or enough space is available in the row. EXEC sp_tableoption @TableNamePattern = 'dbo.TextTable', @OptionName = 'text in row', @OptionValue = 7000 The @OptionValue parameter value can be: Execute the following script to remove the previously inserted records and insert 100 records where LargeString column value in each row is 4,000 ‘B’ characters (i.e. 4,000 bytes). TRUNCATE TABLE dbo.TextTable GO INSERT INTO dbo.TextTable (LargeString) VALUES(REPLICATE('B', 4000)) GO 100 Execute the following statement to check whether the Text type column value is stored in-row or out-of-row: SELECT alloc_unit_type_desc, page_count FROM sys.dm_db_index_physical_stats (DB_ID('SqlHintsLOBDemo'), OBJECT_ID('dbo.TextTable'), NULL, NULL , 'DETAILED') From the above result now we can see that the Text column values are stored in-row. So we can use sp_tableoption system stored procedures option ‘text in row’ to change the text data types default storage behavior of always storing out-of-row. With this option we an force text data type column value to store in-row up-to 7000 bytes or till the enough space is available in the row. Execute the following statement to change back the Text type columns storage behavior to the default behavior where Text type columns values are always stored out-of-row even we have sufficient space in the row. EXEC sp_tableoption @TableNamePattern = 'dbo.TextTable', @OptionName = 'text in row', @OptionValue = 'OFF' |
As we have already seen above for VarChar(MAX) type column Sql Server by default always tries to store the data in-row. Only if it is exceeding 8000 bytes or available space in the row, then only it stores out-of-row in a LOB data pages and in-row it will have 16 byte pointer to the LOB data pages where actual column value is stored. Sql Server provides a mechanism where we can change this default behavior of storing the data for VarChar(Max) type column, by means of sp_tableoption system stored procedure with the option ‘large value types out of row’. Execute the below statement to always store Varchar(Max) column value out-of-Row whether it is 1 byte or 2GB even when enough space is available in the row. EXEC sp_tableoption @TableNamePattern = 'dbo.VarMaxTable', @OptionName = 'large value types out of row', @OptionValue = 1 The @OptionValue parameter value can be: Execute the following script to remove the previously inserted records and insert 100 records where LargeString column value in each row is 4,000 ‘B’ characters (i.e. 4,000 bytes). TRUNCATE TABLE dbo.VarMaxTable GO INSERT INTO dbo.VarMaxTable (LargeString) VALUES(REPLICATE('B', 4000)) GO 100 Execute the following statement to check whether the VarChar(Max) type column value is stored in-row or out-of-row: SELECT alloc_unit_type_desc, page_count FROM sys.dm_db_index_physical_stats (DB_ID('SqlHintsLOBDemo'), OBJECT_ID('dbo.VarMaxTable'), NULL, NULL , 'DETAILED') From the above result we can see that VarChar(Max) type column values are stored out-of-row even when there was a sufficient space available in the row. So we can use the sp_tableoption system stored procedures option ‘large value types out of row’ to change the Varchar(Max) data type columns default storage behavior. Execute the following statement to change back the Varchar(Max) type columns storage behavior to the default behavior where Sql Server by default always tries to store the data in-row. Only if it is exceeding 8000 bytes or available space in the row, then only it stores out-of-row in a LOB data pages. EXEC sp_tableoption @TableNamePattern = 'dbo.VarMaxTable', @OptionName = 'large value types out of row', @OptionValue = 0 |
Supported/Unsupported Functionalities | |
Some of the string functions, operators or the constructs which work on VarChar(Max) type column may not work on the Text type column. Below are two such example functions, operators or constructs: 1. = Operator on Text type column SELECT * FROM TextTable WITH(NOLOCK) WHERE LargeString = 'test string' RESULT: Msg 402, Level 16, State 1, Line 1 2. Group by clause on Text type column SELECT LargeString, COUNT(1) FROM VarMaxTable WITH(NOLOCK) GROUP BY LargeString RESULT: Msg 306, Level 16, State 2, Line 3 From the above examples we can see that we can’t use ‘=’ operator on a Text type column and also the Group By clause on the Text type column. |
Some of the string functions, operators or the constructs which doesn’t work on the Text type column, but they do work on VarChar(Max) type column. Below are two such example functions, operators or constructs: 1. = Operator on VarChar(Max) type column SELECT * FROM VarMaxTable WITH(NOLOCK) WHERE LargeString = 'test string' RESULT: SELECT LargeString, COUNT(1) FROM VarMaxTable WITH(NOLOCK) GROUP BY LargeString RESULT: |
System IO Considerations | |
As we know that the Text type column values are always stored out-of-row in LOB data pages and in-row it will have a 16 byte pointer pointing to the root LOB data page. So if the query doesn’t include the LOB columns then the number of pages required to read to retrieve the data will be less as the column data is out-of-row. But if the query includes the LOB columns, then the number of pages required to retrieve the data will be more. |
As we know that the VarChar(Max) type column values are stored out-of-row only if the length of the value to be stored in it is greater than 8000 bytes or there is not enough space in the row, otherwise it will store it in-row. So if most of the values stored in the VarChar(Max) column are large and stored out-of-row, the data retrieval behavior will almost similar to the one that of the Text type column. But if most of the values stored in VarChar(Max) type columns are small enough to store in-row. Then retrieval of the data where LOB columns are not included requires the more number of data pages to read as the LOB column value is stored in-row in the same data page where the non-LOB column values are stored. But if the select query includes LOB column then it requires less number of pages to read for the data retrieval compared to the Text type columns. |
ALSO READ
- Varchar vs NVarchar
- Varchar vs Varchar(MAX)
- Char vs Varchar
- Text vs Varchar(Max)
- Union vs Union All
- DateTime vs DateTime2
- SET QUOTED_IDENTIFIER ON vs SET QUOTED_IDENTIFIER OFF
- Stored Procedure vs User Defined Function
- Primary Key vs Unique Key
- RAISERROR vs THROW
- Temporary Table vs Table Variable
- Len() vs Datalength()
- Sequence vs Identity
- DATEDIFF vs DATEDIFF_BIG
- LEFT JOIN vs LEFT OUTER JOIN
- RIGHT JOIN vs RIGHT OUTER JOIN
- JOIN vs INNER JOIN
- LEFT OUTER JOIN vs RIGHT OUTER JOIN
- SMALLDATETIME vs DATETIME
I enjoyed reading this article.
Thanks for educating the community.
Please continue your way of presentation is very good.
Also appreciate your volunteership
Thanks Kris. Appreciate your comments.