Tag Archives: Difference Between TEXT and VARCHAR(MAX) Data Type

Differences Between Sql Server TEXT and VARCHAR(MAX) Data Type

The objective of this article is to compare the legacy Large OBject (LOB) data type TEXT and the Sql Server 2005 introduced VARCHAR(MAX) LOB data type.

[ALSO READ] Difference Between Sql Server VARCHAR and VARCHAR(MAX)

TEXT VarChar(MAX)
Basic Definition

It is a Non-Unicode large Variable Length character data type, which can store maximum of 2147483647 Non-Unicode characters (i.e. maximum storage capacity is: 2GB).

It is a Non-Unicode large Variable Length character data type, which can store maximum of 2147483647 Non-Unicode characters (i.e. maximum storage capacity is: 2GB).

Version of the Sql Server in which it is introduced?

Text data type was present from the very old versions of Sql Server. If I remember correctly it was present even in Sql Server 6.5 days.

VarChar(Max) data type was introduced in Sql Server 2005.

Which one to Use?

As per MSDN link Microfost is suggesting to avoid using the Text datatype and it will be removed in a future versions of Sql Server.

Varchar(Max) is the suggested data type for storing the large string values instead of Text data type.

In-Row or Out-of-Row Storage?

Data of a Text type column is stored out-of-row in a separate LOB data pages. The row in the table data page will only have a 16 byte pointer to the LOB data page where the actual data is present.

Data of a Varchar(max) type column is stored in-row if it is less than or equal to 8000 byte. If Varchar(max) column value is crossing the 8000 bytes then the Varchar(max) column value is stored in a separate LOB data pages and row will only have a 16 byte pointer to the LOB data page where the actual data is present.

When LOB column value is less than 8000 bytes or available space in the row, then whether LOB column value is stored in-row or out-of-row?

Execute the following script to create a demo database SqlHintsLOBDemo if it doesn’t exists already. In the demo data base it creates a table TextTable with a Text LOB type column LargeString. Finally it inserts 100 records, where LargeString column value in each row is 4000 B characters (i.e. 4000 bytes).

--Create a Demo Database
IF DB_ID('SqlHintsLOBDemo')
     IS NULL
CREATE DATABASE SqlHintsLOBDemo
GO
USE SqlHintsLOBDemo
GO
--Create a table with Text type 
--column
CREATE TABLE dbo.TextTable
(
	Id INT IDENTITY(1,1),
	LargeString TEXT
)
GO
--INSERT 100 records, where  
--LargeString column value in 
--each row is 4000 B characters
-- (i.e. 4000 bytes)
INSERT INTO dbo.TextTable
(LargeString)
VALUES(REPLICATE('B', 4000))
GO 100 --Loop 100 times

[ALSO READ] GO Statement can also be used to excute batch of T-Sql statement multiple times

Execute the following statement to check whether the Text type column value is stored in-row or out-of-row:

SELECT alloc_unit_type_desc, 
 page_count
FROM
 sys.dm_db_index_physical_stats 
   (DB_ID('SqlHintsLOBDemo'),
    OBJECT_ID('dbo.TextTable'),
    NULL, NULL , 'DETAILED')

RESULT:
TextData Typ Page Allocation Column Value Less Than 8000 Bytes

From the above result we can see that even though one row (4 Byte for Integer Id column value + 4000 Bytes for Text type column value) can fit into one 8KB data page, but still as per design Sql Server always stores the Text type column value in the LOB data pages. Whether we are storing 1 byte or 2GB data in a Text type column Sql Server always stores the Text type column value out-of-row in the LOB data pages and the row will have a 16 byte pointer pointing to the LOB data pages where the data is stored.

Execute the following script to create a demo database SqlHintsLOBDemo if it doesn’t exists already. In the demo data base it creates a table VarMaxTable with a VarChar(Max) LOB type column LargeString. Finally it inserts 100 records, where LargeString column value in each row is 4000 B characters (i.e. 4000 bytes).

--Create a Demo Database
IF DB_ID('SqlHintsLOBDemo')
   IS NULL
CREATE DATABASE SqlHintsLOBDemo
GO
USE SqlHintsLOBDemo
GO
--Create a table with a 
--Varchar(Max) type column
CREATE TABLE dbo.VarMaxTable
(
	Id INT IDENTITY(1,1),
	LargeString VARCHAR(MAX)
)
GO
--INSERT 100 records, where
--LargeString column value in 
--each row is 4000 B characters 
--(i.e. 4000 bytes)
INSERT INTO dbo.VarMaxTable
 (LargeString)
VALUES(REPLICATE('B', 4000))
GO 100

Execute the following statement to check whether the VarChar(Max) type column value is stored in-row or out-of-row:

SELECT alloc_unit_type_desc, 
 page_count
FROM
 sys.dm_db_index_physical_stats 
   (DB_ID('SqlHintsLOBDemo'),
    OBJECT_ID('dbo.VarMaxTable'),
    NULL, NULL , 'DETAILED')

RESULT:
VarcharMax Type Page Allocation Column Value Less Than 8000 Bytes

From the above result we can see that LOB VarChar(MAX) type column value is stored in-row. For VarChar(MAX) type column Sql Server by default always tries to store the data in-row. Only if it is exceeding 8000 bytes or available space in the row, then only it stores out-of-row in a LOB data pages and in-row it will have 16 byte pointer to the LOB data pages where actual column value is stored.

When LOB column value is more than 8000 bytes or available space in the row, then whether LOB column value is stored in-row or out-of-row?

Execute the following script to remove the previously inserted records and insert 100 records where LargeString column value in each row is 10,000 B characters (i.e. 10,000 bytes).

--TRUNCATE the table to
--remove all the previously
--Inserted records
TRUNCATE TABLE dbo.TextTable
GO
INSERT INTO dbo.TextTable
 (LargeString)
VALUES(REPLICATE(
 CAST('B' AS VARCHAR(MAX)), 
 10000))
GO 100

Execute the following statement to check whether the Text type column value is stored in-row or out-of-row:

RESULT:
TextData Typ Page Allocation Column Value more Than 8000 Bytes

The above result further re-affirms that: whether we are storing 1 byte or 2GB data in a Text type column Sql Server always stores the Text type column value out-of-row in the LOB data pages and the row will have a 16 byte pointer pointing to the LOB data pages where the data is stored.

Execute the following script to remove the previously inserted records and insert 100 records where LargeString column value in each row is 10,000 B characters (i.e. 10,000 bytes).

--TRUNCATE the table to
--remove all the previously
--Inserted records
TRUNCATE TABLE dbo.VarMaxTable
GO
INSERT INTO dbo.VarMaxTable
 (LargeString)
VALUES(REPLICATE(
 CAST('B' AS VARCHAR(MAX)), 
 10000))
GO 100

Execute the following statement to check whether the VarChar(Max) type column value is stored in-row or out-of-row:

RESULT:
VarcharMax Type Page Allocation Column Value more Than 8000 Bytes

From the above result we can see that LOB VarChar(MAX) type column value is stored out-of-row in a LOB data pages. For VarChar(MAX) type column Sql Server by default always tries to store the data in-row. Only if it is exceeding 8000 bytes or available space in the row, then only it stores out-of-row in a LOB data pages having 16 byte pointer in-row pointing to LOB data pages where actual column value is stored.

Do we have an option to change default In-Row and Out-Of-Row Storage behavior?

As we have already seen above whether we are storing 1 byte or 2GB data in a Text type column Sql Server always stores it out-of-row in the LOB data pages and the row will have a 16 byte pointer pointing to the LOB data pages where the data is stored.

Sql Server provides a mechanism where we can change this default behavior of storing the data out-of-row even when we have a sufficient free space in the row to accommodate the Text type column value, by means of sp_tableoption system stored procedure with the option ‘text in row’.

Execute the below statement to store the Text Type Column value in Row if Text Type column value is less than 7000 bytes or enough space is available in the row.

EXEC sp_tableoption 
 @TableNamePattern =
      'dbo.TextTable',
 @OptionName = 'text in row', 
 @OptionValue = 7000

The @OptionValue parameter value can be:
0/OFF (Default Value): Text type column value is stored out-of-row
ON: Text Type Column value is stored in-row as long as the Text type column value is less than or equal to 256 bytes
integer value from 24 through 7000: specifies the number bytes up to which the text type column value is stored in-row.

Execute the following script to remove the previously inserted records and insert 100 records where LargeString column value in each row is 4,000 ‘B’ characters (i.e. 4,000 bytes).

TRUNCATE TABLE dbo.TextTable
GO
INSERT INTO dbo.TextTable
(LargeString)
VALUES(REPLICATE('B', 4000))
GO 100

Execute the following statement to check whether the Text type column value is stored in-row or out-of-row:

SELECT alloc_unit_type_desc, 
 page_count
FROM
 sys.dm_db_index_physical_stats 
   (DB_ID('SqlHintsLOBDemo'),
    OBJECT_ID('dbo.TextTable'),
    NULL, NULL , 'DETAILED')

RESULT:
Changing Text Data Type Default Storgae Behaviour

From the above result now we can see that the Text column values are stored in-row. So we can use sp_tableoption system stored procedures option ‘text in row’ to change the text data types default storage behavior of always storing out-of-row. With this option we an force text data type column value to store in-row up-to 7000 bytes or till the enough space is available in the row.

Execute the following statement to change back the Text type columns storage behavior to the default behavior where Text type columns values are always stored out-of-row even we have sufficient space in the row.

EXEC sp_tableoption
 @TableNamePattern =
      'dbo.TextTable',
 @OptionName = 'text in row',
 @OptionValue = 'OFF'

As we have already seen above for VarChar(MAX) type column Sql Server by default always tries to store the data in-row. Only if it is exceeding 8000 bytes or available space in the row, then only it stores out-of-row in a LOB data pages and in-row it will have 16 byte pointer to the LOB data pages where actual column value is stored.

Sql Server provides a mechanism where we can change this default behavior of storing the data for VarChar(Max) type column, by means of sp_tableoption system stored procedure with the option ‘large value types out of row’.

Execute the below statement to always store Varchar(Max) column value out-of-Row whether it is 1 byte or 2GB even when enough space is available in the row.

EXEC sp_tableoption 
 @TableNamePattern =
  'dbo.VarMaxTable',
 @OptionName = 
  'large value types out of row',
 @OptionValue = 1

The @OptionValue parameter value can be:
0 (Default) : Varchar(Max) column values are stored in-row as long as the value length is <= 8000 bytes and enough space is available in the row. 1 : VarChar(Max) column values are always stored out-of-row even when enough space is available in the row.

Execute the following script to remove the previously inserted records and insert 100 records where LargeString column value in each row is 4,000 ‘B’ characters (i.e. 4,000 bytes).

TRUNCATE TABLE dbo.VarMaxTable
GO
INSERT INTO dbo.VarMaxTable
 (LargeString)
VALUES(REPLICATE('B', 4000))
GO 100

Execute the following statement to check whether the VarChar(Max) type column value is stored in-row or out-of-row:

SELECT alloc_unit_type_desc, 
 page_count
FROM
 sys.dm_db_index_physical_stats 
   (DB_ID('SqlHintsLOBDemo'),
    OBJECT_ID('dbo.VarMaxTable'),
    NULL, NULL , 'DETAILED')

RESULT:
Changing VarcharMax column Default Storgae Behaviour

From the above result we can see that VarChar(Max) type column values are stored out-of-row even when there was a sufficient space available in the row. So we can use the sp_tableoption system stored procedures option ‘large value types out of row’ to change the Varchar(Max) data type columns default storage behavior.

Execute the following statement to change back the Varchar(Max) type columns storage behavior to the default behavior where Sql Server by default always tries to store the data in-row. Only if it is exceeding 8000 bytes or available space in the row, then only it stores out-of-row in a LOB data pages.

EXEC sp_tableoption 
 @TableNamePattern =
  'dbo.VarMaxTable',
 @OptionName = 
  'large value types out of row',
 @OptionValue = 0
Supported/Unsupported Functionalities
Some of the string functions, operators or the constructs which work on VarChar(Max) type column may not work on the Text type column.
Below are two such example functions, operators or constructs:
1. = Operator on Text type column

SELECT *
FROM TextTable WITH(NOLOCK)
WHERE LargeString = 'test string'

RESULT:

Msg 402, Level 16, State 1, Line 1
The data types text and varchar are incompatible in the equal to operator.

2. Group by clause on Text type column

SELECT LargeString, COUNT(1)
FROM VarMaxTable WITH(NOLOCK)
GROUP BY LargeString

RESULT:

Msg 306, Level 16, State 2, Line 3
The text, ntext, and image data types cannot be compared or sorted, except when using IS NULL or LIKE operator.

From the above examples we can see that we can’t use ‘=’ operator on a Text type column and also the Group By clause on the Text type column.

Some of the string functions, operators or the constructs which doesn’t work on the Text type column, but they do work on VarChar(Max) type column.
Below are two such example functions, operators or constructs:
1. = Operator on VarChar(Max) type column

SELECT *
FROM VarMaxTable WITH(NOLOCK)
WHERE LargeString = 'test string'

RESULT:
Equal to Operator on a VarCharMax type column
2. Group by clause on VarChar(Max) type column

SELECT LargeString, COUNT(1)
FROM VarMaxTable WITH(NOLOCK)
GROUP BY LargeString

RESULT:
Group By Clause on VarCharMax Type Column
From the above examples we can see that we use ‘=’ operator and Group By clause on the VarChar(Max) type column, but not on the Text type column. If data stored in the Varchar(Max) column is a very large strings, then using these functions may lead to performance issues.

System IO Considerations

As we know that the Text type column values are always stored out-of-row in LOB data pages and in-row it will have a 16 byte pointer pointing to the root LOB data page. So if the query doesn’t include the LOB columns then the number of pages required to read to retrieve the data will be less as the column data is out-of-row. But if the query includes the LOB columns, then the number of pages required to retrieve the data will be more.

As we know that the VarChar(Max) type column values are stored out-of-row only if the length of the value to be stored in it is greater than 8000 bytes or there is not enough space in the row, otherwise it will store it in-row. So if most of the values stored in the VarChar(Max) column are large and stored out-of-row, the data retrieval behavior will almost similar to the one that of the Text type column.

But if most of the values stored in VarChar(Max) type columns are small enough to store in-row. Then retrieval of the data where LOB columns are not included requires the more number of data pages to read as the LOB column value is stored in-row in the same data page where the non-LOB column values are stored. But if the select query includes LOB column then it requires less number of pages to read for the data retrieval compared to the Text type columns.

ALSO READ