Everyone who works with Data warehouse data loads should be familiar with Minimal Logging. Minimal Logging helps tremendously in improving the performance of data loads as the amount of logging is reduced enormously by keeping track of only the extent allocations but not individual row allocations as happens in Fully logged recovery model. For smaller data loads, improvement may be small but it balloons up in high volume data loads. To get a primer on this topic, a good place is to start using the below links.
Minimal Logging doesn’t happen if you use only BULK_LOGGED recovery model. There are certain additional requirements that should be met for utilizing the Minimal Logging. The following link should be referred to understand the complete requirements necessary for Minimal Logging. For this post, we will take few cases and interrogate using Internals of SQL Server to figure out if Minimal Logging is happening or NOT.
Test 1.a : WITHOUT TABLOCK for an Empty Table
To start with, let’s go ahead with a simple scenario of loading data into an empty table using OPENROWSET (BULK) with the help of a format file.
The following script, creates a database with BULK_LOGGED recovery model, takes a full back up to avoid being in Pseudo simple recovery model, a log backup and finally creates a table.
USE master; GO IF DB_ID('Bulk2008R2') IS NOT NULL DROP DATABASE Bulk2008R2 GO CREATE DATABASE Bulk2008R2 ON ( NAME = Bulk2008R2_data, FILENAME = 'L:\Bulk2008R2_Data.mdf', SIZE = 10240, FILEGROWTH = 500 ) LOG ON ( NAME = Bulk2008R2_log, FILENAME = 'L:\Bulk2008R2_log.ldf', SIZE = 1024MB, FILEGROWTH = 500MB ) ; GO --Make sure the database is in BULK_LOGGED recover model ALTER DATABASE Bulk2008R2 SET RECOVERY BULK_LOGGED GO --Backup the database to avoid the pseudo simple recovery model --Unless a Full database backup is taken, new databases are always in pseudo simple recovery model BACKUP DATABASE Bulk2008R2 TO DISK = 'L:\Bulk2008R2.bak' GO BACKUP LOG Bulk2008R2 TO DISK = 'L:\Bulk2008R2_log.trn' GO USE Bulk2008R2 IF OBJECT_ID('dbo.TestBulk') IS NOT NULL DROP TABLE dbo.TestBulk GO CREATE TABLE dbo.TestBulk ( c1 BIGINT NOT NULL , c2 CHAR(4000) NOT NULL , c3 DATETIME NOT NULL , CONSTRAINT PK_TestBulk PRIMARY KEY CLUSTERED (c1) ) GO
Sample data file can be accessed from the below link. For some reason Formatfile can’t be uploaded to WordPress and failing for security reasons. A screenshot of the format file has been added below as well.
Sample data for TestBulk: testBulk.txt
Note: Rename the extension from .txt to .xml to use this format file.
Let’s go ahead and add data to the table WITHOUT using TABLOCK hint like below and check if the data has been added using Minimal Logging.
USE Bulk2008R2; GO INSERT dbo.TestBulk (c1, c2, c3) SELECT c1, c2, c3 FROM OPENROWSET(BULK 'L:\TestBulk.txt', FORMATFILE='L:\TestBulk_FormatFile.xml' ) AS TB; GO
Instead of looking at the Transaction log size or the log operation (logop) names, let’s delve into some internals of the pages to figure out if Minimal Logging happened for the above data load. Using DBCC EXTENTINFO, we can figure out the page id numbers of the table. And using DBCC PAGE, look at the allocation status information as listed below.
--Lets use DBCC EXTENTINFO to look at the pages allocated to the table DECLARE @DBID INT , @ObjectID INT SELECT @DBID = DB_ID() , @ObjectID = OBJECT_ID('dbo.TestBulk') DBCC EXTENTINFO (@DBID, @ObjectID, -1) GO
using DBCC PAGE, we can look at the allocation status of the page.
--Pick a page id from the above results and use it below. DECLARE @DBID INT SELECT @DBID = DB_ID() DBCC TRACEON (3604) DBCC PAGE (@DBID, 1, 153, 3) GO
You need to quickly hop on over to the below links and look at the ML map pages.
In short, in BULK_LOGGED recovery model whenever a minimally logged operation changes data, those extents are tracked in ML map pages that happened from the last transaction log backup. Using DBCC PAGE, one can look at the “Allocation Status” information to figure out if the data is modified using a Minimally Logged operation or NOT.
Test 1.b : WITH TABLOCK for an Empty Table
With the knowledge of the Allocation Status and the ML Map pages and let’s drop and re-create the table and load the same data using the TABLOCK hint this time and look at the page id numbers using the DBCC EXTENTINFO, and using DBCC PAGE look at the allocation status information.
USE Bulk2008R2 IF OBJECT_ID('dbo.TestBulk') IS NOT NULL DROP TABLE dbo.TestBulk GO CREATE TABLE dbo.TestBulk ( c1 BIGINT NOT NULL , c2 CHAR(4000) NOT NULL , c3 DATETIME NOT NULL , CONSTRAINT PK_TestBulk PRIMARY KEY CLUSTERED (c1) ) GO USE Bulk2008R2; GO INSERT dbo.TestBulk WITH (TABLOCK) (c1, c2, c3) SELECT c1, c2, c3 FROM OPENROWSET(BULK 'L:\TestBulk.txt', FORMATFILE='L:\TestBulk_FormatFile.xml' ) AS TB; GO
--Pick a page id from the above results and use it below. DECLARE @DBID INT SELECT @DBID = DB_ID() DBCC TRACEON (3604) DBCC PAGE (@DBID, 1, 153, 1) GO
From the above, it is clear that the data load when TABLOCK hint is specified, Minimal logging has happened. Voila! This is one way of making sure your data loads are doing exactly what you intended them for.
In our simple test case, for Minimal Logging to happen the below conditions had to be met.
- Database is in BULK_LOGGED recovery model.
- Table is empty with a Clustered Index + no additional indexes.
- TABLOCK hint is specified.
In the next few posts, I will delve into some complex scenarios involving TF 610 + others and dig into the internals if Minimal Logging is happening or NOT. Until then, take care!