Executing Heavy Set-Based Operations Against VLDBs in SQL Server

Executing Heavy Set-Based Operations Against VLDBs in SQL Server

Share on Facebook0Share on Google+0Tweet about this on TwitterShare on LinkedIn0Share on Reddit0
Recently I had to design and execute some heavy set-based T-SQL operations against a large number of very large databases (VLDBs) within a SQL Server 2005 instance.

I won’t enter the debate of Row-Based vs. Set-Based processing right now, as I plan to write an article on this in the near future, but I will rather discuss some concerns that need to be taken into account when performing such operations.

First of all, in terms of hardware resources you will need enough RAM and of course, more than one processor (remember to check the Processor affinity setting in your SQL Server instance).

Now, the most important aspect when executing such operations (set-based) against large databases is the log space.

As the execution of the operation continues, a significant amount of transaction log space is being reserved thus requiring more storage. The reason this is happening is that the operation is not committed until it completes its execution against the entire data set. So, if your target database’s recovery model is set to “FULL” it means that any changes performed by the set-based operation will be logged up to the latest detail thus requiring more storage.

So, how can you ensure that the set-based operation will be completed without any problems such as full disk spaces, etc. and thus avoid the possibility of having the operation abnormally terminated?

Well, there are two approaches (if not more); the easy one and the complex one.

The complex approach is to break-up the set-based operation thus targeting smaller data sets within the database. Not so good right? I don’t like it either! In my opinion, this should be the very last option as “segmenting” the target data set is not a trivial task, mainly in terms of data consistency.

And here’s the simple approach:

1. Find a date and time where there are not operations targeting the database (this is actually a downtime 🙂

2. Set the database’s recovery model to SIMPLE as in the following example:

ALTER DATABASE [Database_Name] SET RECOVERY SIMPLE
GO

3. Execute the set-based operation.

4. Set the database’s recovery model to FULL as in the following example:

ALTER DATABASE [Database_Name] SET RECOVERY FULL
GO

When your database uses the Full Recovery Model, SQL Server preserves the transaction log until you back it up. By doing this, it allows you to recover the database to a specific point of time based on the entries in the transaction log. That’s why when this recovery model is selected SQL Server logs everything in the database’s log file. Of course this requires more disk space and slightly affects performance but it enables you to fully recover your database in the case of a disaster.

When you use the Simple Recovery Model, SQL Server keeps only a minimal amount of information in the log file. Also the file is automatically truncated by SQL Server whenever the database reaches a transaction checkpoint. Even though this is not the best choice for disaster recovery purposes, it allows the processing on the database to be faster.

So, as I had to execute some really heavy set-based T-SQL operations I changed the recovery model to Simple in order for the processing to be faster and to ensure (up to the possible limit) that I wouldn’t experience any disk space issues due to the size increase of the transaction log files.

Hope this helps!


Recommended eBooks on SQL Server:

Tuning SQL Server: eBook by SQL Server MVP Artemakis Artemiou
Tuning SQL Server: eBook by SQL Server MVP Artemakis Artemiou
Administering SQL Server: eBook by SQL Server MVP Artemakis Artemiou
Administering SQL Server: eBook by SQL Server MVP Artemakis Artemiou
Artemakis Artemiou
Artemakis Artemiou is a Senior SQL Server Architect, Author, Software Developer and a Microsoft Data Platform MVP. He has over 15 years of experience in the IT industry in various roles. Among other, via his initiative SQLEBooks.com, Artemakis authors and publishes eBooks on different topics on SQL Server. Artemakis currently serves as the President of the Cyprus .NET User Group (CDNUG) and the International .NET Association Country Leader for Cyprus (INETA). Additionally he is the founder of the SQLArtBits initiative that aims to provide the technical community with simple, yet powerful and high-quality SQL Server tools. Currently, the highlights of these tools are DBA Security Advisor and In-Memory OLTP Simulator. Artemakis's official website can be found at aartemiou.com. Artemakis's blogs can be found at: SQLNetHub.com and TechHowTos.com.

3 thoughts on “Executing Heavy Set-Based Operations Against VLDBs in SQL Server

  1. Artemakis Artemiou [MVP]

    Hi Jack,

    Simple and Bulk-Logged Recovery models indeed both allow high-performance bulk copy operations.

    However they have some differences.

    The Bulk-Logged Recovery Model minimally logs bulk operations. However it fully logs all other transactions.

    The Simple Recovery Model does not backup the transaction log at all. It actually logs only minimal data related to the user that performs operations against the database.

    Also the Simple Recovery Model requires the least administration.

    The reason I used the Simple Recovery Model for this post's example, is that I just
    wanted to fully avoid the transaction log backup in order for executing the heavy set-based operations and switching back to the Full Recovery Model.

    Now, when it comes to selecting the recovery model for a proper backup strategy on a Production Environment, you first need to
    analyze the business requirements that govern the target database(s) in order to make the right choice.

    Cheers,
    Artemis