In this article, we will discuss about executing heavy set-based operations against Very Large Datatabases (VLDBs) in SQL Server.
Introduction
Recently I had to design and execute some heavy set-based T-SQL operations against a large number of very large databases (VLDBs) within a SQL Server instance.
I won’t enter the debate of Row-Based vs. Set-Based processing right now, as I plan to write an article on this in the near future, but I will rather discuss some concerns that need to be taken into account when performing such operations.
First of all, in terms of hardware resources you will need enough RAM and of course, more than one processor (remember to check the Processor affinity setting in your SQL Server instance).
Managing the Transaction Log Space
Now, the most important aspect when executing such operations (set-based) against large databases is the log space.
As the execution of the operation continues, a significant amount of transaction log space is being reserved thus requiring more storage. The reason this is happening is that the operation is not committed until it completes its execution against the entire data set. So, if your target database’s recovery model is set to “FULL” it means that any changes performed by the set-based operation will be logged up to the latest detail thus requiring more storage.
How to Ensure that the Set-Based Operation Executes without Issues
So, how can you ensure that the set-based operation will be completed without any problems such as full disk spaces, etc. and thus avoid the possibility of having the operation abnormally terminated?
Well, there are two approaches (if not more); the easy one and the complex one.
The Complex Approach
The complex approach is to break-up the set-based operation thus targeting smaller data sets within the database. Not so good right? I don’t like it either! In my opinion, this should be the very last option as “segmenting” the target data set is not a trivial task, mainly in terms of data consistency.
The Simple Approach
And here’s the simple approach:
1. Find a date and time where there are not operations targeting the database (this is actually a downtime :) 2. Set the database's recovery model to SIMPLE as in the following example: ALTER DATABASE [Database_Name] SET RECOVERY SIMPLE GO 3. Execute the set-based operation. 4. Set the database's recovery model to FULL as in the following example: ALTER DATABASE [Database_Name] SET RECOVERY FULL GO
How SQL Server Recovery Models Affect Heavy Set-Based Operations in SQL Server
When your database uses the Full Recovery Model, SQL Server preserves the transaction log until you back it up. By doing this, it allows you to recover the database to a specific point of time based on the entries in the transaction log. That’s why when this recovery model is selected SQL Server logs everything in the database’s log file. Of course this requires more disk space and slightly affects performance but it enables you to fully recover your database in the case of a disaster.
When you use the Simple Recovery Model, SQL Server keeps only a minimal amount of information in the log file. Also the file is automatically truncated by SQL Server whenever the database reaches a transaction checkpoint. Even though this is not the best choice for disaster recovery purposes, it allows the processing on the database to be faster.
So, as I had to execute some really heavy set-based T-SQL operations I changed the recovery model to Simple in order for the processing to be faster and to ensure (up to the possible limit) that I wouldn’t experience any disk space issues due to the size increase of the transaction log files.
Hope this helps!
Learn more tips like this!
Enroll to our Online Course!Check our online course titled “Essential SQL Server Development Tips for SQL Developers”
(special limited-time discount included in link).Sharpen your SQL Server database programming skills via a large set of tips on T-SQL and database development techniques. The course, among other, features over than 30 live demonstrations!
Featured Online Courses:
- Introduction to Azure SQL Database for Beginners
- Working with Python on Windows and SQL Server Databases
- Introduction to Azure Database for MySQL
- Boost SQL Server Database Performance with In-Memory OLTP
- Essential SQL Server Administration Tips
- SQL Server Fundamentals – SQL Database for Beginners
- Essential SQL Server Development Tips for SQL Developers
- Introduction to Computer Programming for Beginners
- .NET Programming for Beginners – Windows Forms with C#
- SQL Server 2019: What’s New – New and Enhanced Features
- Entity Framework: Getting Started – Complete Beginners Guide
- Data Management for Beginners – Main Principles
- A Guide on How to Start and Monetize a Successful Blog
Related SQL Server Development Articles:
- Error converting data type varchar to float
- Error converting data type varchar to numeric
- The set identity_insert Command in SQL Server
- Handling NULL Character x00 when Exporting to File Using BCP
- The Net.Tcp Port Sharing Service service on Local Computer started and then stopped
- …more SQL Server development articles
Subscribe to our newsletter and stay up to date!
Subscribe to our YouTube channel (SQLNetHubTV)!
Like our Facebook Page!
Check our SQL Server Administration articles.
Check out our latest software releases!
Check our eBooks!
Rate this article:
Reference: SQLNetHub.com (https://www.sqlnethub.com)
© SQLNetHub
Artemakis Artemiou, a distinguished Senior Database and Software Architect, brings over 20 years of expertise to the IT industry. A Certified Database, Cloud, and AI professional, he earned the Microsoft Data Platform MVP title for nine consecutive years (2009-2018). As the founder of SQLNetHub and GnoelixiAI Hub, Artemakis is dedicated to sharing his knowledge and democratizing education on various fields such as: Databases, Cloud, AI, and Software Development. His commitment to simplicity and knowledge sharing defines his impactful presence in the tech community.
Well said,I agree with you.
What exactly do you mean when you say that Simple Recovery "keeps only a minimal amount of information in the log file"? Actions are logged just like Bulk_Logged, http://technet.microsoft.com/en-us/magazine/2009.02.logging.aspx#id0060077
Hi Jack,
Simple and Bulk-Logged Recovery models indeed both allow high-performance bulk copy operations.
However they have some differences.
The Bulk-Logged Recovery Model minimally logs bulk operations. However it fully logs all other transactions.
The Simple Recovery Model does not backup the transaction log at all. It actually logs only minimal data related to the user that performs operations against the database.
Also the Simple Recovery Model requires the least administration.
The reason I used the Simple Recovery Model for this post's example, is that I just
wanted to fully avoid the transaction log backup in order for executing the heavy set-based operations and switching back to the Full Recovery Model.
Now, when it comes to selecting the recovery model for a proper backup strategy on a Production Environment, you first need to
analyze the business requirements that govern the target database(s) in order to make the right choice.
Cheers,
Artemis