Wishing you all the best, a new year full of health, happiness, success and even more SQL Server knowledge!!! :)
As this is my first blog post in 2010, I thought that it should focus on a really hot topic! I hope you will enjoy it!
I called this post “The PowerPivot Experiment”. Here’s a little background information on how I decided writing this article.
When PowerPivot (formerly known as "Gemini") was released, I really wanted to test its real capabilities by trying to import a really large amount of information right into excel and perform aggregations. To this end, I designed a simple relational database with five tables, populated it with data and performed my experiment using PowerPivot. For the record, the total amount of data was more than 100 Million Rows.
In this post I will explain step-by-step the experiment, starting from the database design, then explaining the import process and finally, provide some examples on how I processed this huge amount of data.
Before describing the steps, I would like to list the several software technologies used, as well as the hardware and O/S configuration of my infrastructure.
Hardware –O/S Configuration:
- OS Name: Microsoft Windows Server 2008 R2 Enterprise
- OS Version: 6.1.7600 Build 7600
- System Type: x64-based PC
- Processor: Intel(R) Core(TM)2 Duo CPU T7250 @ 2.00GHz, 2001 Mhz, 2 Core(s), 2 Logical Processor(s)
- Installed Physical Memory (RAM): 4.00 GB
- Total Physical Memory: 4.00 GB
- Available Physical Memory: 2.47 GB
- Total Virtual Memory: 8.00 GB
- Available Virtual Memory: 6.10 GB
- Page File Space: 4.00 GB
- MS SQL Server 2008 R2
- MS SQL Server PowerPivot
- MS Excel 2010
- Red Gate’s SQL Data Generator (for populating my database with sample data)
The following database diagram displays the tables participating in the DB Schema used in my experiment along with the table sizes in terms of total number of records:
Database size and other statistics:
Importing the Data Using PowerPivot
Step 1: Starting up Excel 2010 and launching PowerPivot Window
Step 2: Establishing a connection to a SQL Server Instance
Step 3: Selecting SQL Server Database Tables for importing it into Excel 2010
Step 4: Start Importing Data (Time: 15:46)
Let’s take a look on the status of the import process as it is occurring:
Step 5: Import completed! Total Time: 15:54 – 15:46 = 8 minutes!!!
It only took 8 minutes for the entire import process to complete! I personally believe that this is awesome! :)
Total number of records imported: 40,000 + 60,000 + 500,000 + 100,000,000 + 5,000 = 100,605,000
Working with the pivot tables
Right after importing the data, I proceeded with creating two pivot tables (by clicking on the “PivotTable” button in the PowerPivot window) for being able to easily perform aggregations:
The following two screenshots are examples of some basic processing I did against the data I imported into Excel using PowerPivot. Using the PivotChart tools, I visually built OLAP queries answering business questions like: (i) What is the total sales amount by product, and (ii) What is the total sales amount by customer:
Note that you can also modify the charts as you could also do in earlier versions of Excel. For example, in the below screenshot you can see that the chart “Total Sales by Customer ID” was modified in order to illustrate the data using a pie chart:
Saving the workbook
After I finished with aggregating the data using the PivotChart tools, I saved the workbook. Note that by saving the workbook, the data that were imported earlier into the workbook using PowerPivot, are stored IN the workbook and you can access it anywhere as they are contained in the workbook file as a binary object.
In my example, the database size was around 2300 MB. If you take a look at the following screenshot which presents the properties of the saved workbook, you can see that the file size is 732 MB:
After performing the above practice, I was able to see in action and test some of the powerful capabilities of PowerPivot.
The data compression, the speed of the import process, the ease of creating rich Pivot Tables and OLAP queries are some of the great features of PowerPivot that take Business Intelligence a step further!
Well, that was a description of my “PowerPivot Experiment” :)
I hope you found this post useful!
There is more to come on PowerPivot so make sure you check back my blog soon!