Using a Numbers Table in SQL Server to insert test data (T-SQL Tuesday #65)

T-SQL TuesdayT-SQL Tuesday #65 is hosted by Mike Donnelly (@SQLMD). There is no specific topic to write about this month, Mike simply wants us to learn something new and then write a blog post to teach it to others. I want to share something that I only recently learned, something I wish I had known about years ago, something that became part of my toolbox as soon as I discovered it: the Numbers Table. It is so simple and solves so many problems that everyone should know about it :)

A Numbers Table (perhaps most known as a Tally Table, sometimes called an Auxiliary Table of Numbers and even referred to as the Swiss Army Knife of SQL Server) is a one-column helper table that contains the numbers 1, 2, 3, 4, 5 and so on all the way up to the-highest-number-you-could-possibly-need.

It can be used to replace slower loops and row-by-row operations with faster set-based operations, generate dates, split strings, find gaps in data sets, expand data sets, insert test data and probably hundreds of other things. There are so many great and detailed articles already published about this topic, so I will stick to the T-SQL Tuesday topic of “Teach Something New” and share the two most recent things I learned: different ways to create a numbers table, and different ways to quickly insert test data by using a numbers table.

Create a Numbers Table
My preferred way of creating a numbers table is to use Itzik Ben-Gan’s solution described in the Virtual Auxiliary Table of Numbers article. The solution uses CROSS JOINs to generate rows and then a ROW_NUMBER() to create the actual sequence of numbers.

You can use Itzik Ben-Gan’s solution to generate a virtual numbers table when needed, you can create a function that returns a numbers table with a certain number of rows, or you can create an actual numbers table that live in your database.

Notice the highlighted lines. By creating the function or table in your model database, they will automatically be created in new databases for you. Handy, right? :)

Insert Test Data
I learned about the numbers table while researching table partitioning. All I needed was to insert a specified number of rows in each partition so I could test things like partition switching, splitting and merging. I didn’t want my tests to depend on existing data.

Writing every single insert statement was not an option. I first tried GO [count] to insert a row a specified number of times. The syntax is easy to remember and it works ok on a small amount of data:

However, GO [count] works like a WHILE loop, it executes a batch a specified number of times. The more times you execute a batch, the longer it takes. On average, inserting 1000 rows took 0-1 seconds. Inserting 10000 rows took 3-4 seconds. Inserting 100000 rows took 30-35 seconds. And inserting 1 million rows took 5.5 minutes!

After some research I discovered the numbers table, created the Numbers Table Function, and rewrote my queries:

What a difference! It was blazing fast compared to using GO [count]. On average, inserting 1000 rows took 2 milliseconds. Inserting 10000 rows took 25 milliseconds. Inserting 100000 rows took 140 milliseconds. And inserting 1 million rows took 1.5 seconds instead of 5.5 minutes!

This worked great until I wanted to insert 1000 rows per day instead of year. I didn’t have to write every single insert statement, but I still had to write an insert statement for every single day. That was not an option, so I rewrote my queries yet again. This time I used the same CROSS JOIN logic as Itzik Ben-Gan used in his numbers table solution. I first created rows for all the days and then cross joined to get 1000 rows for each day:

Each of these queries only take a couple of seconds. My preferred way of inserting test data is the highlighted query that uses the Numbers Table Function. It is short and the syntax is easy to remember.

Summary
Create a Numbers Table and / or a Numbers Table Function in the model database so they are instantly created in all new databases. Use CROSS JOIN logic to write short and powerful queries to insert test data. Finally, inserting data is just one use case, have fun and play around with these things to see if you can come up with completely new solutions to solve old problems!

And as a bonus, if you thought this article was interesting, you should absolutely watch Itzik Ben-Gan’s Boost your T-SQL with the APPLY Operator session from SQLBits XIV. He explains how to use APPLY instead of JOIN for more advanced use cases, and I promise some really cool demos. Thank you so much, Itzik! :)

Who is Cathrine Wilhelmsen?

Cathrine is a Microsoft Data Platform MVP, BimlHero, speaker, blogger and chronic volunteer who loves teaching and sharing knowledge. She works as a consultant, technical architect and developer, focusing on Data Warehouse and Business Intelligence projects. She loves sci-fi, chocolate, cat gifs and smilies :)

4 thoughts on “Using a Numbers Table in SQL Server to insert test data (T-SQL Tuesday #65)”

Pingback: Using a Numbers Table in SQL Server to insert test data | just dave info

Pingback: T-SQL Tuesday #065 – Teach Something New (Roundup) | Mike Donnelly, SQLMD

hi catherine, what is the ‘n’ here, i couldn’t understand. i couldn’t understand the expression
dateadd(day, dates.n-1,’2012-01-01′) could you please explain the two part ‘n’ convention?

N is the number that you get from the Numbers table or GetNums function. I have aliased these as dates, which is why I use dates.n. I encourage you to pick apart the code examples and work through them piece by piece to understand the details :)

Share Your Thoughts?