I got to interview Buck Woody about Data Science at Microsoft Ignite 2017! 🥳
In this interview, we chat about Microsoft Business Analytics and AI (formerly known as Cortana Intelligence Suite), Artificial Intelligence in Excel, intent-based programming, Predictive Analytics, DevOps for Data Scientists and life-long learning.
Data Science with Buck Woody - Microsoft Ignite 2017
Inserts, updates and deletes on large tables can be very slow and expensive, cause locking and blocking, and even fill up the transaction log. One of the main benefits of table partitioning is that you can speed up loading and archiving of data by using partition switching.
Partition switching moves entire partitions between tables almost instantly. It is extremely fast because it is a metadata-only operation that updates the location of the data, no data is physically moved. New data can be loaded to separate tables and then switched in, old data can be switched out to separate tables and then archived or purged. All data preparation and manipulation can be done in separate tables without affecting the partitioned table.
T-SQL Tuesday #65 is hosted by Mike Donnelly (@SQLMD). There is no specific topic to write about this month, Mike simply wants us to learn something new and then write a blog post to teach it to others. I want to share something that I only recently learned, something I wish I had known about years ago, something that became part of my toolbox as soon as I discovered it: the Numbers Table. It is so simple and solves so many problems that everyone should know about it 🤓
A Numbers Table (perhaps most known as a Tally Table, sometimes called an Auxiliary Table of Numbers and even referred to as the Swiss Army Knife of SQL Server) is a one-column helper table that contains the numbers 1, 2, 3, 4, 5 and so on all the way up to the-highest-number-you-could-possibly-need.
It can be used to replace slower loops and row-by-row operations with faster set-based operations, generate dates, split strings, find gaps in data sets, expand data sets, insert test data and probably hundreds of other things. There are so many great and detailed articles already published about this topic, so I will stick to the T-SQL Tuesday topic of “Teach Something New” and share the two most recent things I learned: different ways to create a numbers table, and different ways to quickly insert test data by using a numbers table.
There are many benefits of partitioning large tables. You can speed up loading and archiving of data, you can perform maintenance operations on individual partitions instead of the whole table, and you may be able to improve query performance. However, implementing table partitioning is not a trivial task and you need a good understanding of how it works to implement and use it correctly.
Being a business intelligence and data warehouse developer, not a DBA, it took me a while to understand table partitioning. I had to read a lot, get plenty of hands-on experience and make some mistakes along the way. (The illustration to the left is my Table Partitioning Cheat Sheet.) This post covers the basics of partitioned tables, partition columns, partition functions and partition schemes.
The table below is a simplified mapping between SQL Server, SSIS and Biml Data Types. It does not include all possible mappings or all data types, but is meant as a quick reference while developing and learning Biml. I mainly created it as a cheat sheet for myself, but I hope other Biml developers will find it useful!