Skip to content

Tag: SQL Server

Data Science with Buck Woody

I got to interview Buck Woody about Data Science at Microsoft Ignite 2017! 🥳

In this interview, we chat about Microsoft Business Analytics and AI (formerly known as Cortana Intelligence Suite), Artificial Intelligence in Excel, intent-based programming, Predictive Analytics, DevOps for Data Scientists and life-long learning.

Data Science with Buck Woody - Microsoft Ignite 2017

Table Partitioning in SQL Server - Partition Switching

Partition Switching in SQL Server.

Inserts, updates and deletes on large tables can be very slow and expensive, cause locking and blocking, and even fill up the transaction log. One of the main benefits of table partitioning is that you can speed up loading and archiving of data by using partition switching.

Partition switching moves entire partitions between tables almost instantly. It is extremely fast because it is a metadata-only operation that updates the location of the data, no data is physically moved. New data can be loaded to separate tables and then switched in, old data can be switched out to separate tables and then archived or purged. All data preparation and manipulation can be done in separate tables without affecting the partitioned table.

Using a Numbers Table in SQL Server to insert test data (T-SQL Tuesday #65)

T-SQL Tuesday logo.

T-SQL Tuesday #65 is hosted by Mike Donnelly (@SQLMD). There is no specific topic to write about this month, Mike simply wants us to learn something new and then write a blog post to teach it to others. I want to share something that I only recently learned, something I wish I had known about years ago, something that became part of my toolbox as soon as I discovered it: the Numbers Table. It is so simple and solves so many problems that everyone should know about it 🤓

A Numbers Table (perhaps most known as a Tally Table, sometimes called an Auxiliary Table of Numbers and even referred to as the Swiss Army Knife of SQL Server) is a one-column helper table that contains the numbers 1, 2, 3, 4, 5 and so on all the way up to the-highest-number-you-could-possibly-need.

It can be used to replace slower loops and row-by-row operations with faster set-based operations, generate dates, split strings, find gaps in data sets, expand data sets, insert test data and probably hundreds of other things. There are so many great and detailed articles already published about this topic, so I will stick to the T-SQL Tuesday topic of “Teach Something New” and share the two most recent things I learned: different ways to create a numbers table, and different ways to quickly insert test data by using a numbers table.

Table Partitioning in SQL Server - The Basics

Cheat Sheet of Table Partitioning in SQL Server.

There are many benefits of partitioning large tables. You can speed up loading and archiving of data, you can perform maintenance operations on individual partitions instead of the whole table, and you may be able to improve query performance. However, implementing table partitioning is not a trivial task and you need a good understanding of how it works to implement and use it correctly.

Being a business intelligence and data warehouse developer, not a DBA, it took me a while to understand table partitioning. I had to read a lot, get plenty of hands-on experience and make some mistakes along the way. (The illustration to the left is my Table Partitioning Cheat Sheet.) This post covers the basics of partitioned tables, partition columns, partition functions and partition schemes.

SQL Server, SSIS and Biml Data Types

Biml Wheel.

The table below is a simplified mapping between SQL Server, SSIS and Biml Data Types. It does not include all possible mappings or all data types, but is meant as a quick reference while developing and learning Biml. I mainly created it as a cheat sheet for myself, but I hope other Biml developers will find it useful!

Cheat Sheet for SQL Server, SSIS and Biml Data Types

SQL Server SSIS Biml
bigint DT_I8 Int64
binary DT_BYTES Binary
bit DT_BOOL Boolean
char DT_STR AnsiStringFixedLength
date DT_DBDATE Date
datetime DT_DBTIMESTAMP DateTime
datetime2 DT_DBTIMESTAMP2 DateTime2
datetimeoffset DT_DBTIMESTAMPOFFSET DateTimeOffset
decimal DT_NUMERIC Decimal
float DT_R8 Double
geography DT_IMAGE Object
geometry DT_IMAGE Object
hierarchyid DT_BYTES Object
image (*) DT_IMAGE Binary
int DT_I4 Int32
money DT_CY Currency
nchar DT_WSTR StringFixedLength
ntext (*) DT_NTEXT String
numeric DT_NUMERIC Decimal
nvarchar DT_WSTR String
nvarchar(max) DT_NTEXT String
real DT_R4 Single
rowversion DT_BYTES Binary
smalldatetime DT_DBTIMESTAMP DateTime
smallint DT_I2 Int16
smallmoney DT_CY Currency
sql_variant DT_WSTR Object
text (*) DT_TEXT AnsiString
time DT_DBTIME2 Time
timestamp (*) DT_BYTES Binary
tinyint DT_UI1 Byte
uniqueidentifier DT_GUID Guid
varbinary DT_BYTES Binary
varbinary(max) DT_IMAGE Binary
varchar DT_STR AnsiString
varchar(max) DT_TEXT AnsiString
xml DT_NTEXT Xml

(* These data types will be removed in a future version of SQL Server. Avoid using these data types in new projects, and try to change them in current projects.)