Skip to content

Tag: Azure Data Factory

Templates in Azure Data Factory

This post is part 20 of 26 in the series Beginner's Guide to Azure Data Factory

In the previous post, we looked at setting up source control. Once we did that, a new menu popped up under factory resources: templates! In this post, we will take a closer look at this feature. What is the template gallery? How can you create pipelines from templates? And how can you create your own templates?

Let’s hop straight into Azure Data Factory!

Using Templates from the Template Gallery

From the Home page, you can create pipelines from templates:

Screenshot of the Azure Data Factory Home page, highlighting the create pipeline from template option
Continue reading →

Source Control in Azure Data Factory

This post is part 19 of 26 in the series Beginner's Guide to Azure Data Factory

Raise your hand if you have wondered why you can only publish and not save anything in Azure Data Factory 🙋🏼‍♀️ Wouldn’t it be nice if you could save work in progress? Well, you can. You just need to set up source control first! In this post, we will look at why you should use source control, how to set it up, and how to use it inside Azure Data Factory.

And yeah, I usually recommend that you set up source control early in your project, and not on day 18… However, it does require some external configuration, and in this series I wanted to get through the Azure Data Factory basics first. But by now, you should know enough to decide whether or not to commit to Azure Data Factory as your data integration tool of choice.

Get it? Commit to Azure Data Factory? Source Control? Commit? 🤓

Ok, that was terrible, I know. But hey, I’ve been writing these posts for 18 days straight now, let me have a few minutes of fun with Wil Wheaton 😂

Aaaaanyway!

Continue reading →

Executing SSIS Packages in Azure Data Factory

This post is part 18 of 26 in the series Beginner's Guide to Azure Data Factory

Two posts ago, we looked at the three types of integration runtimes and created an Azure integration runtime. In the previous post, we created a self-hosted integration runtime for copying SQL Server data. In this post, we will complete the integration runtime part of the series. We will look at what SSIS Lift and Shift is, how to create an Azure-SSIS integration runtime, and how you can start executing SSIS packages in Azure Data Factory.

(And if you don’t work with SSIS, today is an excellent day to take a break from this series. Go do something fun! Like eat some ice cream. I’m totally going to eat ice cream after publishing this post 🙃)

Continue reading →

Copy SQL Server Data in Azure Data Factory

This post is part 17 of 26 in the series Beginner's Guide to Azure Data Factory

In the previous post, we looked at the three different types of integration runtimes. In this post, we will first create a self-hosted integration runtime. Then, we will create a new linked service and dataset using the self-hosted integration runtime. Finally, we will look at some common techniques and design patterns for copying data from and into an on-premises SQL Server.

And when I say “on-premises”, I really mean “in a private network”. It can either be a SQL Server on-premises on a physical server, or “on-premises” in a virtual machine.

Or, in my case, “on-premises” means a SQL Server 2019 instance running on Linux in a Docker container on my laptop 🤓

Continue reading →

Integration Runtimes in Azure Data Factory

This post is part 16 of 26 in the series Beginner's Guide to Azure Data Factory

So far in this series, we have only worked with cloud data stores. But what if we need to work with on-premises data stores? After all, Azure Data Factory is a hybrid data integration service :) To do that, we need to create and configure a self-hosted integration runtime. But before we do that, let’s look at the different types of integration runtimes!

(Pssst! Integration runtimes have been moved into the management page. I’ll be updating the descriptions and screenshots shortly!)

Integration Runtimes

An integration runtime (IR) specifies the compute infrastructure an activity runs on or gets dispatched from. It has access to resources in either public networks, or in public and private networks.

Or, in Cathrine-speak, using less precise words: An integration runtime specifies what kind of hardware is used to execute activities, where this hardware is physically located, who owns and maintains the hardware, and which data stores and services the hardware can connect to.

Continue reading →