Skip to content

Tag: Azure Data Factory

Variables in Azure Data Factory

This post is part 21 of 25 in the series Beginner's Guide to Azure Data Factory

In the previous post, we talked about why you would want to build a dynamic solution, then looked at how to use parameters. In this post, we will look at variables, how they are different from parameters, and how to use the set variable and append variable activities.

Variables

Parameters are external values passed into pipelines. They can’t be changed inside a pipeline. Variables, on the other hand, are internal values that live inside a pipeline. They can be changed inside that pipeline.

Parameters and variables can be completely separate, or they can work together. For example, you can pass a parameter into a pipeline, and then use that parameter value in a set variable or append variable activity.

Continue reading →

Parameters in Azure Data Factory

This post is part 20 of 25 in the series Beginner's Guide to Azure Data Factory

In the last mini-series inside the series (🙃), we will go through how to build dynamic pipelines in Azure Data Factory. In this post, we will look at parameters, expressions, and functions. Later, we will look at variables, loops, and lookups. Fun!

But first, let’s take a step back and discuss why we want to build dynamic pipelines at all.

Hardcoded Solutions

Back in the post about the copy data activity, we looked at our demo datasets. The LEGO data from Rebrickable consists of nine CSV files. So far, we have hardcoded the values for each of these files in our example datasets and pipelines.

Now imagine that you want to copy all the files from Rebrickable to your Azure Data Lake Storage account. Then copy all the data from your Azure Data Lake Storage into your Azure SQL Database. What will it look like if you have to create all the individual datasets and pipelines for these files?

Like this. It will look like this:

Screenshot of nine different datasets connecting to the Rebrickable website
Screenshot of nine different datasets connecting to Azure Data Lake Storage
Screenshot of nine different datasets connecting to Azure SQL Database
Screenshot of nine different pipelines copying data from the Rebrickable website to Azure Data Lake Storage
Screenshot of nine different pipelines copying data from Azure Data Lake Storage to Azure SQL Database

Hooboy! I don’t know about you, but I do not want to create all of those resources! 🤯

(And I mean, I have created all of those resources, and then some. I currently have 56 hardcoded datasets and 72 hardcoded pipelines in my demo environment, because I have demos of everything. And I don’t know about you, but I never want to create all of those resources again! 😂)

So! What can we do instead?

Dynamic Solutions

We can build dynamic solutions!

Continue reading →

Templates in Azure Data Factory

This post is part 19 of 25 in the series Beginner's Guide to Azure Data Factory

In the previous post, we looked at setting up source control. Once we did that, a new menu popped up under factory resources: templates! In this post, we will take a closer look at this feature. What is the template gallery? How can you create pipelines from templates? And how can you create your own templates?

Let’s hop straight into Azure Data Factory!

Using Templates from the Template Gallery

From the Home page, you can create pipelines from templates:

Screenshot of the Azure Data Factory Home page, highlighting the create pipeline from template option
Continue reading →

Source Control in Azure Data Factory

This post is part 18 of 25 in the series Beginner's Guide to Azure Data Factory

Raise your hand if you have wondered why you can only publish and not save anything in Azure Data Factory 🙋🏼‍♀️ Wouldn’t it be nice if you could save work in progress? Well, you can. You just need to set up source control first! In this post, we will look at why you should use source control, how to set it up, and how to use it inside Azure Data Factory.

And yeah, I usually recommend that you set up source control early in your project, and not on day 18… However, it does require some external configuration, and in this series I wanted to get through the Azure Data Factory basics first. But by now, you should know enough to decide whether or not to commit to Azure Data Factory as your data integration tool of choice.

Get it? Commit to Azure Data Factory? Source Control? Commit? 🤓

Ok, that was terrible, I know. But hey, I’ve been writing these posts for 18 days straight now, let me have a few minutes of fun with Wil Wheaton 😂

Aaaaanyway!

Continue reading →

Executing SSIS Packages in Azure Data Factory

This post is part 17 of 25 in the series Beginner's Guide to Azure Data Factory

Two posts ago, we looked at the three types of integration runtimes and created an Azure integration runtime. In the previous post, we created a self-hosted integration runtime for copying SQL Server data. In this post, we will complete the integration runtime part of the series. We will look at what SSIS Lift and Shift is, how to create an Azure-SSIS integration runtime, and how you can start executing SSIS packages in Azure Data Factory.

(And if you don’t work with SSIS, today is an excellent day to take a break from this series. Go do something fun! Like eat some ice cream. I’m totally going to eat ice cream after publishing this post 🙃)

Continue reading →
Secured By miniOrange