Skip to content

Series: Beginner's Guide to Azure Data Factory

Welcome to this Beginner’s Guide to Azure Data Factory! In this series, I’m going to cover the fundamentals of Azure Data Factory in fun, casual, bite-sized blog posts that you can read through at your own pace and reference later. You may not be new to data integration or SQL, but we’re going to start completely from scratch in this series.

How do you get started building data pipelines? What if you need to transform or re-shape data? How do you schedule and monitor your data pipelines? Can you make your solution dynamic and reusable? Join me in this Beginner’s Guide to Azure Data Factory to learn all of these things – and maybe more :) Let’s go!

  1. Introduction to Azure Data Factory
  2. Creating an Azure Data Factory
  3. Overview of Azure Data Factory Components
  4. Copy Data Wizard
  5. Pipelines
  6. Copy Data Activity
  7. Datasets
  8. Linked Services
  9. Data Flows
  10. Orchestrating Pipelines
  11. Debugging Pipelines
  12. Triggers
  13. Monitoring
  14. Annotations and User Properties
  15. Integration Runtimes
  16. Copy SQL Server Data
  17. Executing SSIS Packages
  18. Source Control
  19. Templates
  20. Parameters
  21. Variables
  22. ForEach Loops
  23. Lookups
  24. Understanding Pricing
  25. Resources

P.S. This series will always be a work-in-progress. Yes, always. Azure changes often, so I keep coming back to tweak, update, and improve content :)

Debugging Pipelines in Azure Data Factory

This post is part 11 of 25 in the series Beginner's Guide to Azure Data Factory

In the previous post, we looked at orchestrating pipelines using branching, chaining, and the execute pipeline activity. In this post, we will look at debugging pipelines. How do we test our solutions?

You debug a pipeline by clicking the debug button:

Screenshot of the Azure Data Factory interface, with a pipeline open, and the debug button highlighted

Tadaaa! Blog post done? :D

I joke, I joke, I joke. Debugging pipelines is a one-click operation, but there are a few more things to be aware of. In the rest of this post, we will look at what happens when you debug a pipeline, how to see the debugging output, and how to set breakpoints.

Debugging Pipelines

Let’s start with the most important thing:

Continue reading →

Triggers in Azure Data Factory

This post is part 12 of 25 in the series Beginner's Guide to Azure Data Factory

In the previous post, we looked at testing and debugging pipelines. But how do you schedule your pipelines to run automatically? In this post, we will look at the different types of triggers in Azure Data Factory.

Let’s start by looking at the user interface, and dig into the details of the different trigger types.

Creating Triggers

First, click Triggers. Then, on the linked services tab, click New:

Screenshot of Azure Data Factory user interface with Triggers open, highlighting the button for creating a new trigger

The New Trigger pane will open. The default trigger type is Schedule, but you can also choose Tumbling Window and Event:

Screenshot of Azure Data Factory user interface with the New Trigger pane open and the different trigger types highlighted

Let’s look at each of these trigger types and their properties :)

Continue reading →

Monitoring Azure Data Factory

This post is part 13 of 25 in the series Beginner's Guide to Azure Data Factory

In the previous post, we looked at the three different trigger types, as well as how to trigger pipelines on-demand. In this post, we will look at what happens after that. How does monitoring work in Azure Data Factory?

Now, if we want to look at monitoring, we probably need something to monitor first. I mean, I could show you a blank dashboard, but I kind of already did that, and that wasn’t really interesting at all 🤔 So! In the previous post, I created a schedule trigger that runs hourly, added it to my orchestration pipeline, and published it.

Let’s take a look at what has happened since then!

Continue reading →

Annotations and User Properties in Azure Data Factory

This post is part 14 of 25 in the series Beginner's Guide to Azure Data Factory

In the previous post, we looked at how monitoring and alerting works. But what if we want to customize the monitoring views even further? There are a few ways to do that in Azure Data Factory. In this post, we will add both annotations and custom properties.

But before we do that, let’s look at a few more ways to customize the monitoring views.

Customizing Monitoring Views

In the previous post, we mainly looked at how to configure the monitoring and alerting features. We saw that we could change filters and switch between list and Gantt views, but it’s possible to tweak the interface even more to our liking.

Continue reading →

Integration Runtimes in Azure Data Factory

This post is part 15 of 25 in the series Beginner's Guide to Azure Data Factory

So far in this series, we have only worked with cloud data stores. But what if we need to work with on-premises data stores? After all, Azure Data Factory is a hybrid data integration service :) To do that, we need to create and configure a self-hosted integration runtime. But before we do that, let’s look at the different types of integration runtimes!

Integration Runtimes

An integration runtime (IR) specifies the compute infrastructure an activity runs on or gets dispatched from. It has access to resources in either public networks, or in public and private networks.

Or, in Cathrine-speak, using less precise words: An integration runtime specifies what kind of hardware is used to execute activities, where this hardware is physically located, who owns and maintains the hardware, and which data stores and services the hardware can connect to.

Continue reading →