I’m a data geek :) In fact, I like data so much that I have made it my career! I work with Azure Data and the Microsoft Data Platform, focusing on Data Integration using Azure Data Factory (ADF) and SQL Server Integration Services (SSIS).
In this category, I write technical posts and guides, and share my experiences with certification exams. You can also find a few interviews with Azure and SQL Server experts!
Azure Data posts cover topics like Azure Data Factory, Azure SQL Databases, Azure Data Lake Storage, and Azure Synapse Analytics. Microsoft Data Platform posts may cover topics like SQL Server, T-SQL, and SQL Server Management Studio (SSMS). You may even find the occasional Power BI post in here!
In the previous post, we peeked at the two different data flows in Azure Data Factory, then created a basic mapping data flow. In this post, we will look at orchestrating pipelines using branching, chaining, and the execute pipeline activity.
Let’s continue where we left off in the previous post. How do we wire up our solution and make it look something like this?
We need to make sure that we get the data before we can transform that data.
One way to build this solution is to create a single pipeline with a copy data activity followed by a data flow activity. But! Since we have already created two separate pipelines, and this post is about orchestrating pipelines, let’s go with the second option :D
So far in this Azure Data Factory series, we have looked at copying data. We have created pipelines, copy data activities, datasets, and linked services. In this post, we will peek at the second part of the data integration story: using data flows for transforming data.
But first, I need to make a confession. And it’s slightly embarrassing…
In the previous post, we looked at datasets and their properties. In this post, we will look at linked services in more detail. How do you configure them? What are the authentication options for Azure services? And how do you securely store your credentials?
Let’s start by creating a linked service to an Azure SQL Database. Yep, that linked service you saw screenshots of in the previous post. Mhm, the one I sneakily created already so I could explain using datasets as a bridge to linked services. That one :D
Creating Linked Services
First, click Connections. Then, on the linked services tab, click New:
The New Linked Service pane will open. The Data Store tab shows all the linked services you can get data from or read data to:
In the previous post, we looked at the copy data activity and saw how the source and sink properties changed with the datasets used. In this post, we will take a closer look at some common datasets and their properties.
Let’s start with the source and sink datasets we created in the copy data wizard!
First, a quick note. If you use the copy data wizard, you can change the dataset names by clicking the edit button on the summary page…
In the previous post, we went through Azure Data Factory pipelines in more detail. In this post, we will dig into the copy data activity. How does it work? How do you configure the settings? And how can you optimize performance while keeping costs down?
Copy Data Activity
The copy data activity is the core (*) activity in Azure Data Factory.
(* Cathrine’s opinion 🤓)
You can copy data to and from more than 80 Software-as-a-Service (SaaS) applications (such as Dynamics 365 and Salesforce), on-premises data stores (such as SQL Server and Oracle), and cloud data stores (such as Azure SQL Database and Amazon S3). During copying, you can define and map columns implicitly or explicitly, convert file formats, and even zip and unzip files – all in one task.
Yeah. It’s powerful :) But how does it really work?