Please note: Since I wrote this post, Wrangling Data Flows have been renamed to Power Queries, and there have been many updates in Azure Data Factory. I'm keeping this post as-is, please make sure you also read the official documentation.
In 2019, the Azure Data Factory team announced two exciting features. The first was Mapping Data Flows (currently in Public Preview), and the second was Wrangling Data Flows (currently in Limited Private Preview). Since then, I have heard many questions. One of the more common questions is “which should I use?” In this blog post, we will be comparing Mapping and Wrangling Data Flows to hopefully make it a little easier for you to answer that question.
Should you use Mapping or Wrangling Data Flows?
Now, we all know that the consultant answer to “which should I use?” is It Depends ™ 😄 But what does it depend on?
To me, it boils down to a few key questions you need to ask:
What is the task or problem you are trying to solve?
Where and how will you use the output?
Which tool are you most comfortable using?
Before we dig further into these questions, let’s start with comparing Mapping and Wrangling Data Flows.
On April 4th, 2019, I presented my Pipelines and Packages: Introduction to Azure Data Factory session at 24 Hours of PASS. I was excited to show some cool features and use cases, including how to handle schema drift in the new Mapping Data Flows feature.
Aaaaand… I failed! 🤦🏼♀️
Or, more specifically, my demo failed…
...when you test your demo three times and everything is fine, then it fails in your live session, but runs perfectly again once the session is over... 😂 pic.twitter.com/scTqVTvrtb
In January 2019, I was honored to be asked to contribute to the PASS Insights BI Edition Newsletter. I said yes, of course! 😊 I chose to create an Azure Data Factory Data Flows introduction video. This is a sneak preview of the upcoming Data Flows feature, with a quick walkthrough of how easy it can be to create scalable data transformations in the cloud - without writing any code!
Last year at Microsoft Ignite, I was fortunate enough to interview Mike Flasko and Sanjay Krishnamurthi. This year, I got to have a follow-up chat with Mike Flasko and Sharon Lo! We talked about the recent and upcoming Azure Data Factory updates 🤓
In this interview, Mike and Sharon share the highlights from their session at Microsoft Ignite 2018. What are visual Data Flows? How are Azure Data Factory Data Flows different from the recently announced Power BI Dataflows? What’s on the Azure Data Factory roadmap? And finally, how can you provide feedback and get involved in private previews?
Azure Data Factory Updates with Mike Flasko and Sharon Lo
(I apologize for the unsteady video 😔 Unfortunately, I didn’t see how shaky it was until post-production. If it gets too distracting to watch, please just listen. Mike and Sharon share a lot of interesting things!)
Thank you so much to Mike and Sharon for chatting with me on a busy day 😃
One of the sessions I was most looking forward to at Microsoft Ignite 2017 was New capabilities for data integration in the cloud with Mike Flasko. In that session, he talks about Azure Data Factory (ADF) v2 and its new first-class SSIS support.
After the session, I convinced Mike Flasko and Sanjay Krishnamurthi to have a chat with me 🤓 We talked about what’s new in Azure Data Factory v2, including the updated pipeline application model with a new visual design canvas, new Software Development Kits (SDKs) for working with Azure Data Factory, the new Integration Runtime, and the ability to run SSIS packages inside Azure Data Factory v2.