Fast Track Your Azure Data Factory v2 ETL Development
By Dawie Kruger, Azure Practice Lead, Australia
In my previous blog Using SSIS within Azure Data Factory v2, we looked at using the Azure Data Factory SSIS Runtime and how the Altis ETL Jumpstart Kit was easily ported to run in the new environment. In this blog I will take you through some new developments Altis has made for the Azure Data Factory v2 platform.
Azure Data Factory v2 is a cloud-based data integration service that allows you to create data-driven workflows for orchestrating and automating data movement and data transformation. Using Azure Data Factory, you can create and schedule data-driven workflows (called pipelines) that can ingest data from disparate data stores. It can process and transform the data by using compute services such as Azure HDInsight, Hadoop, Spark, Azure Data Lake Analytics and Azure Machine Learning.
The next step in the evolution of data integration
Over the past six months, we have been inundated with clients looking to use this platform to extend existing or implement a new Data warehouse and ETL platform. Due to these requests, we have developed a set of development standards and guidelines to streamline Azure Data Factory v2 development. Whether implementing a pure Azure Data Factory v2 ETL; Azure Data Factory SSIS ETL or a Hybrid Solution, Altis can effectively deliver repeatable results.
Some of the most popular features of our Azure Data Factory v2 Templates include:
- Greater overview of your data integration process with;
- Error Reporting
- Highly Configurable
- Strong Naming Conventions
- Support for Kimball, Inmon, Data Vault and other design practices
- Integration with On Premise sources using Azure Data Factory Self Hosted Runtime
- Checkpoints and Restart Points – Pick up where you left off
- Source Control and Deployment guidelines
- Native support for JSON and XML data
- Source control (TFS, VSTS, Git)
What does this mean for you?
Using these kits allows for quicker turnaround times for your data needs, as less time is spent on setup and configuration and more time is spent on working with your data to deliver insights to your business.