Azure Data Factory Templates with Data Lake
Altis have been working on a number of projects to assist our customers move their data and analytics workloads to the Azure Cloud. In this blog, James Mitchell from our Canberra office shares some project details and insights into one of these successful projects, where we established a new data warehouse environment in Microsoft Azure using our Azure Data Factory v2 Templates.
Our client needed to provide KPI reporting to internal and external stakeholders. At the same time, there was a drive to establish a new cloud-based analytics platform as a potential replacement for the legacy on premise data warehouse. Our client chose this project to establish this new platform and trial the new technologies.
This is how we approached the solution:
Azure Data Factory V2 was used via the Self Hosted Integration Runtime to extract data from Excel and Oracle data sources to populate an Azure Data Lake. The Data Lake was then utilised as a source for a Data Warehouse developed on Azure SQL Database, with further transformations and enrichment of the data.
Power Bi reports were implemented for reporting, on top of the Azure SQL Database. A future requirement is to use Power Bi’s Azure Data Lake connector to leverage insights directly from the Azure Data Lake.
A challenge we experienced in this project, was that Azure Data Factory does not have native support for Excel files – a method needed to be developed to automatically convert these files to csv. We opted to use PowerShell to complete this task. As PowerShell has a set of commandlets for Azure Data Factory, the script evolved into the trigger mechanism for the Azure Data Factory Pipelines. The script would intermittently poll for new source files, then start the required Data Factory Pipelines.
Follow this link for information on our Azure Data Factory v2 Templates