Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Udemy

Azure Data Factory for Beginners - Build Data Ingestion

via Udemy

Overview

Learn Azure Data Factory by building a Metadata-driven Ingestion Framework as an industry standard

What you'll learn:
  • Azure Data Factory
  • Azure Blob Storage
  • Azure Gen 2 Data Lake Storage
  • Azure Data Factory Pipelines
  • Data Engineering Concepts
  • Data Lake Concepts
  • Metadata Driven Frameworks Concepts
  • Industry Example on How to build Ingestion Frameworks
  • Dynamic Azure Data Factory Pipelines
  • Email Notifications with Logic Apps
  • Tracking of Pipelines and Batch Runs
  • Version Management with Azure DevOps
  • An in-depth introduction to Infrastructure as Code with the Azure DevOps platform
  • A definition of DevOps and how Azure as a SaaS (Software as a Service) platform that facilitates the practice of the DevOps methodology
  • An Introduction to YAML pipelines on the Azure DevOps platform
  • An Introduction to BICEP and ARM templates for developing Infrastructure as Code (IaC) on the Azure DevOps Platform
  • An overview of Industry leading DevOps tools
  • The creation of a local Git Repository
  • Learn how to stage and commit single and multiple files
  • Branching management with Git including Merging
  • Git with Bash and Visual Studio Code
  • Learn how to time travel and undo changes
  • Set up Billing for Microsoft and Self Hosted pipeline agents
  • Installation and Set Up for a Self Hosted pipeline agents
  • Setting up of a Personal Access Token
  • Configuration of a Self-Hosted Agent
  • How to Create an Azure Service Connection
  • Cloning an Azure DevOps Repository
  • Writing PowerShell Script to Provision a Resource Group
  • How to Add Stages, Jobs and Steps in a YAML pipeline template
  • Running the YAML pipeline on Azure DevOps
  • How to develop Azure Variables Group and pass them into YAML templates
  • How to override BICEP parameters using YAML
  • Creating Project Structures for a DevOps and BICEP project using Bash and Git
  • Establish a standard naming convention for resources using BICEP and PowerShel
  • Development of a BICEP template to provision Log Analytics and Data Factory
  • How to add Input Parameters to a BICEP template
  • How to create BICEP Modules for Log Analytics and Data Factory
  • How to add Tagging Information to BICEP modules
  • How to structure a naming convention with BICEP
  • How to use run time and compile time variables and parameters
  • How to write a PowerShell Script to Transpile BICEP to an ARM template
  • How to Manage Dependencies between Resources with BICEP
  • How manage BICEP template errors

The main objective of this course is to help you to learn Data Engineering techniques of building Metadata-Driven frameworks with Azure Data Engineering tools such as Data Factory, Azure SQL, and others.


Building Frameworks are now an industry norm and it has become an important skill to know how to visualize, design, plan and implement data frameworks.


The framework that we are going to build together is referred to as the Metadata-Driven Ingestion Framework.


Data ingestion into the data lake from the disparate source systems is a key requirement for a company that aspires to be data-driven, and finding a common way to ingest the data is a desirable and necessary requirement.


Metadata-Driven Frameworks allow a company to develop the system just once and it can be adopted and reused by various business clusters without the need for additional development, thus saving the business time and costs. Think of it as a plug-and-play system.


The first objective of the course is to onboard you onto the Azure Data Factory platform to help you assemble your first Azure Data Factory Pipeline. Once you get a good grip of the Azure Data Factory development pattern, then it becomes easier to adopt the same pattern to onboard other sources and data sinks.


Once you are comfortable with building a basic azure data factory pipeline, as a second objective we then move on to building a fully-fledged and working metadata-driven framework to make the ingestion more dynamic, and furthermore, we will build the framework in such a way that you can audit every batch orchestration and individual pipeline runs for business intelligence and operational monitoring.


Creating your first Pipeline


What will be covered is as follows;

1. Introduction to Azure Data Factory

2. Unpack the requirements and technical architecture

3. Create an Azure Data Factory Resource

4. Create an Azure Blob Storage account

5. Create an Azure Data Lake Gen 2 Storage account

6. Learn how to use the Storage Explorer

7. Create Your First Azure Pipeline.


Metadata Driven Ingestion


1. Unpack the theory on Metadata Driven Ingestion

2. Describing the High-Level Plan for building the User

3. Creation of a dedicated Active Directory User and assigning appropriate permissions

4. Using Azure Data Studio

5. Creation of the Metadata Driven Database (Tables and T-SQL Stored Procedure)

6. Applying business naming conventions

7. Creating an email notifications strategy

8. Creation of Reusable utility pipelines

9. Develop a mechanism to log data for every data ingestion pipeline run and also the batch itself

10. Creation of a dynamic data ingestion pipeline

11. Apply the orchestration pipeline

12. Explanation of T-SQL Stored Procedures for the Ingestion Engine

13. Creating an Azure DevOps Repository for the Data Factory Pipelines


Event-Driven Ingestion

1. Enabling the Event Grid Provider

2. Use the Getmetadata Activity

3. Use the Filter Activity

4. Create Event-Based Triggers

5. Create and Merge new DevOps Branches


Bonus Course: Provision Infra with Azure BICEP


The goal of this course is to help students learn how to professionally write and develop Azure DevOps Infrastructure as Code with BICEP, YAML, Git, and PowerShell.

Azure DevOps is a leading automation and DevOps platform and the students will be taken through the following;

  • An in-depth introduction to Infrastructure as Code with the Azure DevOps platform

  • A definition of DevOps and how Azure as a SaaS (Software as a Service) platform facilitates the practice of the DevOps methodology

  • An Introduction to YAML pipelines on the Azure DevOps platform

  • An Introduction to BICEP and ARM templates for developing Infrastructure as Code(IaC) on the Azure DevOps Platform

  • An overview of Industry leading DevOps tools

Git is an industry-leading distributed version control system and is a very critical component of Azure DevOps therefore students will be taken through a Git Crash Course that covers the following basic aspects;

  • The creation of a local Git Repository

  • Learn how to stage and commit single and multiple files

  • Branching management with Git including Merging

  • Git with Bash and Visual Studio Code

  • Learn how to time travel and undo changes

Students may find it necessary to learn about how to set up Azure DevOps Pipeline Agents as Self-Hosted Azure DevOps agents for running CI / CD pipelines, perhaps the situation could be cost saving in a work environment or a cost-effective personal environment, and therefore the students will learn the following;

  • Set up Billing for Microsoft and Self Hosted pipeline agents

  • Installation and Set Up for a Self Hosted pipeline agents

  • Setting up of a Personal Access Token

  • Configuration of a Self-Hosted Agent

YAML is a leading configuration management technology for developing CI / CD pipelines, perhaps the best way to learn how to write YAML pipelines is for the student to be taken through how to provision infrastructure with YAML, Powershell, and BICEP. The initial focus will be the provisioning of the resource group and there and therefore the students will learn the following;

  • How to Create an Azure Service Connection

  • Cloning an Azure DevOps Repository

  • Writing PowerShell Script to Provision a Resource Group

  • How to Add Stages, Jobs, and Steps in a YAML pipeline template

  • Running the YAML pipeline on Azure DevOps

  • How to develop Azure Variables Group and pass them into YAML templates

  • How to override BICEP parameters using YAML

One aspect of professionalism in coding is how projects are structured for coding efficiency and ease of management, the other aspect in the naming convention of resources. The course will take through students on the following.

  • Creating Project Structures for a DevOps and BICEP project using Bash and Git

  • Establish a standard naming convention for resources using BICEP and PowerShell

The heart of provisioning and deploying infrastructure in Azure is the adoption of BICEP, and students will learn the following in terms of developing BICEP in a professional manner;

  • Development of a BICEP template to provision Log Analytics and Data Factory

  • How to add Input Parameters to a BICEP template

  • How to create BICEP Modules for Log Analytics and Data Factory

  • How to add Tagging Information to BICEP modules

  • How to structure a naming convention with BICEP

  • How to use run time and compile time variables and parameters

  • How to write a PowerShell Script to Transpile BICEP to an ARM template

  • How to Manage Dependencies between Resources with BICEP

  • How to manage BICEP template errors


Taught by

David Charles Academy

Reviews

4.4 rating at Udemy based on 1342 ratings

Start your review of Azure Data Factory for Beginners - Build Data Ingestion

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.