Implement a data engineering solution with Azure Databricks (DP-3027)

Name: Implement a data engineering solution with Azure Databricks
Availability: InStock

Learn how to harness the power of Apache Spark and powerful clusters running on the Azure Databricks platform to run large data engineering workloads in the cloud.

Audience Profile

Data Engineers, Data Scientists, ELT Developers learn how to harness the power of Apache Spark and powerful clusters running on the Azure Databricks platform to run large data engineering workloads in the cloud.

Course Modules

Perform incremental processing with spark structured streaming

You explore different features and tools to help you understand and work with incremental processing with spark structured streaming.

Introduction
Set up real-time data sources for incremental processing
Optimize Delta Lake for incremental processing in Azure Databricks
Handle late data and out-of-order events in incremental processing
Monitoring and performance tuning strategies for incremental processing in Azure Databricks
Exercise - Real-time ingestion and processing with Delta Live Tables with Azure Databricks
Module assessment
Summary

Implement streaming architecture patterns with Delta Live Tables

You explore different features and tools to help you develop architecture patterns with Azure Databricks Delta Live Tables.

Introduction
Event driven architectures with Delta Live tables
Ingest data with structured streaming
Maintain data consistency and reliability with structured streaming
Scale streaming workloads with Delta Live tables
Exercise - end-to-end streaming pipeline with Delta Live tables
Module assessment
Summary

Optimize performance with Spark and Delta Live Tables

Learn how to Optimize performance with Spark and Delta Live Tables in Azure Databricks.

Introduction
Optimize performance with Spark and Delta Live Tables
Perform cost-based optimization and query tuning
Use change data capture (CDC)
Use enhanced autoscaling
Implement observability and data quality metrics
Exercise - optimize data pipelines for better performance in Azure Databricks
Module assessment
Summary

Implement CI/CD workflows in Azure Databricks

Learn how to implement CI/CD workflows in Azure Databricks to automate the integration and delivery of code changes.

Introduction
Implement version control and Git integration
Perform unit testing and integration testing
Manage and configure your environment
Implement rollback and roll-forward strategies
Exercise - Implement CI/CD workflows
Module assessment
Summary

Automate workloads with Azure Databricks Jobs

Learn how to orchestrate and schedule data workflows with Azure Databricks Jobs. Define and monitor complex pipelines, integrate with tools like Azure Data Factory and Azure DevOps, and reduce manual intervention, leading to improved efficiency, faster insights, and adaptability to business needs.

Introduction
Implement job scheduling and automation
Optimize workflows with parameters
Handle dependency management
Implement error handling and retry mechanisms
Explore best practices and guidelines
Exercise - Automate data ingestion and processing
Module assessment
Summary

Manage data privacy and governance with Azure Databricks

In this module, you explore different features and approaches to help you secure and manage your data within Azure Databricks using tools, such as Unity Catalog.

Introduction
Implement data encryption techniques in Azure Databricks
Manage access controls in Azure Databricks
Implement data masking and anonymization in Azure Databricks
Use compliance frameworks and secure data sharing in Azure Databricks
Use data lineage and metadata management
Implement governance automation in Azure Databricks
Exercise - Practice the implementation of Unity Catalog
Module assessment
Summary

Use SQL Warehouses in Azure Databricks

Azure Databricks provides SQL Warehouses that enable data analysts to work with data using familiar relational SQL queries.

Introduction
Get started with SQL Warehouses
Create databases and tables
Create queries and dashboards
Exercise - Use a SQL Warehouse in Azure Databricks
Module assessment
Summary

Run Azure Databricks Notebooks with Azure Data Factory

Using pipelines in Azure Data Factory to run notebooks in Azure Databricks enables you to automate data engineering processes at cloud scale.

Introduction
Understand Azure Databricks notebooks and pipelines
Create a linked service for Azure Databricks
Use a Notebook activity in a pipeline
Use parameters in a notebook
Exercise - Run an Azure Databricks Notebook with Azure Data Factory
Module assessment
Summary

Download With Schedule Download Without Schedule