Implement a data engineering solution with Azure Databricks (DP-3027) Coming Soon

Learn how to harness the power of Apache Spark and powerful clusters running on the Azure Databricks platform to run large data engineering workloads in the cloud.


Audience Profile

Data Engineers, Data Scientists, ELT Developers learn how to harness the power of Apache Spark and powerful clusters running on the Azure Databricks platform to run large data engineering workloads in the cloud.


Course Modules

Perform incremental processing with spark structured streaming

You explore different features and tools to help you understand and work with incremental processing with spark structured streaming.

  • Introduction
  • Set up real-time data sources for incremental processing
  • Optimize Delta Lake for incremental processing in Azure Databricks
  • Handle late data and out-of-order events in incremental processing
  • Monitoring and performance tuning strategies for incremental processing in Azure Databricks
  • Exercise - Real-time ingestion and processing with Delta Live Tables with Azure Databricks
  • Module assessment
  • Summary


Implement streaming architecture patterns with Delta Live Tables

You explore different features and tools to help you develop architecture patterns with Azure Databricks Delta Live Tables.

  • Introduction
  • Event driven architectures with Delta Live tables
  • Ingest data with structured streaming
  • Maintain data consistency and reliability with structured streaming
  • Scale streaming workloads with Delta Live tables
  • Exercise - end-to-end streaming pipeline with Delta Live tables
  • Module assessment
  • Summary


Optimize performance with Spark and Delta Live Tables

Learn how to Optimize performance with Spark and Delta Live Tables in Azure Databricks.

  • Introduction
  • Optimize performance with Spark and Delta Live Tables
  • Perform cost-based optimization and query tuning
  • Use change data capture (CDC)
  • Use enhanced autoscaling
  • Implement observability and data quality metrics
  • Exercise - optimize data pipelines for better performance in Azure Databricks
  • Module assessment
  • Summary


Implement CI/CD workflows in Azure Databricks

Learn how to implement CI/CD workflows in Azure Databricks to automate the integration and delivery of code changes.

  • Introduction
  • Implement version control and Git integration
  • Perform unit testing and integration testing
  • Manage and configure your environment
  • Implement rollback and roll-forward strategies
  • Exercise - Implement CI/CD workflows
  • Module assessment
  • Summary


Automate workloads with Azure Databricks Jobs

Learn how to orchestrate and schedule data workflows with Azure Databricks Jobs. Define and monitor complex pipelines, integrate with tools like Azure Data Factory and Azure DevOps, and reduce manual intervention, leading to improved efficiency, faster insights, and adaptability to business needs.

  • Introduction
  • Implement job scheduling and automation
  • Optimize workflows with parameters
  • Handle dependency management
  • Implement error handling and retry mechanisms
  • Explore best practices and guidelines
  • Exercise - Automate data ingestion and processing
  • Module assessment
  • Summary


Manage data privacy and governance with Azure Databricks

In this module, you explore different features and approaches to help you secure and manage your data within Azure Databricks using tools, such as Unity Catalog.

  • Introduction
  • Implement data encryption techniques in Azure Databricks
  • Manage access controls in Azure Databricks
  • Implement data masking and anonymization in Azure Databricks
  • Use compliance frameworks and secure data sharing in Azure Databricks
  • Use data lineage and metadata management
  • Implement governance automation in Azure Databricks
  • Exercise - Practice the implementation of Unity Catalog
  • Module assessment
  • Summary


Use SQL Warehouses in Azure Databricks

Azure Databricks provides SQL Warehouses that enable data analysts to work with data using familiar relational SQL queries.

  • Introduction
  • Get started with SQL Warehouses
  • Create databases and tables
  • Create queries and dashboards
  • Exercise - Use a SQL Warehouse in Azure Databricks
  • Module assessment
  • Summary


Run Azure Databricks Notebooks with Azure Data Factory

Using pipelines in Azure Data Factory to run notebooks in Azure Databricks enables you to automate data engineering processes at cloud scale.

  • Introduction
  • Understand Azure Databricks notebooks and pipelines
  • Create a linked service for Azure Databricks
  • Use a Notebook activity in a pipeline
  • Use parameters in a notebook
  • Exercise - Run an Azure Databricks Notebook with Azure Data Factory
  • Module assessment
  • Summary