Training: Data Engineering with Databricks

Data engineering is a vital component of any data-driven organization. With the increasing volume and complexity of data, it has become essential to have a powerful and efficient platform to manage and process it. And this is where Databricks comes into play. Databricks is a cloud-based platform that provides data engineers with a unified analytics engine for big data and machine learning. It combines the power of Apache Spark with an easy-to-use interface, making it an indispensable tool in any data engineer's toolbox.
As a data engineer, you want to know how to easily and quickly process large datasets, build and train machine learning models, and perform advanced analytics.
This training is necessary for any data engineer looking to build a robust and scalable data infrastructure.

Training description

This 3-day hands-on Databricks training equips you with the latest skills to design, build, and optimize data pipelines on Azure Databricks using the Lakehouse architecture.
You’ll learn not only the core Databricks concepts but also how to apply modern engineering practices such as Delta Lake, Unity Catalog, Lakeflow ingestion, Medallion architecture, and CI/CD deployment. By the end, you’ll be able to set up robust data pipelines that are secure, automated, and ready for production.

Duration & Agenda

This 3-day training covers end-to-end data development through Databricks.

Day 1 - Core Foundations:

Get a strong foundation in the Databricks platform:

Databricks Overview & Evolution from Data Warehouse → Data Lakehouse → Delta Lakehouse
Platform essentials: workspaces, notebooks, clusters, repos & catalog
Data access through Unity Catalog
Working with Spark DataFrames: Transformations vs. Actions, Lazy Execution
Hands-on labs: create clusters, load & transform datasets

Day 2 - Lakehouse Build:

Learn how to structure, govern, and secure a modern Lakehouse:

Delta Lake essentials: ACID transactions, Delta table structure & time travel
Unity Catalog deep dive: centralized governance, object model, managed vs. external tables, and secure data sharing across the platform
Databricks in a Modern Data Platform: how Azure Databricks fits into a larger architecture and interacts with other components
Medallion Architecture: applying Bronze / Silver / Gold design patterns for ingestion, cleansing, and aggregation
Data security in Databricks: security levels and controls, including RLS, OLS, CLS, and ABAC
Hands-on labs: working with Delta tables, building governed schemas in Unity Catalog, and implementing Medallion design pattern

Day 3 - Advanced Engineering & Orchestration:

Build end-to-end, production-ready pipelines and optimize performance:

Modern ingestion with Lakeflow Connect
Streaming pipelines with Auto Loader & Change Data Feed
Workflows & orchestration with DLT & Lakeflow pipelines
Monitoring, debugging, and Spark UI
Delta & Spark optimizations (Z-ORDER, predictive optimization, Liquid Clustering)
CI/CD & deployment with repos & DAB integration
User & identity management (Entra ID, workspace access)
Hands-on labs: orchestrate pipelines, optimize datasets, deploy code to production

After completing this training, you will have a thorough understanding of basic and advanced optimization techniques and the ability to master data engineering with Databricks on Azure, significantly improving your skills.

Target audience
You are an (aspirant) BI professional with knowledge of data modeling & data lakehouse development. You know SQL or Python, and you have a notion of dimensional data concepts.
- Note: For aspirants in BI & data warehousing, we highly recommend following the Dimensional Data Modeling Training before this Databricks training.
You are looking to know what's what in the Azure Cloud and get some practical tips (rather than reading online documentation)
- Note: we recommend all participants follow the Azure Fundamentals training before this Databricks training as it gives a broad overview of Azure Cloud, resources & cloud data analytics concepts & key resources.

Format

The training consists of plenary lecturing with a hands-on lab environment. The course can be taught in both English and Dutch, also on-site at the customers' premises.

Cost

2.000 € per participant for 3 days

More information or registration

For more information, contact academy@element61.be
The training schedule can be found in the Academy Calendar (PDF)
For a complete overview of all training, visit our A cademy page