Databricks Continuous integration and delivery using Azure DevOps

So you’ve created notebooks in your Databricks workspace, collaborated with your peers and now you’re ready to operationalize your work. This is a simple process if you only need to copy to another folder within the same workspace. But what if you needed to separate your DEV and PROD?

Things get a little more complicated when it comes to automating code deployment; hence CI/CD.

Implementing CI/CD in Azure Databricks

The following are the steps to be performed for a complete CI/CD workflow.

Develop and commit your code in develop branch.

Push code from develop branch to → master branch.

Deploy Notebooks in different environments:

Dev → UAT → Prod using CI/CD pipelines in Azure DevOps.

Databricks Notebook configuration

Step 1: Dev Env – Code Development + Git Link

Create a notebook:

Develop code and unit tests in a Databricks notebook

To link to Git:

1. From Revision history, Click on Git: Not linked

2. Status: select Link

3. Link: need to add repo url of respective project

4. Branch: create new branch from your project’s master branch

5. Save: to link it to Git

Step 2: Dev Env – Git Commit

To do commit once Development is done:

1. From Revision History, Click on Save now

2. Add you commit message and tick mark

3. Once you Save, it will commit to feature branch

DevOps – Release pipeline

1. Create Pull request from your feature branch to merge it to master branch

Add Team Lead / Release Manager as Reviewer

Once it merged to master branch, CI CD pipeline will auto triggered

and Team Lead / Release Manager will receive an email for pre-deployment approval into UAT and Production.

2. Deploy in UAT/PROD

Databricks Continuous integration and delivery using Azure DevOps

Post a Comment

Post a Comment

Contact Form

Databricks Continuous integration and delivery using Azure DevOps

You might like

Post a Comment

Post a Comment

Contact Form