Databricks Continuous integration and delivery using Azure DevOps

So you’ve created notebooks in your Databricks workspace, collaborated with your peers and now you’re ready to operationalize your work. This is a simple process if you only need to copy to another folder within the same workspace. But what if you needed to separate your DEV and PROD?
Things get a little more complicated when it comes to automating code deployment; hence CI/CD.

Implementing CI/CD in Azure Databricks

The following are the steps to be performed for a complete CI/CD workflow.
Develop and commit your code in develop branch.
Push code from develop branch to → master branch.
Deploy Notebooks in different environments:
    Dev → UAT → Prod using CI/CD pipelines in Azure DevOps.


Databricks Notebook configuration

Step 1: Dev Env – Code Development + Git Link
Create a notebook:
Develop code and unit tests in a Databricks notebook
To link to Git:
1. From Revision history, Click on Git: Not linked
2. Status: select Link
3. Link: need to add repo url of respective project
4. Branch: create new branch from your project’s master branch
5. Save: to link it to Git


Step 2: Dev Env – Git Commit
To do commit once Development is done:
 
1. From Revision History, Click on Save now
2. Add you commit message and tick mark
3. Once you Save, it will commit to feature branch

DevOps – Release pipeline

1. Create Pull request from your feature branch to merge it to master branch
Add Team Lead / Release Manager as Reviewer
Once it merged to master branch, CI CD pipeline will auto triggered
and Team Lead / Release Manager will receive an email for pre-deployment approval into UAT and Production.

2. Deploy in UAT/PROD

Post a Comment

Thanks for your comment !
I will review your this and will respond you as soon as possible.