Create encrypted zones in HDFS

To create an HDFS encryption zone first you need to set up HDFS Data at Rest encryption service. For Cloudera distribution follow below: Select Service.

Read more

Configure Proxy for HiveServer2 and Impala

We will use here HAProxy which is an open-source HA load balancer and proxy server for TCP and HTTP based applications. Ngnix is not recommended as we do not have webserver traffic to load balance. Let’s first install HAProxy on the proxy server. yum -y inst…

Read more

Databricks Continuous integration and delivery using Azure DevOps

So you’ve created notebooks in your Databricks workspace, collaborated with your peers and now you’re ready to operationalize your work. This is a simple process if you only need to copy to another folder within the same workspace. But what if you needed to separate …

Read more

Real-time Anomaly detection on Azure

Anomaly Detection is a very powerful pattern and mostly, 70% time used. Azure Stream Analytics is having built-in ML-based Anomaly detection. It is based on the Un-supervised learning model i.e. model does not come with any pre-training, it starts learning with no o…

Read more

Real-Time Data Stream Processing In Azure Part: 2

This article is extend of my previous post “ Real-Time Data Stream Processing in Azure Part: 1 ” on real time analytics. Before I start the core for which you started reading this blog, lets quickly review few basic concepts for Stream Analytics.

Read more

Real-Time Data Stream Processing In Azure Part: 1

There are multiple ways to do real time analytics on Azure, depends on source type and the analytics which we want to perform. This article will introduce the major services in Azure which are involved for real time data solution and at the end will compare all to kno…

Read more