Showing posts with the label ClouderaShow All

Create encrypted zones in HDFS

To create an HDFS encryption zone first you need to set up HDFS Data at Rest encryption service. For Cloudera distribution follow below: Select Service.

Read more

Configure Proxy for HiveServer2 and Impala

We will use here HAProxy which is an open-source HA load balancer and proxy server for TCP and HTTP based applications. Ngnix is not recommended as we do not have webserver traffic to load balance. Let’s first install HAProxy on the proxy server. yum -y inst…

Read more

Lambda vs Delta Architecture - Realtime Analytics on Delta Lake

Before I start details for Delta Architecture lets recap Lambda Architecture first, then you will be able to appreciate the beauty of delta Architecture. Lambda architecture is a popular technique where records are processed by a batch system and streaming system…

Read more

How to expand Linux OS Disk

First let’s understand the 3 different roles of disk i.e. the data disk, the OS disk, and the temporary disk.  A data disk is a managed disk that's attached to a virtual machine to store application data, or other data you need to keep.  OS disk has pre-insta…

Read more

Hadoop Cluster Setup

Installing a Hadoop cluster typically involves unpacking the software on all the machines in the cluster or installing it via a packaging system as appropriate for your operating system. It is important to divide up the hardware into functions. Typically one machin…

Read more

Benchmark Hadoop

When we install hadoop we get few jars to test the installation and for benchmarking. In Cloudera distribution: /opt/cloudera/parcels/CDH-6.1.1-1.cdh6.1.1.p0.875250/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar /opt/cloudera/parcels/CDH-6.1.1-1.cdh6.1.1.p0.…

Read more

Install Apache Web Server on RedHat/CentOS 7

Apache is available within CentOS’s default software repositories, which means you can install it with the  yum  package manager. yum install -y httpd You might need to allow HTTP(port 80) and HTTPS (port 443) from firewall, use below:

Read more

How to Connect to Hadoop Cloudera Hive Cluster

If you encounter the following Cloudera connection errors, the following steps to restart all Cloudera services will most likely resolve issue. This article explains how to address the following connection errors shown below that you may encounter.

Read more