# HashPrompt

After configuring an automatic failover HA with ZooKeeper and NFS, we will now configure an automatic failover HA with ZooKeeper and QJM.

We will use the Quorum Journal Manager to share edit logs between active and standby namenodes. Any namespace modification done by active namenode is recorded by the journal nodes. These journal node daemons can be run alongside any other daemons like namenode or jobtracker. It is important to note that it is highly recommended to use three journal nodes, so that the edit log information is written to majority of machines.

Setup Summary

The below table describes my setup.

Role Hostname IP-Address

namenode1/ ha-nn01 192.168.56.101

journal-node1

namenode2/ ha-nn02 192.168.56.102

journal-node2

namenode3/ ha-nn03 192.168.56.103

journal-node3

datanode1 ha-dn01 192.168.56.104

datanode2 ha-dn02 192.168.56.105

client ha-client 192.168.56.101

According to the above table, I have three namenodes which will be concurrently running journal-nodes too. And two datanodes with one client machine.

Downloads

Download the below packages and place them in all nodes.

Download Apache Hadoop 2.6 here.

Download Oracle JDK 8 here.

Note: Before moving directly towards the installation and configuration part, read the pre-checks listed below.

1. Disable firewall on all nodes.

2. Disable selinux on all nodes.

3. Update the hostname and their repective ip-addresses of all nodes in /etc/hosts file on all nodes.

4. It is recommended to use Oracle JDK.

INSTALLATION & CONFIGURATION

Hadoop installation & configuration includes user settings, java installation, passwordless ssh configuration and lastly, hadoop installation and configuration.

User & Group Settings

Location: ha-nn01, ha-nn02, ha-nn03, ha-dn01, ha-dn02, ha-client

First we will create a group named 'hadoop'. Next we create a user 'huser' to perform all hadoop administrative tasks and setup a password for it.

# groupadd hadoop

# useradd -m -d /home/huser -g hadoop huser

# passwd huser

Note: Henceforth we will be using the newly created 'huser' user to perform all hadoop tasks.

Java Installation

Location: ha-nn01, ha-nn02, ha-nn03, ha-dn01, ha-dn02, ha-client

To install java, you can refer the blog here.

Passwordless SSH Configuration

Passwordless ssh environment is needed by the namenode to start HDFS & MapReduce related daemons in all nodes.

Location: ha-nn01, ha-nn02, ha-nn03

huser@ha-nn01:~$ ssh-keygen -t rsa
huser@ha-nn01:~$ ssh-copy-id -i ~/.ssh/id_rsa.pub huser@ha-nn01
huser@ha-nn01:~$ ssh-copy-id -i ~/.ssh/id_rsa.pub huser@ha-nn02
huser@ha-nn01:~$ ssh-copy-id -i ~/.ssh/id_rsa.pub huser@ha-nn03
huser@ha-nn01:~$ ssh-copy-id -i ~/.ssh/id_rsa.pub huser@ha-dn01
huser@ha-nn01:~$ ssh-copy-id -i ~/.ssh/id_rsa.pub huser@ha-dn02

Testing Passwordless SSH
Run the below commands from ha-nn01, ha-nn02 & ha-nn03 to test passwordless logins.
huser:~$ ssh ha-nn01
huser:~$ ssh ha-nn02
huser:~$ ssh ha-nn03
huser:~$ ssh ha-dn01
huser:~$ ssh ha-dn01

ZooKeeper Installation

Location: ha-nn01, ha-nn02, ha-nn03

For zookeeper quorum installation and configuration, you can refer the blog here.

Note: Zookeeper installation and configuration needs to be done only on all namenodes.

Hadoop Installation

We will be installing lastest stable release of Apache Hadoop 2.6.0.

Location: ha-nn01, ha-nn02, ha-nn03, ha-dn01, ha-dn02, ha-client

We will first place the downloaded tarball in /opt directory, untar it and change the ownership of that directory to 'huser' user.

root:~# cd /opt

root:~# tar -xzvf hadoop-2.6.0.tar.gz

root:~# chown -R huser:hadoop hadoop-2.6.0/

Next we will login as 'huser' user and set the environment variables in .bashrc file.

huser:~$ vi ~/.bashrc

###JAVA CONFIGURATION###

JAVA_HOME=/usr/java/jdk1.8.0_25/

export PATH=$PATH:$JAVA_HOME/bin

###HADOOP CONFIGURATION###

HADOOP_PREFIX=/opt/hadoop-2.6.0/

export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin

After making necessary changes to the .bashrc file activate the configured environment settings for 'huser' user by running the below command.

huser:~$ exec bash

Testing Hadoop Installation

Execute the below command to test the successful hadoop installation. It should produce .

huser:~$ hadoop version

Hadoop 2.6.0
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1
Compiled by jenkins on 2014-11-13T21:10Z
Compiled with protoc 2.5.0
From source with checksum 18e43357c8f927c0695f1e9522859d6a

This command was run using /opt/hadoop-2.6.0/share/hadoop/common/hadoop-common-2.6.0.jar

Note: Installation of hadoop has to be done on all nodes.

Hadoop Configuration

There are a couple of files that need to be configured to make hadoop with automatic failover cluster with QJM up and running. All our configuration files reside in /opt/hadoop-2.6.0/etc/hadoop/ directory.

hadoop-env.sh
Location: ha-nn01, ha-nn02, ha-nn03, ha-dn01, ha-dn02, ha-client

huser:~$ vi /opt/hadoop-2.6.0/etc/hadoop/hadoop-env.sh

export JAVA_HOME=/opt/jdk1.8.0_25

export HADOOP_LOG_DIR=/var/log/hadoop/

Create a directory for logs as specified in hadoop-env.sh file with required 'huser' user permissions.

root:~# mkdir /var/log/hadoop

root:~# chown -R huser:hadoop /var/log/hadoop

core-site.xml

Location: ha-nn01, ha-nn02, ha-nn03, ha-dn01, ha-dn02, ha-client

huser:~$ vi /opt/hadoop-2.6.0/etc/hadoop/core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://auto-ha</value>
</property>
</configuration>

hdfs-site.xml

Location: ha-nn01, ha-nn02, ha-nn03, ha-dn01, ha-dn02, ha-client

<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///hdfs/data</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>auto-ha</value>
</property>
<property>
<name>dfs.ha.namenodes.auto-ha</name>
<value>nn01,nn02</value>
</property>
<property>
<name>dfs.namenode.rpc-address.auto-ha.nn01</name>
<value>ha-nn01:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.auto-ha.nn01</name>
<value>ha-nn01:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.auto-ha.nn02</name>
<value>ha-nn02:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.auto-ha.nn02</name>
<value>ha-nn02:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://ha-nn01:8485;ha-nn02:8485;ha-nn03:8485/auto-ha</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/hdfs/journalnode</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/huser/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled.auto-ha</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>ha-nn01.hadoop.lab:2181,ha-nn02.hadoop.lab:2181,ha-nn03.hadoop.lab:2181</value>
</property>
</configuration>

Note:

1. Replication factor is set to '2' as I have only two datanodes.

2. Create a directory /hdfs/name in all namenodes with required 'huser' user permissions.

root:~# mkdir -p /hdfs/name

root:~# chown -R huser:hadoop /hdfs/name

3. Create a directory /hdfs/data in all datanodes with required 'huser' user permissions.

root:~# mkdir -p /hdfs/data
root:~# chown -R huser:hadoop /hdfs/data

4. Create a directory /hdfs/journalnode in all namenodes with required 'huser' user permissions.
root:~# mkdir /hdfs/journalnode
root:~# chown -R huser:hadoop /hdfs/journalnode

4. In ha-client host add the below property to hdfs-site.xml file.

<property>
<name>dfs.client.failover.proxy.provider.auto-ha</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>

6. We can explicitly enable automatic-failover for the nameservice-id 'auto-ha' by setting the property 'dfs.ha.automatic-failover.enabled.auto-ha' to 'true'.

slaves

This file contains only the hostnames of datanodes.

Location: ha-nn01, ha-nn02, ha-nn03

huser:~$ vi /opt/hadoop2.6.0/etc/hadoop/slaves

ha-dn01

ha-dn02

Finally after completing the configuration part, we will be initializing and starting our automatic-failover hadoop cluster.

Initializing HA state in ZooKeeper

Zookeeper needs to initialize the required state by running the below command from any one of the namenodes.

Location: ha-nn01

huser@ha-nn01:~$ hdfs zkfc -formatZK

Starting Journal Nodes

In case of Quorum Journal Manager mechanism to share edit logs, we need to start journalnode daemons on all namenodes.
huser@ha-nn01:~$ hadoop-daemon.sh start journalnode

Formatting & Starting Namenodes

Both the namenodes need to be formatted to start HDFS filesystem.

Location: ha-nn01

huser@ha-nn01:~$ hadoop namenode -format

huser@ha-nn01:~$ hadoop-daemon.sh start namenode

Location: ha-nn02

huser@ha-nn02:~$ hadoop namenode -bootstrapStandby
huser@ha-nn02:~$ hadoop-daemon.sh start namenode

Note: By default both the namenodes will be in 'standby' state.

Starting ZKFC Services

Zookeeper Failover Controller service needs to be started in order to make any one namenode as 'active'. Run the below command on both namenodes.

huser@ha-nn01:~$ hadoop-daemon.sh start zkfc

huser@ha-nn02:~$ hadoop-daemon.sh start zkfc

Note: As soon as the zkfc service is started you can see that one of the namenode is in active state using below command from any one of the namenodes.

huser@ha-nn01:~$ hdfs haadmin -getServiceState nn01

huser@ha-nn01:~$ hdfs haadmin -getServiceState nn02

Starting Datanodes

To start the datanodes run the below mentioned command from any one of the namenodes.

huser@ha-nn01:~$ hadoop-daemons.sh start datanode

Verifying Automatic Failover

To verify the automatic failover, we need to locate the active namenode using command line or by visiting the namenode web interfaces.

Using command line

huser@ha-nn01:~$ hdfs haadmin -getServiceState nn01
huser@ha-nn01:~$ hdfs haadmin -getServiceState nn02

Using web interface

ha-nn01:50070

ha-nn02:50070

After locating the active namenode, we can cause a failure on that node to initiate a failover automatically. One can fail the active namenode by running 'jps' command and kill the namenode daemon by determining it's pid. Within a few seconds the other namenode will automatically become active.

Fully Distributed Hadoop Cluster - Automatic Failover HA Cluster with ZooKeeper & NFS

9 comments:

Subhankar ChattopadhyayApril 1, 2015 at 4:31 PM
Hi,

I am trying to setup a HA Hadoop cluster following your blog but facing some difficulties. Details are here. -

http://stackoverflow.com/questions/29346348/hadoop-ha-setup-not-able-to-connect-to-zookeeper?noredirect=1#comment46921872_29346348

Could you please help me?

Regards,
Subhankar
AsierDecember 1, 2015 at 9:39 PM
Hi,

I am trying to follow your tutorial, but I have some problems.
Here you can see my problem: http://stackoverflow.com/questions/34024215/error-executing-hdfs-zkfc-command

Could you help me?
I am new to hdfs and I don't know where is the problem.

Best regards,
UnknownJanuary 15, 2016 at 5:55 PM
Hi,

As per your configuration, my namenode status is active but it shows live nodes count 0 .
My cluster info:
total 4 machines.
2 Namenodes.
1 for NFS.
1 for datanode.

My slaves entry is the my machine name for data node.

Datanode log info: 2016-01-15 17:16:40,920 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = datanode1/192.x.x.117
STARTUP_MSG: args = []
STARTUP_MSG: version = 2.6.0

please help me, how to start datanode.

Web Designers PitampuraMarch 11, 2016 at 1:57 PM
very informative post for me as I am always looking for new content that can help me and my knowledge grow better.
UnknownNovember 7, 2016 at 4:10 PM
its quite good. i have successive growth in business because of Informatica. This is most important guide to make yours business to improve in top level rank. keep sharing more.
Hadoop training in chennai
Quickbooks ExpertFebruary 2, 2021 at 3:35 PM
Nice & Informative Blog !
Are you looking for the best ways on QuickBooks Error 1603? We are here to help you. Call us at 1-855-977-7463 and get the best technical con
QuickBooks Support Phone NumberFebruary 18, 2021 at 8:53 PM
Keep it up your good work, I read your's website, I think that your blog is fascinating and has sets of the fantastic piece of information. Thanks for your valuable efforts
If you face QuickBooks Error 15215, please visit:QuickBooks Error 15215
QuickBooks Support Phone NumberFebruary 18, 2021 at 8:54 PM
Keep it up your good work, I read your website, I think that your blog is fascinating and has sets of the fantastic piece of information. Thanks for your valuable efforts
If you face QuickBooks Error 15215, please visit: QuickBooks Error 15215

# HashPrompt

Pages

About Me

Monday, January 12, 2015

Fully Distributed Hadoop Cluster - Automatic Failover HA Cluster with Zookeeper & QJM

Setup Summary

INSTALLATION & CONFIGURATION

User & Group Settings

Java Installation

Passwordless SSH Configuration

Hadoop Installation

Hadoop Configuration

Initializing HA state in ZooKeeper

Starting Journal Nodes

Formatting & Starting Namenodes

Starting ZKFC Services

Starting Datanodes

Verifying Automatic Failover

9 comments: