Sunday, January 11, 2015

ZooKeeper Installation & Configuration

In my previous post we had configured a manual failover hadoop cluster. Now we being clear with the hadoop high availability features, we will slowly proceed towards configuring automatic failover cluster. Prior to that we need 'zookeeper'.

What is ZooKeeper?

An excerpt from Apache ZooKeeper website -

ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications. Each time they are implemented there is a lot of work that goes into fixing the bugs and race conditions that are inevitable. Because of the difficulty of implementing these kinds of services, applications initially usually skimp on them, which make them brittle in the presence of change and difficult to manage. Even when done correctly, different implementations of these services lead to management complexity when the applications are deployed.
Lear more about ZooKeeper on the ZooKeeper Wiki website.

Apache ZooKeeper is an open source software project of Apache Software Foundation which provides solution for various coordination problems in large distributed systems. It serves as a highly reliable centralised coordination service, running on a cluster of servers. It is also highly recommended to run ZooKeeper on minimum three separate servers in a replicated mode and this group of ZooKeeper servers is called ZooKeeper ensemble. Production environments across the world normally opt for a five node ZooKeeper ensemble.

Within this ZooKeeper ensemble, a replicated group of servers that use the same application domain is called a quorum. These servers in quorum have same ZooKeeper configuration file and run in leader-follower model. One of the ZooKeeper servers act as a leader and others as follower. If the leader fails, one of the followers is elected as a leader.

Apache ZooKeeper is implemented in Java. It also ships with C, Java, Perl and Python client bindings.

We have downloaded the latest stable release i.e. zookeeper-3.4.8.

Setup Summary

It is recommended to have odd number of ZooKeeper servers to ensure that always a majority of zookeeper servers are running in case of any namenode failure. Below mentioned is the setup summary.

ZooKeeper Installation

After downloading the tarball, we will place it in system directory /opt and extract it. Perform the below mentioned steps on all three ZooKeeper nodes. We are using "huser" user (with ownership to /opt directory) to install ZooKeeper on all ZooKeeper server nodes.

Set the permissions for 'huser' user.
Location: zkdn1, zkdn2, zkdn3

root~# chown -R huser:hadoop /opt/zookeeper-3.4.8

Next create a zoo.cfg file in conf directory and make the necessary modifications shown below.
Location: zkdn1, zkdn2, zkdn3

huser~$ vi /opt/zookeeper-3.4.8/conf/zoo.cfg



Add the executable path to bashrc file of all namenodes.
huser~$ vi ~/.bashrc
export PATH=$PATH:/opt/zookeeper-3.4.8/bin


  • ticktime determines the time in milliseconds used by ZooKeeper for heartbeats by clients and defines session registration. The minimum session timeout is twice the tickTime parameter.
  • initLimit provides the maximum time duration used by the zookeeper follower server in the quorum to connect to the leader. Specified in number of ticks.
  • syncLimit provides the maximum time for a follower server to be outdated from a leader server.
  • dataDir points to the directory to store zookeeper data (in-memory data snapshots & transaction logs of data updates).
  • clientPort is the port number that listens for client connections. It is where ZooKeeper clients will initiate a connection.
  • The next three parameters are specified in "" format. In out configuration we have mentioned identifier "1" for the quorum server "zkdn1", identifier "2" for "zkdn2" and identifier "3" for "zkdn3". These identifiers must be unique for each server and must be specified in a file "myid". Port 2888 is used for peer-to-peer communication in the quorum, such as connecting followers to leaders. Port 3888 is used for leader election. All of the communication happens over TCP.

Create a "zoodata" directory with required 'huser' permissions in /opt as declared in /opt/zookeeper-3.4.8/conf/zoo.cfg file and create a file named 'myid' in it. Add 1 to the myid file in zkdn1, 2 to the myid file in zkdn2 and 3 to the myid file in zkdn3.
Location: zkdn1, zkdn2, zkdn3

root:~# mkdir -p /opt/zoodata
root:~# chown -R huser:hadoop /opt/zoodata/

Location: zkdn1
huser@zkdn1~$ vi /opt/zoodata/myid

Location: zkdn2
huser@zkdn2~$ vi /opt/zoodata/myid

Location: zkdn3
huser@zkdn3~$ vi /opt/zoodata/myid

Note: The myid file contains a unique number between 1 and 255 that represent the zookeeper server-id.

Starting ZooKeeper

Run the below command to start zookeeper service on all three namenodes.
huser~$ status

One can confirm the running of zookeeper service by executing jps command to find the QuorumPeerMain running on each namenode.

Run script to check the ZooKeeper servers status.
huser~$ status

ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: follower

ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: follower

ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.8/bin/../conf/zoo.cfg
Mode: leader

Now the ZooKeeper instance is running on all servers with "zkdn3" server being the "leader", zkdn1 and zkdn2 as "follower".

We can also connect to the running ZooKeeper instance using script.
huser~$ zkCli -server zkdn1:2181

Using the ZooKeeper model various distributed processes coordinate with each other with the help of shared hierarchical namespace of data registers. The data registers can store data with a maximum data size not more than 1MB. These data registers are known as znodes. Znodes can be persistent and ephemeral. Persistent znodes exist till ZooKeeper's namespace lifetime. A persistent znode can be deleted manually. The znode and it's data persist even after the client which created them dies. Persistent znodes are used by the applications that want to store the configuration data. Whereas an ephemeral znode ceases to exist and is deleted once the creator client's session ends. Hence, unlike persistent znodes, ephemeral znodes are not allowed to have children. Distributed group membership like services can be implemented using ephemeral znodes.

1 comment:

  1. while configuring zookeeper with hadoop ha architecture on which node zookeeper will run, suppose i have 2 namenode and 3datanode cluster