In
my last post we had configured Hadoop Federation Cluster in a fully
distributed mode. Next we will go for a fully distributed manual failover
hadoop HA cluster in this post. I will skip the hadoop and java
installation part as we have already gone through those a couple of
times in my previous posts. For further learning we will use the hardware configuration mentioned in the below table.
namenode1 ha-nn01 192.168.56.101
namenode2 ha-nn02 192.168.56.102
datanode1 ha-dn01 192.168.56.103
datanode2 ha-dn02 192.168.56.104
client ha-client 192.168.56.105
We
already have 2 namenodes, 2 datanodes and a client node all running on CentOS release 5.11 and ready with
the required user configuration, passwordless ssh environment,
appropriate java configurations, hadoop installation with all
variables and paths declared.
Note:
If not set, please follow my last post on
“FullyDistributed Hadoop Federation Cluster”
till “hadoop installation and testing” step.
Also
make sure that you follow the note in the “Downloads” section.
Hadoop Configuration
Moving
directly to the configuration part required to setup a manual
failover HA.
hadoop-env.sh
Location:
ha-nn01, ha-nn02, ha-dn01, ha-dn02, ha-client
huser:~$
vi /opt/hadoop-2.6.0/etc/hadoop/hadoop-env.sh
export
JAVA_HOME=/usr/java/jdk1.8.0_25/
export
HADOOP_LOG_DIR=/var/log/hadoop/
Create
a log directory in the specified path mentioned in hadoop-env.sh file
under the parameter HADOOP_LOG_DIR and change the ownership of the
directory to “huser” user.
$
sudo mkdir /var/log/hadoop
$
sudo chown -R huser:hadoop /var/log/hadoop
Note:
In the client machine there is no need to specify and create a log
directory. Similarly, it is needless to declare java's home directory
if it is pre-installed.
core-site.xml
Location:
ha-nn01, ha-nn02, ha-dn01, ha-dn02, ha-client
huser:~$
sudo vi /opt/hadoop-2.6.0/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://man-ha</value>
</property>
</configuration>
hdfs-site.xml
Location:
ha-nn01, ha-nn02, ha-dn01, ha-dn02
huser:~$
sudo vi /opt/hadoop-2.6.0/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///hdfs/data</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>man-ha</value>
</property>
<property>
<name>dfs.ha.namenodes.man-ha</name>
<value>nn01,nn02</value>
</property>
<property>
<name>dfs.namenode.rpc-address.man-ha.nn01</name>
<value>ha-nn01:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.man-ha.nn01</name>
<value>ha-nn01:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.man-ha.nn02</name>
<value>ha-nn02:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.man-ha.nn02</name>
<value>ha-nn02:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>file:///mnt/</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/huser/.ssh/id_rsa</value>
</property>
</configuration>
Copy
the hdfs-site.xml in client node and add the below mentioned property
to it.
Location:
ha-client
huser@ha-client:~$
sudo vi /opt/hadoop-2.6.0/etc/hadoop/hdfs-site.xml
<property>
<name>dfs.client.failover.proxy.provider.man- ha</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
Note: The value for "dfs.namenode.shared.edits.dir" property should point to a shared NFS mounted directory. To create a permanently shared NFS mount, click here.
slaves
Location:
ha-nn01, ha-nn02
huser:~$
vi /opt/hadoop-2.6.0/etc/hadoop/slaves
ha-dn01
ha-dn02
Here
we complete the required configurations to deploy manual failover
Hadoop HA. Get ready to fire it up. The below mentioned
administration steps will help us to move further and manage the
cluster.
Formatting, starting & activating namenodes & datanodes
We
will format the namenodes one by one and start the namenode daemons
manually on them. Make sure that the shared edits directory is mounted on all namenodes.
Location:
ha-nn01
huser@ha-nn01:~$
hadoop namenode -format
huser@ha-nn01:~$
hadoop-daemon.sh start namenode
Location:
ha-nn02
huser@ha-nn02:~$
hadoop namenode -bootstrapStandby
huser@ha-nn02:~$
hadoop-daemon.sh start namenode
Now
at this point of time both the namenodes will be in standby mode. To
make the desired namenode to run in “active” state, we can run
the below command from any namenode machine.
huser@ha-nn02:~$
hdfs
haadmin -transitionToActive nn01
Note:
In the above example we have activated the namenode-id “nn01” to
transition to active state. Do not provide the hostname of the
namenode in this field.
Finally,
we will start the datanode daemon on all slave nodes using the below
command.
huser@ha-nn01:~$ hadoop-daemons.sh start datanode
Alternatively
we can start the datanode daemons independently on each slave node.
huser@ha-dn01:~$
hadoop-daemon.sh start datanode
Monitoring
One
can check the status of namenodes using below command from any
namenode.
huser@ha-nn02:~$ huser@ha-nn02:~$ hdfs
haadmin -getServiceState nn01
where,
nn01 is the namenode-id for ha-nn01.
Also
it is possible to view the status of namenode in a browser from any
namenode or client machines by pointing the url to ha-nn01:50070 or ha-nn02:50070
The filesystem operations like create, copy, list, delete, etc should be done using the absolute path.
Example:
To copy a file from local filesystem in client to the active namenode ha-nn01
huser@ha-client ~$ hadoop dfs -copyFromLocal largefile hdfs://ha-nn01/test/Related links
Single-Node Hadoop Cluster on Ubuntu 14.04
Multi-Node Hadoop Cluster on Ubuntu 14.04
Multi-Node Hadoop Cluster on Oracle Solaris 11 using Zones
Fully Distributed Hadoop Cluster - Automatic Failover HA with ZooKeeper & NFS
Thank you so much for sharing this worthwhile to spent time on. You are running a really awesome blog. Keep up this good work
ReplyDeleteBig data training in velachery
Hadoop training chennai velachery
Hashprompt >>>>> Download Now
Delete>>>>> Download Full
Hashprompt >>>>> Download LINK
>>>>> Download Now
Hashprompt >>>>> Download Full
>>>>> Download LINK t9
Wow that's a wonderfull blog having all details & helpful. Hadoop cluster NJ
ReplyDeleteLearning new technology would give oneself a true confidence in the current emerging Information Technology domain. With the knowledge of big data the most magnificent cloud computing technology one can go the peek of data processing. As there is a drastic improvement in this field everyone are showing much interest in pursuing this technology. Your content tells the same about evolving technology. Thanks for sharing this.
ReplyDeleteHadoop Training in Chennai | Best Hadoop Training in Chennai | Best hadoop training institute in chennai
This is the exact piece of information that I was searching for a long time(Hadoop Training in Chennai). Processing data is the biggest issue that every cloud based companies are facing worldwide(Big Data Training). Handling this problem made easy with the introduction of big data. Thank you so much for your worth able content here. Keep Posting article like this.
ReplyDeleteI have finally found a Worth able content to read. The way you have presented information here is quite impressive. I have bookmarked this page for future use. Thanks for sharing content like this once again. Keep sharing content like this.
ReplyDeleteSoftware testing training in chennai | Software testing training institutes in chennai | Manual testing training in Chennai
I am reading ur post from the beginning, it was so interesting to read & i feel thanks to you for posting such a good blog, keep updates regularly.Best Hadoop Training Institute In Chennai
ReplyDeleteThank you for your feedback regarding this spell, I will forward it to the appropriate teams.
ReplyDeleteHadoop Certification in Chennai
ReplyDeleteA1 Trainings as one of the best training institute in Hyderabad for online trainings for Hadoop. We have expertise and real time professionals working in Hadoop since 7 years. Our training strategy and materials will help the students for the certification exams also.
Hadoop Training in Hyderabad
Hashprompt >>>>> Download Now
ReplyDelete>>>>> Download Full
Hashprompt >>>>> Download LINK
>>>>> Download Now
Hashprompt >>>>> Download Full
>>>>> Download LINK Tu