If you're initializing a cluster for the first time, it's likely that you'll want to start small with just one or two slaves (DataNodes).
Over time, as you operate your cluster, gaining more experience and confidence, you'll likely want to add more slaves.
I started with a simple NameNode and dual-DataNode cluster configuration. This article details the steps I took to add an extra node to my cluster.
Outline
- The first step I took was to clone an existing VM.
- Update the /etc/hosts file
- Add the new slave and IP address
- Copy this file to ea
- Start the DataNode
Clone an existing DataNode
If you're cloning a DataNode that's already been used within a cluster, you'll want to clean out the $HADOOP_DATA_DIR. An easy way to do this is simply delete the existing directory, and re-create it (reference "Configuring your Base" for more information on this step):
sudo rm -rf $HADOOP_DATA_DIR
mkdir -p $HADOOP_DATA_DIR/data
mkdir -p $HADOOP_DATA_DIR/name
mkdir -p $HADOOP_DATA_DIR/local
sudo chmod 755 $HADOOP_DATA_DIR
If you plan to clone more than one node, I recommend cloning the additional nodes from this step onward.
Once my data directory is cleared, I use VirtualBox to create my clone:
VBoxManage clonevm "%~1" --name "%~2" --register --basefolder %vm%
Substitute the node names for the parameters above. Reference "VirtualBox for Virtualization" for automated cloning recipes.
Updating the Hosts File
On your NameNode, update the hosts file:
sudo gedit /etc/hosts
and add the node name and IP address for each newly created slave node.
My /etc/hosts file now looks like this:
127.0.0.1 localhost
127.0.1.1 CVB
192.168.1.10 master
192.168.1.11 slave1
192.168.1.12 slave2
192.168.1.15 slave3
192.168.1.17 slave4
192.168.1.18 slave5
192.168.1.14 dev
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
The new lines in my file are in blue bold. The information in your file will not be identical.
Copying the Host File
Do I need to copy the hosts file on my NameNode to each DataNode in the cluster?
Yes, you should do this. There are cases where DataNodes will talk to each other. DataNode do this when they are replicating data. Also, when adding a new DataNode to an existing cluster, data re-balancing (addressed below) will occur. This requires DataNodes to address one another.
It is important that each DataNode be able to address each other in a consistent fashion. I recommend maintaining (and editing) the master copy of the hosts file on the NameNode.
On my NameNode, I've created a script in my home directory that will copy the hosts file to each DataNode in my cluster:
cd ~
gedit copy-hosts.sh
Copy this script with the appropriate modifications:
#!/bin/bash
for i in {1..5}
do
scp /etc/hosts craigtrim@slave$i:/etc/hosts
done
Don't forget to make your shell script executable:
chmod +x copy-hosts.sh
Operational Output
When I execute this script, it simply reports back to me that all the files were copied successfully: craigtrim@CVB:~$ ./copy-hosts.sh
hosts 100% 355 0.4KB/s 00:00
hosts 100% 355 0.4KB/s 00:00
hosts 100% 355 0.4KB/s 00:00
hosts 100% 355 0.4KB/s 00:00
hosts 100% 355 0.4KB/s 00:00
It doesn't hurt to logon to at least one of the slaves and verify that the file was copied correctly:
craigtrim@CVB:~$ ssh slave3
Welcome to Ubuntu 14.04.1 LTS (GNU/Linux 3.13.0-32-generic x86_64)
* Documentation: https://help.ubuntu.com/
229 packages can be updated.
87 updates are security updates.
Last login: Wed Nov 26 14:26:43 2014 from master
craigtrim@CVB:~$ cat /etc/hosts
127.0.0.1 localhost
127.0.1.1 CVB
192.168.1.10 master
192.168.1.11 slave1
192.168.1.12 slave2
192.168.1.15 slave3
192.168.1.17 slave4
192.168.1.18 slave5
192.168.1.14 dev
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
Once you gain confidence that the script is doing what it's supposed to, you can likely skip this verification step in the future.
Updating the Slaves File
This section is almost identical to the one above. Just as we maintain a master /etc/hosts on the designated NameNode, we'll maintain our master slaves file there.
Let's go ahead and edit it, and add the two new slaves (DataNodes):
cd $HADOOP_CONF_DIR
gedit slaves
My /etc/hosts file now looks like this:
slave1 slave2 slave3
slave4 slave5
The new lines in my file are in blue bold. That's about as simple as it gets. Use of the hosts file prevents us from having to enter actual IP addresses in this file.
In like manner, we can also write a script to copy our slaves file to each DataNode in the cluster. I call this script "copy-slaves.sh", make it executable in the same manner as the last script we wrote, and enter this:
#!/bin/bash
for i in {1..5}
do
scp $HADOOP_CONF_DIR/slaves craigtrim@slave$i:$HADOOP_CONF_DIR
done
Operational Output
The script output contains nothing surprising: craigtrim@CVB:~$ ./copy-slaves.sh
slaves 100% 35 0.0KB/s 00:00
slaves 100% 35 0.0KB/s 00:00
slaves 100% 35 0.0KB/s 00:00
slaves 100% 35 0.0KB/s 00:00
slaves 100% 35 0.0KB/s 00:00
Starting Up
If the entire cluster has been stopped, you should just start the cluster using the start-dfs and the start-yarn shell scripts, as described in this article "Flipping the Switch".
If your cluster is already operational, and you want to "hot deploy" a new DataNode, then execute this command the DataNode:
hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode
this works then notice once I start up the datanode (it heartbeats to the namenode - diagram this)
2014-11-26 13:08:56,141 INFO [main] datanode.DataNode (StringUtils.java:startupShutdownMessage(619)) - STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = CVB/127.0.1.1
STARTUP_MSG: args = []
STARTUP_MSG: version = 2.5.2
STARTUP_MSG: classpath = /usr/lib/apache/hadoop/2.5.2/conf:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/commons-net-3.1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/mockito-all-1.8.5.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/commons-math3-3.1.1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/stax-api-1.0-2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jettison-1.1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/hamcrest-core-1.3.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/junit-4.11.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/activation-1.1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/zookeeper-3.4.6.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/paranamer-2.3.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/slf4j-api-1.7.5.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jsch-0.1.42.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/hadoop-auth-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/commons-compress-1.4.1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/commons-el-1.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/hadoop-annotations-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jsp-api-2.1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jasper-compiler-5.5.23.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/commons-collections-3.2.1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/xz-1.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/netty-3.6.2.Final.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jasper-runtime-5.5.23.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/httpclient-4.2.5.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jets3t-0.9.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jersey-json-1.9.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/asm-3.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/commons-httpclient-3.1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/httpcore-4.2.5.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/avro-1.7.4.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/commons-io-2.4.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jsr305-1.3.9.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/jetty-6.1.26.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/lib/guava-11.0.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/hadoop-common-2.5.2-tests.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/hadoop-nfs-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/common/hadoop-common-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/commons-el-1.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/jsp-api-2.1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/jasper-runtime-5.5.23.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/asm-3.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/commons-io-2.4.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/jsr305-1.3.9.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/lib/guava-11.0.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/hadoop-hdfs-nfs-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/hadoop-hdfs-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/hdfs/hadoop-hdfs-2.5.2-tests.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/jersey-core-1.9.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/commons-cli-1.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/commons-lang-2.6.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/aopalliance-1.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/jettison-1.1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/javax.inject-1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/activation-1.1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/zookeeper-3.4.6.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/commons-codec-1.4.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/jackson-xc-1.9.13.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/jackson-core-asl-1.9.13.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/jersey-server-1.9.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/jline-0.9.94.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/guice-3.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/commons-collections-3.2.1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/log4j-1.2.17.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/xz-1.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/jersey-json-1.9.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/asm-3.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/jackson-mapper-asl-1.9.13.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/commons-httpclient-3.1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/commons-io-2.4.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/servlet-api-2.5.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/jsr305-1.3.9.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/jackson-jaxrs-1.9.13.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/jetty-6.1.26.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/guava-11.0.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/lib/jersey-client-1.9.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/hadoop-yarn-client-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/hadoop-yarn-common-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/hadoop-yarn-server-common-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/hadoop-yarn-api-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/yarn/hadoop-yarn-server-tests-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/leveldbjni-all-1.8.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/javax.inject-1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/hamcrest-core-1.3.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/junit-4.11.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/hadoop-annotations-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/jackson-core-asl-1.9.13.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/guice-3.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/xz-1.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/asm-3.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.9.13.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/commons-io-2.4.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.5.2-tests.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.5.2.jar:/usr/lib/apache/hadoop/2.5.2/contrib/capacity-scheduler/*.jar:/usr/lib/apache/hadoop/2.5.2/contrib/capacity-scheduler/*.jar:/usr/lib/apache/hadoop/2.5.2/contrib/capacity-scheduler/*.jar
STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r cc72e9b000545b86b75a61f4835eb86d57bfafc0; compiled by 'jenkins' on 2014-11-14T23:45Z
STARTUP_MSG: java = 1.8.0_25
************************************************************/
2014-11-26 13:08:56,157 INFO [main] datanode.DataNode (SignalLogger.java:register(91)) - registered UNIX signal handlers for [TERM, HUP, INT]
2014-11-26 13:08:56,699 WARN [main] impl.MetricsConfig (MetricsConfig.java:loadFirst(124)) - Cannot locate configuration: tried hadoop-metrics2-datanode.properties,hadoop-metrics2.properties
2014-11-26 13:08:56,794 INFO [main] impl.MetricsSystemImpl (MetricsSystemImpl.java:startTimer(345)) - Scheduled snapshot period at 10 second(s).
2014-11-26 13:08:56,794 INFO [main] impl.MetricsSystemImpl (MetricsSystemImpl.java:start(184)) - DataNode metrics system started
2014-11-26 13:08:56,799 INFO [main] datanode.DataNode (DataNode.java:<init>(291)) - Configured hostname is CVB
2014-11-26 13:08:56,800 INFO [main] datanode.DataNode (DataNode.java:startDataNode(766)) - Starting DataNode with maxLockedMemory = 0
2014-11-26 13:08:56,825 INFO [main] datanode.DataNode (DataNode.java:initDataXceiver(567)) - Opened streaming server at /0.0.0.0:50010
2014-11-26 13:08:56,829 INFO [main] datanode.DataNode (DataXceiverServer.java:<init>(75)) - Balancing bandwith is 1048576 bytes/s
2014-11-26 13:08:56,829 INFO [main] datanode.DataNode (DataXceiverServer.java:<init>(76)) - Number threads for balancing is 5
2014-11-26 13:08:57,069 INFO [main] mortbay.log (Slf4jLog.java:info(67)) - Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2014-11-26 13:08:57,073 INFO [main] http.HttpRequestLog (HttpRequestLog.java:getRequestLog(80)) - Http request log for http.requests.datanode is not defined
2014-11-26 13:08:57,084 INFO [main] http.HttpServer2 (HttpServer2.java:addGlobalFilter(699)) - Added global filter 'safety' (class=org.apache.hadoop.http.HttpServer2$QuotingInputFilter)
2014-11-26 13:08:57,086 INFO [main] http.HttpServer2 (HttpServer2.java:addFilter(677)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context datanode
2014-11-26 13:08:57,087 INFO [main] http.HttpServer2 (HttpServer2.java:addFilter(684)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2014-11-26 13:08:57,087 INFO [main] http.HttpServer2 (HttpServer2.java:addFilter(684)) - Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2014-11-26 13:08:57,103 INFO [main] http.HttpServer2 (HttpServer2.java:addJerseyResourcePackage(603)) - addJerseyResourcePackage: packageName=org.apache.hadoop.hdfs.server.datanode.web.resources;org.apache.hadoop.hdfs.web.resources, pathSpec=/webhdfs/v1/*
2014-11-26 13:08:57,106 INFO [main] http.HttpServer2 (HttpServer2.java:openListeners(887)) - Jetty bound to port 50075
2014-11-26 13:08:57,106 INFO [main] mortbay.log (Slf4jLog.java:info(67)) - jetty-6.1.26
2014-11-26 13:08:57,394 INFO [main] mortbay.log (Slf4jLog.java:info(67)) - Started HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:50075
2014-11-26 13:08:57,559 INFO [main] datanode.DataNode (DataNode.java:startDataNode(783)) - dnUserName = craigtrim
2014-11-26 13:08:57,559 INFO [main] datanode.DataNode (DataNode.java:startDataNode(784)) - supergroup = supergroup
2014-11-26 13:08:57,608 INFO [main] ipc.CallQueueManager (CallQueueManager.java:<init>(53)) - Using callQueue class java.util.concurrent.LinkedBlockingQueue
2014-11-26 13:08:57,626 INFO [Socket Reader #1 for port 50020] ipc.Server (Server.java:run(593)) - Starting Socket Reader #1 for port 50020
2014-11-26 13:08:57,660 INFO [main] datanode.DataNode (DataNode.java:initIpcServer(442)) - Opened IPC server at /0.0.0.0:50020
2014-11-26 13:08:57,673 INFO [main] datanode.DataNode (BlockPoolManager.java:refreshNamenodes(152)) - Refresh request received for nameservices: null
2014-11-26 13:08:57,698 INFO [main] datanode.DataNode (BlockPoolManager.java:doRefreshNamenodes(197)) - Starting BPOfferServices for nameservices: <default>
2014-11-26 13:08:57,708 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] datanode.DataNode (BPServiceActor.java:run(805)) - Block pool <registering> (Datanode Uuid unassigned) service to master/192.168.1.10:9000 starting to offer service
2014-11-26 13:08:57,713 INFO [IPC Server Responder] ipc.Server (Server.java:run(815)) - IPC Server Responder: starting
2014-11-26 13:08:57,713 INFO [IPC Server listener on 50020] ipc.Server (Server.java:run(662)) - IPC Server listener on 50020: starting
2014-11-26 13:08:57,947 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] common.Storage (DataStorage.java:recoverTransitionRead(173)) - Data-node version: -55 and name-node layout version: -57
2014-11-26 13:08:57,962 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] common.Storage (Storage.java:tryLock(706)) - Lock on /home/craigtrim/HADOOP_DATA_DIR/data/in_use.lock acquired by nodename 5450@CVB
2014-11-26 13:08:57,964 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] common.Storage (DataStorage.java:recoverTransitionRead(197)) - Storage directory /home/craigtrim/HADOOP_DATA_DIR/data is not formatted
2014-11-26 13:08:57,964 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] common.Storage (DataStorage.java:recoverTransitionRead(198)) - Formatting ...
2014-11-26 13:08:58,000 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] common.Storage (BlockPoolSliceStorage.java:recoverTransitionRead(108)) - Analyzing storage directories for bpid BP-1847084755-127.0.1.1-1416961177933
2014-11-26 13:08:58,000 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] common.Storage (Storage.java:lock(666)) - Locking is disabled
2014-11-26 13:08:58,001 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] common.Storage (BlockPoolSliceStorage.java:recoverTransitionRead(130)) - Storage directory /home/craigtrim/HADOOP_DATA_DIR/data/current/BP-1847084755-127.0.1.1-1416961177933 is not formatted.
2014-11-26 13:08:58,001 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] common.Storage (BlockPoolSliceStorage.java:recoverTransitionRead(131)) - Formatting ...
2014-11-26 13:08:58,001 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] common.Storage (BlockPoolSliceStorage.java:format(183)) - Formatting block pool BP-1847084755-127.0.1.1-1416961177933 directory /home/craigtrim/HADOOP_DATA_DIR/data/current/BP-1847084755-127.0.1.1-1416961177933/current
2014-11-26 13:08:58,004 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] common.Storage (BlockPoolSliceStorage.java:doTransition(254)) - Restored 0 block files from trash.
2014-11-26 13:08:58,006 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] datanode.DataNode (DataNode.java:initStorage(977)) - Setting up storage: nsid=1137176454;bpid=BP-1847084755-127.0.1.1-1416961177933;lv=-55;nsInfo=lv=-57;cid=CID-822421ef-e4d6-49d7-9b25-e0b81ae32b7d;nsid=1137176454;c=0;bpid=BP-1847084755-127.0.1.1-1416961177933;dnuuid=null
2014-11-26 13:08:58,009 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] datanode.DataNode (DataNode.java:checkDatanodeUuid(812)) - Generated and persisted new Datanode UUID 68bd5310-85cb-49bf-9bfa-a34be7531d58
2014-11-26 13:08:58,026 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] impl.FsDatasetImpl (FsDatasetImpl.java:<init>(214)) - Added volume - /home/craigtrim/HADOOP_DATA_DIR/data/current, StorageType: DISK
2014-11-26 13:08:58,036 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] impl.FsDatasetImpl (FsDatasetImpl.java:registerMBean(1376)) - Registered FSDatasetState MBean
2014-11-26 13:08:58,041 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] datanode.DirectoryScanner (DirectoryScanner.java:start(329)) - Periodic Directory Tree Verification scan starting at 1417045060041 with interval 21600000
2014-11-26 13:08:58,042 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] impl.FsDatasetImpl (FsDatasetImpl.java:addBlockPool(1745)) - Adding block pool BP-1847084755-127.0.1.1-1416961177933
2014-11-26 13:08:58,044 INFO [Thread-30] impl.FsDatasetImpl (FsVolumeList.java:run(208)) - Scanning block pool BP-1847084755-127.0.1.1-1416961177933 on volume /home/craigtrim/HADOOP_DATA_DIR/data/current...
2014-11-26 13:08:58,061 INFO [Thread-30] impl.FsDatasetImpl (FsVolumeList.java:run(213)) - Time taken to scan block pool BP-1847084755-127.0.1.1-1416961177933 on /home/craigtrim/HADOOP_DATA_DIR/data/current: 17ms
2014-11-26 13:08:58,062 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] impl.FsDatasetImpl (FsVolumeList.java:addBlockPool(237)) - Total time to scan all replicas for block pool BP-1847084755-127.0.1.1-1416961177933: 19ms
2014-11-26 13:08:58,062 INFO [Thread-32] impl.FsDatasetImpl (FsVolumeList.java:run(109)) - Adding replicas to map for block pool BP-1847084755-127.0.1.1-1416961177933 on volume /home/craigtrim/HADOOP_DATA_DIR/data/current...
2014-11-26 13:08:58,063 INFO [Thread-32] impl.FsDatasetImpl (FsVolumeList.java:run(114)) - Time to add replicas to map for block pool BP-1847084755-127.0.1.1-1416961177933 on volume /home/craigtrim/HADOOP_DATA_DIR/data/current: 0ms
2014-11-26 13:08:58,063 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] impl.FsDatasetImpl (FsVolumeList.java:getAllVolumesMap(137)) - Total time to add all replicas to map: 2ms
2014-11-26 13:08:58,067 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] datanode.DataNode (BPServiceActor.java:register(765)) - Block pool BP-1847084755-127.0.1.1-1416961177933 (Datanode Uuid null) service to master/192.168.1.10:9000 beginning handshake with NN
2014-11-26 13:08:58,084 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] datanode.DataNode (BPServiceActor.java:register(778)) - Block pool Block pool BP-1847084755-127.0.1.1-1416961177933 (Datanode Uuid null) service to master/192.168.1.10:9000 successfully registered with NN
2014-11-26 13:08:58,085 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] datanode.DataNode (BPServiceActor.java:offerService(637)) - For namenode master/192.168.1.10:9000 using DELETEREPORT_INTERVAL of 300000 msec BLOCKREPORT_INTERVAL of 21600000msec CACHEREPORT_INTERVAL of 10000msec Initial delay: 0msec; heartBeatInterval=3000
2014-11-26 13:08:58,127 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] datanode.DataNode (BPOfferService.java:updateActorStatesFromHeartbeat(439)) - Namenode Block pool BP-1847084755-127.0.1.1-1416961177933 (Datanode Uuid 68bd5310-85cb-49bf-9bfa-a34be7531d58) service to master/192.168.1.10:9000 trying to claim ACTIVE state with txid=1193283
2014-11-26 13:08:58,127 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] datanode.DataNode (BPOfferService.java:updateActorStatesFromHeartbeat(451)) - Acknowledging ACTIVE Namenode Block pool BP-1847084755-127.0.1.1-1416961177933 (Datanode Uuid 68bd5310-85cb-49bf-9bfa-a34be7531d58) service to master/192.168.1.10:9000
2014-11-26 13:08:58,153 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] datanode.DataNode (BPServiceActor.java:blockReport(498)) - Sent 1 blockreports 0 blocks total. Took 1 msec to generate and 24 msecs for RPC and NN processing. Got back commands org.apache.hadoop.hdfs.server.protocol.FinalizeCommand@7c7470f4
2014-11-26 13:08:58,154 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] datanode.DataNode (BPOfferService.java:processCommandFromActive(618)) - Got finalize command for block pool BP-1847084755-127.0.1.1-1416961177933
2014-11-26 13:08:58,160 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] util.GSet (LightWeightGSet.java:computeCapacity(354)) - Computing capacity for map BlockMap
2014-11-26 13:08:58,160 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] util.GSet (LightWeightGSet.java:computeCapacity(355)) - VM type = 64-bit
2014-11-26 13:08:58,161 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] util.GSet (LightWeightGSet.java:computeCapacity(356)) - 0.5% max memory 889 MB = 4.4 MB
2014-11-26 13:08:58,161 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] util.GSet (LightWeightGSet.java:computeCapacity(361)) - capacity = 2^19 = 524288 entries
2014-11-26 13:08:58,162 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:<init>(186)) - Periodic Block Verification Scanner initialized with interval 504 hours for block pool BP-1847084755-127.0.1.1-1416961177933
2014-11-26 13:08:58,166 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] datanode.DataBlockScanner (DataBlockScanner.java:addBlockPool(264)) - Added bpid=BP-1847084755-127.0.1.1-1416961177933 to blockPoolScannerMap, new size=1
2014-11-26 13:09:01,671 INFO [DataXceiver for client at /192.168.1.12:47651 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835150_94326]] datanode.DataNode (DataXceiver.java:writeBlock(600)) - Receiving BP-1847084755-127.0.1.1-1416961177933:blk_1073835150_94326 src: /192.168.1.12:47651 dest: /192.168.1.15:50010
2014-11-26 13:09:01,671 INFO [DataXceiver for client at /192.168.1.12:47650 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835151_94327]] datanode.DataNode (DataXceiver.java:writeBlock(600)) - Receiving BP-1847084755-127.0.1.1-1416961177933:blk_1073835151_94327 src: /192.168.1.12:47650 dest: /192.168.1.15:50010
2014-11-26 13:09:01,672 INFO [DataXceiver for client at /192.168.1.11:38074 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835153_94329]] datanode.DataNode (DataXceiver.java:writeBlock(600)) - Receiving BP-1847084755-127.0.1.1-1416961177933:blk_1073835153_94329 src: /192.168.1.11:38074 dest: /192.168.1.15:50010
2014-11-26 13:09:01,675 INFO [DataXceiver for client at /192.168.1.11:38075 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835152_94328]] datanode.DataNode (DataXceiver.java:writeBlock(600)) - Receiving BP-1847084755-127.0.1.1-1416961177933:blk_1073835152_94328 src: /192.168.1.11:38075 dest: /192.168.1.15:50010
2014-11-26 13:09:01,702 INFO [DataXceiver for client at /192.168.1.12:47651 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835150_94326]] datanode.DataNode (DataXceiver.java:writeBlock(766)) - Received BP-1847084755-127.0.1.1-1416961177933:blk_1073835150_94326 src: /192.168.1.12:47651 dest: /192.168.1.15:50010 of size 923
2014-11-26 13:09:01,702 INFO [DataXceiver for client at /192.168.1.12:47650 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835151_94327]] datanode.DataNode (DataXceiver.java:writeBlock(766)) - Received BP-1847084755-127.0.1.1-1416961177933:blk_1073835151_94327 src: /192.168.1.12:47650 dest: /192.168.1.15:50010 of size 1367
2014-11-26 13:09:01,703 INFO [DataXceiver for client at /192.168.1.11:38074 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835153_94329]] datanode.DataNode (DataXceiver.java:writeBlock(766)) - Received BP-1847084755-127.0.1.1-1416961177933:blk_1073835153_94329 src: /192.168.1.11:38074 dest: /192.168.1.15:50010 of size 1902
2014-11-26 13:09:01,702 INFO [DataXceiver for client at /192.168.1.11:38075 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835152_94328]] datanode.DataNode (DataXceiver.java:writeBlock(766)) - Received BP-1847084755-127.0.1.1-1416961177933:blk_1073835152_94328 src: /192.168.1.11:38075 dest: /192.168.1.15:50010 of size 1980
2014-11-26 13:09:03,055 INFO [Thread-26] datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(430)) - Verification succeeded for BP-1847084755-127.0.1.1-1416961177933:blk_1073835151_94327
2014-11-26 13:09:03,057 INFO [Thread-26] datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(430)) - Verification succeeded for BP-1847084755-127.0.1.1-1416961177933:blk_1073835153_94329
2014-11-26 13:09:03,059 INFO [Thread-26] datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(430)) - Verification succeeded for BP-1847084755-127.0.1.1-1416961177933:blk_1073835150_94326
2014-11-26 13:09:03,060 INFO [Thread-26] datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(430)) - Verification succeeded for BP-1847084755-127.0.1.1-1416961177933:blk_1073835152_94328
2014-11-26 13:09:04,479 INFO [DataXceiver for client at /192.168.1.12:47652 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835155_94331]] datanode.DataNode (DataXceiver.java:writeBlock(600)) - Receiving BP-1847084755-127.0.1.1-1416961177933:blk_1073835155_94331 src: /192.168.1.12:47652 dest: /192.168.1.15:50010
2014-11-26 13:09:04,479 INFO [DataXceiver for client at /192.168.1.12:47653 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835154_94330]] datanode.DataNode (DataXceiver.java:writeBlock(600)) - Receiving BP-1847084755-127.0.1.1-1416961177933:blk_1073835154_94330 src: /192.168.1.12:47653 dest: /192.168.1.15:50010
2014-11-26 13:09:04,482 INFO [DataXceiver for client at /192.168.1.12:47653 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835154_94330]] datanode.DataNode (DataXceiver.java:writeBlock(766)) - Received BP-1847084755-127.0.1.1-1416961177933:blk_1073835154_94330 src: /192.168.1.12:47653 dest: /192.168.1.15:50010 of size 776
2014-11-26 13:09:04,484 INFO [DataXceiver for client at /192.168.1.12:47652 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835155_94331]] datanode.DataNode (DataXceiver.java:writeBlock(766)) - Received BP-1847084755-127.0.1.1-1416961177933:blk_1073835155_94331 src: /192.168.1.12:47652 dest: /192.168.1.15:50010 of size 381
2014-11-26 13:09:04,617 INFO [DataXceiver for client at /192.168.1.11:38076 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835157_94333]] datanode.DataNode (DataXceiver.java:writeBlock(600)) - Receiving BP-1847084755-127.0.1.1-1416961177933:blk_1073835157_94333 src: /192.168.1.11:38076 dest: /192.168.1.15:50010
2014-11-26 13:09:04,619 INFO [DataXceiver for client at /192.168.1.11:38077 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835156_94332]] datanode.DataNode (DataXceiver.java:writeBlock(600)) - Receiving BP-1847084755-127.0.1.1-1416961177933:blk_1073835156_94332 src: /192.168.1.11:38077 dest: /192.168.1.15:50010
2014-11-26 13:09:04,621 INFO [DataXceiver for client at /192.168.1.11:38076 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835157_94333]] datanode.DataNode (DataXceiver.java:writeBlock(766)) - Received BP-1847084755-127.0.1.1-1416961177933:blk_1073835157_94333 src: /192.168.1.11:38076 dest: /192.168.1.15:50010 of size 555
2014-11-26 13:09:04,624 INFO [DataXceiver for client at /192.168.1.11:38077 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835156_94332]] datanode.DataNode (DataXceiver.java:writeBlock(766)) - Received BP-1847084755-127.0.1.1-1416961177933:blk_1073835156_94332 src: /192.168.1.11:38077 dest: /192.168.1.15:50010 of size 7514
2014-11-26 13:09:07,480 INFO [DataXceiver for client at /192.168.1.12:47654 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835163_94339]] datanode.DataNode (DataXceiver.java:writeBlock(600)) - Receiving BP-1847084755-127.0.1.1-1416961177933:blk_1073835163_94339 src: /192.168.1.12:47654 dest: /192.168.1.15:50010
2014-11-26 13:09:07,486 INFO [DataXceiver for client at /192.168.1.12:47655 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835162_94338]] datanode.DataNode (DataXceiver.java:writeBlock(600)) - Receiving BP-1847084755-127.0.1.1-1416961177933:blk_1073835162_94338 src: /192.168.1.12:47655 dest: /192.168.1.15:50010
2014-11-26 13:09:07,488 INFO [DataXceiver for client at /192.168.1.12:47654 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835163_94339]] datanode.DataNode (DataXceiver.java:writeBlock(766)) - Received BP-1847084755-127.0.1.1-1416961177933:blk_1073835163_94339 src: /192.168.1.12:47654 dest: /192.168.1.15:50010 of size 6221
2014-11-26 13:09:07,496 INFO [DataXceiver for client at /192.168.1.12:47655 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835162_94338]] datanode.DataNode (DataXceiver.java:writeBlock(766)) - Received BP-1847084755-127.0.1.1-1416961177933:blk_1073835162_94338 src: /192.168.1.12:47655 dest: /192.168.1.15:50010 of size 7424
2014-11-26 13:09:07,621 INFO [DataXceiver for client at /192.168.1.11:38079 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835158_94334]] datanode.DataNode (DataXceiver.java:writeBlock(600)) - Receiving BP-1847084755-127.0.1.1-1416961177933:blk_1073835158_94334 src: /192.168.1.11:38079 dest: /192.168.1.15:50010
2014-11-26 13:09:07,621 INFO [DataXceiver for client at /192.168.1.11:38078 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835159_94335]] datanode.DataNode (DataXceiver.java:writeBlock(600)) - Receiving BP-1847084755-127.0.1.1-1416961177933:blk_1073835159_94335 src: /192.168.1.11:38078 dest: /192.168.1.15:50010
2014-11-26 13:09:07,632 INFO [DataXceiver for client at /192.168.1.11:38078 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835159_94335]] datanode.DataNode (DataXceiver.java:writeBlock(766)) - Received BP-1847084755-127.0.1.1-1416961177933:blk_1073835159_94335 src: /192.168.1.11:38078 dest: /192.168.1.15:50010 of size 4788
2014-11-26 13:09:07,633 INFO [DataXceiver for client at /192.168.1.11:38079 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835158_94334]] datanode.DataNode (DataXceiver.java:writeBlock(766)) - Received BP-1847084755-127.0.1.1-1416961177933:blk_1073835158_94334 src: /192.168.1.11:38079 dest: /192.168.1.15:50010 of size 9118
... etc ...
Notice how the auto -rebalancing occurs
It looks like this
2014-11-26 13:09:07,488 INFO [DataXceiver for client at /192.168.1.12:47654 [Receiving block BP-1847084755-127.0.1.1-1416961177933:blk_1073835163_94339]] datanode.DataNode (DataXceiver.java:writeBlock(766)) -
Received BP-1847084755-127.0.1.1-1416961177933:blk_1073835163_94339
src: /192.168.1.12:47654
dest: /192.168.1.15:50010
of size 6221
LIkewise, the summary in the web interface should now show the extra node:
For more information on how DataNodes join the cluster, read up on the heartbeat mechanism in the Hadoop Architecture. Brad Hedlund has one of the best posts on this topic I've come across.
Troubleshooting
Node is expected to serve this storage
If you create a clone VM, and don't clean out the $HADOOP_DATA_DIR, you're likely to get this error. 2014-11-26 12:56:16,660 WARN [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] datanode.DataNode (BPServiceActor.java:offerService(731)) - Block pool BP-1847084755-127.0.1.1-1416961177933 (Datanode Uuid 9b460762-eba0-45f2-b0b4-f00e11572ed6) service to master/192.168.1.10:9000 is shutting down
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.UnregisteredNodeException): Data node DatanodeRegistration(192.168.1.15, datanodeUuid=9b460762-eba0-45f2-b0b4-f00e11572ed6, infoPort=50075, ipcPort=50020, storageInfo=lv=-55;cid=CID-822421ef-e4d6-49d7-9b25-e0b81ae32b7d;nsid=1137176454;c=0) is attempting to report storage ID 9b460762-eba0-45f2-b0b4-f00e11572ed6. Node 192.168.1.12:50010 is expected to serve this storage.
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanode(DatanodeManager.java:475)
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1702)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:1049)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReport(DatanodeProtocolServerSideTranslatorPB.java:152)
at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28061)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
at org.apache.hadoop.ipc.Client.call(Client.java:1411)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy12.blockReport(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy12.blockReport(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReport(DatanodeProtocolClientSideTranslatorPB.java:214)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.blockReport(BPServiceActor.java:476)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:699)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.(BPServiceActor.java:834)
at java.lang.Thread.run(Thread.java:745)
2014-11-26 12:56:16,664 WARN [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] datanode.DataNode (BPServiceActor.java:run(845)) - Ending block pool service for: Block pool BP-1847084755-127.0.1.1-1416961177933 (Datanode Uuid 9b460762-eba0-45f2-b0b4-f00e11572ed6) service to master/192.168.1.10:9000
2014-11-26 12:56:16,766 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] datanode.DataNode (BlockPoolManager.java:remove(103)) - Removed Block pool BP-1847084755-127.0.1.1-1416961177933 (Datanode Uuid 9b460762-eba0-45f2-b0b4-f00e11572ed6)
2014-11-26 12:56:16,766 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] datanode.DataBlockScanner (DataBlockScanner.java:removeBlockPool(273)) - Removed bpid=BP-1847084755-127.0.1.1-1416961177933 from blockPoolScannerMap
2014-11-26 12:56:16,766 INFO [DataNode: [[[DISK]file:/home/craigtrim/HADOOP_DATA_DIR/data/]] heartbeating to master/192.168.1.10:9000] impl.FsDatasetImpl (FsDatasetImpl.java:shutdownBlockPool(1755)) - Removing block pool BP-1847084755-127.0.1.1-1416961177933
2014-11-26 12:56:18,768 WARN [main] datanode.DataNode (DataNode.java:secureMain(2019)) - Exiting Datanode
2014-11-26 12:56:18,770 INFO [main] util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 0
2014-11-26 12:56:18,772 INFO [Thread-1] datanode.DataNode (StringUtils.java:run(645)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at CVB/127.0.1.1
************************************************************/
The solution is to clear out the directory as mentioned above, and restart the DataNode.
References
- Inter DataNode Communication
- https://wiki.apache.org/hadoop/DataNode
- http://www.fromdev.com/2010/12/interview-questions-hadoop-mapreduce.html
- The NameNode Orchestrates the replication of data blocks from one datanode to another. The replication data transfer happens directly between datanodes and the data never passes through the namenode.
- http://stackoverflow.com/questions/9790784/how-are-the-datanodes-communicates-with-each-other-in-hadoop
- "the main communications between datanodes is data transfers, replicating blocks from one datanode to one or more other nodes"
- http://opensource.com/life/14/8/intro-apache-hadoop-big-data
- Data nodes can talk to each other to rebalance data, to move copies around, and to keep the replication of data high
- The Heartbeat Mechanism
- http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/
- Data Nodes send heartbeats to the Name Node every 3 seconds via a TCP handshake, using the same port number defined for the Name Node daemon, usually TCP 9000.