Just the command lines to get hadoop 2 installed on Ubuntu. These are all cribbed from the following source notes, and I am preserving them here for my own benefit so I can quickly repeat what I did. Note many of these instructions are also in the main hadoop docs from apache.

Source material

Use Michael-noll’s guide for version 1 & ssh http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/


Or this one for Hadoop 2 http://jugnu-life.blogspot.com/2012/05/hadoop-20-install-tutorial-023x.html http://hadoop.apache.org/docs/r2.0.5-alpha/

Create the hadoop user and ssh

sudo apt-get install openssh-server openssh-client

sudo addgroup hadoop
sudo adduser --ingroup hadoop hduser
su - hduser

If you cannot ssh to localhost without a passphrase, execute the following commands:

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

Testing your SSH ssh localhost Say yes #exit

Get hadoop all set up

As the hduser, after downloading the tar

tar -xvf hadoop-2.0.5-alpha.tar.gz
ln -s hadoop-2.0.5-alpha hadoop
#edit .bashrc
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_21/
export HADOOP_PREFIX="/home/hduser/hadoop"



Stolen entirely from JJ, but with path changed for my Ubuntu

Stolen from http://jugnu-life.blogspot.com/2012/05/hadoop-20-install-tutorial-023x.html Please click on his blog.

Login again so bash has paths above. In Hadoop 2.x version /etc/hadoop is the default conf directory. We need to modify / create following property files in the /etc/hadoop directory

cd ~
mkdir -p /home/hduser/workspace/hadoop_space/hadoop23/dfs/name;mkdir -p
/home/hduser/workspace/hadoop_space/hadoop23/dfs/data;mkdir -p /home/hduser/workspace/hadoop_space/hadoop23/mapred/system;mkdir
-p /home/hduser/workspace/hadoop_space/hadoop23/mapred/local

Edit core-site.xml with following contents

<description>The name of the default file system. Either the literal
string "local" or a host:port for NDFS.</description>

Edit hdfs-site.xml with following contents

<description>Determines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list of directories then the name table is replicated in all of the
directories, for redundancy. </description>

<description>Determines where on the local filesystem an DFS data
node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named
directories, typically on different devices. Directories that do not exist are ignored.




The path

file:/home/hduser/workspace/hadoop_space/hadoop23/dfs/name AND


are some folders in your computer which would give space to store data and name edit files

Path should be specified as URI

Create a file mapred-site.xml inside /etc/hadoop with following contents





The path

file:/home/hduser/workspace/hadoop_space/hadoop23/mapred/system AND


are some folders in your computer which would give space to store data

Path should be specified as URI

Edit yarn-site.xml with following contents


Edit the ~/hadoop/etc/hadoop/hadoop-env.sh, to set the JAVA_HOME

export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_21/

Format the namenode

hdfs namenode –format

Say Yes and let it complete the format

Time to start the daemons

hadoop-daemon.sh start namenode
hadoop-daemon.sh start datanode

You can also start both of them together by


Start Yarn Daemons

yarn-daemon.sh start resourcemanager
yarn-daemon.sh start nodemanager

You can also start all yarn daemons together by


Time to check if Daemons have started

Enter the command

2539 NameNode
2744 NodeManager
3075 Jps
3030 DataNode
2691 ResourceManager

Time to launch UI

Open the localhost:8088 to see the Resource Manager page

