Hadoop Installation in UNIX

Prerequisite for Hadoop Installation :


Hadoop requires java 1.5 & above for its working but Java 1.6 is recommended.
So first thing you need in your machine is java 1.6.

 Step 1. Check you have java 1.6 installed or not.


Write the command $ java -version and then press enter.

Step 2. If it is not there you can install the same with the below command:

$ sudo apt-get install openjdk-6-jre

 


After installation of java, Check Java is installed properly or not by by using 

 

$ java -version command.

 


 

If the above output comes, java is installed properly on your system. 

 

You can check for the installation package at  /usr/lib/jvm/

 

 

Step 3. Adding a dedicated system user :

 

 
It helps to separate the Hadoop installation with other software application and also with the user account running on the single node.


So, for creating a separate user you can use the below commands:

$ sudo addgroup hadoop1
where Hadoop1 is a group name.



Adding a user in Hadoop1 group


$ sudo adduser --ingroup hadoop hduser 


 

This will add a user hduser and a group Hadoop to your local machine.

`

Step 4. Configuring a SSH(Secure Shell) to localhost :


Hadoop requires SSH access to manage its nodes. So for this single node installation of Hadoop we need to configure the SSH access to localhost. 

We will be creating this access for the hduser we created in the previous step.
$ sudo apt-get install openssh-server



Step 5. Switch user to hduser by using the command 'su'
 
$ su - hduser



Step 6. After the SSH server installation. we have to generate an SSH key for the hduser:


  $ ssh-keygen -t rsa -P ""
 




Step 7. Now since the key pair is generated we have to enable SSH access to local machine with this newly created key. For that you have to put the below command.

hduser@ubuntu:~$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
 
 
 
 
Step 8. Finally you can check the same using command
$ ssh localhost
 
 

Hadoop Installation :

 

Download & Extract Hadoop :

 
So if you have all the above prerequisite in your machine,you are good to go with the Hadoop installation.
First download Hadoop from http://www.apache.org/dyn/closer.cgi/hadoop/core and extract the same at any location, I kept it at /usr/local. Also you need to change the owner permission of all files to hduser and group to Hadoop.

 $ cd /usr/local
 $ sudo tar xzf hadoop-1.2.1.tar.gz
 $ sudo mv hadoop-1.0.3 hadoop
 $ sudo chown -R hduser:hadoop hadoop

Update $HOME/.bashrc

 Update the following lines at the end of $Home/.bashrc file of user hduser. Well if you are using a different shell than bash, you have to update the appropriate configuration file.
export HADOOP_HOME=/usr/local/hadoop
export JAVA_HOME=/usr/lib/jvm/openjdk-6-jre

unalias fs &> /dev/null
alias fs="hadoop fs"
unalias hls &> /dev/null
alias hls="fs -ls"

lzohead () {
    hadoop fs -cat $1 | lzop -dc | head -1000 | less
}
export PATH=$PATH:$HADOOP_HOME/bin

 

Configuration File Setup


By now we are almost done with the Hadoop installation. Now what we have to do is, change a few properties of the configuration file provided in Hadoop Conf folder.
But before that we have to make a directory where we are going to save our data on the local node cluster. We will be saving our data on HDFS.

So lets create the directory and set the required ownership and permission.

$ sudo mkdir /tmp/hadoop_data
$ sudo chown hduser:hadoop /tmp/hadoop_data
$ sudo chmod 777 /tmp/hadoop_data

Now lets start changing a few of the required configuration file.

Note: you will find all these configuration file inside hadoop/conf directory where you have put your file. In my case it is at /usr/local/hadoop/conf.

hadoop-env.sh

Open the Hadoop-env.sh file and change the only required environment variable for local machine installation. And it is JAVA_HOME. For this you just need to uncomment the below line and set the JAVA_HOME environment to your JDK/JRE directory. 

# The java implementation to use.  Required.
export JAVA_HOME=/usr/lib/jvm/openjdk-6-jre

core-site.xml

In between <configuration> ... </configuration> put the below code:

<property>
  <name>hadoop.tmp.dir</name>
  <value>/tmp/hadoop_data</value>
  <description>directory for hadoop data</description>
</property>

<property>
  <name>fs.default.name</name>
  <value>hdfs://localhost:54310</value>
  <description> data to be put on this URI</description>
</property>

mapred-site.xml

<property>
  <name>mapred.job.tracker</name>
  <value>localhost:54311</value>
  <description>...
  </description>
</property>



hdfs-site.xml

<property>
  <name>dfs.replication</name>
  <value>1</value>
</property>


Formatting and Starting the Single Node Cluster.

 So if you are done till now successfully, you are done with the installation part. Now we just have to format the name-node and start the cluster.

hduser@ubuntu:~$ /usr/local/hadoop/bin/hadoop namenode -format

 

1.
2.
3.
4.
5.
6.
7.



1 comments:

Click here for comments
isabella
admin
30 March 2017 at 02:35 ×

you can Achieve your Target............
Dot Net Training in Chennai
Android Training in Chennai

Dot Net Training in Chennai

Congrats bro isabella you got PERTAMAX...! hehehehe...
Reply
avatar

Popular Posts