INSTALLING HADOOP,HIVE,DERBY IN centos 6

Please subscribe to my site www.jamesjara.com to get more tutorials.

INSTALLING HADOOP IN centos 6
INSTALLING HIVE IN centos 6
INSTALLING DERBY IN centos 6
hadoop-0.20.203.0rc1

this is the guide for the installation of Hadoop ecosystem,
is very extended so please follow step by step

====INSTALLATION=====

1. Installing java
    yum  install sun-java6-jdk

2.Adding a dedicated user for hadoop
This will add the user hdoopuser and the group hdoopgroup to your local machine.
    /usr/sbin/useradd hdoopuser
    groupadd hdoopgroup
    usermod -a -G hdoopgroup hdoopuser

3.Configuring SSH
    su - hdoopuser        #login as hdoopuser
    ssh-keygen -t rsa -P ""    #generate key without password
    cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys    #enable the new key
    chmod 0600 $HOME/.ssh/authorized_keys    #enable empty password

4.Disabling IPv6
    sed -i 's/^\(NETWORKING\s*=\s*\).*$/\NETWORKING=NO/' /etc/sysconfig/network

5.Installation/Conf/startup of Hadoop
    mkdir /hadoop
    chown -R hdoopuser /hadoop
    cd /hadoop/
    wget http://mirrors.abdicar.com/Apache-HTTP-Server//hadoop/common/stable/hadoop-0.20.203.0rc1.tar.gz
    tar -xvzf hadoop-0.20.203.0rc1.tar.gz
    ln -s /hadoop/hadoop-0.20.203.0rc1/ /hadoop/hadoop
    cd /hadoop/hadoop

    #basic config
    1)
    vim conf/core-site.xml
        #Add the following inside the tag
        
          fs.default.name
          hdfs://localhost:9000/
        

        
        dfs.permissions
        false
        

    2)
    vim conf/hdfs-site.xml
        #Add the following inside the tag
        
          dfs.name.dir
          /hadoop/hdfs/name
        

        
          dfs.data.dir
          /hadoop/hdfs/data
        

        
          dfs.replication
          2
        

    3)
    vim conf/mapred-site.xml
        #Add the following inside the tag
        
          mapred.job.tracker
          localhost:9001
        

    4)
    vim conf/hadoop-env.sh
        export JAVA_HOME=/opt/jre/
        export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
    5)
    Fomart nodes
        su - hdoopuser
        cd /hadoop/hadoop
        bin/hadoop namenode -format
    6)Start hadoop
        bin/start-all.sh
        notes:  HTTP CONSOLE OF HADOOP
            http://localhost:50030/ for the jobtrackeR
            http://localhost:50070/ for the namenode

5.Installation/Conf/startup of Hive/Derby
    cd /hadoop
    wget http://mirrors.ucr.ac.cr/apache//hive/stable/hive-0.8.1-bin.tar.gz
    tar -xvzf hive-0.8.1-bin.tar.gz
    ln -s /hadoop/hive-0.8.1-bin/ /hadoop/hive
    export HADOOP_HOME=/hadoop/hadoop/
    cd /hadoop/hive
     mv conf/hive-default.xml.template conf/hive-site.xml
    #test hive
    bin/hive
        > show tables;
    #installing derby metadatastore
    cd /hadoop
    wget http://archive.apache.org/dist/db/derby/db-derby-10.4.2.0/db-derby-10.4.2.0-bin.tar.gz
    tar -xzf db-derby-10.4.2.0-bin.tar.gz
    ln -s db-derby-10.4.2.0-bin derby
    mkdir derby/data
    export DERBY_INSTALL=/hadoop/derby/
    export DERBY_HOME=/hadoop/derby/
    export HADOOP=/hadoop/hadoop/bin/hadoop  

    vim /hadoop/hadoop/bin/start-dfs.sh
    #add to the file start-dfs.sh the next 2 lines
        cd /hadoop/derby/data
        nohup /hadoop/derby/bin/startNetworkServer -h 0.0.0.0 &

    vim /hadoop/hadoop/bin/start-all.sh
    #add to the file start-all.sh the next 2 lines
        cd /hadoop/derby/data
        nohup /hadoop/derby/bin/startNetworkServer -h 0.0.0.0 &

    #HIVE CONF
    vim /hadoop/hive/conf/hive-site.xml    #installing web panel for hive , search and replace
    #search for "javax.jdo.option.ConnectionURL" and edit like the following
        
          javax.jdo.option.ConnectionURL
          jdbc:derby://localhost:1527/metastore_db;create=true
          JDBC connect string for a JDBC metastore
        

    #HTTP CONSOLE OF HIVE
    bin/hive --service hwi &              
        URL: http://localhost:9999/

    #create new file
    vim /hadoop/hive/conf/jpox.properties
    #add the following
        javax.jdo.PersistenceManagerFactoryClass=org.jpox.PersistenceManagerFactoryImpl
        org.jpox.autoCreateSchema=false
        org.jpox.validateTables=false
        org.jpox/usr/share/javadoc/java-1.6.0-openjdk/jre/.validateColumns=false
        org.jpox.validateConstraints=false
        org.jpox.storeManagerType=rdbms
        org.jpox.autoCreateSccp /hadoop/derby/lib/derbytools.jar  /hadoop/hive/libhema=true
        org.jpox.autoStartMechanismMode=checked
        org.jpox.transactionIsolation=read_committed
        javax.jdo.option.DetachAllOnCommit=true
        javax.jdo.option.NontransactionalRead=true
        javax.jdo.option.ConnectionDriverName=org.apache.derby.jdbc.ClientDriver
        javax.jdo.option.ConnectionURL=jdbc:derby://localhost:1527/metastore_db;create=true
        javax.jdo.option.ConnectionUserName=APP
        javax.jdo.option.ConnectionPassword=mine
    #now copy derby jar sources to Hive lib
    cp /hadoop/derby/lib/derbyclient.jar /hadoop/hive/lib
    cp /hadoop/derby/lib/derbytools.jar  /hadoop/hive/lib

    #HTTP CONSOLE OF HIVE      
    http://localhost:9999/hwi/ for the hive

6.START CLUSTER
    /hadoop/hadoop/bin/start-all.sh
    /hadoop/hive/bin/hive --service hwi &   #hwi=webpanel
  

7. FOR NEXT TIME AND EVER. Create a bash profile
    vi /etc/profile
    export JAVA_HOME=/opt/jre/
    export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
    export HADOOP_HOME=/hadoop/hadoop/
    export DERBY_INSTALL=/hadoop/derby/
    export DERBY_HOME=/hadoop/derby/
    export HADOOP=/hadoop/hadoop/bin/hadoop


======RUNNING======
PANELS:
http://localhost:50030/ for the jobtrackeR
http://localhost:50060/ for the  tasktracker
http://localhost:50070/ for the namenode
http://localhost:9999/hwi/ for the hive

0 pensamientos:

Post a Comment

feedback!