Hadoop2.7.2 完全分布式集群部署

一、系统环境和软件环境

已整合hive+hbase+zookeeper用于线上环境，支撑大数据平台
CentOS 6.6 X64
hadoop-2.7.2.tar.gz
jdk-7u79-linux-x64.rpm
主机3台

master.hdp.imdst.com      NameNode   SecondaryNameNode  
1.slave.hdp.imdst.com     DataNode  
2.slave.hdp.imdst.com     DataNode

生产环境建议NameNode 和 SecondaryNameNode 分开

二、安装过程

解析主机名，对应服务器内网地址即可
三台主机分别新建hadoop用户和设置密码

useradd hadoop  
echo "123456" |passwd --stdin hadoop

在master中配置ssh免密码登录

su hadoop -c "mkdir -p /home/hadoop/.ssh"  
su hadoop -c "ssh-keygen -t rsa"  #直接回车  
su hadoop -c "touch /home/hadoop/.ssh/authorized_keys"  
su hadoop -c "cat /home/hadoop/.ssh/id_rsa.pub >> /home/hadoop/.ssh/authorized_keys "

scp /home/hadoop/.ssh/id_rsa.pub hadoop@1.slave.imdst.com:/home/hadoop/.ssh/authorized_keys   #根据提示输入密码，2.slave.imdst.com重复此操作

安装JDK和配置JAVA_HOME

rpm -Uvh jdk-7u79-linux-x64.rpm  
echo "export PATH=$PATH:/usr/java/jdk1.7.0_79/bin/" >> /etc/profile  
su hadoop -c "echo 'export JAVA_HOME=/usr/java/jdk1.7.0_79' >> /home/hadoop/.bashrc"  
source /etc/profile

配置Hadoop集群

mkdir /home/hadoop/src  
tar zxf hadoop-2.7.2.tar.gz -C /home/hadoop/src  
chown hadoop.hadoop /home/hadoop -R

三、配置文件修改 "$HOME/src/hadoop/etc/hadoop/"

core-site.xml

<configuration>  
        <property>  
                <name>fs.defaultFS</name>  
              <value>hdfs://master.hdp.imdst.com:9000</value>  
        </property>  
        <property>
                <name>hadoop.tmp.dir</name>
                <value>/data/hadoop/tmp</value>
        </property>

        <property>
                <name>hadoop.proxyuser.hadoop.groups</name>
                <value>hadoop</value>
                <description>Allow the superuser oozie to impersonate any members of the group group1 and group2</description>
        </property>

        <property>
                <name>hadoop.proxyuser.hadoop.hosts</name>
                <value>master.hdp.imdst.com,1.slave.hdp.imdst.com,2.slave.hdp.imdst.com</value>
                <description>The superuser can connect only from host1 and host2 to impersonate a user</description>
        </property>

</configuration>

hdfs-site.xml

<configuration>  
        <!--备份副本个数 -->
        <property> 
                <name>dfs.replication</name> 
                <value>3</value> 
        </property> 
        <!-- 热备NameNode -->
        <property> 
                <name>dfs.namenode.secondary.http-address</name> 
                <value>master.hdp.imdst.com:9001</value> 
        </property> 

        <property>
                <name>hadoop.tmp.dir</name>
                <value>/data/hadoop/tmp</value>
        </property>

        <property>
                <name>dfs.permissions</name>
                <value>false</value>
        </property>
</configuration>

mapred-site.xml

<configuration>  
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>

        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>master.hdp.imdst.com:10020</value>
        </property>

        <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>master.hdp.imdst.com:19888</value>
        </property>

        <property>
                <name>mapred.map.tasks.speculative.execution</name>
                <value>false</value>
        </property>

        <property>
                <name>mapred.map.tasks.speculative.execution</name>
                <value>flase</value>
        </property>
</configuration>

yarn-site.xml

<configuration>

    <!-- Site specific YARN configuration properties -->
    <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
    </property>
    <property>                                                               
            <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
            <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
            <name>yarn.resourcemanager.address</name>
            <value>master.hdp.imdst.com:8032</value>
    </property>
    <property>
            <name>yarn.resourcemanager.scheduler.address</name>
            <value>master.hdp.imdst.com:8030</value>
    </property>
    <property>
            <name>yarn.resourcemanager.resource-tracker.address</name>
            <value>master.hdp.imdst.com:8031</value>
    </property>
    <property>
            <name>yarn.resourcemanager.admin.address</name>
            <value>master.hdp.imdst.com:8033</value>
    </property>
    <property>
            <name>yarn.resourcemanager.webapp.address</name>
            <value>master.hdp.imdst.com:8088</value>
    </property>
</configuration>

slaves 中写入所有的DataNode

1.slave.hdp.imdst.com,2.slave.hdp.imdst.com

四、配置好的hadoop部署到三台服务器

master上执行

su - hadoop  
cd src && tar zcf hadoop-2.7.2.tar.gz hadoop-2.7.2  
scp hadoop-2.7.2.tar.gz hadoop@1.slave.hdp.imdst.com:/home/hadoop/src  
scp hadoop-2.7.2.tar.gz hadoop@2.slave.hdp.imdst.com:/home/hadoop/src

五、格式化namenode

在master上执行

su - hadoop  
cd src/hadoop  
bin/hdfs namenode -format  
#出现 successfully formatted 表示格式化成功了

六、启动和关闭hadoop集群

启动

su - hadoop && cd src/hadoop  
sbin/start-all.sh

关闭

su - hadoop && cd src/hadoop  
sbin/stop-all.sh

Hadoop2.7.2 完全分布式集群部署

2024-10-23 at 02:59

关于

认知

本站架构

博客历史

联系方式

打赏小博

一、系统环境和软件环境

二、安装过程

三、配置文件修改 "$HOME/src/hadoop/etc/hadoop/"

四、配置好的hadoop部署到三台服务器

五、格式化namenode

六、启动和关闭hadoop集群