第一步:ZOOKEEPER(多台机器,时间同步)
- 在/opt下创建chd目录:
sudo mkdir cdh-5.3.6
- 更改所属用户与用户组:
sudo chown beifeng:beifeng /opt/*
- 上传文件到/opt/software目录下:zookeeper-3.4.5-cdh5.3.6.tar,hadoop-2.5.0-cdh5.3.6.tar,hive-0.13.1-cdh5.3.6.tar,sqoop-1.4.5-cdh5.3.6.tar
解压:
tar -zxf hadoop-2.5.0-cdh5.3.6.tar.gz -C /opt/cdh-5.3.6/ tar -zxf hive-0.13.1-cdh5.3.6.tar.gz -C /opt/cdh-5.3.6/ tar -zxf zookeeper-3.4.5-cdh5.3.6.tar.gz -C /opt/cdh-5.3.6/
- 配置伪分布式zookeeper
之前配置分布式的步骤博客:
http://blog.youkuaiyun.com/haoyuexihuai/article/details/53080133 - conf目录下 zoo.cfg 配置数据路径:dataDir=/opt/cdh-5.3.6/zookeeper-3.4.5-cdh5.3.6/datas
- 启动并查看zookeeper
- 在/opt下创建chd目录:
第二步:HADOOP
HDFS
- hadoop-env.sh
配置export JAVA_HOME=/opt/modules/jdk1.7.0_67
core-site.xml
<!-- 指定 namenode 主节点所在的位置以及交互端口号. --> <property> <name>fs.defaultFS</name> <value>hdfs://hadoop-senior01.ibeifeng.com:8020</value> </property> <!-- 更改 hadoop.tmp.dir 的默认临时目录路径. --> <property> <name>hadoop.tmp.dir</name> <value>/opt/cdh-5.3.6/hadoop-2.5.0-cdh5.3.6/data</value> </property>
hdfs-site.xml
<!-- 指定副本个数. --> <property> <name>dfs.replication</name> <value>1</value> </property> <!-- 设置不启用 HDFS 文件系统的权限检查. 由于是测试环境,所以关掉--> <property> <name>dfs.permissions.enabled</name> <value>false</value> </property>
配置完之后格式化:
bin/hdfs namenode –format
启动进程
sbin/hadoop-daemon.sh start namenode sbin/hadoop-daemon.sh start datanode
HDFS操作,创建数据仓库
- 创建两个目录/tmp 与 /user/hive/warehouse
bin/hdfs dfs -mkdir -p /user/hive/warehouse - 赋予权限
bin/hdfs dfs -chmod g+w /tmp bin/hdfs dfs -chmod g+w /user/hive/warehouse
- 创建两个目录/tmp 与 /user/hive/warehouse
- 配置native
- hadoop-env.sh
YARN
- yarn-env.sh
配置export JAVA_HOME=/opt/modules/jdk1.7.0_67 yarn-site.xml
<!-- 配置resourcemanager在哪台电脑运行 --> <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop-senior01.ibeifeng.com</value> </property> <!-- 设置reduce的获取数据的方式 --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- 指定是否开启日志聚集功能 --> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <!-- 设置日志在 HDFS 上保留的时间期限 --> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>106800</value> </property>
启动进程
sbin/yarn-daemon.sh start resourcemanager sbin/yarn-daemon.sh start nodemanager
- 打开页面:http://hadoop-senior01.ibeifeng.com:8088
- yarn-env.sh
MAPREDUCE
- mapred-env.sh
配置:export JAVA_HOME=/opt/modules/jdk1.7.0_67 mapred-site.xml.template -》 mapred -site.xml
<!-- 指定 MapReduce 运行在 YARN 上. --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <!-- 配置 JobhistoryServer 历史服务器. --> <property> <name>mapreduce.jobhistory.address</name> <value>hadoop-senior01.ibeifeng.com:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hadoop-senior01.ibeifeng.com:19888</value> </property>
启动进程:sbin/mr-jobhistory-daemon.sh start historyserver
- mapred-env.sh
第三步:HIVE配置
博客:http://blog.youkuaiyun.com/haoyuexihuai/article/details/53290274hive-env.sh.template,将template去掉
HADOOP_HOME=/opt/cdh-5.3.6/hadoop-2.5.0-cdh5.3.6 export HIVE_CONF_DIR=/opt/cdh-5.3.6/hive-0.13.1-cdh5.3.6/conf
hive-site.xml
<!--指定连接MySQL的主机以及端口号和数据库名称--> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://hadoop-senior01.ibeifeng.com:3306/cdhmetastore?createDatabaseIfNotExist=true</value> </property> <!--指定MySQL驱动--> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <!--指定连接MySQL的用户名和密码--> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>123456</value> </property> <!--显示当前数据以及表头列名--> <property> <name>hive.cli.print.header</name> <value>true</value> </property> <property> <name>hive.cli.print.current.db</name> <value>true</value> </property> <!--Hive作为服务启动--> <property> <name>hive.server2.thrift.port</name> <value>10000</value> </property> <property> <name>hive.server2.thrift.bind.host</name> <value>hadoop-senior01.ibeifeng.com</value> </property>
复制mysql的驱动到hive下
cp -r mysql-connector-java-5.1.27-bin.jar /opt/cdh-5.3.6/hive-0.13.1-cdh5.3.6/lib/
配置hive-log4j.properties 日志目录
在hive下先创建logs目录hive.root.logger=INFO,DRFA hive.log.dir=/opt/cdh-5.3.6/hive-0.13.1-cdh5.3.6/logs
CDH配置(Zookeeper,HADOOP,Hive)
最新推荐文章于 2024-06-12 09:11:02 发布