一 服务器确认
hostIP hostname hdfs
192.168.1.235 master namenode
对应大数据组件 :hadoop,hive,hbase,zookeeper,sqoop
hostIP hostname hdfs
192.168.1.234 slave datanode
对应大数据组件:hadoop,hbase,zookeeper
注:两台主机必须是相对独立的计算机。
二 创建新用户并赋予root权限
创建用户hadoop 密码hadoop
adduser hadoop
passwd hadoop
赋予root权限:
修改 /etc/sudoers 文件,找到下面一行,在root下面添加一行,如下所示:
## Allow root to run any commands anywhere
root ALL=(ALL) ALL
hadoop ALL=(ALL) ALL
修改完毕,现在可以用hadoop帐号登录,然后用命令 su - ,即可获得root权限进行操作。
三 修改计算机名称
vi /etc/hostname
master或slave(对应以上)
vi /etc/hosts
192.168.1.235 master
192.168.1.234 slave
四 ssh免密登入
ssh-keygen -t rsa -P "" 获取公钥/root/.ssh/id_rsa.pub.
vi /root/.ssh/authorized_keys 新建文件authorized_keys
cat id_rsa.pub >> authorized_keys 将id_rsa.pub内容放入authorized_keys
scp authorized_keys root@slave:/root/.ssh 将master的 authorized_keys 放入slave 的/root/.ssh
cat id_rsa.pub >> authorized_keys 将slave 的id_rsa.pub内容放入authorized_keys
scp authorized_keys root@master:/root/.ssh 将slave的 authorized_keys 放入master 的/root/.ssh
五 安装Java1.7
1.下载jdk-7u25-linux-x64.tar.gz
2.解压在/home/hadoop目录下并配置好Java环境
mv jdk-7u25-linux-x64 jdk1.7
vi /etc/profile
export JAVA_HOME=/home/hadoop/jdk1.7
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/
export PATH=$PATH:$JAVA_HOME/bin
export JAVA_BIN=/usr/local/java/jdk1.7
export JAVA_HOME JAVA_BIN PATH CLASSPATH
六 下载hadoop-2.2.0.tar.gz
解压在/home/hadoop/
tar -zxvf hadoop-2.2.0.tar.gz
mv hadoop-2.2.0 hadoop
vi /etc/profile
export HADOOP_HOME="/home/hadoop/hadoop"
export PATH=.:$HADOOP_HOME/bin:$JAVA_HOME/bin:$PATH
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
配置hadoop-env.sh,指定java位置
cd /home/hadoop/hadoop/etc/hadoop
vi hadoop-env.sh
export JAVA_HOME=/home/hadoop/jdk1.7
配置core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/data/hadoop/tmp</value>
</property>
</configuration>
配置mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
</configuration>
配置hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
</configuration>
配置slave
slave
scp /etc/hosts root@slave:/etc/hosts
scp /etc/profile root@slave:/etc/profile
scp -r /home/hadoop/hadoop root@slave:/home/hadoop
格式化hdfs
bin/hdfs namenode -format
启动hadoop
bin/start-all.sh
验证:jps
七 下载apache-hive-0.13.1-bin.tar.gz
1.安装并启动mysql
2.解压hive在/home/hadoop下
mv apache-hive-0.13.1-bin hive
vi /etc/profile
export HIVE_HOME=/home/hadoop/hive
export PATH=$PATH:$HIVE_HOME/bin
配置 hive-site.xml
cp hive-default.xml.template hive-site.xml
<configuration>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/hive/warehouse</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?characterEncoding=UTF-8
&createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
</configuration>
验证:输入hive
八 下载sqoop-1.4.6.bin__hadoop-2.0.4.tar.gz
1.解压 /home/hadoop下
mv sqoop-1.4.6.bin__hadoop-2.0.4 sqoop
2. 配置环境变量和配置文件
cp conf/sqoop-env-template.sh /conf/sqoop-env.sh
在sqoop-env.sh中添加如下代码:
export HADOOP_COMMON_HOME=/home/hadoop/hadoop
export HADOOP_MAPRED_HOME=/home/hadoop/hadoop
export HBASE_HOME=/home/hadoop/hbase
export HIVE_HOME=/home/hadoop/hive
export ZOOCFGDIR=/home/hadoop/zookeeper
(如果数据读取不设计hbase和hive,那么相关hbase和hive的配置可以不加,如果集群有独立的zookeeper集群,那么配置zookeeper,反之,不用配置)。
3.copy需要的lib包到Sqoop/lib
所需的包:mysql的jdbc包(或Oracle的jdbc包等)
cp mysql-connector-java-5.1.18.jar /sqoop/lib/
4.添加环境变量
vi /etc/profile
export SQOOP_HOME=/home/hadoop/sqoop
export PATH=$SQOOP_HOME/bin:$PATH
export LOGDIR=$SQOOP_HOME/logs
5.测试验证
--列出mysql数据库中的所有数据库
sqoop list-databases --connect jdbc:mysql://IPHOST:3306 --username xxx --password xxx
九 下载zookeeper-3.4.6.tar.gz
解压在/home/hadoop下
mv zookeeper-3.4.6 zookeeper
vi /etc/profile
export ZOOKEEPER_HOME=/home/hadoop/zookeeper
export PATH=$PATH:$ZOOKEEPER_HOME/bin
修改配置文件zoo.cfg
cd /home/hadoop/zookeeper/conf
vi zoo.cfg
dataDir=/home/hadoop/zookeeper/zkdata
dataLogDir=/home/hadoop/zookeeper/zkdatalog
server.1=master:2888:3888
server.2=slave:2888:3888
复制到个节点
scp -r zookeeper root@slave:/home/hadoop/
master和slave: 在/home/hadoop/zookeeper/zkdata下添加对应的myid文件,如其值与server.1中1对应
如master :myid为1 slave:myid为2
zookeeper验证:
cd /home/hadoop/zookeeper/bin
zkServer.sh start
十 下载hbase-0.98.4-hadoop2-bin.tar.gz
hbase依赖Hadoop,zookeeper
解压在/home/hadoop下
mv hbase-0.98.4-hadoop2-bin hbase
vi /etc/profile
export HBASE_HOME=/home/hadoop/hbase
export PATH=$PATH:$HBASE_HOME/bin
配置hbase-site.xml
cd /home/hadoop/hbase/conf
vi hbase-site.xml
<configuration>
<property>
<name>hbase.tmp.dir</name>
<value>/home/admin/data/var/hbase</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master,slave1</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/zookeeper/zkdata</value>
</property>
<property>
<name>hbase.master.maxclockskew</name>
<value>180000</value>
</property>
</configuration>
vi regionservers
slave
复制到个节点
scp -r hbase root@slave:/home/hadoop/
验证sh start-hbase.sh
hbase shell 查看状态
Status
十一 myeclipse 连接hive
在工程下导入Hadoop/share/hadoop/common ,Hadoop/share/hadoop/common/lib,hive/lib下的jar包
启动hive --service hiveserver2
我们需要mysql驱动jar包mysql-connector-java-5.1.22-bin.jar 拷贝到 $HIVE_HOME/lib/ 目录下和新建hive项目lib下。
连接代码:
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.Statement;
public class hive {
private static ResultSet res;
public static void main(String[] args) throws Exception {
Class.forName("org.apache.hive.jdbc.HiveDriver");
Connection conn=DriverManager.getConnection("jdbc:hive2://192.168.1.235:10000", "root", "");
Statement stmt = conn.createStatement();
try {
String tablename="testHiveDriverTable";
String sql = "select * from " + tablename;
System.out.println("Running:" + sql);
res = stmt.executeQuery(sql);
System.out.println("执行“select * query”运行结果:");
while (res.next()) {
System.out.println(res.getInt(1) + "\t" + res.getString(2));
}
sql="select count(*) from testHiveDriverTable";
PreparedStatement pstmt=conn.prepareStatement(sql);
ResultSet rs=pstmt.executeQuery();
while (rs.next()) {
System.out.println(rs.getInt(1));
}
} catch (Exception e) {
e.printStackTrace();
} finally{
conn.close();
}
}
}
十二 myeclipse连接hbase
(1)添加JAR包
添加JAR包有两种方法,比较简单的是,在HBase工程上,右击HBase工程,弹出BuildPath->ConfigureBuildPath,在对话框中单击Libraries选项卡,在该选项卡下单击Add External JARs按钮,定位到$HBase/lib目录下所有jar包,Hadoop/share/hadoop下所有jar包
(2)添加hbase-site.xml配置文件
在工程目录下创建Conf文件夹,将$HBase_HOME/conf/目录中的hbase-site.xml文件复制到该文件夹中。通过右键选择Properties->Java BuildPath->Libraries->Add Class Folder,然后勾选Conf文件进行添加。
连接代码如下:
package hbase;
import java.io.IOException;
import javax.annotation.Resource;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.MasterNotRunningException;
import org.apache.hadoop.hbase.ZooKeeperConnectionException;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.HBaseAdmin;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.util.Bytes;
public class HBaseTestCase {
static Configuration cfg=HBaseConfiguration.create();
public static void create(String tablename,String columnFamily) throws MasterNotRunningException, ZooKeeperConnectionException, IOException{
HBaseAdmin admin=new HBaseAdmin(cfg);
System.out.println(admin.toString());
if(admin.tableExists(tablename)){
System.out.println("table Exists");
System.exit(0);
}else{
HTableDescriptor tableDesc=new HTableDescriptor(tablename);
tableDesc.addFamily(new HColumnDescriptor(columnFamily));
admin.createTable(tableDesc);
System.out.println("create table success");
}
}
public static void put(String tablename,String row,String columnFamily,String column,String data) throws IOException
{
HTable table=new HTable(cfg,tablename);
Put p1=new Put(Bytes.toBytes(row));
p1.add(Bytes.toBytes(columnFamily),Bytes.toBytes(column),Bytes.toBytes(data));
table.put(p1);
System.out.println("put '"+row+"','"+columnFamily+":"+column+"','"+data+"'");
}
public static void get(String tablename,String row) throws IOException
{
HTable table=new HTable(cfg,tablename);
Get g=new Get(Bytes.toBytes(row));
Result result=table.get(g);
System.out.println("Get: "+result);
}
public static void scan(String tablename) throws IOException
{
HTable table=new HTable(cfg,tablename);
Scan s=new Scan();
ResultScanner rs=table.getScanner(s);
for(Result r:rs)
{
System.out.println("Scan: "+r);
}
}
public static boolean delete(String tablename) throws MasterNotRunningException, ZooKeeperConnectionException, IOException
{
HBaseAdmin admin=new HBaseAdmin(cfg);
if(admin.tableExists(tablename))
{
try
{
admin.disableTable(tablename);
admin.deleteTable(tablename);
}catch(Exception ex)
{
ex.printStackTrace();
return false;
}
}
return true;
}
public static void main(String[] args) {
String tablename="hbase_tb";
String columnFamily="cf";
try
{
HBaseTestCase.create(tablename, columnFamily);
HBaseTestCase.put(tablename, "row1", columnFamily, "cl1", "data");
HBaseTestCase.get(tablename, "row1");
HBaseTestCase.scan(tablename);
if(true==HBaseTestCase.delete(tablename))
{
System.out.println("Delete table:"+tablename+"success!");
}
}catch(Exception e)
{
e.printStackTrace();
}
}
}
1109

被折叠的 条评论
为什么被折叠?



