hadoop@hadoopSlave2:/opt/hadoop/testData$ touch hiveOuter.log
hadoop@hadoopSlave2:/opt/hadoop/testData$ vim hiveOuter.log
20140413 jim0 2014-06-06
20140414 jim1 2014-06-06
...
hadoop@hadoopSlave2:/opt/hadoop/testData$ scp -rv hiveOuter.log hadoopMaster:/opt/hadoop/hive/
hadoop@hadoopSlave2:/opt/hadoop/testData$ ssh hadoopMaster
hadoop@hadoopMaster:~$ cd /opt/hadoop/hive/
hadoop@hadoopMaster:/opt/hadoop/hive$ hdfs dfs -moveFromLocal /opt/hadoop/hive/hiveOuter.log /user/hive/
hadoop@hadoopMaster:/opt/hadoop/hive$ hdfs dfs -ls -R /
-rw-r--r-- 2 hadoop supergroup 2591 2014-06-06 10:00 /user/hive/hiveOuter.log
drwxr-xr-x - hadoop supergroup 0 2014-06-06 09:51 /user/hive/warehouse
drwxr-xr-x - hadoop supergroup 0 2014-06-06 09:51 /user/hive/warehouse/test1
...
hive> create EXTERNAL table IF NOT EXISTS userOuter (id int,name string, ed string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' location 'hdfs://hadoopMaster:9000/user/hive/warehouse/test1';
OK
Time taken: 0.309 seconds
hive> show tables;
OK
userouter
Time taken: 0.045 seconds, Fetched: 12 row(s)
hive> LOAD DATA INPATH 'hdfs://hadoopMaster:9000/user/hive/warehouse/hiveOuter.log' INTO TABLE userOuter partition(ptDate='20140606');
Loading data to table default.hiveOuter partition (ptdate=20140606)
Partition default.testouter{ptdate=20140606} stats: [numFiles=3, numRows=0, totalSize=4312, rawDataSize=0]
OK
Time taken: 1.153 seconds
hive> SELECT * FROM userouter;
OK
20140413 jim0 2014-06-06
20140414 jim1 2014-06-06
20140415 jim2 2014-06-06
....
此时hive将该表的数据文件信息保存到metadata数据库中。
hadoop@hadoopMaster:~$ mysql -uroot -proot
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 47
Server version: 5.5.37-0ubuntu0.12.04.1 (Ubuntu)
Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| hiveDB |
| mysql |
| performance_schema |
| schools |
+--------------------+
5 rows in set (0.00 sec)
mysql> use hiveDB;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
可以看到该表的类型为EXTERNAL_TABLE。
mysql> SELECT * FROM TBLS WHERE TBL_NAME = 'userouter';
+--------+-------------+-------+------------------+--------+-----------+-------+-----------+----------------+--------------------+--------------------+
| TBL_ID | CREATE_TIME | DB_ID | LAST_ACCESS_TIME | OWNER | RETENTION | SD_ID | TBL_NAME | TBL_TYPE | VIEW_EXPANDED_TEXT | VIEW_ORIGINAL_TEXT |
+--------+-------------+-------+------------------+--------+-----------+-------+-----------+----------------+--------------------+--------------------+
| 25 | 1402020143 | 1 | 0 | hadoop | 0 | 26 | userouter | EXTERNAL_TABLE | NULL | NULL |
+--------+-------------+-------+------------------+--------+-----------+-------+-----------+----------------+--------------------+--------------------+
1 row in set (0.00 sec)
在表SDS中记录了表userouter的数据文件路径为hdfs://hadoopMaster:9000/user/hive/warehouse/test1。
mysql> SELECT * FROM SDS WHERE SD_ID=25;
+-------+-------+------------------------------------------+---------------+---------------------------+----------------------------------------------------+-------------+------------------------------------------------------------+----------+
| SD_ID | CD_ID | INPUT_FORMAT | IS_COMPRESSED | IS_STOREDASSUBDIRECTORIES | LOCATION | NUM_BUCKETS | OUTPUT_FORMAT | SERDE_ID |
+-------+-------+------------------------------------------+---------------+---------------------------+----------------------------------------------------+-------------+------------------------------------------------------------+----------+
| 25 | 24 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://hadoopMaster:9000/user/hive/warehouse/test1 | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 25 |
+-------+-------+------------------------------------------+---------------+---------------------------+----------------------------------------------------+-------------+------------------------------------------------------------+----------+
1 row in set (0.00 sec)