上来先做实验,用0.8.1下面的table03,来做这个BitMapIndex的实验。
hive> dfs -ls /user/hive/warehouse/table03;
Found 6 items
-rw-r--r-- 1 allen supergroup 67109134 2012-03-12 21:48 /user/hive/warehouse/table03/000000_0
-rw-r--r-- 1 allen supergroup 67108860 2012-03-12 21:48 /user/hive/warehouse/table03/000001_0
-rw-r--r-- 1 allen supergroup 67108860 2012-03-12 21:48 /user/hive/warehouse/table03/000002_0
-rw-r--r-- 1 allen supergroup 67108860 2012-03-12 21:48 /user/hive/warehouse/table03/000003_0
-rw-r--r-- 1 allen supergroup 67108860 2012-03-12 21:49 /user/hive/warehouse/table03/000004_0
-rw-r--r-- 1 allen supergroup 21344316 2012-03-12 21:49 /user/hive/warehouse/table03/000005_0
hive> create index bitmap_index on table table03(id)
> as 'org.apache.hadoop.hive.ql.index.bitmap.BitmapIndexHandler'
> with deferred rebuild;
OK
Time taken: 0.715 seconds
hive> alter index bitmap_index on table03 rebuild;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapred.reduce.tasks=<number>
Starting Job = job_201203141051_0004, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201203141051_0004
Kill Command = /home/allen/Hadoop/hadoop-0.20.2/bin/../bin/hadoop job -Dmapred.job.tracker=localhost:9001 -kill job_201203141051_0004
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
.
.
.
2012-03-14 13:49:33,749 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201203141051_0004
Loading data to table default.default__table03_bitmap_index__
Deleted hdfs://localhost:9000/user/hive/warehouse/default__table03_bitmap_index__
Table default.default__table03_bitmap_index__ stats: [num_partitions: 0, num_files: 1, num_rows: 0, total_size: 95701985, raw_data_size: 0]
MapReduce Jobs Launched:
Job 0: Map: 2 Reduce: 1 HDFS Read: 356889161 HDFS Write: 95701985 SUCESS
Total MapReduce CPU Time Spent: 0 msec
OK
Time taken: 283.695 seconds
hive>
下面看一下HDFS上都有哪些变化:
hive> dfs -ls /user/hive/warehouse/;
Found 5 items
drwxr-xr-x - allen supergroup 0 2012-03-12 17:26 /user/hive/warehouse/default__table02_compact_index__
drwxr-xr-x - allen supergroup 0 2012-03-14 13:49 /user/hive/warehouse/default__table03_bitmap_index__
drwxr-xr-x - allen supergroup 0 2012-03-04 22:22 /user/hive/warehouse/table01
drwxr-xr-x - allen supergroup 0 2012-03-04 22:33 /user/hive/warehouse/table02
drwxr-xr-x - allen supergroup 0 2012-03-12 21:49 /user/hive/warehouse/table03
hive> dfs -du /user/hive/warehouse/;
Found 5 items
74701985 hdfs://localhost:9000/user/hive/warehouse/default__table02_compact_index__
95701985 hdfs://localhost:9000/user/hive/warehouse/default__table03_bitmap_index__
356888890 hdfs://localhost:9000/user/hive/warehouse/table01
356888890 hdfs://localhost:9000/user/hive/warehouse/table02
356888890 hdfs://localhost:9000/user/hive/warehouse/table03
hive> dfs -ls /user/hive/warehouse/default__table03_bitmap_index__
> ;
Found 1 items
-rw-r--r-- 1 allen supergroup 95701985 2012-03-14 13:47 /user/hive/warehouse/default__table03_bitmap_index__/000000_0
hive> exit;
allen@allen-laptop:~/Desktop/hive-0.8.1$ hadoop fs -cat /user/hive/warehouse/default__table03_bitmap_index__/000000_0|head
12/03/14 14:22:45 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively
0hdfs://localhost:9000/user/hive/warehouse/table03/000000_00124858993459210
1hdfs://localhost:9000/user/hive/warehouse/table03/000000_0352124858993459210
2hdfs://localhost:9000/user/hive/warehouse/table03/000000_0704124858993459210
3hdfs://localhost:9000/user/hive/warehouse/table03/000000_01056124858993459210
4hdfs://localhost:9000/user/hive/warehouse/table03/000000_01408124858993459210
5hdfs://localhost:9000/user/hive/warehouse/table03/000000_01760124858993459210
6hdfs://localhost:9000/user/hive/warehouse/table03/000000_02112124858993459210
7hdfs://localhost:9000/user/hive/warehouse/table03/000000_02464124858993459210
8hdfs://localhost:9000/user/hive/warehouse/table03/000000_02816124858993459210
9hdfs://localhost:9000/user/hive/warehouse/table03/000000_03168124858993459210
cat: Unable to write to output stream.
allen@allen-laptop:~/Desktop/hive-0.8.1$ hadoop fs -text /user/hive/warehouse/default__table03_bitmap_index__/000000_0|head
12/03/14 14:23:10 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively
0hdfs://localhost:9000/user/hive/warehouse/table03/000000_00124858993459210
1hdfs://localhost:9000/user/hive/warehouse/table03/000000_0352124858993459210
2hdfs://localhost:9000/user/hive/warehouse/table03/000000_0704124858993459210
3hdfs://localhost:9000/user/hive/warehouse/table03/000000_01056124858993459210
4hdfs://localhost:9000/user/hive/warehouse/table03/000000_01408124858993459210
5hdfs://localhost:9000/user/hive/warehouse/table03/000000_01760124858993459210
6hdfs://localhost:9000/user/hive/warehouse/table03/000000_02112124858993459210
7hdfs://localhost:9000/user/hive/warehouse/table03/000000_02464124858993459210
8hdfs://localhost:9000/user/hive/warehouse/table03/000000_02816124858993459210
9hdfs://localhost:9000/user/hive/warehouse/table03/000000_03168124858993459210
text: Unable to write to output stream.
对比下compact的内容:
allen@allen-laptop:~/Desktop/hive-0.8.1$ hadoop fs -text /user/hive/warehouse/default__table02_compact_index__/000000_0|head
12/03/14 14:23:41 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively
0hdfs://localhost:9000/user/hive/warehouse/table02/000000_00
1hdfs://