最近一直在搞批量生成HFile格式的代码,使用了好多方法,具体如下:
方法一、KeyValue生成,代码大致如下:
KeyValue kev = new KeyValue(Bytes.toBytes(row.toString()), Bytes
.toBytes("info"), Bytes.toBytes(tableField), Bytes.toBytes(value));但是总是报错,如下:
Added a key not lexically larger than previous
Added a key not lexically larger than previous网上找了些解决方法,都是治标不治本,无结果。
方法二、Put生成,代码大致如下:
Put put = new Put(Bytes.toBytes(row));
put.add(Bytes.toBytes("info"), Bytes.toBytes(tableField), Bytes.toBytes(value));
但是总是报错,如下:
12/05/29 09:35:00 INFO mapred.JobClient:Task Id : attempt_201205181722_0988_r_000000_0, Status : FAILED
org.apache.hadoop.hbase.ZooKeeperConnectionException:HBase is able to connect to ZooKeeper but the connection closes immediately.This could be a sign that the server has too many connections (30 is thedefault). Consider inspecting your ZK server logs for that error and then makesure you are reusing HBaseConfiguration as often as you can. See HTable'sjavadoc for more information.
atorg.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:160)
atorg.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1209)
atorg.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:511)
atorg.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:502)
atorg.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:172)
atorg.apache.hadoop.hbase.client.HTable.<init>(HTable.java:175)同样网上找了些解决方法,同样都是治标不治本,同样无结果。
方法三:参考hbase批量导入类src\org\apache\hadoop\hbase\mapreduce\ImportTsv.java代码,代码生成关键大致如下:
KeyValue kv = new KeyValue(lineBytes, parsed
.getRowKeyOffset(), parsed.getRowKeyLength(),
parser.getFamily(i), 0, parser.getFamily(i).length,
parser.getQualifier(i), 0,
parser.getQualifier(i).length, ts,
KeyValue.Type.Put, lineBytes, parsed
.getColumnOffset(i), parsed
.getColumnLength(i));折腾了一上午,因为要和业务相关联,所以改了好多代码。搞定!
写的比较简单,但是过程相当痛苦!
希望大家拍砖!呵呵。
希望和各位有云经验的一起进步。
本文详细记录了作者在使用Hadoop HBase进行批量生成HFile格式代码时遇到的问题,包括KeyValue生成、Put生成及参考导入类时所遇到的错误。经过一上午的反复尝试与修改代码,最终通过参考ImportTsv.java类成功解决问题。文章分享了从问题到解决方案的过程,以及在过程中遇到的困难与收获,旨在与有经验的技术人员一起探讨和进步。
2万+

被折叠的 条评论
为什么被折叠?



