http://blog.youkuaiyun.com/yiboo/article/details/7284111
前面写了HBASE通过预先创建regions,来平衡数据的负载,其中用到了hbase官方的example
但是没有人告诉你怎么用
自己试了试用法
主要的就是如何分配rowkey start end之间的关系,因为我的数据的key是md5值,所以我使用了md5的两段分为300份
public static void main(String[] agrs) {
HBaseAdmin admin;
try {
admin = new HBaseAdmin(conf);
HTableDescriptor tableDesc = new HTableDescriptor("test");
byte[][] splits =getHexSplits("100000000000000000", "ffffffffffffffffffff",
300);
createTable( admin, tableDesc,splits);
} catch (MasterNotRunningException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (ZooKeeperConnectionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
/**
* 默认情况下Hbase创建Table会新建一个region。执行批量导入,
* 意味着所有的client会写入这个region,直到这个region足够大,以至于分裂。
* 一个有效的提高批量导入的性能的方式,是预创建空的region
*/
public static boolean createTable(HBaseAdmin admin, HTableDescriptor table,
byte[][] splits) throws IOException {
try {
admin.createTable(table, splits);
return true;
} catch (TableExistsException e) {
LOG.info("table " + table.getNameAsString() + " already exists");
// the table already exists...
return false;
}
}
public static byte[][] getHexSplits(String startKey, String endKey,
int numRegions) {
byte[][] splits = new byte[numRegions - 1][];
BigInteger lowestKey = new BigInteger(startKey, 16);
BigInteger highestKey = new BigInteger(endKey, 16);
BigInteger range = highestKey.subtract(lowestKey);
BigInteger regionIncrement = range.divide(BigInteger
.valueOf(numRegions));
lowestKey = lowestKey.add(regionIncrement);
for (int i = 0; i < numRegions - 1; i++) {
BigInteger key = lowestKey.add(regionIncrement.multiply(BigInteger
.valueOf(i)));
byte[] b = String.format("%016x", key).getBytes();
splits[i] = b;
}
return splits;
}
运行完后大家可以看到http://host:60010/master-status中已经有了numberOfOnlineRegions=101,如果创建2个表则则是201
Region Servers
ServerName | Start time | Load | |
---|---|---|---|
dn1,60020,1329986704703 | Thu Feb 23 16:45:04 CST 2012 | requestsPerSecond=0, numberOfOnlineRegions=201, usedHeapMB=62, maxHeapMB=995 | |
dn2,60020,1329986707893 | Thu Feb 23 16:45:07 CST 2012 | requestsPerSecond=0, numberOfOnlineRegions=201, usedHeapMB=51, maxHeapMB=995 | |
namenode,60020,1329986696234 | Thu Feb 23 16:44:56 CST 2012 | requestsPerSecond=0, numberOfOnlineRegions=202, usedHeapMB=40, maxHeapMB=995 | |
Total: | servers: 3 | requestsPerSecond=0, numberOfOnlineRegions=604 |