Hbase 某个时间段的多条记录存在一个rowkey中

最新推荐文章于 2025-06-15 20:34:42 发布

hfhaozai12

最新推荐文章于 2025-06-15 20:34:42 发布

阅读量1.2k

点赞数

本文详细解析了OpenTSDB的数据处理方法，包括数据重组为列式存储和时间序列数据的高效存储策略。通过将每条日志事件拆分为单独的行，并按时间范围重新组织数据，实现数据的压缩存储。每个事件被转换为列并存储在与开始时间相对应的时间偏移量下，利用HBase的强大功能支持这种高级数据处理和存储技术。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Case Study - Log Data and Timeseries Data on Steroids

This effectively is the OpenTSDB approach. What OpenTSDB does is re-write data and pack rows into columns for certain time-periods. For a detailed explanation, see: http://opentsdb.net/schema.html, and Lessons Learned from OpenTSDB from HBaseCon2012.

But this is how the general concept works: data is ingested, for example, in this manner…

[hostname][log-event][timestamp1]
[hostname][log-event][timestamp2]
[hostname][log-event][timestamp3]

with separate rowkeys for each detailed event, but is re-written like this…

[hostname][log-event][timerange]

and each of the above events are converted into columns stored with a time-offset relative to the beginning timerange (e.g., every 5 minutes). This is obviously a very advanced processing technique, but HBase makes this possible.