Cassandra的timestamp类型

最新推荐文章于 2020-03-19 22:44:23 发布

sin_core

最新推荐文章于 2020-03-19 22:44:23 发布

阅读量9.1k

点赞数

本文链接：https://blog.youkuaiyun.com/wxs_core/article/details/45815885

版权

Cassandra中存储时间推荐使用10位整型timestamp，虽然不够直观，但更节省空间。例如，存储1431838118比存储'2015-05-17 20:45:00'更为高效。INT_MAX值为2147483647，对应2038-01-19 11:14:07，对于当前需求来说已经足够。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

cqlsh:testspace> CREATE TABLE test_timestamp(id int, date timestamp, PRIMARY KEY(id,date));
cqlsh:testspace> INSERT INTO test_timestamp (id, date ) VALUES ( 1,'2015-05-18 09:00');
cqlsh:testspace> INSERT INTO test_timestamp (id, date ) VALUES ( 4,'2015-05-17 22:48:38');
cqlsh:testspace> INSERT INTO test_timestamp (id, date ) VALUES ( 3,'2015-05-17 21:48:38');
cqlsh:testspace> INSERT INTO test_timestamp (id, date ) VALUES ( 2,'2015-05-17 20:48:38');
cqlsh:testspace> SELECT * FROM test_timestamp 
             ... ;


 id | date
----+--------------------------
  1 | 2015-05-18 09:00:00+0800
  2 | 2015-05-17 20:48:38+0800
  4 | 2015-05-17 22:48:38+0800
  3 | 2015-05-17 21:48:38+0800


(4 rows)

查询语句不限定PRIMARY KEY的话需要用ALLOW FILTERING

cqlsh:testspace> SELECT * FROM test_timestamp WHERE date>'2015-05-17';
Bad Request: Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING
cqlsh:testspace> SELECT * FROM test_timestamp WHERE date>'2015-05-17' allow filtering;


 id | date
----+--------------------------
  1 | 2015-05-18 09:00:00+0800
  2 | 2015-05-17 20:48:38+0800
  4 | 2015-05-17 22:48:38+0800
  3 | 2015-05-17 21:48:38+0800


(4 rows)

timestamp的字符串表达比较灵活

cqlsh:testspace> SELECT * FROM test_timestamp WHERE date>'2015-05-18' allow filtering;


 id | date
----+--------------------------
  1 | 2015-05-18 09:00:00+0800


(1 rows)


cqlsh:testspace> SELECT * FROM test_timestamp WHERE date>'2015-05-17 20:00' and date<'2015-05-17 21:00' allow filtering;


 id | date
----+--------------------------
  2 | 2015-05-17 20:48:38+0800


(1 rows)


cqlsh:testspace> SELECT * FROM test_timestamp WHERE date>'2015-05-17';
Bad Request: Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING
cqlsh:testspace> SELECT * FROM test_timestamp WHERE date>'2015-05-17 20:00' and date<'2015-05-17 21:00' allow filtering;


 id | date
----+--------------------------
  2 | 2015-05-17 20:48:38+0800


(1 rows)


cqlsh:testspace> SELECT * FROM test_timestamp WHERE date>'2015-05-18' allow filtering;


 id | date
----+--------------------------
  1 | 2015-05-18 09:00:00+0800


(1 rows)


cqlsh:testspace> SELECT * FROM test_timestamp WHERE date>'2015-05-17';
Bad Request: Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING
cqlsh:testspace> SELECT * FROM test_timestamp WHERE date>'2015-05-17' allow filtering;


 id | date
----+--------------------------
  1 | 2015-05-18 09:00:00+0800
  2 | 2015-05-17 20:48:38+0800
  4 | 2015-05-17 22:48:38+0800
  3 | 2015-05-17 21:48:38+0800


(4 rows)


cqlsh:testspace> SELECT * FROM test_timestamp WHERE date>'2015-05-17 20:45:00' and date<'2015-05-17 20:50:00' allow filtering;


 id | date
----+--------------------------
  2 | 2015-05-17 20:48:38+0800


(1 rows)


cqlsh:testspace> SELECT * FROM test_timestamp WHERE id=2 and date>'2015-05-17 20:45:00' and date<'2015-05-17 20:50:00';


 id | date
----+--------------------------
  2 | 2015-05-17 20:48:38+0800


(1 rows)

值得注意的一个问题,使用cassandra python driver

#~/usr/bin/python
import time
from cassandra.cluster import Cluster
cluster  = Cluster()
session = cluster.connect()

session.execute('USE testspace')

cql  = "SELECT * FROM test_timestamp WHERE date > '2015-05-17 20:45:00+0800' AND date < '2015-05-17 20:50:00+0800' ALLOW FILTERING"
print cql 

res = session.execute(cql)
print res 
for idx, date in res:
    print idx, date
    print idx, int(time.mktime(date.timetuple()))
    print type(idx), type(date)

output

SELECT * FROM test_timestamp WHERE date > '2015-05-17 20:45:00+0800' AND date < '2015-05-17 20:50:00+0800' ALLOW FILTERING
[Row(id=2, date=datetime.datetime(2015, 5, 17, 12, 48, 38))]
2 2015-05-17 12:48:38
2 1431838118
<type 'int'> <type 'datetime.datetime'>

可见，cassandra中timestamp类型是带时区的，比如上面的例子中是东8区。但是python查询的结果就有问题了，应该是标准时区的，因此date time的数值从20:48变为了12:48，慢了8个小时。关于datetime与不同时区python中肯定有相关函数处理的。

在cassandra中存储时间还是用int的10位timestamp方便，缺点是不够直观。但存储1431838118肯定比存储datetime格式的'2015-05-17 20:45:00'要节约空间吧。

INT_MAX 2147483647 2038-01-19 11:14:07, 可见用int来存储timestamp足够了，2038年还早。