创建一下一个table:
CREATE TABLE ruling_stewards (
steward_name text,
king text,
reign_start int,
event text,
PRIMARY KEY (steward_name, king, reign_start)
);
当用
SELECT * FROM ruling_stewards
WHERE king = 'none'
AND reign_start >= 1500
AND reign_start < 3000 LIMIT 10 ALLOW FILTERING;
时,会报错误:
Bad Request: Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING.
要求用allow filtering, 为啥呢 ?官方的解释是:
ALLOW FILTERING will probably become less strict as we collect more statistics on our data. For example, if we knew that 90% of entries have no king we would know that finding 10 such entries should be relatively inexpensive.
加了ALLOW FILTERING 后,cassandra会根据已经创建好的统计表,查询那些NODE 上最有可能有king这个cluster, 然后到最有可能的node上去找满足条件的entries。 如果没有加ALLOW FILTERING , 那就是盲找,这个查询代价非常高。