java lookup list_cassandra lookup by list of primary keys in java

最新推荐文章于 2024-08-17 07:45:00 发布

原创最新推荐文章于 2024-08-17 07:45:00 发布 · 78 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#java lookup list

针对使用Cassandra查询多个主键的问题，目前的方式会导致多次网络请求。虽然Cassandra支持通过IN子句批量查询，但这种方法在性能上不推荐，因为协调节点需要向每个副本发送请求并等待所有结果。并发异步请求可能更快，直接将请求发送到副本，并能边接收结果边处理，减少网络延迟和协调节点的压力。

问题

I am implementing a feature which requires looking up Cassandra by a list of primary keys.

Below is an example data where id is primary key

mytable

id column1

1 423

2 542

3 678

4 45534

5 435634

6 2435

7 678

8 4564

9 546

Most of my queries a lookup by id, but for some special cases I would like to get data for a list of ids.

The way I am currently doing is a follows:

public Object fetchFromCassandraForId(int id);

int ids[] = {1, 3, 5, 7, 9};

List results;

for(int id: ids) {

results.add(fetchFromCassandraForId(id));

}

This results in issuing multiple network call to cassandra, Is it possible to batch this somehow, therefore i would like to know if cassandra supports fast lookup by list of ids

select coulmn1 from mytable where id in (1, 3, 5, 7, 9);

Any help or pointers would be appreciated?

回答1:

If the id is the full primary key, then Cassandra supports this, although it's not recommended from performance point of view:

request is sent to coordinator node

coordinator node finds a replica for each of the id, and send individual request to them

wait for results from every node, collect them to result set & send back

As result:

all your sub-queries need to wait for slowest of the replicas

you have an additional network hope from coordinator to replica

you put more pressure to the coordinator node as it need to keep results in memory

If you do a lot of parallel, asynchronous requests for each of the id values from application, then you:

avoid an additional hop - if you're using prepared statements with token-aware load balancing, then query is sent directly to replicas

you may start to process results as you get them, not waiting for everything

So sending parallel asynchronous requests could be faster than sending one request with IN...

来源：https://stackoverflow.com/questions/62643342/cassandra-lookup-by-list-of-primary-keys-in-java