概述
需求,某私有化客户需要对聊天搜索的结果,进行批量导出,最大量达到17W,而我们目前情况一次只能导出1W
解决思路
- 目前导出瓶颈1W在于,查询方式为ES集群,返回值最大结果集默认为1W。 index.max_result_window:10000
- 修改默认返回值配置 index.max_result_window:200000
(弊端:会影响整体ES性能,由于是私有化客户独立服务器,且非长期需求,所以,改!)
- 返回值超过缓存大小报错,利用反射修改es配置工厂
- 还是无法突破上限,改为es游标查询
- 后台查询时长超过nginx判定超时,优化查询时长+异步
实战
- 原查询方式(ES查询瓶颈,默认配置返回值1W条)
public List<ChatSearch> search(String corpid, ChatSearchParam param) throws IOException {
String corpId = param.getCorpId();
List<ChatSearch> chatList = new ArrayList<>();
RestHighLevelClient esClient = esConfig.getESClient(corpId);
SearchRequest request = searchByEs(param);
SearchResponse response = esClient.search(request, RequestOptions.DEFAULT);
for (SearchHit hit : response.getHits().getHits()) {
chatList.add(JSONObject.parseObject(hit.getSourceAsString(), ChatSearch.class));
}
fill(corpId, chatList);
return chatList;
}
private SearchRequest searchByEs(ChatSearchParam param) {
String corpId = param.getCorpId();
BoolQueryBuilder condition = QueryBuilders.boolQuery()
.must(QueryBuilders.termQuery("corpid", corpId))
.must(QueryBuilders.termQuery("chattype", 1))
.must(QueryBuilders.termQuery("msgtype", "text"))
.must(QueryBuilders.matchPhraseQuery("data", param.getContent()));
if (!StringUtils.isBlank(param.getEndTime())) {
condition.must(QueryBuilders.rangeQuery("msgtime").lte(param.getEndTime()));
}
if (!StringUtils.isBlank(param.getStartTime())) {
condition.must(QueryBuilders.rangeQuery("msgtime").gte(param.getStartTime()));
}
if (param.getFrom() != null) {
condition.must(QueryBuilders.termQuery("is_sales_send", param.getFrom()));
}
if (param.getGroup() != null) {
condition.must(QueryBuilders.termQuery("isgroup", param.getGroup()));
}
if (!CollectionUtils.isEmpty(param.getSalesIds())) {
condition.must(QueryBuilders.boolQuery()
.should(QueryBuilders.termsQuery("userid", param.getSalesIds()))
.should(QueryBuilders.termsQuery("talklist", param.getSalesIds()))
);
}
SearchSourceBuilder requestBuilder = new SearchSourceBuilder()
.query(condition)
.sort("msgtime", param.getOrder())
.size(param.getPageSize().intValue