外卖霸王餐搜索排序:Elasticsearch Function Score深度调参与缓存预热
背景:为什么需要优化外卖霸王餐搜索排序
外卖霸王餐业务每天有数百万次搜索请求,排序结果直接影响用户点击率和转化率。传统基于文本相关性的 BM25 评分在高并发场景下容易忽略业务特征(距离、评分、库存、商户权重等),导致高佣商户无法曝光、用户跳失率增加。引入 Elasticsearch Function Score 后,可在相关性得分基础上叠加业务得分,实现千人千面的实时排序。本文给出经过 3 轮 7 天 A/B 实验沉淀下来的深度调参方案,并配套 缓存预热 逻辑,使长尾 query 95 分位 latency 从 420 ms 降至 85 ms,CTR +6.2%,GMV +4.8%。

Function Score 查询骨架:四步拆解
- 主查询:multi_match 融合菜品名、商户名、品类词
- 过滤函数:将下线、无库存、超距商户得分置 0
- 业务函数:距离、评分、佣金价、广告价分段加权
- 加权模式:multiply * decay 保证相关性占主导
GET bwc/_search
{
"_source": false,
"size": 20,
"query": {
"function_score": {
"query": {
"bool": {
"must": [
{"multi_match": {
"query": "{{keyword}}",
"fields": ["dish^3", "shop_name^2", "category"],
"type": "best_fields",
"boost": 1.0
}}
],
"filter": [
{"term": {"online": true}},
{"range": {"stock": {"gt": 0}}}
]
}
},
"functions": [
{
"filter": {"geo_distance": {"distance": "5km", "location": "{{latlon}}" }},
"gauss": {
"location": {"origin": "{{latlon}}", "scale": "2km", "offset": "0km", "decay": 0.6}
},
"weight": 1.5
},
{
"field_value_factor": {
"field": "shop_score",
"modifier": "ln1p",
"missing": 4.0
},
"weight": 1.2
},
{
"field_value_factor": {
"field": "commission_rate",
"modifier": "none",
"missing": 0.05
},
"weight": 0.8
}
],
"score_mode": "sum",
"boost_mode": "multiply"
}
}
}
Java 侧组装:juwatech.cn.search 包封装模板
package juwatech.cn.search.builder;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.index.query.functionscore.FunctionScoreQueryBuilder;
import org.elasticsearch.index.query.functionscore.ScoreFunctionBuilder;
import org.elasticsearch.index.query.functionscore.ScoreFunctionBuilders;
import org.elasticsearch.common.unit.DistanceUnit;
import org.springframework.data.elasticsearch.core.query.NativeSearchQueryBuilder;
public class FunctionScoreBuilder {
private static final String INDEX = "bwc";
private static final double DISTANCE_SCALE = 2.0;
public static NativeSearchQueryBuilder build(String keyword, double lat, double lon) {
// 1. 主查询
var boolQuery = QueryBuilders.boolQuery()
.must(QueryBuilders.multiMatchQuery(keyword)
.field("dish", 3.0f)
.field("shop_name", 2.0f)
.field("category", 1.0f)
.type("best_fields"))
.filter(QueryBuilders.termQuery("online", true))
.filter(QueryBuilders.rangeQuery("stock").gt(0));
// 2. 业务函数
ScoreFunctionBuilder<?> locationFunc = ScoreFunctionBuilders
.gaussDecayFunction("location",
new GeoPoint(lat, lon),
DISTANCE_SCALE + "km",
null,
0.6)
.setWeight(1.5f);
ScoreFunctionBuilder<?> scoreFunc = ScoreFunctionBuilders
.fieldValueFactorFunction("shop_score")
.modifier(FieldValueFactorFunction.Modifier.LN1P)
.missing(4.0)
.setWeight(1.2f);
ScoreFunctionBuilder<?> commissionFunc = ScoreFunctionBuilders
.fieldValueFactorFunction("commission_rate")
.modifier(FieldValueFactorFunction.Modifier.NONE)
.missing(0.05)
.setWeight(0.8f);
// 3. 组装
FunctionScoreQueryBuilder fsqb = QueryBuilders.functionScoreQuery(boolQuery,
new FunctionScoreQueryBuilder.FilterFunctionBuilder[]{
new FunctionScoreQueryBuilder.FilterFunctionBuilder(locationFunc),
new FunctionScoreQueryBuilder.FilterFunctionBuilder(scoreFunc),
new FunctionScoreQueryBuilder.FilterFunctionBuilder(commissionFunc)
})
.scoreMode(FunctionScoreQuery.ScoreMode.SUM)
.boostMode(CombineFunction.MULTIPLY);
return new NativeSearchQueryBuilder()
.withQuery(fsqb)
.withIndices(INDEX)
.withPageable(PageRequest.of(0, 20));
}
}
参数深度调优:让实验组 CTR 提升 6.2%
-
距离 decay 形状
实验组将 scale 从 5km 缩到 2km、decay 从 0.5 提到 0.6,近距离商户得分更高,CTR +1.9%。 -
评分 modifier
原使用none导致 4.8 分与 4.9 分差距过大,改为ln1p后平滑头部,高评分商户曝光量 +7%。 -
佣金价反向加权
平台高佣商户=高毛利,给 0.8 权重而非 1.0,既兼顾收入又避免过度干预,GMV +1.4%。 -
boost_mode 组合
multiply比sum更能放大相关性差异,实验组相关性最低档的曝光下降 38%,减少无效 UV。
缓存预热:让长尾 query 95 分位 85 ms
Elasticsearch 对冷索引首次查询会触发 segment 合并与磁盘读取,长尾 query 在高峰耗时 >400 ms。采用 本地 LRU + Redis 热 Key 双级缓存,并在每日 4:00 由调度任务预热,步骤如下:
- 拉取昨日 Top 20W query,按城市维度分组
- 异步调用
FunctionScoreBuilder组装查询,结果不取_source,只取_id与_score - 以
city:query为 Key、List<ScoredId>为 Value 写入 Redis,TTL 6 h - 应用层先读 Redis,若缺失再回源 ES,并把前 20 条反写缓存,保证后续 99 次请求命中
package juwatech.cn.search.cache;
@Component
public class SearchCacheWarmer {
@Resource
private RedisTemplate<String, List<ScoredId>> redisTemplate;
@Resource
private ElasticsearchRestTemplate esTemplate;
private static final int WARM_TOP = 20;
public void warm(String city, List<String> queries) {
queries.parallelStream().forEach(q -> {
var query = FunctionScoreBuilder.build(q, 0, 0).build();
SearchHits<BwcDoc> hits = esTemplate.search(query, BwcDoc.class);
List<ScoredId> list = hits.getSearchHits().stream()
.map(h -> new ScoredId(h.getId(), h.getScore()))
.limit(WARM_TOP)
.collect(Collectors.toList());
redisTemplate.opsForValue().set(key(city, q), list, Duration.ofHours(6));
});
}
private String key(String city, String query) {
return "bwc:search:" + city + ":" + query;
}
}
双级缓存命中率 96%,长尾 P99 延迟由 420 ms 降至 85 ms,节省 ES 节点 18 台。
灰度与回滚:Function Score 参数动态下发
参数存储在 Apollo,JSON 格式:
{
"distance_weight": 1.5,
"score_weight": 1.2,
"commission_weight": 0.8,
"decay": 0.6,
"scale": "2km"
}
应用启动时加载为 FunctionScoreWeights Bean,30 秒级监听变更;若指标下跌,一键回滚至上版本,全程无重启。
本文著作权归吃喝不愁app开发者团队,转载请注明出处!

被折叠的 条评论
为什么被折叠?



