Elasticsearch allows you to configure a scoring algorithm or similarity per field. The similarity
setting provides a simple way of choosing a similarity algorithm other than the default TF/IDF, such as BM25
.
Similarities are mostly useful for text
fields, but can also apply to other field types.
Custom similarities can be configured by tuning the parameters of the built-in similarities. For more details about this expert options, see the similarity module.
The only similarities which can be used out of the box, without any further configuration are:
- The Okapi BM25 algorithm. The algorithm used by default in Elasticsearch and Lucene. See Pluggable Similarity Algorithms for more information.
- The TF/IDF algorithm which used to be the default in Elasticsearch and Lucene. See Lucene’s Practical Scoring Function for more information.
BM25
classic
The similarity
can be set on the field level when a field is first created, as follows: