elasticsearch查询语句篇

1.ElasticSearch基本概念

整个搜索客户端github地址:https://github.com/cweeyii/elasticsearch-parent
elasticsearch基本概念见:https://es.xiaoleilu.com/010_Intro/05_What_is_it.html
集群模式安装:http://blog.youkuaiyun.com/cweeyii/article/details/71055884

2. 重点概念

  • 搜素类型(searchType)
    特别是你需要检索出满足条件的文档数量时,可以直接设置为count类型,即只会返回命中的文档数量。(相当于mysql:select count(1) from table where valid=0)
    PS:该类型现在已经被废弃可以直接设置search条件中的from=0 size=0即可,效率一样。
#检索条件构造:
SearchRequestBuilder builder = client.prepareSearch(indexName).setQuery(searchCondition.getQueryBuilder()).setFrom(0).setSize(0);
#结果数量获取:
SearchHits hits = searchResponse.getHits();
hits.getTotalHits();
  • 默认对象
    ES建立的索引中会包换多个元数据字段,每一个都以下划线开头,例如 _type, _id,_index 和 _source
    这些字段是十分有用的,例如可以将用户记录中的主键设置为_id的内容,可以实现根据主键更新es记录的作用,并且可以实现根据id获取记录或者实现查询中过滤指定id的记录的功能。参见:IdsQueryBuilder。并且如果没有设置逻辑的routing,那么记录定位shard分片就是根据_id来实现的。
    _index索引名字 _type索引类型 _id文档id _source (Elasticsearch 用来保存文档主体 JSON字段)
    这里写图片描述
  • 动态映射
    当 Elasticsearch 处理一个位置的字段时,它通过【动态映射】来确定字段的数据类型且自动将该字段加到类型映射中。例如:你可以不用先自己去建立mapping关系,es会根据你传入的索引类型中的字段的类型来自动映射,如string类型会分词和存储。这个功能在你要对索引的对象加上一个字段的时候,非常有用。你不需要去删除和修改mapping,只要刷一遍数据,这个字段就自动刷到索引中了。
    但是有时候该功能不是想要的。如下面一个mapping就不能通过自动映射来实现:
{
   "settings": {
      "index": {
         "number_of_replicas": "1",
         "number_of_shards": "5"
      }
   },
   "mappings": {
      "enterprise_basic_info": {
         "_all": {
            "enabled": false
         },
         "properties": {
            "id": {
               "type": "long" },       
            "enterpriseName": {
               "type": "string",
               "analyzer": "ik_max_word" },
            "address": {
               "type": "string",
               "analyzer": "ik_max_word" },
            "latitude": {
               "type": "double" },
            "longitude": {
               "type": "double" },
            "phone": {
               "type": "string",
               "index": "not_analyzed" },
            "businessCategory": {
               "type": "string",
               "index": "not_analyzed" },
            "cityName": {
               "type": "string",
               "index": "not_analyzed" },
            "districtName": {
               "type": "string",
               "index": "not_analyzed" },
            "valid": {
               "type": "long" },
            "location": {
               "type": "geo_point",
                                "geohash_prefix":true,
               "geohash_precision":12 }
         }
      }
   }
}

其中城市和行政区虽然都是字符串类型但是并不需要其被分词。因此对于建立索引推荐还是自己配置mapping

  • 索引别名
    索引别名有点像指针的作用,其并不会存储数据或者产生一个新的索引,其主要是指定向一个索引。别名常用于索引的快速切换的功能。例如:刚开始你的索引别名my_index指向my_index1,你可以不用开关机,修改代码直接将my_index指向my_index2
  • QueryBuilder和FilterBuilder的区别
    FilterBuilder在检索的时候,实现的是过滤的功能,它会将所有的记录根据筛选条件进行预先的筛选,然后在筛选的结果里面进行QueryBuilder的查询。因此FilterBuilder也有选取满足指定条件的记录的功能,并且该筛选结果会被缓存起来,下一次有同样条件的筛选要求,就不需要重新计算了,另外与QueryBuilder比较FilterBuilder其不需要计算文档的相关性,因此速度更快。【官网解释】
    PS:我做实验发现,并没有速度的提升,有可能进行了QueryBuilder的优化,或者我索引文档的数量太少(10000条记录)体现不出差别:
    见后文的github中代码:ElasticSearchConditionTest
int times = 10000;
        SearchCondition filterCondition = new SearchCondition();
        filterCondition.setFilterBuilder(OperationBuilderFactory.builder().queryString("address", "云中", OperationType.MUST)
                .term("valid", 1, OperationType.MUST).builder());

        Long beginTime1 = System.currentTimeMillis();
        for (int i = 0; i < times; i++) {
            List<EnterpriseBasicInfoDTO> basicInfoDTOList = enterpriseSearchHandle.getListByCondition(filterCondition, null);
        }
        Long beginTime2 = System.currentTimeMillis();
        LOGGER.info("运行Filter {} 花费时间{} 秒", times, (beginTime2 - beginTime1) / 1000);
        SearchCondition queryCondition = new SearchCondition();
        queryCondition.setQueryBuilder(OperationBuilderFactory.builder().queryString("address", "云中", OperationType.MUST).
                term("valid", 1, OperationType.MUST).builder());
        for (int i = 0; i < times; i++) {
            List<EnterpriseBasicInfoDTO> basicInfoDTOList = enterpriseSearchHandle.getListByCondition(queryCondition, null);
        }
        Long beginTime3 = System.currentTimeMillis();
        LOGGER.info("运行query {} 花费时间{} 秒", times, (beginTime3 - beginTime2) / 1000);

执行结果:可以发现进行一万次就只有几秒的提升,感觉并没有太大区别。

18:10:51.718  INFO (ElasticSearchConditionTest.java:41) - 运行Filter 10000 花费时间27 秒
18:11:26.416  INFO (ElasticSearchConditionTest.java:49) - 运行query 10000 花费时间34 秒
  • 快速的距离范围查找GeoHash
    GeoHash算法是主要用于解决快速查找邻域范围(如500m内商家)类的其他记录的功能的算法。其主要思想是将整个地球品面分为8分,每一份由不同的字符表示,同样的对于每一份也递归的进行切分,最后根据你设置的geohash的长度,没一份覆盖的范围越来越小,因此如果需要求范围内点,只需要获取领域返回的其他几块,之后在这些删选数据中在进行高消耗的详细计算。
    这里写图片描述
    上图是geohash不同长度对应的精度。如11位长的geohash编码能够到达查找15米范围的所有相邻点的功能。
    要设置坐标的geohash功能需要添加一个新的字段来表示,如我有一堆POI有经纬度坐标,为了要实现geohash范围查找的功能,我需要在mapping中加入一个location字段
#mapping中设置
            "latitude": {
               "type": "double"
            },
            "longitude": {
               "type": "double"
            },
            "location": {
               "type": "geo_point",
               "geohash_prefix":true,
               "geohash_precision":12
            }
#在建立索引的类型对象中只需要如下设置即可:(利用fastJson序列化只根据get和set方法来判断是否具有location字段,你可以不用设置该字段,具体代码也可以看下面的gitHub链接,里面有具体的实现)
public String getLocation() {
        return latitude + "," + longitude;
    }
  • ElasticSearch具体操作
    term匹配(不进行分词)准确匹配:TermQueryBuilder
    queryString(进行分词)分词匹配:QueryStringQueryBuilder根据QueryStringQueryBuilder.Operator的操作是AND 还是OR操作来决定分词结果是需要同时包含,还是包含其中一个就行。
    prefix 准确匹配: 如果索引的字段需要进行分词,那么根据该分词结果的term是否有prefix指定的前缀,如果有则匹配。如果索引的字段不进行分词,那么看该字段内容是否有prefix前缀。PrefixQueryBuilder
    range(范围匹配,大小、小于、between and):指定字段是否在该范围内,如果在则匹配。RangeQueryBuilder
    notInId或者idIn:根据id进行筛选或者过滤。IdsQueryBuilder
    fuzzy模糊匹配:根据字符串之间的编辑距离来匹配FuzzyQueryBuilder
    wildcard通配符匹配:根据通配符来匹配字符串WildcardQueryBuilder
    geoDistance地理坐标范围匹配:根据各种计算距离的方式来实现距离范围匹配GeoDistanceQueryBuilder
    geoHash根据geohash编码来进行近似范围匹配:GeohashCellQuery.Builder
  • 开发的elasticsearch通用包
    github地址:https://github.com/cweeyii/elasticsearch-parent
    client包:主要实现对Query操作的编辑包装和搜索操作的封装,特别好用
    重要类介绍:
    query和filter的条件构造类:
package com.cweeyii.operation;

import org.elasticsearch.common.unit.DistanceUnit;
import org.elasticsearch.index.query.*;
import org.springframework.util.CollectionUtils;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;

/**
 * Created by wenyi on 17/5/9.
 * Email:caowenyi@meituan.com
 */
public class OperationBuilderFactory {

    public static Builder builder() {
        return new Builder();
    }

    public static class Builder {
        private Map<OperationType, List<QueryBuilder>> queryBuilderMap = new ConcurrentHashMap<>();

        private Builder(){}

        public Builder term(String field, Object value, OperationType operationType) {
            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);
            queryBuilders.add(new TermQueryBuilder(field, value));
            return this;
        }

        public Builder queryString(String field, String value, OperationType operationType, QueryStringQueryBuilder.Operator operator) {
            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);
            queryBuilders.add(new QueryStringQueryBuilder(value).field(field).defaultOperator(operator));
            return this;
        }

        public Builder queryString(String field, String value, OperationType operationType) {
            return queryString(field, value, operationType, QueryStringQueryBuilder.Operator.OR);
        }

        public Builder prefix(String field, String prefix, OperationType operationType) {
            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);
            queryBuilders.add(new PrefixQueryBuilder(field, prefix));
            return this;
        }

        public Builder range(String field, Object from, Object to, OperationType operationType) {
            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);
            queryBuilders.add(new RangeQueryBuilder(field).from(from).to(to));
            return this;
        }

        public Builder notInId(List<String> ids, OperationType operationType) {
            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);
            queryBuilders.add(new IdsQueryBuilder().ids(ids));
            return this;
        }

        public Builder fuzzy(String field, Object value, OperationType operationType) {
            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);
            queryBuilders.add(new FuzzyQueryBuilder(field, value));
            return this;
        }

        public Builder wildcard(String field, String value, OperationType operationType) {
            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);
            queryBuilders.add(new WildcardQueryBuilder(field, value));
            return this;
        }

        public Builder geoDistance(String field, double lat, double lon, double distance, OperationType operationType) {
            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);
            queryBuilders.add(new GeoDistanceQueryBuilder(field).point(lat, lon).distance(distance, DistanceUnit.METERS));
            return this;
        }

        public Builder geoHash(String field, double lat, double lon, int precisionLevel, OperationType operationType) {
            List<QueryBuilder> queryBuilders = getQueryBuilders(operationType);
            queryBuilders.add(new GeohashCellQuery.Builder(field).point(lat, lon).precision(precisionLevel).neighbors(true));
            return this;
        }

        public QueryBuilder builder() {
            BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
            List<QueryBuilder> mustBuilders = getQueryBuilders(OperationType.MUST);
            if (!CollectionUtils.isEmpty(mustBuilders)) {
                for (QueryBuilder queryBuilder : mustBuilders) {
                    boolQueryBuilder.must(queryBuilder);
                }
            }
            List<QueryBuilder> mustNotBuilders = getQueryBuilders(OperationType.MUST_NOT);
            if (!CollectionUtils.isEmpty(mustNotBuilders)) {
                for (QueryBuilder queryBuilder : mustNotBuilders) {
                    boolQueryBuilder.mustNot(queryBuilder);
                }
            }
            List<QueryBuilder> shouldBuilders = getQueryBuilders(OperationType.SHOULD);
            if (!CollectionUtils.isEmpty(shouldBuilders)) {
                for (QueryBuilder queryBuilder : shouldBuilders) {
                    boolQueryBuilder.should(queryBuilder);
                }
            }
            return boolQueryBuilder;
        }

        public List<QueryBuilder> getQueryBuilders(OperationType operationType) {
            List<QueryBuilder> queryBuilders = queryBuilderMap.get(operationType);
            if (queryBuilders == null) {
                synchronized (this) {
                    if (queryBuilders == null) {
                        queryBuilders = new ArrayList<>();
                        queryBuilderMap.put(operationType, queryBuilders);
                    }
                }
            }
            return queryBuilders;
        }
    }


}

搜素条件包装类:实现了搜索条件的封装、排序、聚合

public class SearchCondition {
    private QueryBuilder queryBuilder = null;
    private QueryBuilder filterBuilder = null;
    private List<SortBuilder> orders = new ArrayList<>();
    private List<AbstractAggregationBuilder> aggregationBuilders = new ArrayList<>();

    private SearchType searchType;
    private int limit = 20;
    private int offset = 0;
    private int total = 0;

    public List<AbstractAggregationBuilder> getAggregationBuilders() {
        return aggregationBuilders;
    }

    public void setAggregationBuilders(List<AbstractAggregationBuilder> aggregationBuilders) {
        this.aggregationBuilders = aggregationBuilders;
    }

    public List<SortBuilder> getOrders() {
        return orders;
    }

    public void setOrders(List<SortBuilder> orders) {
        this.orders = orders;
    }

    public int getTotal() {
        return total;
    }

    public SearchCondition setTotal(int total) {
        this.total = total;
        return this;
    }

    public SearchCondition orderBy(String field, double lat, double lon, SortOrder order, GeoDistance geoDistance) {
        if (!StringUtils.isEmpty(field)) {
            orders.add(new GeoDistanceSortBuilder(field).order(order).point(lat, lon).geoDistance(geoDistance));
        }
        return this;
    }

    public SearchCondition orderBy(String field, double lat, double lon) {
        return orderBy(field, lat, lon, SortOrder.ASC, GeoDistance.DEFAULT);
    }

    public SearchCondition orderBy(String field, SortOrder order) {
        if (!StringUtils.isEmpty(field)) {
            orders.add(new FieldSortBuilder(field).order(order));
        }
        return this;
    }

    public SearchCondition orderBy(String field) {
        return orderBy(field, SortOrder.ASC);
    }

    public QueryBuilder getQueryBuilder() {
        if (queryBuilder == null) {
            return QueryBuilders.matchAllQuery();
        }
        return queryBuilder;
    }

    public SearchCondition setQueryBuilder(QueryBuilder queryBuilder) {
        this.queryBuilder = queryBuilder;
        return this;
    }

    public QueryBuilder getFilterBuilder() {
        return filterBuilder;
    }

    public SearchCondition setFilterBuilder(QueryBuilder filterBuilder) {
        this.filterBuilder = filterBuilder;
        return this;
    }

    public SearchCondition setAggregation(String field, double lat, double lon, Pair<Double, Double>... rangePoints) {
        if (!StringUtils.isEmpty(field)) {
            GeoDistanceBuilder geoDistanceBuilder = new GeoDistanceBuilder(field).point(new GeoPoint(lat, lon)).unit(DistanceUnit.METERS);
            for (Pair<Double, Double> rangePoint : rangePoints) {
                geoDistanceBuilder.addRange(rangePoint.getFirst(), rangePoint.getSecond());
            }
            aggregationBuilders.add(geoDistanceBuilder);
        }
        return this;
    }

    public SearchType getSearchType() {
        return searchType;
    }

    public SearchCondition setSearchType(SearchType searchType) {
        this.searchType = searchType;
        return this;
    }

    public int getLimit() {
        return limit;
    }

    public SearchCondition setLimit(int limit) {
        this.limit = limit;
        return this;
    }

    public int getOffset() {
        return offset;
    }

    public SearchCondition setOffset(int offset) {
        this.offset = offset;
        return this;
    }
}

使用方法:

SearchCondition searchCondition = new SearchCondition();
        searchCondition.setFilterBuilder(OperationBuilderFactory.builder().queryString("address", "云中", OperationType.MUST)
                .term("valid", 1, OperationType.MUST).builder());
                searchCondition.setFilterBuilder(OperationBuilderFactory.builder().geoHash("location", lat, lon, 5, OperationType.MUST).builder())
                .orderBy("location", lat, lon, SortOrder.ASC, GeoDistance.ARC).orderBy("id", SortOrder.ASC).setOffset(0).setLimit(100);
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值