Solr入门之官方文档6.0阅读笔记系列(四)

本文是Solr入门系列的第四篇,主要分享了Solr 6.0官方文档的学习要点,包括核心概念、索引构建、查询优化等方面,旨在帮助初学者快速掌握Solr搜索引擎的使用和配置。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

第二部分  Documents, Fields, and Schema Design
Defining Fields
定义字段是简单的事情
Example
<field name="price" type="float" default="0.0" indexed="true" stored="true"/>
默认值的定义,其余未定义的使用继承类型的属性
Field Properties
name,type,defualt
Optional Field Type Override Properties
在字段中明确定义的属性能覆盖字段类型中申明的属性或者默认的属性.
字段属性和字段类型中的属性一致

page 68
Related Topics
SchemaXML-Fields
Field Options by Use Case


Copying Fields

<copyField source="cat" dest="text" maxChars="30000" />

可以设置字符的大小限制
最好的使用情况是默认搜索域在多个字段中,将他们复合到一个字段来,设置属性为多值
multivalued="true"

还可以使用通配符来进行复制
<copyField source="*_t" dest="text" maxChars="25000" />

Dynamic Fields
动态字段的定义和普通字段没有什么区别,除了使用通配符*以外.
<dynamicField name="*_i" type="int" indexed="true" stored="true"/>
字段的匹配规则是先匹配明确定义的字段.如果无法匹配再去匹配动态字段.
官方推荐定义几个基本的动态字段.
Related Topics
SchemaXML-Dynamic Field


Other Schema Elements

Unique Key
<uniqueKey>id</uniqueKey>
copyfield不行.
不能是多值的

Default Search Field & Query Operator

 <defaultSearchField>text</defaultSearchField>
 <solrQueryParser defaultOperator="OR"/>   (AND|OR) 

还支持此种方式,不过以后可能不再支持.可以使用request parameter defaults 来代替
df parameter  q.op parameter.

Similarity

这个应该是solr的核心相似度打分(以后可以仔细看看)

必须要有一个全局的相似性打分类,默认是BM25Similarity
可以为每一个字段类型定义打分类,通过具体的类,或者相应的工厂,相关工厂:
lucene.apache.org/solr/6_0_0/solr-core/org/apache/solr/schema/SimilarityFactory.html

<similarity class="solr.SchemaSimilarityFactory">
<str name="defaultSimFromFieldType">text_dfr</str>
<similarity>
<fieldType name="text_dfr" class="solr.TextField">
<analyzer ... />
<similarity class="solr.DFRSimilarityFactory">
<str name="basicModel">I(F)</str>
<str name="afterEffect">B</str>
<str name="normalization">H3</str>
<float name="mu">900</float>
</similarity>
</fieldType>
<fieldType name="text_ib">
<analyzer ... />
<similarity class="solr.IBSimilarityFactory">
<str name="distribution">SPL</str>
<str name="lambda">DF</str>
<str name="normalization">H2</str>
</similarity>
</fieldType>
<fieldType name="text_other">
<analyzer ... />
</fieldType>

Schema API
能进行schema的读写操作.以后还将继续增强.
改变后需要重建索引,使之生效.

使用api操作,修改后的配置文件会自动的重新加载.
基本路径: http://<host>:<port>/solr/<collection_name>

主要内容:
API Entry Points
Modify the Schema
   Add a New Field
   Delete a Field
   Replace a Field
  Add a Dynamic Field Rule
  Delete a Dynamic Field Rule
  Replace a Dynamic Field Rule
  Add a New Field Type
  Delete a Field Type
  Replace a Field Type
  Add a New Copy Field Rule
  Delete a Copy Field Rule

  Multiple Commands in a Single POST
  Schema Changes among Replicas
Retrieve Schema Information
  Retrieve the Entire Schema
  List Fields
  List Dynamic Fields
  List Field Types
  List Copy Fields
  Show Schema Name
  Show the Schema Version
    List UniqueKey
   Show Global Similarity
    Get the Default Query Operator

Manage Resource Data

例子:
API Entry Points 查看
/schema 返回整个可用的schema信息
/schema: retrieve the schema, or modify the schema to add, remove, or replace fields, dynamic fields, copy
fields, or field types
/schema/fields: retrieve information about all defined fields or a specific named field
/schema/dynamicfields: retrieve information about all dynamic field rules or a specific named dynamic rule
/schema/fieldtypes: retrieve information about all field types or a specific field type
/schema/copyfields: retrieve information about copy fields
/schema/name: retrieve the schema name
/schema/version: retrieve the schema version
/schema/uniquekey: retrieve the defined uniqueKey
/schema/similarity: retrieve the global similarity definition
/schema/solrqueryparser/defaultoperator: retrieve the default operator


Modify the Schema 修改

POST   /collection/schema

add-field: add a new field with parameters you provide.
delete-field: delete a field.
replace-field: replace an existing field with one that is differently configured.
add-dynamic-field: add a new dynamic field rule with parameters you provide.
delete-dynamic-field: delete a dynamic field rule.
replace-dynamic-field: replace an existing dynamic field rule with one that is differently configured.
add-field-type: add a new field type with parameters you provide.
delete-field-type: delete a field type.
replace-field-type: replace an existing field type with one that is differently configured.
add-copy-field: add a new copy field rule.
delete-copy-field: delete a copy field rule.


例子:
Add a New Field
curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field":{
"name":"sell-by",
"type":"tdate",
"stored":true }
}' http://localhost:8983/solr/gettingstarted/schema


Delete a Field
curl -X POST -H 'Content-type:application/json' --data-binary '{
"delete-field" : { "name":"sell-by" }
}' http://localhost:8983/solr/gettingstarted/schema


Replace a Field

要不能修改部分属性(相当于删除后重建)
curl -X POST -H 'Content-type:application/json' --data-binary '{
"replace-field":{
"name":"sell-by",
"type":"date",
"stored":false }
}' http://localhost:8983/solr/gettingstarted/schema


Add a Dynamic Field Rule
curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-dynamic-field":{
"name":"*_s",
"type":"string",
"stored":true }
}' http://localhost:8983/solr/gettingstarted/schem

Delete a Dynamic Field Rule
curl -X POST -H 'Content-type:application/json' --data-binary '{
"delete-dynamic-field":{ "name":"*_s" }
}' http://localhost:8983/solr/gettingstarted/schema


Replace a Dynamic Field Rule
curl -X POST -H 'Content-type:application/json' --data-binary '{
"replace-dynamic-field":{
"name":"*_s",
"type":"text_general",
"stored":false }
}' http://localhost:8983/solr/gettingstarted/schema


Add a New Field Type
curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field-type" : {
"name":"myNewTxtField",
"class":"solr.TextField",
"positionIncrementGap":"100",
"analyzer" : {
"charFilters":[{
"class":"solr.PatternReplaceCharFilterFactory",
"replacement":"$1$1",
"pattern":"([a-zA-Z])\\\\1+" }],
"tokenizer":{ 
"class":"solr.WhitespaceTokenizerFactory" },
"filters":[{
"class":"solr.WordDelimiterFilterFactory",
"preserveOriginal":"0" }]}}
}' http://localhost:8983/solr/gettingstarted/schema

还能将索引和查询的分词器分别定义:
curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field-type":{
"name":"myNewTextField",
"class":"solr.TextField",
"indexAnalyzer":{
"tokenizer":{
"class":"solr.PathHierarchyTokenizerFactory", 
"delimiter":"/" }},
"queryAnalyzer":{
"tokenizer":{ 
"class":"solr.KeywordTokenizerFactory" }}}
}' http://localhost:8983/solr/gettingstarted/schema


Delete a Field Type
curl -X POST -H 'Content-type:application/json' --data-binary '{
"delete-field-type":{ "name":"myNewTxtField" }
}' http://localhost:8983/solr/gettingstarted/schema


Replace a Field Type

curl -X POST -H 'Content-type:application/json' --data-binary '{
"replace-field-type":{
"name":"myNewTxtField",
"class":"solr.TextField",
"positionIncrementGap":"100",
"analyzer":{
"tokenizer":{ 
"class":"solr.StandardTokenizerFactory" }}}
}' http://localhost:8983/solr/gettingstarted/schema


Add a New Copy Field Rule
curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-copy-field":{
"source":"shelf",
"dest":[ "location", "catchall" ]}
}' http://localhost:8983/solr/gettingstarted/schema


Delete a Copy Field Rule
curl -X POST -H 'Content-type:application/json' --data-binary '{
"delete-copy-field":{ "source":"shelf", "dest":"location" }
}' http://localhost:8983/solr/gettingstarted/schema


Multiple Commands in a Single POST
事务类型(都成功或者都失败)
三种模式: 顺序,重复,数组
create a new field type and then a field that uses that type

curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field-type":{
"name":"myNewTxtField",
"class":"solr.TextField",
"positionIncrementGap":"100",
"analyzer":{
"charFilters":[{
"class":"solr.PatternReplaceCharFilterFactory",
"replacement":"$1$1",
"pattern":"([a-zA-Z])\\\\1+" }],
"tokenizer":{ 
"class":"solr.WhitespaceTokenizerFactory" },
"filters":[{
"class":"solr.WordDelimiterFilterFactory",
"preserveOriginal":"0" }]}},
"add-field" : {
"name":"sell-by",
"type":"myNewTxtField",
"stored":true }
}' http://localhost:8983/solr/gettingstarted/schema


same command can be repeated, as in this example:

curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field":{
"name":"shelf",
"type":"myNewTxtField",
"stored":true },
"add-field":{
"name":"location",
"type":"myNewTxtField",
"stored":true },
"add-copy-field":{
"source":"shelf",
"dest":[ "location", "catchall" ]}
}' http://localhost:8983/solr/gettingstarted/schema


repeated commands can be sent as an array

curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field":[
{ "name":"shelf",
"type":"myNewTxtField",
"stored":true },
{ "name":"location",
"type":"myNewTxtField",
"stored":true }]
}' http://localhost:8983/solr/gettingstarted/schema


Schema Changes among Replicas

如果没有设置 updateTimeoutSecs这个参数,那么在一个core上的更新,仅仅会上传到zookeeper上然后返回成功,以后会进行异步的更新操作,你无法确定所有的副本是否都更新成功了.
最好设置updateTimeoutSecs这个参数,超时后会返回相应的错误信息.,



Retrieve Schema Information 读取信息

Retrieve the Entire Schema
GET /collection/schema

collection The collection (or core) name.

wt string No json Defines the format of the response. The options are jsonxml or schema.xml. If not specified, JSON will be returned by default.

输出(省略)

List Fields

GET /collection/schema/fields
GET /collection/schema/fields/fieldnam
Path Parameters

collection The collection (or core) name.
fieldname The specific fieldname (if limiting request to a single field)

Query Parameters
wt
fl
includeDynamic
showDefaults(这个属性是神器)

List Dynamic Fields

GET /collection/schema/dynamicfields
GET /collection/schema/dynamicfields/name
Path Parameters
同上
Query Parameters
wt
showDefaults

List Field Types

GET /collection/schema/fieldtypes
GET /collection/schema/fieldtypes/name

请求参数同上

List Copy Fields

GET /collection/schema/copyfields
Path Parameters
collection The collection (or core) name.
Query Parameters
wt
source.fl
dest.fl

Show Schema Name

GET /collection/schema/name
查看版本信息有啥用?
List UniqueKey
GET /collection/schema/uniquekey

Show Global Similarity
GET /collection/schema/similarity

Get the Default Query Operator

GET /collection/schema/solrqueryparser/defaultoperator

Manage Resource Data

外部插件可以使用API来操作solr的shema.xml进行资源的管理.
See the Managed Resources section for more information and examples.

(see page 94 today)


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值