solr_day03_solr基本使用

最新推荐文章于 2023-04-24 14:32:27 发布

冰蓝心灵

最新推荐文章于 2023-04-24 14:32:27 发布

阅读量135

点赞数

CC 4.0 BY-SA版权

分类专栏：全文检索

本文链接：https://blog.youkuaiyun.com/qq_35537301/article/details/82659927

全文检索专栏收录该内容

7 篇文章

订阅专栏

本文详细介绍如何使用Solrj操作Solr服务器，包括索引的增删改查等基本操作，并介绍schema.xml配置方法，实现中文分词器配置及从数据库导入数据的过程。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1、solrj

1. solrj是什么

solrj是访问Solr服务的java客户端，提供索引和搜索的请求方法

2. 操作步骤

创建HttpSolrServer对象，通过它和Solr服务器建立连接
创建SolrInputDocument对象，然后通过它来添加域
通过HttpSolrServer对象将SolrInputDocument添加到索引库

package com.itheima.test;

import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.impl.HttpSolrServer;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.common.SolrDocument;
import org.apache.solr.common.SolrDocumentList;
import org.apache.solr.common.SolrInputDocument;
import org.junit.Test;

/**
 * @ClassName: TestSolrj
 * @Description:演示solrj操作solr服务器
 * @date 2018年9月12日
 */
public class TestSolrj {

	/**
	 * @MethodName:testCresteOrUpdate
	 * @Description: 添加/修改
	 * @throws Exception
	 */
	@Test
	public void testCresteOrUpdate() throws Exception {

		// 1.创建HttpSolrServer连接：连接Solr服务器
		HttpSolrServer solrServer = new HttpSolrServer("http://localhost:8080/solr/");

		// 2.创建文档对象：SolrInputDocument
		SolrInputDocument doc = new SolrInputDocument();

		// 3.文档中添加数据
		doc.addField("id", "c001");
		doc.addField("name", "瓜子");

		// 4.执行添加到索引库
		solrServer.add(doc);

		// 5.提交
		solrServer.commit();
	}

	/**
	 * @MethodName:testDeleteById
	 * @Description:根据ID删除
	 * @throws Exception
	 */
	@Test
	public void testDeleteById() throws Exception {
		// 1.创建HttpSolrServer连接：连接Solr服务器
		HttpSolrServer solrServer = new HttpSolrServer("http://localhost:8080/solr/");

		// 2.根据ID删除
		solrServer.deleteById("c001");

		// 3.提交
		solrServer.commit();
	}

	/**
	 * @MethodName:testDleteAll
	 * @Description: 删除所有
	 * @throws Exception
	 */
	@Test
	public void testDleteAll() throws Exception {
		// 1.创建HttpSolrServer连接：连接Solr服务器
		HttpSolrServer solrServer = new HttpSolrServer("http://localhost:8080/solr/");

		// 2.删除所有
		solrServer.deleteByQuery("*:*");

		// 3.提交
		solrServer.commit();
	}

	/**
	 * @MethodName:testQuery
	 * @Description: 查询
	 * @throws Exception
	 */
	@Test
	public void testQuery() throws Exception {
		// 1.创建HttpSolrServer连接：连接Solr服务器
		HttpSolrServer solrServer = new HttpSolrServer("http://localhost:8080/solr/");

		// 2.创建查询条件对象：封装条件
		SolrQuery query = new SolrQuery();
		query.setQuery("*:*");

		// 3.执行查询，返回一个响应
		QueryResponse queryResponse = solrServer.query(query);

		// 4.根据这个响应获取文档集合
		SolrDocumentList documentList = queryResponse.getResults();
		System.err.println("====总记录数size====" + documentList.size());
		System.err.println("====总记录数NumFound====" + documentList.getNumFound());

		// 5.遍历文档集合
		for (SolrDocument doc : documentList) {
			// 6.打印文档数据
			System.err.println("====id=====" + doc.get("id"));
			System.err.println("====name=====" + doc.get("name"));
		}
	}

}

2、schema.xml的使用

2.1 作用

配置了solr中自带的域
开发人员设置自定义域

2.2 位置

solrTomcat\solrHome\collection1\conf

2.3 内容说明

Field域

name 域的名称
type 域的类型
indexed 是否索引
stored 是否存储
required 是否必须
multiValued 是否多值

dynamicField域

name
- 域的名称
- name中包含的*：通配符,只要符合*_s这种格式的都可以是域名
type 域的类型
indexed 是否索引
stored 是否存储
multiValued 是否多值

uniqueKey

唯一的：id域唯一标识

copyField域

source 源域
dest 目标域
复制域把源域中的数据复制到目标域中，那么我们查询的时候，就可以只用目标域来查询

fieldType

name：域类型的名称
class：指定域类型的solr类型
analyzer：指定分词器。在FieldType定义的时候最重要的就是定义这个类型的数据在建立索引和进行查询的时候要使用的分析器analyzer，包括分词和过滤
type：index和query，Index 是创建索引，query是查询索引
tokenizer：指定分词器
filter：指定过滤器

<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
	<tokenizer class="solr.StandardTokenizerFactory"/>
	<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
	<filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>	
  <analyzer type="query">
	<tokenizer class="solr.StandardTokenizerFactory"/>
	<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
	<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
	<filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>

3、配置中文分词器

3.1 添加中文分词器的jar包

把IKAnalyzer核心jar复制到\solrTomcat\apache-tomcat-7.0.52\webapps\solr\WEB-INF\lib下
在\solrTomcat\apache-tomcat-7.0.52\webapps\solr\WEB-INF创建classes文件夹

3.2 配置中文分词器核心配置文件和两个词典

复制IKAnalyzer中文分词器的核心配置文件和两个词典到\solrTomcat\apache-tomcat-7.0.52\webapps\solr\WEB-INF\classes下

3.3 在solrCore中配置中文分词器

编辑solrTomcat\solrHome\collection1\conf\schema.xml：自定义域的类型（中文分词器的类型）自定义域名

<!-- IKAnalyzer-->
<fieldType name="text_ik" class="solr.TextField">
    <analyzer class="org.wltea.analyzer.lucene.IKAnalyzer"/>
</fieldType>

<!--IKAnalyzer Field-->
<field name="content_ik" type="text_ik" indexed="true" stored="true" />

3.4 启动tomcat

4、配置业务域

在solrHome\collection1\conf\schema.xml中配置

	<!--product-->
	<field name="product_name" type="text_ik" indexed="true" stored="true"/>
	<field name="product_price"  type="float" indexed="true" stored="true"/>
	<field name="product_description" type="text_ik" indexed="true" stored="false" />
	<field name="product_picture" type="string" indexed="false" stored="true" />
	<field name="product_catalog_name" type="string" indexed="true" stored="true" />
	<field name="product_keywords" type="text_ik" indexed="true" stored="false" multiValued="true"/>

	<copyField source="product_name" dest="product_keywords"/>
	<copyField source="product_description" dest="product_keywords"/>

5、导入数据库数据

使用dataimportHandler插件

5.1 导入jar包

在solrTomcat\solrHome\collection1\lib文件下导入jar包

5.2配置solrconfig.xml文件

solrTomcat\solrHome\collection1\conf\solrconfig.xml

	<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
		<lst name="defaults">
			<str name="config">data-config.xml</str>
		</lst>
	</requestHandler>

5.3 创建data-config.xml文件

solrTomcat\solrHome\collection1\conf\data-config.xml

<?xml version="1.0" encoding="UTF-8" ?>  
<dataConfig>   
<dataSource type="JdbcDataSource"   
		  driver="com.mysql.jdbc.Driver"   
		  url="jdbc:mysql://localhost:3306/solr"   
		  user="root"   
		  password="root"/>   
<document>   
	<entity name="product" query="SELECT pid,name,catalog_name,price,description,picture FROM products ">
		 <field column="pid" name="id"/> 
		 <field column="name" name="product_name"/> 
		 <field column="catalog_name" name="product_catalog_name"/> 
		 <field column="price" name="product_price"/> 
		 <field column="description" name="product_description"/> 
		 <field column="picture" name="product_picture"/> 
	</entity>   
</document>   

</dataConfig>