Lucene API基本用法

最新推荐文章于 2023-07-24 09:25:05 发布

原创最新推荐文章于 2023-07-24 09:25:05 发布 · 790 阅读

2 ·

CC 4.0 BY-SA版权

文章标签：

#lucene #api #string #file #list #query

nutch+lucene 专栏收录该内容

7 篇文章

订阅专栏

本文介绍了如何使用Lucene结合IKAnalyzer实现文本检索功能。包括建立索引、使用IKAnalyzer进行分词处理以及搜索功能的具体实现。此外还提供了完整的代码示例。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

检索：

public static void query(String queryString){

try {

List<String> list = new ArrayList<String>();

Query query = IKQueryParser.parse("name",queryString);

System.out.println(query.toString());

IndexReader reader = IndexReader.open(FSDirectory.open(new File("D://index")), true);

IndexSearcher is = new IndexSearcher(reader);

is.setSimilarity(new IKSimilarity());

TopDocs topDocs = is.search(query, 100);

ScoreDoc[] docs = topDocs.scoreDocs;

for (int i = 0; i < docs.length; i++) {

Document doc = is.doc(docs[i].doc);// new method is.doc()

list.add(doc.get("ID"));

System.out.println(doc.get("name") + " ID：" + doc.get("ID"));

}

String pathString = "D://result";

File file = new File(pathString);

if(!file.exists()){

file.mkdir();

}

pathString = pathString + "//" + new Date().getTime() + num + ".xml";

XMLProcesser xmlProcesser = new XMLProcesser(pathString);

xmlProcesser.setList(list);

xmlProcesser.generateXML();

} catch (CorruptIndexException e) {

e.printStackTrace();

} catch (IOException e) {

e.printStackTrace();

}

索引：

public static void index(){

File indexPathFile = new File("D://index");

boolean initFlag = true;

Analyzer analyzer = new IKAnalyzer();

if(!indexPathFile.exists()){

indexPathFile.mkdir();

}else{

if(indexPathFile.list().length > 0){

initFlag = true;

}

}

try {

IndexWriter writer = new IndexWriter(FSDirectory.open(indexPathFile), analyzer

, initFlag, new IndexWriter.MaxFieldLength(1000000));

indexBuilder(writer);

writer.optimize();

writer.close();

} catch (CorruptIndexException e) {

e.printStackTrace();

} catch (LockObtainFailedException e) {

e.printStackTrace();

} catch (IOException e) {

e.printStackTrace();

} catch (SQLException e) {

e.printStackTrace();

}

}

private static void indexBuilder(IndexWriter writer)

throws CorruptIndexException, LockObtainFailedException, IOException, SQLException{

List<String> list = getInitResult();

for(String string : list){

Document document = new Document();

String id = string.split(";")[0];

String name = string.split(";")[1];

Field field = new Field("name",name,Field.Store.YES,Field.Index.ANALYZED);

document.add(field);

field = new Field("ID",id

,Field.Store.YES,Field.Index.NO);

document.add(field);

writer.addDocument(document);

}

}

private static List<String> getInitResult() throws SQLException, IOException{

String last_idString = 0 + "";

File file = new File("D://result");

if(file.exists() && file.list().length > 0){

for(String nameString : file.list()){

if(nameString.contains("last_id")){

last_idString = nameString.split("-")[1];

new File(nameString).delete();

}

}

}

SQLManage manage = new SQLManage();

String sql = "select count(*) as num from product where PRODUCTID > " + last_idString;

ResultSet rsResultSet = manage.query(sql);

rsResultSet.next();

int count = Integer.parseInt(rsResultSet.getString("num"));

int n = count / PAGESIZE;

if((count % PAGESIZE) > 0){

n++;

}

List<String> list = new ArrayList<String>();

for(int i = 0;i < n; i++){

sql = "select * from product where PRODUCTID > " + last_idString

+ " order by PRODUCTID ASC limit " + i*PAGESIZE + "," + PAGESIZE;

rsResultSet = manage.query(sql);

int k = 0;

while(rsResultSet.next()){

list.add(rsResultSet.getString("PRODUCTID") + ";" + rsResultSet.getString("NAME"));

k++;

}

}

if(!file.exists()){

file.mkdir();

}

if(list.size() > 0){

File file2 = new File("D://result//last_id-" +

list.get(list.size() - 1).split(";")[0] + "-.txt");

file2.createNewFile();

}

return list;

}

这里索引的是数据库里的数据。其实，方法都是一样，先建立document，然后，往这个document添加field。然后，再将document写进段。

我这里使用了IKAnalyzer分词器。

只要导入IKAnalyzer包，并将其提供的xml文件放到src根目录下就可以。web应用上，将xml放进classes文件夹里。

Lucene具体原理看下面两个网站：

http://forfuture1978.javaeye.com/blog/546808

http://www.fkdoc.com/html/program/java/2010/0607/4124.html