1. WEB-INF/classes/custom-fields.xml里添加:
<entry key="field1.name">title</entry>
<entry key="field1.indexed">yes</entry>
<entry key="field1.stored">yes</entry>
<entry key="field1.tokenized">no</entry>
<entry key="field1.boost">1.0</entry>
<entry key="field1.multi">false</entry>
<entry key="field2.name">content</entry>
<entry key="field2.indexed">yes</entry>
<entry key="field2.stored">yes</entry>
<entry key="field2.tokenized">no</entry>
<entry key="field2.boost">1.0</entry>
<entry key="field2.multi">false</entry>
2. plugin/query-custom/plugin.xml里修改:
<extension id="org.apache.nutch.searcher.custom"
name="Nutch Custom Field Query Filter"
point="org.apache.nutch.searcher.QueryFilter">
<implementation id="CustomQueryFilter"
class="org.apache.nutch.searcher.custom.CustomFieldQueryFilter">
<parameter name="fields" value="lang,content,title" />
</implementation>
之后可以使用content:XXX或者title:XXX只搜索content或者title了。
<entry key="field1.name">title</entry>
<entry key="field1.indexed">yes</entry>
<entry key="field1.stored">yes</entry>
<entry key="field1.tokenized">no</entry>
<entry key="field1.boost">1.0</entry>
<entry key="field1.multi">false</entry>
<entry key="field2.name">content</entry>
<entry key="field2.indexed">yes</entry>
<entry key="field2.stored">yes</entry>
<entry key="field2.tokenized">no</entry>
<entry key="field2.boost">1.0</entry>
<entry key="field2.multi">false</entry>
2. plugin/query-custom/plugin.xml里修改:
<extension id="org.apache.nutch.searcher.custom"
name="Nutch Custom Field Query Filter"
point="org.apache.nutch.searcher.QueryFilter">
<implementation id="CustomQueryFilter"
class="org.apache.nutch.searcher.custom.CustomFieldQueryFilter">
<parameter name="fields" value="lang,content,title" />
</implementation>
</extension>
3.在nutch-default.xml中添加插件query-custom:
<property>
<name>plugin.includes</name>
<value>protocol-http|urlfilter-regex|parse-(text|html|js)|index-(basic|anchor)|query-(basic|site|url|custom)|response-(json|xml)|summary-lucene|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
<description>
</description>
</property>
之后可以使用content:XXX或者title:XXX只搜索content或者title了。