Build and deploy workbench

本文介绍如何从源代码构建Carrot2文档聚类工作台,并针对中文分词器集成过程中遇到的问题提供解决方案。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

To build Carrot2 Document Clustering Workbench from source code:

  1. Download Eclipse Target Platform from http://download.carrot2.org/eclipse and extract to some local folder.

  2. Copy local.properties.example from Carrot2 checkout folder tolocal.properties in the same folder. In local.properties edit thetarget.platform property to point to the Eclipse Target Platform you have downloaded.

    Important

    The folder pointed to by target.platform must have the eclipse/ folder inside.

    You may also change the configs property to match the platform you want to build Carrot2 Document Clustering Workbench for or rely on auto-detection.

  3. Run:

    ant workbench

    to build Carrot2 Document Clustering Workbench binaries.

  4. Go to the tmp/ workbench/ tmp/ carrot2-workbench folder in the Carrot2 checkout dir and run Carrot2 Document Clustering Workbench.

 

we can run ant jar successfully. but if we want to run ant workbench, we should modify 

1.

workbench/org.carrot2.workbench.core.feature/feature.xml

to add our custom chinese tokenizer related  jcseg jars

   <plugin
         id="org.lionsoul.jcseg"
         download-size="0"
         install-size="0"
         version="0.0.0"
         unpack="false"/>

 

 2

workbench/org.carrot2.workbench.feature/carrot2.Workbench.launch

 

<stringAttribute key="selected_workspace_plugins" value="org.lionsoul.jcseg@default:default,com.carrotsearch.hppc@default:default,

 

3
#ant eclipse
etc/maven/poms/pom.xml
<org.lionsoul.jcseg.version>1.9.6</org.lionsoul.jcseg.version>


    <dependency>
      <groupId>org.lionsoul.jcseg</groupId>
      <artifactId>jcseg-core</artifactId>
      <version>${org.lionsoul.jcseg.version}</version>
    </dependency>
 

4

core/carrot2-util-text/META-INF/MANIFEST.MF

 

Bundle-SymbolicName: org.carrot2.text
Bundle-Version: 0.0.0.QUALIFIER
Require-Bundle:
 org.carrot2.core;bundle-version="0.0.0";visibility:=reexport,
 org.carrot2.util.matrix;bundle-version="0.0.0";visibility:=reexport,
 org.apache.lucene.v2;bundle-version="2.9.0";visibility:=reexport,
 org.lionsoul;bundle-version="1.9.6";visibility:=reexport,
 morfologik;bundle-version="1.1.2";resolution:=optional;visibility:=reexport,
 com.carrotsearch.hppc;bundle-version="0.3.0";visibility:=reexport

 

5
lib/org.lionsoul.jcseg/META-INF/MANIFEST.MF
Manifest-Version: 1.0
Bundle-ManifestVersion: 2
Bundle-Name: Jcseg Tokenizer
Bundle-SymbolicName: org.lionsoul.jcseg
Bundle-Version: 1.9.6
Bundle-ClassPath: jcseg-core-1.9.6.jar
Bundle-Vendor: jcseg Inc.
Bundle-RequiredExecutionEnvironment: JavaSE-1.7
 
#ant workbench
Successfully!
#cd tmp/workbench/build/tmp/carrot2-workbench-3.11.0-SNAPSHOT
#./carrot2-workbench
Error:
org.eclipse.swt.SWTError: No more handles [Unknown Mozilla path (MOZILLA_FIVE_HOME not set)
solution: sudo apt-get install  libwebkitgtk-1.0-0
 -----
But the custom tokenizer can't work. 
#jar tf  carrot2/tmp/workbench/build/tmp/carrot2-workbench-3.11.0-SNAPSHOT/plugins/org.lionsoul.jcseg_1.9.6.jar
Reason: there is a whitespace after
jcseg.LICENSE,\ 
solution: lib/org.lionsoul.jcseg/build.properties
bin.includes = META-INF/,\
               jcseg.LICENSE,\
               jcseg-core-1.9.6.jar
  
 
 
META-INF/
META-INF/MANIFEST.MF
jcseg.LICEN
there is no jcseg-core-1.9.6.jar in this plugin

 

#ant workbench

 

an error occurs

 

[java] [java] BUILD FAILED [java] /home/zhaohj/soft/eclipse/plugins/org.eclipse.pde.build_3.8.100.v20130514-1028/ scripts/productBuild/productBuild.xml:35: The following error occurred while executing this line: [java] /home/zhaohj/soft/eclipse/plugins/ org.eclipse.pde.build_3.8.100.v20130514-1028/scripts/productBuild/ productBuild.xml:69: org.eclipse.core.runtime.CoreException: x86_64 is not a valid configuration. [java] BUILD FAILED [java] /home/zhaohj/soft/eclipse/plugins/org.eclipse.pde.build_3.8.100.v20130514-1028/scripts/productBuild/productBuild.xml:35: The following error occurred while executing this line: [java] /home/zhaohj/soft/eclipse/plugins/org.eclipse.pde.build_3.8.100. v20130514-1028/scripts/productBuild/productBuild.xml:69: Unable to find element : /home/zhaohj/hadoop/src/carrot2/tmp/workbench/build/features/org.carrot2.workbench.feature/Workbench.product.

 

[eclipse.generateFeature] Incorrect directory entry: /home/zhaohj/hadoop/src/carrot2/.tools/rcp/3.7.1/eclipse.

----

cd carrot2/workbench/org.carrot2.workbench.target

README

Target platform definition. Download matching target's binary bundle from:

http://download.carrot2.org/eclipse

and unpack it to this folder.

 org.carrot2.workbench.eclipse-3.6.2.target

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?pde version="3.5"?>

<target name="org.carrot2.workbench.eclipse-3.6.2">
<locations>
<location path="${workspace_loc:org.carrot2.workbench.target}/3.6.2/eclipse" type="Directory"/>
</locations>
</target>

 

 

 

 http://download.carrot2.org/eclipse/

 

 lexicon put in dir :

/carrot2/tmp/workbench/build/tmp/carrot2-workbench-3.10.0-SNAPSHOT/configuration/org.eclipse.osgi/bundles/85/1/.cp

 

 

 

 

References

http://issues.carrot2.org/browse/CARROT-272

http://doc.carrot2.org/#section.workbench

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值