Build and deploy workbench

本文介绍如何从源代码构建Carrot2文档聚类工作台,并针对中文分词器集成过程中遇到的问题提供解决方案。

To build Carrot2 Document Clustering Workbench from source code:

  1. Download Eclipse Target Platform from http://download.carrot2.org/eclipse and extract to some local folder.

  2. Copy local.properties.example from Carrot2 checkout folder tolocal.properties in the same folder. In local.properties edit thetarget.platform property to point to the Eclipse Target Platform you have downloaded.

    Important

    The folder pointed to by target.platform must have the eclipse/ folder inside.

    You may also change the configs property to match the platform you want to build Carrot2 Document Clustering Workbench for or rely on auto-detection.

  3. Run:

    ant workbench

    to build Carrot2 Document Clustering Workbench binaries.

  4. Go to the tmp/ workbench/ tmp/ carrot2-workbench folder in the Carrot2 checkout dir and run Carrot2 Document Clustering Workbench.

 

we can run ant jar successfully. but if we want to run ant workbench, we should modify 

1.

workbench/org.carrot2.workbench.core.feature/feature.xml

to add our custom chinese tokenizer related  jcseg jars

   <plugin
         id="org.lionsoul.jcseg"
         download-size="0"
         install-size="0"
         version="0.0.0"
         unpack="false"/>

 

 2

workbench/org.carrot2.workbench.feature/carrot2.Workbench.launch

 

<stringAttribute key="selected_workspace_plugins" value="org.lionsoul.jcseg@default:default,com.carrotsearch.hppc@default:default,

 

3
#ant eclipse
etc/maven/poms/pom.xml
<org.lionsoul.jcseg.version>1.9.6</org.lionsoul.jcseg.version>


    <dependency>
      <groupId>org.lionsoul.jcseg</groupId>
      <artifactId>jcseg-core</artifactId>
      <version>${org.lionsoul.jcseg.version}</version>
    </dependency>
 

4

core/carrot2-util-text/META-INF/MANIFEST.MF

 

Bundle-SymbolicName: org.carrot2.text
Bundle-Version: 0.0.0.QUALIFIER
Require-Bundle:
 org.carrot2.core;bundle-version="0.0.0";visibility:=reexport,
 org.carrot2.util.matrix;bundle-version="0.0.0";visibility:=reexport,
 org.apache.lucene.v2;bundle-version="2.9.0";visibility:=reexport,
 org.lionsoul;bundle-version="1.9.6";visibility:=reexport,
 morfologik;bundle-version="1.1.2";resolution:=optional;visibility:=reexport,
 com.carrotsearch.hppc;bundle-version="0.3.0";visibility:=reexport

 

5
lib/org.lionsoul.jcseg/META-INF/MANIFEST.MF
Manifest-Version: 1.0
Bundle-ManifestVersion: 2
Bundle-Name: Jcseg Tokenizer
Bundle-SymbolicName: org.lionsoul.jcseg
Bundle-Version: 1.9.6
Bundle-ClassPath: jcseg-core-1.9.6.jar
Bundle-Vendor: jcseg Inc.
Bundle-RequiredExecutionEnvironment: JavaSE-1.7
 
#ant workbench
Successfully!
#cd tmp/workbench/build/tmp/carrot2-workbench-3.11.0-SNAPSHOT
#./carrot2-workbench
Error:
org.eclipse.swt.SWTError: No more handles [Unknown Mozilla path (MOZILLA_FIVE_HOME not set)
solution: sudo apt-get install  libwebkitgtk-1.0-0
 -----
But the custom tokenizer can't work. 
#jar tf  carrot2/tmp/workbench/build/tmp/carrot2-workbench-3.11.0-SNAPSHOT/plugins/org.lionsoul.jcseg_1.9.6.jar
Reason: there is a whitespace after
jcseg.LICENSE,\ 
solution: lib/org.lionsoul.jcseg/build.properties
bin.includes = META-INF/,\
               jcseg.LICENSE,\
               jcseg-core-1.9.6.jar
  
 
 
META-INF/
META-INF/MANIFEST.MF
jcseg.LICEN
there is no jcseg-core-1.9.6.jar in this plugin

 

#ant workbench

 

an error occurs

 

[java] [java] BUILD FAILED [java] /home/zhaohj/soft/eclipse/plugins/org.eclipse.pde.build_3.8.100.v20130514-1028/ scripts/productBuild/productBuild.xml:35: The following error occurred while executing this line: [java] /home/zhaohj/soft/eclipse/plugins/ org.eclipse.pde.build_3.8.100.v20130514-1028/scripts/productBuild/ productBuild.xml:69: org.eclipse.core.runtime.CoreException: x86_64 is not a valid configuration. [java] BUILD FAILED [java] /home/zhaohj/soft/eclipse/plugins/org.eclipse.pde.build_3.8.100.v20130514-1028/scripts/productBuild/productBuild.xml:35: The following error occurred while executing this line: [java] /home/zhaohj/soft/eclipse/plugins/org.eclipse.pde.build_3.8.100. v20130514-1028/scripts/productBuild/productBuild.xml:69: Unable to find element : /home/zhaohj/hadoop/src/carrot2/tmp/workbench/build/features/org.carrot2.workbench.feature/Workbench.product.

 

[eclipse.generateFeature] Incorrect directory entry: /home/zhaohj/hadoop/src/carrot2/.tools/rcp/3.7.1/eclipse.

----

cd carrot2/workbench/org.carrot2.workbench.target

README

Target platform definition. Download matching target's binary bundle from:

http://download.carrot2.org/eclipse

and unpack it to this folder.

 org.carrot2.workbench.eclipse-3.6.2.target

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?pde version="3.5"?>

<target name="org.carrot2.workbench.eclipse-3.6.2">
<locations>
<location path="${workspace_loc:org.carrot2.workbench.target}/3.6.2/eclipse" type="Directory"/>
</locations>
</target>

 

 

 

 http://download.carrot2.org/eclipse/

 

 lexicon put in dir :

/carrot2/tmp/workbench/build/tmp/carrot2-workbench-3.10.0-SNAPSHOT/configuration/org.eclipse.osgi/bundles/85/1/.cp

 

 

 

 

References

http://issues.carrot2.org/browse/CARROT-272

http://doc.carrot2.org/#section.workbench

在 Maven 项目中,`build.bat deploy -pl` 是一个组合命令,涉及 Maven 的 `deploy` 目标和 `-pl`(或 `--projects`)选项的使用。该命令通常用于在多模块 Maven 项目中部署特定模块到远程仓库,例如私有仓库或快照仓库。 ### deploy 命令的作用 Maven 的 `deploy` 目标用于将构建好的项目(通常是 JAR、WAR 或其他打包格式)上传到远程仓库。这一操作通常发生在持续集成/交付(CI/CD)流程的最后阶段,以确保其他项目可以引用和使用该项目的构建产物。`deploy` 操作通常包括将项目部署到快照仓库(Snapshots)或发布仓库(Releases),具体取决于项目的版本号是否包含 `SNAPSHOT` 标识[^2]。 ### -pl(--projects)选项的含义 `-pl` 是 `--projects` 的缩写,它允许指定要在多模块项目中操作的特定模块。例如,在一个包含多个子模块的项目中,可以使用 `-pl` 来限定只对某个或某些模块执行操作。例如: ``` mvn deploy -pl module-name ``` 该命令将仅对 `module-name` 子模块执行 `deploy` 操作。如果该模块依赖于其他模块,则可以通过 `-am`(`--also-make`)选项一并构建这些依赖模块[^2]。 ### build.bat 中 deploy -pl 命令的典型用法 在实际的 `build.bat` 脚本中,`deploy -pl` 命令可能与 `-am` 和 `-DaltDeploymentRepository` 等参数结合使用,以确保在部署特定模块时,其依赖项也被处理,并且部署到指定的远程仓库。例如: ``` mvn -U -Dmaven.test.skip=true -Dautconfig.skip clean deploy -pl bigdata-process-sev-service-api -am -DaltDeploymentRepository=snapshots::default::http://mavenrepository.bestpay.com.cn/content/repositories/snapshots/ ``` 这条命令的作用是: - `-U`:强制更新快照依赖。 - `-Dmaven.test.skip=true`:跳过测试阶段。 - `clean deploy`:清理并部署项目。 - `-pl bigdata-process-sev-service-api`:仅对 `bigdata-process-sev-service-api` 模块进行操作。 - `-am`:同时构建该模块所依赖的其他模块。 - `-DaltDeploymentRepository`:指定部署的目标仓库为私有快照仓库。 ### 总结 `build.bat deploy -pl` 命令主要用于在多模块 Maven 项目中部署特定模块及其依赖项到指定的远程仓库,从而实现高效的模块化构建和部署流程。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值