stanford parse使用命令进行句法树的生成过程

Python调用Java执行Stanford Parse数据预处理

最新推荐文章于 2021-07-30 11:36:27 发布

原创最新推荐文章于 2021-07-30 11:36:27 发布 · 886 阅读

0 ·

CC 4.0 BY-SA版权

博客讲述因需求变化，要将数据预处理写成完整模块，使用Python方法调用Java程序来使用Stanford Parse。介绍了使用命令执行的示例，还详细说明了在数据预处理中，通过cmd调用Stanford Parse的代码操作步骤，包括编译、复制文件、修改.py文件等。

之前有介绍过图形化生成的过程，但是需求变了，需要把数据预处理的现成东西写成一个完整的模块，我使用python方法调用java程序，来使用stanford parse。

这个是使用命令来执行：（就是官方打包好的一个例子）

然后就开始看我们的正儿八经的代码吧：

在数据预处理中，使用cmd调用stanford parse的代码

def dependency_parse(filepath, cp='', tokenize=True):
    print('\nDependency parsing ' + filepath)
    dirpath = os.path.dirname(filepath)#得到文件的绝对路径
    filepre = os.path.splitext(os.path.basename(filepath))[0]#得到路径中最后的文件名和后缀
    tokpath = os.path.join(dirpath, filepre + '.toks')
    parentpath = os.path.join(dirpath, filepre + '.parents')
    relpath = os.path.join(dirpath, filepre + '.rels')
    tokenize_flag = '-tokenize - ' if tokenize else ''
    cmd = ('java -cp %s DependencyParse  -tokpath %s -parentpath %s -relpath %s %s < %s'
           % (cp, tokpath, parentpath, relpath, tokenize_flag, filepath))
    os.system(cmd)