自然语言处理中的数据获取、格式转换与XML应用
1. 数据查询与处理
我们先来看一些词汇及其定义的示例:
"sleep","sli:p","v.i","a condition of body and mind ..."
"walk","wo:k","v.intr","progress by lifting and setting down each foot ..."
"wake","weik","intrans","cease to sleep"
可以使用以下Python代码来表达查询:
import csv
lexicon = csv.reader(open('dict.csv'))
pairs = [(lexeme, defn) for (lexeme, _, _, defn) in lexicon]
lexemes, defns = zip(*pairs)
defn_words = set(w for defn in defns for w in defn.split())
sorted(defn_words.difference(lexemes))
运行上述代码后,输出结果如下:
['...', 'a', 'and', 'body', 'by', 'cease', 'condition', 'down', 'each',
'foot', 'lifting', 'mind', 'of
超级会员免费看
订阅专栏 解锁全文

被折叠的 条评论
为什么被折叠?



