gAnswer通过自然语言问题转化成查询图,然后再和图数据库中的RDF图做匹配以生成用于查询的SPARQL语句。在将SPARQL语句应用于gStore查询之前还需要进行修复和聚合,以及一些后处理工作,本文聚焦于此。
// step 0: Node (entity & type & literal) Recognition
// step 1: question parsing (dependency tree, sentence type)
// step 2: build query graph (structure construction, relation extraction, top-k join)
// step 3: some fix (such as "one-node" or "ask-one-triple") and aggregation
t = System.currentTimeMillis();
AddtionalFix step3 = new AddtionalFix();
step3.process(qlog);
在前几期关于gAnswer的文章中,我们完成了算法前三步的解析,认识了依存分析,节点提取,关系提取,进一步的查询图生成,子图匹配等模块。上面是第四步修复与聚合的入口函数,注释中,举了两个例子,"one-node"单节点查询和"ask-one-triple",之后都会有具体方法的解析。
public HashMap<String, String> pattern2category = new HashMap<String, String>();
public AddtionalFix()
{
// Some category mappings for DBpedia, try automatic linking methods later. | base form
pattern2category.put("gangster_from_the_prohibition_era", "Prohibition-era_gangsters");
pattern2category.put("seven_wonder_of_the_ancient_world", "Seven_Wonders_of_the_Ancient_World");
pattern2category.put("three_ship_use_by_columbus", "Christopher_Columbus");
pattern2category.put("13_british_colony", "Thirteen_Colonies");
}
-
首先在
AddtionalFix
类内部创建了一个名为pattern2category
的哈希映射,用于将查询模式映射到类别。
public void process(QueryLogger qlog)
{
fixCategory(qlog);
oneTriple(qlog);
oneNode(qlog);
//aggregation
AggregationRecognition ar = new AggregationRecognition();
ar.recognize(qlog);
//query type
decideQueryType(qlog);
}
-
主方法
process
接受了QueryLogger
对象qlog
作为参数。在该方法中,依次调用了以下三个方法:fixCategory
、oneTriple
和oneNode
。这是完成fix的三个方法,然后调用ar.recognize(qlog)
来进行聚合识别。以及调用了decideQueryType(qlog)
来确定查询的类型。
public void fixCategory(QueryLogger qlog)
{
if(qlog == null || qlog.semanticUnitList == null)
return;
String var = null, category = null;
for(SemanticUnit su: qlog.semanticUnitList)
{
if(su.centerWord.mayCategory)
{
var = "?"+su.centerWord.originalForm;
category = su.centerWord.category;
}
}
if(category != null && var != null)
for(Sparql spq: qlog.rankedSparqls)
{
boolean occured = false;
for(Triple tri: spq.tripleList)
{
if(tri.subject.equals(var))
{
occured = true;
break;
}
}
&n