#gStore-weekly | gAnswer源码分析：后处理

原创

已于 2023-11-20 16:42:07 修改 · 163 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#知识图谱 #图数据库 #人工智能 #数据库

于 2023-11-20 16:41:43 首次发布

gAnswer通过自然语言问题转化成查询图，然后再和图数据库中的RDF图做匹配以生成用于查询的SPARQL语句。在将SPARQL语句应用于gStore查询之前还需要进行修复和聚合，以及一些后处理工作，本文聚焦于此。

// step 0: Node (entity & type & literal) Recognition 
// step 1: question parsing (dependency tree, sentence type)
// step 2: build query graph (structure construction, relation extraction, top-k join)

// step 3: some fix (such as "one-node" or "ask-one-triple") and aggregation
t = System.currentTimeMillis();
AddtionalFix step3 = new AddtionalFix();
step3.process(qlog);

在前几期关于gAnswer的文章中，我们完成了算法前三步的解析，认识了依存分析，节点提取，关系提取，进一步的查询图生成，子图匹配等模块。上面是第四步修复与聚合的入口函数，注释中，举了两个例子，"one-node"单节点查询和"ask-one-triple"，之后都会有具体方法的解析。

public HashMap<String, String> pattern2category = new HashMap<String, String>();

public AddtionalFix()
{
    // Some category mappings for DBpedia, try automatic linking methods later. | base form
    pattern2category.put("gangster_from_the_prohibition_era", "Prohibition-era_gangsters");
    pattern2category.put("seven_wonder_of_the_ancient_world", "Seven_Wonders_of_the_Ancient_World");
    pattern2category.put("three_ship_use_by_columbus", "Christopher_Columbus");
    pattern2category.put("13_british_colony", "Thirteen_Colonies");
}

首先在 AddtionalFix
类内部创建了一个名为 pattern2category
的哈希映射，用于将查询模式映射到类别。

public void process(QueryLogger qlog)
{
    fixCategory(qlog);
    oneTriple(qlog);
    oneNode(qlog);
    
    //aggregation
    AggregationRecognition ar = new AggregationRecognition();
    ar.recognize(qlog);

    //query type
    decideQueryType(qlog);
}

主方法process
接受了 QueryLogger
对象 qlog
作为参数。在该方法中，依次调用了以下三个方法：fixCategory
、oneTriple
和 oneNode
。这是完成fix的三个方法，然后调用 ar.recognize(qlog)
来进行聚合识别。以及调用了 decideQueryType(qlog)
来确定查询的类型。

public void fixCategory(QueryLogger qlog)
{
    if(qlog == null || qlog.semanticUnitList == null)
       return;
    
    String var = null, category = null;
    for(SemanticUnit su: qlog.semanticUnitList)
    {
       if(su.centerWord.mayCategory)
       {
          var = "?"+su.centerWord.originalForm;
          category = su.centerWord.category;
       }
    }
    
    if(category != null && var != null)
       for(Sparql spq: qlog.rankedSparqls)
       {
          boolean occured = false;
          for(Triple tri: spq.tripleList)
          {
             if(tri.subject.equals(var))
             {
                occured = true;
                break;
             }
          }
   &n