Janusgraph使用示例

最新推荐文章于 2023-12-22 02:56:13 发布

转载最新推荐文章于 2023-12-22 02:56:13 发布 · 1.2k 阅读

3 ·

CC 4.0 BY-SA版权

原文链接：http://www.janusgraph.cn/

文章标签：

#图数据库 #Janusgraph #人工智能 #开源项目 #图计算

Janusgraph 专栏收录该内容

10 篇文章

订阅专栏

本文介绍如何使用JanusGraph存储和查询希腊神话人物的关系图谱，包括配置存储后端、加载图数据、创建索引和使用Gremlin进行复杂图遍历等关键步骤。

部署运行你感兴趣的模型镜像

示例

这里将借助希腊诸神图来示例如何使用Janusgraph。这个图是基于Property Graph Model数据模型，描述了希腊诸神与其所居住的位置关系。其中使用到Gremlin查询语言，详细可参照Gremlin Query Language。

标记	含义
粗体关键字	图的索引。
带星的粗体关键字	图的索引且必须唯一
带下划线的关键字	vertex-centric索引关键字
空心箭头的边	无重复唯一的边
带短线的边	单向的边

将诸神图加载到JanusGraph

将一个图数据加载到Janusgraph主要通过JanusGraphFactory和GraphOfTheGodsFactory两个类。首先利用JanusGraphFactory提供的静态open方法，以配置文件路径作为输入参数，运行并返回graph实例(示例中的具体配置在配置章节中介绍)。然后利用GraphOfTheGodsFactory的load方法并将返回的graph实例作为参数运行，从而完成图加载到JanusGraph中。

将诸神图加载到带索引后端的Janusgraph

下面的conf/janusgraph-berkeleyje-es.properties配置文件配置的是以BerkeleyDB作为存储后端，Elasticsearch作为索引后端的Janusgraph。

gremlin> graph = JanusGraphFactory.open('conf/janusgraph-berkeleyje-es.properties')
==>standardjanusgraph[berkeleyje:../db/berkeley]
gremlin> GraphOfTheGodsFactory.load(graph)
==>null
gremlin> g = graph.traversal()
==>graphtraversalsource[standardjanusgraph[berkeleyje:../db/berkeley], standard]

JanusGraphFactory.open()和GraphOfTheGodsFactory.load()完成以下操作：

1. 为图创建全局和vertex-centric的索引集合。
2. 加载所有的顶点及其属性。
3. 加载所有的边及其属性。

想更详细了解GraphOfTheGodsFactory可参考其源码。

conf/janusgraph-cql-es.properties配置文件可以让Janusgraph使用Cassandra作为存储后端。conf/janusgraph-hbase-es.properties配置文件可以让Janusgraph使用HBase作为存储后端。

gremlin> graph = JanusGraphFactory.open('conf/janusgraph-cql-es.properties')
==>standardjanusgraph[cql:[127.0.0.1]]
gremlin> GraphOfTheGodsFactory.load(graph)
==>null
gremlin> g = graph.traversal()
==>graphtraversalsource[standardjanusgraph[cql:[127.0.0.1]], standard]

将诸神图加载到不带索引后端的Janusgraph

不带索引后端的配置文件有conf/janusgraph-cql.properties, conf/janusgraph-berkeleyje.properties, conf/janusgraph-hbase.properties, 或conf/janusgraph-inmemory.properties，可以根据实际后端存储选型而选择不同的配置文件。这里需要注意，使用的是GraphOfTheGodsFactory.loadWithoutMixedIndex()而不是之前的GraphOfTheGodsFactory.load()方法。

gremlin> graph = JanusGraphFactory.open('conf/janusgraph-cql.properties')
==>standardjanusgraph[cql:[127.0.0.1]]
gremlin> GraphOfTheGodsFactory.loadWithoutMixedIndex(graph, true)
==>null
gremlin> g = graph.traversal()
==>graphtraversalsource[standardjanusgraph[cql:[127.0.0.1]], standard]

全局索引

在做图遍历的时候往往首先根据顶点或者边的属性查询处遍历的起始点，然后根据Gremlin语言的遍历路径进行遍历。全局索引能够提高遍历起始顶点或者边的查询性能。

假如在Janusgraph中以顶点的name属性建立了一个全局索引。那么我们可以查询出name为Saturn的顶点。因而我们可以获得Saturn顶点的所有属性：age是10000，type为"titan"。然后我们可以根据Gremlin遍历语言遍历查找出：Saturn的孙子是谁？结果是Hercules。如下所示：

gremlin> saturn = g.V().has('name', 'saturn').next()
==>v[256]
gremlin> g.V(saturn).valueMap()
==>[name:[saturn], age:[10000]]
gremlin> g.V(saturn).in('father').in('father').values('name')
==>hercules

同样在Janusgraph中以边的place属性建立一个全局索引。那么我们可以根据地理位置计算方法查询出在Athens周围50公里内发生的事件。然后利用上述事件查询出参与这些事件的诸神。如下所示：

gremlin> g.E().has('place', geoWithin(Geoshape.circle(37.97, 23.72, 50)))
==>e[a9x-co8-9hx-39s][16424-battled->4240]
==>e[9vp-co8-9hx-9ns][16424-battled->12520]
gremlin> g.E().has('place', geoWithin(Geoshape.circle(37.97, 23.72, 50))).as('source').inV().as('god2').select('source').outV().as('god1').select('god1', 'god2').by('name')
==>[god1:hercules, god2:hydra]
==>[god1:hercules, god2:nemean]

当利用g.V或者g.E去查询节点和边的时候，Janusgraph会自动利用建立的全局索引来提高查询效率，这也是全局索引的重要作用。但是全局索引并不是Janusgraph的唯一一种类型的索引。还有一种类型的索引叫vertex-centric，将在后面的章节中具体介绍。

图遍历示例

下面我们可以采用loop的方式来进行遍历从而得到Saturn的孙子是Hercules的结果，如下所示：

gremlin> hercules = g.V(saturn).repeat(__.in('father')).times(2).next()
==>v[1536]

下面我们通过遍历诸神图的结果来证明Hercules是半人半神。我们以Hercules顶点为起始点去遍历他的父亲和母亲，并获取他父亲和母亲的type属性分别为"god"和"human"，从而得出我们想要的结果，如下所示：

gremlin> g.V(hercules).out('father', 'mother')
==>v[1024]
==>v[1792]
gremlin> g.V(hercules).out('father', 'mother').values('name')
==>jupiter
==>alcmene
gremlin> g.V(hercules).out('father', 'mother').label()
==>god
==>human
gremlin> hercules.label()
==>demigod

Property Graph Model图数据模型具有很强的表达能力，除了上述例子查询诸神的血缘关系外，还可以表示多种类型的事物和关系，我们同样可以根据不同类型的边去查询更丰富的结果，例如通过战斗的边来遍历hercules相关的战斗事迹。如下所示：

gremlin> g.V(hercules).out('battled')
==>v[2304]
==>v[2560]
==>v[2816]
gremlin> g.V(hercules).out('battled').valueMap()
==>[name:[nemean]]
==>[name:[hydra]]
==>[name:[cerberus]]
gremlin> g.V(hercules).outE('battled').has('time', gt(1)).inV().values('name')
==>cerberus
==>hydra

在诸神图中已经为battled边的time属性创建了vertex-centric索引。当我们想根据time查询过滤边的时候效率就是大大提高。因为在没有为time属性建立vertex-centric索引的时候，我们往往需要将所有的边都查询来做条件过滤，那么这个过滤的效率随着边数量的增加而变的十分低效。下面就是利用vertex-centric索引对hercules参与的战斗进行遍历。如下所示：

gremlin> g.V(hercules).outE('battled').has('time', gt(1)).inV().values('name').toString()
==>[GraphStep([v[24744]],vertex), VertexStep(OUT,[battled],edge), HasStep([time.gt(1)]), EdgeVertexStep(IN), PropertiesStep([name],value)]

更加复杂的图遍历示例

gremlin> pluto = g.V().has('name', 'pluto').next()
==>v[2048]
gremlin> // 谁是pluto的同居者?
gremlin> g.V(pluto).out('lives').in('lives').values('name')
==>pluto
==>cerberus
gremlin> // 同居者中去除pluto自己
gremlin> g.V(pluto).out('lives').in('lives').where(is(neq(pluto))).values('name')
==>cerberus
gremlin> g.V(pluto).as('x').out('lives').in('lives').where(neq('x')).values('name')
==>cerberus

gremlin> // pluto的兄弟住在哪里?
gremlin> g.V(pluto).out('brother').out('lives').values('name')
==>sky
==>sea
gremlin> // pluto所有兄弟的住所?
gremlin> g.V(pluto).out('brother').as('god').out('lives').as('place').select('god', 'place')
==>[god:v[1024], place:v[512]]
==>[god:v[1280], place:v[768]]
gremlin> // pluto兄弟的名字及其住所？
gremlin> g.V(pluto).out('brother').as('god').out('lives').as('place').select('god', 'place').by('name')
==>[god:jupiter, place:sky]
==>[god:neptune, place:sea]

gremlin> g.V(pluto).outE('lives').values('reason')
==>no fear of death
gremlin> g.E().has('reason', textContains('loves'))
==>e[6xs-sg-m51-e8][1024-lives->512]
==>e[70g-zk-m51-lc][1280-lives->768]
gremlin> g.E().has('reason', textContains('loves')).as('source').values('reason').as('reason').select('source').outV().values('name').as('god').select('source').inV().values('name').as('thing').select('god', 'reason', 'thing')
==>[god:neptune, reason:loves waves, thing:sea]
==>[god:jupiter, reason:loves fresh breezes, thing:sky]

您可能感兴趣的与本文相关的镜像