Indexing for Better Performance
JanusGraph supports two different kinds of indexing to speed up query processing: graph indexes and vertex-centric indexes. Most graph queries start the traversal from a list of vertices or edges that are identified by their properties. Graph indexes make these global retrieval operations efficient on large graphs. Vertex-centric indexes speed up the actual traversal through the graph, in particular when traversing through vertices with many incident edges.
索引以获得更好的性能 JanusGraph支持两种不同类型的索引以加快查询处理:图形索引和以顶点为中心的索引。大多数图形查询从顶点或边的列表开始遍历,这些顶点或边由属性标识。图索引使这些全局检索操作在大型图上高效。以顶点为中心的索引加快了实际遍历图形的速度,特别是在遍历具有许多入射边的顶点时。
Graph Index
Graph indexes are global index structures over the entire graph which allow efficient retrieval of vertices or edges by their properties for sufficiently selective conditions. For instance, consider the following queries
图形索引 图索引是整个图的全局索引结构,它允许在充分选择的条件下,通过顶点或边的属性来有效地检索它们。例如,考虑以下查询
g.V().has('name', 'hercules')
g.E().has('reason', textContains('loves'))
第一个查询要求所有名为hercules的顶点。第二个要求所有的边缘,其中属性原因包含爱这个词。如果没有图索引来回答这些查询,则需要对图中的所有顶点或边进行全面扫描,以找到符合给定条件的顶点或边,这对于大型图来说是非常低效和不可行的。
JanusGraph distinguishes between two types of graph indexes: composite and mixed indexes. Composite indexes are very fast and efficient but limited to equality lookups for a particular, previously-defined combination of property keys. Mixed indexes can be used for lookups on any combination of indexed keys and support multiple condition predicates in addition to equality depending on the backing index store.
Both types of indexes are created through the JanusGraph management system and the index builder returned by JanusGraphManagement.buildIndex(String, Class)
where the first argument defines the name of the index and the second argument specifies the type of element to be indexed (e.g. Vertex.class
). The name of a graph index must be unique. Graph indexes built against newly defined property keys, i.e. property keys that are defined in the same management transaction as the index, are immediately available. The same applies to graph indexes that are constrained to a label that is created in the same management transaction as the index. Graph indexes built against property keys that are already in use without being constrained to a newly created label require the execution of a reindex procedure to ensure that the index contains all previously added elements. Until the reindex procedure has completed, the index will not be available. It is encouraged to define graph indexes in the same transaction as the initial schema.
JanusGraph区分了两种类型的图索引:复合索引和混合索引。复合索引非常快速高效,但仅限于对特定的、先前定义的属性键组合进行相等查找。混合索引可用于查找索引键的任何组合,并支持多个条件谓词,此外,还可以根据备份索引存储进行相等。 这两种类型的索引都是通过JanusGraph管理系统和JanusGraphManagement.buildIndex(字符串,类)其中第一个参数定义索引的名称,第二个参数指定要索引的元素的类型(例如。顶点.class). 图索引的名称必须唯一。根据新定义的属性键(即在与索引相同的管理事务中定义的属性键)构建的图形索引将立即可用。这同样适用于约束到与索引在同一管理事务中创建的标签的图形索引。根据已在使用的属性键构建的图形索引不受新创建的标签的约束,需要执行重新编制索引过程,以确保索引包含以前添加的所有元素。在重新索引过程完成之前,索引将不可用。鼓励在与初始模式相同的事务中定义图索引。
Note
In the absence of an index, JanusGraph will default to a full graph scan in order to retrieve the desired list of vertices. While this produces the correct result set, the graph scan can be very inefficient and lead to poor overall system performance in a production environment. Enable the force-index
configuration option in production deployments of JanusGraph to prohibit graph scans.
注:如果没有索引,JanuGraph将默认为完整的图扫描,以便检索所需的顶点列表。虽然这会产生正确的结果集,但图形扫描效率很低,并且会导致生产环境中系统整体性能较差。在JanuGraph的生产部署中启用force index configuration选项以禁止图形扫描。
Composite Index
Composite indexes retrieve vertices or edges by one or a (fixed) composition of multiple keys. Consider the following composite index definitions.
复合索引通过多个关键点的一个或一个(固定的)组合检索顶点或边。考虑下面的复合索引定义。
graph.tx().rollback() //Never create new indexes while a transaction is active
mgmt = graph.openManagement()
name = mgmt.getPropertyKey('name')
age = mgmt.getPropertyKey('age')
mgmt.buildIndex('byNameComposite', Vertex.class).addKey(name).buildCompositeIndex()
mgmt.buildIndex('byNameAndAgeComposite', Vertex.class).addKey(name).addKey(age).buildCompositeIndex()
mgmt.commit()
//Wait for the index to become available
ManagementSystem.awaitGraphIndexStatus(graph, 'byNameComposite').call()
ManagementSystem.awaitGraphIndexStatus(graph, 'byNameAndAgeComposite').call()
//Reindex the existing data
mgmt = graph.openManagement()
mgmt.updateIndex(mgmt.getGraphIndex("byNameComposite"), SchemaAction.REINDEX).get()
mgmt.updateIndex(mgmt.getGraphIndex("byNameAndAgeComposite"), SchemaAction.REINDEX).get()
mgmt.commit()
First, two property keys name
and age
are already defined. Next, a simple composite index on just the name property key is built. JanusGraph will use this index to answer the following query.
首先,已经定义了两个属性键name和age。接下来,只在name属性键上构建一个简单的复合索引。JanusGraph将使用此索引来回答以下查询。
g.V().has('name', 'hercules')
The second composite graph index includes both keys. JanusGraph will use this index to answer the following query.
第二个复合图索引包括两个键。JanusGraph将使用此索引来回答以下查询。
g.V().has('age', 30).has('name', 'hercules')
Note, that all keys of a composite graph index must be found in the query’s equality conditions for this index to be used. For example, the following query cannot be answered with either of the indexes because it only contains a constraint on age
but not name
.
注意,组合图索引的所有键必须在查询的相等条件中找到,才能使用此索引。例如,以下查询不能用任何一个索引来回答,因