基于NebulaGraph构建省市区乡镇街道知识图谱(二)

上次我们有讲到构建知识图谱,但是在实际使用的时候会发现某些乡镇街道丢失的问题,因为VID必须全局唯一,覆盖导致原因,另外在全国大批量导入时速度非常慢,为此,我们重新优化表结构与导入语法。

1. 表及索引创建NSQL

# Create Space 
CREATE SPACE `GovGraph` (partition_num = 10, replica_factor = 1, charset = utf8, collate = utf8_bin, vid_type = FIXED_STRING(32)) comment = '行政区划知识图谱';
:sleep 20;
USE `GovGraph`;

# Create Tag: 
CREATE TAG `City` ( `name` string NULL) ttl_duration = 0, ttl_col = "";
CREATE TAG `District` ( `name` string NULL) ttl_duration = 0, ttl_col = "";
CREATE TAG `Province` ( `name` string NULL) ttl_duration = 0, ttl_col = "";
CREATE TAG `Street` ( `name` string NULL) ttl_duration = 0, ttl_col = "";

# Create Edge: 
CREATE EDGE `hasPart` ( `relationship_type` string NULL) ttl_duration = 0, ttl_col = "";
CREATE EDGE `partOf` ( `relationship_type` string NULL) ttl_duration = 0, ttl_col = "";
:sleep 20;

# Create Index: 
CREATE TAG INDEX `city_name_index` ON `City` ( `name`(32)) comment "城市名称索引";
CREATE TAG INDEX `district_name_index` ON `District` ( `name`(32)) comment "区域名称索引";
CREATE TAG INDEX `province_name_index` ON `Province` ( `name`(32)) comment "省份名称索引";
CREATE TAG INDEX `street_name_index` ON `Street` ( `name`(32)) comment "乡镇街道名称索引";
CREATE EDGE INDEX `partof_index` ON `partOf` ( `relationship_type`(16)) comment "行政隶属";
CREATE EDGE INDEX `haspart_index` ON `hasPart` ( `relationship_type`(16)) comment "行政隶属";

2. 数据准备

city.csv等顶点格式如下

66306018d4364e363870dec0,吐鲁番市
66306018d4364e363870dec2,中卫市
66306018d4364e363870dec3,石嘴山市
66306018d4364e363870dec6,海北藏族自治州
66306018d4364e363870dec9,张掖市
66306018d4364e363870deca,天水市
66306018d4364e363870decc,铜川市

city2prov.csv等边关系表格式如下

66306018d4364e363870deb5,663059aad4364e4bd87c578d,昆玉市,新疆维吾尔自治区,行政隶属
66306018d4364e363870deb7,663059aad4364e4bd87c578d,北屯市,新疆维吾尔自治区,行政隶属
66306018d4364e363870deb4,663059aad4364e4bd87c578d,胡杨河市,新疆维吾尔自治区,行政隶属

3. 导入脚本

client:
  version: v3
  address: "127.0.0.1:9669"
  user: root
  password: ****
  concurrencyPerAddress: 10
  reconnectInitialInterval: 1s
  retry: 3
  retryInitialInterval: 1s

manager:
  spaceName: GovGraph
  batch: 128
  readerConcurrency: 50
  importerConcurrency: 512
  statsInterval: 10s
log:
  level: INFO
  console: true
  files:
   - logs/nebula-importer.log

sources:
  - path: ./city.csv
    failDataPath: ./err/city.csv
    csv:
      delimiter: ","
      withHeader: false
      withLabel: false
    tags:
    - name: City
      id:
        type: "STRING"
        index: 0
      props:
        - name: "name"
          type: "STRING"
          index: 1
client:
  version: v3
  address: "127.0.0.1:9669"
  user: root
  password: ****
  concurrencyPerAddress: 10
  reconnectInitialInterval: 1s
  retry: 3
  retryInitialInterval: 1s

manager:
  spaceName: GovGraph
  batch: 128
  readerConcurrency: 50
  importerConcurrency: 512
  statsInterval: 10s
log:
  level: INFO
  console: true
  files:
   - logs/nebula-importer.log

sources:
  - path: ./city2prov.csv
    failDataPath: ./err/error.csv
    csv:
      delimiter: ","
      withHeader: false
      withLabel: false
    edges:
    - name: partOf
      src:
        id:
          type: "STRING"
          index: 0
      dst:
        id:
          type: "STRING"
          index: 1
      props:
        - name: "relationship_type"
          type: "STRING"
          index: 4

4. 查询演示

LOOKUP ON Street WHERE Street.name == '澄江街道' YIELD id(VERTEX) AS vid|
GO FROM $-.vid OVER hasPart REVERSELY YIELD properties($$).name AS parent_name;

澄江街道所属区域

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

算法小生Đ

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值