Debezium 监控Postgresql简单配置
因为Debezium官方文档太复杂了,这里简化一下提供一份简单配置的方案:
1.前置条件 Zookeeper 、kafka 、postgresql(新版kafka可以开启自注册方案,需配置,舍弃Zookeeper),这里给出一份docker-compose文件,根据实际修改
version: '3'
services:
pgsql:
image: postgres
container_name: pgsql
restart: always
privileged: true
ports:
- "5432:5432"
environment:
TZ: Asia/Shanghai
POSTGRES_DB: your_database
POSTGRES_PASSWORD: 'password'
PGDATA: /var/lib/postgresql/data/pgdata
volumes:
#提前找一份postgresql标准配置文件 进行如下设置,后面两个根据需求改
# wal_level = logical max_replication_slots = 4 max_wal_senders = 4
- ./postgresql.conf:/etc/postgresql/postgresql.conf
zookeeper:
image: zookeeper
container_name: zookeeper
ports:
- 2181:2181
environment:
TZ: Asia/Shanghai
kafka:
image: apache/kafka
container_name: kafka
restart: always
depends_on:
- zookeeper
ports:
- "9092:9092"
environment:
KAFKA_BROKER_ID: 1
KAFKA_CREATE_TOPICS: "test"
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
#告诉外部如何访问自己,信息公开在zookeeper中
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://宿主机ip地址:9092
KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
kafkaconnect:
image: debezium/connect
container_name: kafkaconnect
ports:
- "8083:8083"
environment:
GROUP_ID: "1"
CONFIG_STORAGE_TOPIC: "my_connect_configs"
OFFSET_STORAGE_TOPIC: "my_connect_offsets"
STATUS_STORAGE_TOPIC: "my_connect_statuses"
#官方文档没这个参数,默认会访问127导致一直连不上kafka(docker环境)
BOOTSTRAP_SERVERS: "kafka的IP地址,即宿主机:9092"
depends_on:
- kafka
- pgsql
- zookeeper
2.设置Postgresql使用pgoutput插件开启逻辑复制槽
在postgresql中执行创建一个名为my_slot的逻辑复制槽
SELECT * FROM pg_create_logical_replication_slot(‘my_slot’, ‘pgoutput’);
3.进入debezium/connect容器内部,执行下面的命令,进行kafka connect配置,下面给出一份参考文件,建议使用在线JSON解析工具进行相关配置的查看和修改,相关参数信息可以参考下面这篇文章和官网的start教程mysql connect配置那段
注意:topic.prefix 是发送到kafka消息的主题的统一前缀
slot.name 是我们创建的逻辑复制槽名字
publication.name 是postgresql publish的名字,connect会自动在Postgresql创建
(connect停止运行 1 连接不上kafka 2 publish已存在,需要我们手动删除)
https://blog.youkuaiyun.com/foshansuperinter/article/details/110856979
https://debezium.io/documentation/reference/3.0/tutorial.html
curl -i -X POST -H "Accept:application/json" -H "Content-Type:application/json" localhost:8083/connectors/ -d '{ "name": "inventory-connector", "config": { "connector.class": "io.debezium.connector.postgresql.PostgresConnector", "tasks.max": "1", "database.hostname": "172.20.160.1", "database.port":"5432","database.dbname":"aiot", "database.user": "postgres", "database.password": "123456", "schema.history.internal.kafka.bootstrap.servers": "kafka的IP:9092", "schema.history.internal.kafka.topic": "schemahistory.inventory", "plugin.name": "pgoutput", "publication.autocreate.mode": "all_tables", "database.server.name": "electricmeter", "publication.name":"electrimeterpublic ", "slot.name":"electrimeterslot","topic.prefix":"datatrans"} }'
4.如果一切正常,此时连接上kafka(推荐使用github上的Kafka king工具,个人使用不错),应该会看到topic.prefix设置的前缀+ 表所在schema + 表名的主题,每次操作表也能在connect日志中看到变化
5.其他注意事项
1.本例子只是简单单节点布置,分布式、集群请再深入研究
2.curl 和逻辑槽都是开启的都是监控数据库中所有的表,如果按需请参考官方文档
3.开启逻辑槽的表如果需要执行修改、删除等操作,必须包含主键,否则Postgresql会因为无法保证数据同步性拒绝操作