CDC变化数据捕获——Debezium-Embedded

本文详细介绍了如何配置MySQL主从同步并使用Debezium-Embedded在Java应用中捕获MySQL数据库的变化数据。通过示例代码展示了如何创建和使用DebeziumEngine,包括简单的数据通知、使用Connect类创建引擎以及批量处理CDC事件。适用于无需Kafka的场景,便于直接获取和处理数据库变更事件。

CDC变化数据捕获——Debezium-Embedded

0. 前言

  • CDC(Change Data Capture)是变化数据捕获的意思,可以捕获数据库数据的增加、更新、删除等记录,RedHat的 Debezium 正是这样一款产品。
  • 对于数据库数据变化的监听,国内常用的一般是阿里的 Canal ,可惜目前仅支持MySQL。而相对来说,Debezium支持的数据库就非常丰富了,包括:MySQL、PostgreSQL、MongoDB、Oracle、SQL Server、Db2、Cassandra、Vitess。
  • 使用Debezium比较麻烦的是,它通常需要用到 Apache Kafka 来保证数据的高容错性和高可靠性。但是,很多应用不需要这样的特性,或者不想连接Kafka,因此Debezium提供了debezium-embedded。利用它,可以在Java应用中连接到数据库,直接获取到数据库的CDC事件信息,不用额外进行其他操作。
  • debezium-embedded使用起来更为便捷、灵巧,目前国内已有阿里的Flink-CDC在使用它了(查看示例)。可惜的是Debezium在国内相对来说文档不多,其嵌入式使用方式的示例更是难找(官方英文文档有坑- -!!),本文的目的正是要写出此处的示例(MySQL),方便大家参阅。
  • 官方文档地址: Debezium Engine

1. 配置MySQL主从同步

  • 停止MySQL服务
  • 修改配置文件,在[mysqld]处添加内容如下
    # 一般情况下,Window修改mysql目录下的my.ini
    # 一般情况下,Linux修改/etc/mysql下的my.cnf
    
    [mysqld]
    # 添加的部分,server-id随便填
    server-id = 12345
    log-bin = mysql-bin
    # 必须为ROW
    binlog_format = ROW
    # 必须为FULL,MySQL-5.7后才有该参数
    binlog_row_image  = FULL
    expire_logs_days  = 10
    
  • 启动MySQL服务
  • root 登入MySQL,查看binlog相关变量配置
    SHOW VARIABLES LIKE '%binlog%';
    
  • root 登入MySQL,新增用于同步数据的用户
    -- 设置拥有同步权限的用户
    CREATE USER 'cdc_user' IDENTIFIED BY 'cdc_password';
    -- 赋予同步相关权限
    GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'cdc_user';
    
  • 新用户 cdc_user 登入MySQL,查看主从同步情况
    SHOW MASTER STATUS;
    SHOW SLAVE STATUS;
    SHOW BINARY LOGS;
    
  • 建议再建个测试用的库和表(带主键),方便调试,例如Flink-CDC中的

2. Debezium-Embedded 代码开发

2.1 Maven导包

    <properties>
        <maven.compiler.source>8</maven.compiler.source>
        <maven.compiler.target>8</maven.compiler.target>
        <version.debezium>1.4.2.Final</version.debezium>
    </properties>

    <dependencies>
        <dependency>
            <groupId>io.debezium</groupId>
            <artifactId>debezium-api</artifactId>
            <version>${version.debezium}</version>
        </dependency>
        <dependency>
            <groupId>io.debezium</groupId>
            <artifactId>debezium-embedded</artifactId>
            <version>${version.debezium}</version>
        </dependency>
        <dependency>
            <groupId>io.debezium</groupId>
            <artifactId>debezium-connector-mysql</artifactId>
            <version>${version.debezium}</version>
        </dependency>
    </dependencies>

2.2 代码-简单示例

import io.debezium.connector.mysql.MySqlConnector;
import io.debezium.engine.ChangeEvent;
import io.debezium.engine.DebeziumEngine;
import io.debezium.engine.format.Json;
import io.debezium.relational.history.FileDatabaseHistory;
import org.apache.kafka.connect.storage.FileOffsetBackingStore;

import java.util.Properties;
import java.util.UUID;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class Demo01 {

    public static void main(String[] args) {
        // 0. 配置数据库,添加用户,赋予主从同步的权限

        // 1. 生成配置
        Properties props = genProps();

        // 2. 构建 DebeziumEngine
        // 使用 Json 格式
        DebeziumEngine<ChangeEvent<String, String>> engine = DebeziumEngine.create(Json.class)
                .using(props)
                .notifying(record -> {
                	// record中会有操作的类型(增、删、改)和具体的数据
                	// key是主键
                    System.out.println("record.key() = " + record.key());
                    System.out.println("record.value() = " + record.value());
                })
                .using((success, message, error) -> {
               		// 强烈建议加上此部分的回调代码,方便查看错误信息
                    
                    if (!success && error != null) {
                    	// 报错回调
                        System.out.println("----------error------");
                        System.out.println(message);
                    }
                }).build();

        // 3. 正式运行
        ExecutorService executor = Executors.newSingleThreadExecutor();
        executor.execute(engine);
    }

    private static Properties genProps() {
        // 配置
        Properties props = new Properties();
        // 在maven处引入其他数据库的连接器,例如debezium-connector-postgres,再修改此处的connector.class,即可使用其他数据库的CDC
        props.setProperty("connector.class", MySqlConnector.class.getCanonicalName());
        props.setProperty("database.server.name", "my_server_01"); // 可以任意修改
        props.setProperty("database.hostname", "localhost"); // IP
        props.setProperty("database.port", String.valueOf(3306)); // 端口
        props.setProperty("database.user", "cdc_user"); // 用户
        props.setProperty("database.password", "cdc_password"); // 密码
        props.setProperty("database.serverTimezone", "UTC"); // 时区
        // 下面两个是数据库和表,注意只能选择一种:
        // 1. 使用database.whitelist,只设置数据库(会通知全库的CDC信息)
        // 2. 使用table.whitelist,设置库名和表名(会通知单个库的单个表的CDC信息)
//        props.setProperty("database.whitelist", "db_inventory_cdc");
        props.setProperty("table.whitelist", "db_inventory_cdc.tb_products_cdc"); // 库.表名

        props.setProperty("name", "engine");
        props.setProperty("key.converter.schemas.enable", "false");
        props.setProperty("value.converter.schemas.enable", "false");
        props.setProperty("include.schema.changes", "false");
        props.setProperty("tombstones.on.delete", "false");
        props.setProperty("database.history", FileDatabaseHistory.class.getCanonicalName());
        props.setProperty("database.history.store.only.monitored.tables.ddl", "true");
        props.setProperty("database.history.file.filename", "./storage/dbhistory.dat");
        props.setProperty("database.history.instance.name", UUID.randomUUID().toString());
        props.setProperty("database.history.skip.unparseable.ddl", "true");
        // 偏移量持久化配置
		props.setProperty("offset.storage", FileOffsetBackingStore.class.getCanonicalName());
        props.setProperty("offset.storage.file.filename", "./tmp/offsets.dat");
        props.setProperty("offset.flush.interval.ms", "6000");
        return props;
    }
    
}

2.3 代码-使用Connect.class创建引擎

import io.debezium.embedded.Connect;

……

DebeziumEngine<ChangeEvent<SourceRecord, SourceRecord>> engine = DebeziumEngine.create(Connect.class)
        .using(props)
        .notifying(record -> {
              System.out.println("record.key = " + record.key());
              System.out.println("record.value = " + record.value());
              System.out.println("record.value.sourcePartition = " + record.value().sourcePartition());
              System.out.println("record.value.sourceOffset = " + record.value().sourceOffset());
              System.out.println("record.value.key = " + record.value().key());
              System.out.println("record.value.value = " + record.value().value());
              System.out.println("record.destination = " + record.destination());
          })
          .using((success, message, error) -> {
              // 省略代码
          }).build();

2.4 代码-批量处理CDC事件

DebeziumEngine<ChangeEvent<String, String>> engine = DebeziumEngine.create(Json.class)
        .using(props)
        .notifying((records, committer) -> {
        	// records是当前批次的数据
            for (ChangeEvent<String, String> record : records) {
                System.out.println("record.key() = " + record.key());
                System.out.println("record.value() = " + record.value());
            }
            // 当前批次处理完后,需要标记为完成
            committer.markBatchFinished();
        })
        .using((success, message, error) -> {
            // 省略代码
        }).build();

2.5 启动代码运行即可

  • 控制台日志样例
record.key() = {"id":1}
record.value() = {"before":null,"after":{"id":1,"name":"xiaowang","age":34,"address":"beijing"},"source":{"version":"1.4.2.Final","connector":"mysql","name":"my_server_01","ts_ms":0,"snapshot":"true","db":"y1","table":"tb_test_from","server_id":0,"gtid":null,"file":"binlog.000030","pos":115750444,"row":0,"thread":null,"query":null},"op":"r","ts_ms":1679554873026,"transaction":null}
record.key() = {"id":2}
record.value() = {"before":null,"after":{"id":2,"name":"lilei","age":12,"address":"chongqing"},"source":{"version":"1.4.2.Final","connector":"mysql","name":"my_server_01","ts_ms":0,"snapshot":"true","db":"y1","table":"tb_test_from","server_id":0,"gtid":null,"file":"binlog.000030","pos":115750444,"row":0,"thread":null,"query":null},"op":"r","ts_ms":1679554873026,"transaction":null}
record.key() = {"id":3}
record.value() = {"before":null,"after":{"id":3,"name":"meimei","age":19,"address":"chengdu"},"source":{"version":"1.4.2.Final","connector":"mysql","name":"my_server_01","ts_ms":0,"snapshot":"true","db":"y1","table":"tb_test_from","server_id":0,"gtid":null,"file":"binlog.000030","pos":115750444,"row":0,"thread":null,"query":null},"op":"r","ts_ms":1679554873026,"transaction":null}
record.key() = {"id":4}
record.value() = {"before":null,"after":{"id":4,"name":"chunfang","age":18,"address":"chongqing"},"source":{"version":"1.4.2.Final","connector":"mysql","name":"my_server_01","ts_ms":0,"snapshot":"true","db":"y1","table":"tb_test_from","server_id":0,"gtid":null,"file":"binlog.000030","pos":115750444,"row":0,"thread":null,"query":null},"op":"r","ts_ms":1679554873026,"transaction":null}
record.key() = {"id":6}
record.value() = {"before":null,"after":{"id":6,"name":"lili","age":15,"address":"chongqing"},"source":{"version":"1.4.2.Final","connector":"mysql","name":"my_server_01","ts_ms":0,"snapshot":"true","db":"y1","table":"tb_test_from","server_id":0,"gtid":null,"file":"binlog.000030","pos":115750444,"row":0,"thread":null,"query":null},"op":"r","ts_ms":1679554873026,"transaction":null}
record.key() = {"id":7}
record.value() = {"before":null,"after":{"id":7,"name":"xiaoming","age":21,"address":"beijing"},"source":{"version":"1.4.2.Final","connector":"mysql","name":"my_server_01","ts_ms":0,"snapshot":"true","db":"y1","table":"tb_test_from","server_id":0,"gtid":null,"file":"binlog.000030","pos":115750444,"row":0,"thread":null,"query":null},"op":"r","ts_ms":1679554873026,"transaction":null}
record.key() = {"id":8}
record.value() = {"before":null,"after":{"id":8,"name":"wangwu","age":24,"address":"hunan"},"source":{"version":"1.4.2.Final","connector":"mysql","name":"my_server_01","ts_ms":0,"snapshot":"true","db":"y1","table":"tb_test_from","server_id":0,"gtid":null,"file":"binlog.000030","pos":115750444,"row":0,"thread":null,"query":null},"op":"r","ts_ms":1679554873026,"transaction":null}
record.key() = {"id":9}
record.value() = {"before":null,"after":{"id":9,"name":"tom","age":12,"address":"beijing"},"source":{"version":"1.4.2.Final","connector":"mysql","name":"my_server_01","ts_ms":0,"snapshot":"true","db":"y1","table":"tb_test_from","server_id":0,"gtid":null,"file":"binlog.000030","pos":115750444,"row":0,"thread":null,"query":null},"op":"r","ts_ms":1679554873026,"transaction":null}
record.key() = {"id":10}
record.value() = {"before":null,"after":{"id":10,"name":"whitee","age":55,"address":"hebei"},"source":{"version":"1.4.2.Final","connector":"mysql","name":"my_server_01","ts_ms":0,"snapshot":"true","db":"y1","table":"tb_test_from","server_id":0,"gtid":null,"file":"binlog.000030","pos":115750444,"row":0,"thread":null,"query":null},"op":"r","ts_ms":1679554873026,"transaction":null}
record.key() = {"id":11}
record.value() = {"before":null,"after":{"id":11,"name":"meimei","age":31,"address":"hebei"},"source":{"version":"1.4.2.Final","connector":"mysql","name":"my_server_01","ts_ms":0,"snapshot":"true","db":"y1","table":"tb_test_from","server_id":0,"gtid":null,"file":"binlog.000030","pos":115750444,"row":0,"thread":null,"query":null},"op":"r","ts_ms":1679554873026,"transaction":null}
record.key() = {"id":12}
record.value() = {"before":null,"after":{"id":12,"name":"hehe","age":13,"address":"nanjing"},"source":{"version":"1.4.2.Final","connector":"mysql","name":"my_server_01","ts_ms":0,"snapshot":"true","db":"y1","table":"tb_test_from","server_id":0,"gtid":null,"file":"binlog.000030","pos":115750444,"row":0,"thread":null,"query":null},"op":"r","ts_ms":1679554873026,"transaction":null}
record.key() = {"id":13}
record.value() = {"before":null,"after":{"id":13,"name":"liyang","age":16,"address":"beijing"},"source":{"version":"1.4.2.Final","connector":"mysql","name":"my_server_01","ts_ms":0,"snapshot":"true","db":"y1","table":"tb_test_from","server_id":0,"gtid":null,"file":"binlog.000030","pos":115750444,"row":0,"thread":null,"query":null},"op":"r","ts_ms":1679554873026,"transaction":null}
record.key() = {"id":14}
record.value() = {"before":null,"after":{"id":14,"name":"huahua","age":21,"address":"shandong"},"source":{"version":"1.4.2.Final","connector":"mysql","name":"my_server_01","ts_ms":0,"snapshot":"true","db":"y1","table":"tb_test_from","server_id":0,"gtid":null,"file":"binlog.000030","pos":115750444,"row":0,"thread":null,"query":null},"op":"r","ts_ms":1679554873026,"transaction":null}
record.key() = {"id":15}
record.value() = {"before":null,"after":{"id":15,"name":"zhangsan","age":25,"address":"hubei"},"source":{"version":"1.4.2.Final","connector":"mysql","name":"my_server_01","ts_ms":0,"snapshot":"last","db":"y1","table":"tb_test_from","server_id":0,"gtid":null,"file":"binlog.000030","pos":115750444,"row":0,"thread":null,"query":null},"op":"r","ts_ms":1679554873026,"transaction":null}
// 删除了一条带IP的日志
record.key() = {"id":14}
record.value() = {"before":{"id":14,"name":"huahua","age":21,"address":"shandong"},"after":{"id":14,"name":"huahua","age":10,"address":"shandong"},"source":{"version":"1.4.2.Final","connector":"mysql","name":"my_server_01","ts_ms":1679554893000,"snapshot":"false","db":"y1","table":"tb_test_from","server_id":1,"gtid":null,"file":"binlog.000030","pos":115750677,"row":0,"thread":26773,"query":null},"op":"u","ts_ms":1679554893186,"transaction":null}
record.key() = {"id":16}
record.value() = {"before":null,"after":{"id":16,"name":"alice","age":18,"address":"dalian"},"source":{"version":"1.4.2.Final","connector":"mysql","name":"my_server_01","ts_ms":1679554929000,"snapshot":"false","db":"y1","table":"tb_test_from","server_id":1,"gtid":null,"file":"binlog.000030","pos":115752466,"row":0,"thread":26773,"query":null},"op":"c","ts_ms":1679554929396,"transaction":null}
record.key() = {"id":6}
record.value() = {"before":{"id":6,"name":"lili","age":15,"address":"chongqing"},"after":null,"source":{"version":"1.4.2.Final","connector":"mysql","name":"my_server_01","ts_ms":1679554936000,"snapshot":"false","db":"y1","table":"tb_test_from","server_id":1,"gtid":null,"file":"binlog.000030","pos":115752780,"row":0,"thread":26773,"query":null},"op":"d","ts_ms":1679554936366,"transaction":null}
<think>好的,用户想了解Debezium Embedded的使用和配置。首先,我需要确认Debezium Embedded的基本概念,确保自己理解正确。Debezium通常用于CDC(Change Data Capture),而Embedded模式允许将其集成到应用中,而不是作为独立服务运行。 接下来,用户可能需要知道如何开始使用,比如添加依赖。对于Java项目,Maven依赖是常见的,所以需要提供正确的groupId和artifactId,比如io.debezium:debezium-api和io.debezium:debezium-embedded。还有核心的配置参数,比如connector.class、database.hostname等,这些是必须的。 然后,配置步骤需要详细说明,包括创建配置属性、实例化EmbeddedEngine,以及处理变更事件。用户可能需要代码示例来理解如何实现,比如使用Java代码展示配置和启动引擎。同时,处理逻辑部分需要说明如何解析SourceRecord,这里可以提到反序列化器的使用。 另外,高级配置部分,用户可能对如何监控、错误处理和性能优化感兴趣。需要提到JMX监控、重试策略和线程池调整。可能还需要提醒连接器特定配置,比如MySQL或PostgreSQL的不同参数。 用户的问题可能还涉及实际应用中的问题,比如如何处理大数据量或者保证数据一致性,这些需要在高级配置里简要提及。同时,注意引用来源,比如参考Debezium官方文档,确保信息的准确性。 最后,生成相关问题时要考虑用户可能进一步的需求,比如应用场景、性能调优、错误处理以及与其他系统的集成。这些问题能帮助用户更全面地应用Debezium Embedded。</think>### Debezium Embedded 使用与配置详解 #### 一、核心概念 Debezium Embedded 是将变更数据捕获CDC)功能直接嵌入Java应用程序的模式,无需部署独立Kafka连接集群。其核心依赖关系满足: $$ \text{Debezium Embedded} \subseteq \text{Debezium Core} \subseteq \text{Kafka Connect API} $$ #### 二、基础配置步骤 1. **添加依赖**(Maven示例): ```xml <dependency> <groupId>io.debezium</groupId> <artifactId>debezium-api</artifactId> <version>2.3.0.Final</version> </dependency> <dependency> <groupId>io.debezium</groupId> <artifactId>debezium-embedded</artifactId> <version>2.3.0.Final</version> </dependency> ``` 2. **基础配置参数**: ```properties connector.class=io.debezium.connector.mysql.MySqlConnector database.hostname=192.168.1.100 database.port=3306 database.user=debezium database.password=dbz database.server.id=184054 database.server.name=inventory database.include.list=inventory table.include.list=inventory.customers ``` 3. **嵌入式引擎启动**: ```java Configuration config = Configuration.create() .with("name", "embedded-engine") .with("connector.class", MySqlConnector.class) .with("offset.storage", MemoryOffsetBackingStore.class) .build(); EmbeddedEngine engine = EmbeddedEngine.create() .using(config) .notifying(record -> { SourceRecord sourceRecord = record.sourceRecord(); Struct value = (Struct) sourceRecord.value(); // 处理变更数据 }).build(); ExecutorService executor = Executors.newSingleThreadExecutor(); executor.execute(engine); ``` #### 三、高级配置项 1. **反序列化配置**: ```java .with("value.converter", AvroConverter.class) .with("value.converter.schema.registry.url", "http://localhost:8081") ``` 2. **监控集成**: ```java .with("metric.reporters", JmxReporter.class) .with("jmx.domain", "debezium") ``` 3. **容错配置**: ```java .with("errors.max.retries", 5) .with("retry.delay.max.ms", 60000) ``` #### 四、连接器特定配置 | 数据库类型 | 关键配置项 | 默认值 | |----------|----------|--------| | MySQL | `database.server.id`<br>`snapshot.mode` | initial | | PostgreSQL | `plugin.name`<br>`slot.name` | decoderbufs<br>debezium | | MongoDB | `mongodb.members.auto.discover`<br>`capture.mode` | true<br>change_streams | #### 五、性能优化建议 1. 调整批量处理参数: $$ \text{max.batch.size} \times \text{max.queue.size} \leq \text{JVM Heap Size} \times 0.3 $$ 2. 使用并行处理时需满足: $$ \text{task.threads} \leq \text{CPU核心数} \times 2 $$ [^1]: 配置参考Debezium官方文档的嵌入式模式章节
评论 25
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值