使用Flink CDC从mysql同步数据到Doris,遇到报错:Caused by: com.ververica.cdc.connectors.shaded.org.apache.kafka.connect.errors.ConnectException: Data row is smaller than a column index, internal schema representation is probably out of sync with real database schema
mysql source配置使用scan.startup.mode=EARLIEST_OFFSET
仔细看报错,里面有一句:com.ververica.cdc.connectors.mysql.debezium.task.context.exception.SchemaOutOfSyncException: Internal schema representation is probably out of sync with real database schema. The reason could be that the table schema was changed after the starting binlog offset, which is not supported when startup mode is set to EARLIEST_OFFSET
翻译一下:当scan.startup.mode=EARLIEST_OFFSET时,如果在最早的offset之后表结构有过变化就无法支持。
原因应该是在binlog 的earliest_offset之后,表结构被修改过。
我的解决方案:source配置中使用scan.startup.mode' = 'timestamp'
和'scan.startup.timestamp-millis' = '1667232000000'
,以时间戳为消费依据。
完整报错如下:
2024-01-25 09:00:07,079 WARN org.apache.flink.runtime.taskmanager.Task [] - Source: MySQL Source -> Sink: Writer -> Sink: Committer (1/1)#619 (eb387b512a8060e9924a8d5f8c9d49c7_cbc357ccb763df2852fee8c4fc7d55f2_0_619) switched from RUNNING to FAILED with failure cause: java.lang.RuntimeException: One or more fetchers have encountered exception
at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcherManager.checkErrors(SplitFetcherManager.java:225)
at org.apache.flink.connector.base.source.reader.SourceReaderBase.getNextFetch(SourceReaderBase.java:169)
at org.apache.flink.connector.base.source.reader.SourceReaderBase.pollNext(SourceReaderBase.java:130)
at org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:385)
at org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:68)
at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
at org.apache.flink.streaming