问题:
代码逻辑为从socket流中读取数据写入kafka
Configuration conf = new Configuration();
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(conf);
env.setParallelism(4);
env.enableCheckpointing(2000, CheckpointingMode.EXACTLY_ONCE);
DataStreamSource<String> source = env.socketTextStream("172.18.26.53", 7777);
KafkaSink<String> sink = KafkaSink.<String>builder()
.setBootstrapServers("172.18.26.218:9092")
.setRecordSerializer(
KafkaRecordSerializationSchema.<String>builder()
.setTopic("test-topic")
.setValueSerializationSchema(new SimpleStringSchema())
.build())
.setDeliveryGuarantee(DeliveryGuarantee.EXACTLY_ONCE)
.setTransactionalIdPrefix("cz-")
.setProperty(ProducerConfig.TRANSACTION_TIMEOUT_CONFIG, 1 * 60 * 1000 + "")
.build();
source.sinkTo(sink);
env.execute();
命令行bin/flink run-application -t yarn-application -Dyarn.provided.lib.dirs="hdfs://nameservice1/flink-dist" -c com.hex.cz.CZDemo examples/FlinkTutorial-1.17-1.0-SNAPSHOT.jar
提交到yarn集群后报错
原因:
查看flink作业日志yarn logs -applicationId application_1706238034141_6936
org.apache.kafka.common.KafkaException: Failed to construct kafka producer
at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:439) ~[FlinkTutorial-1.17-1.0-SNAPSHOT.jar:?]
at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:289) ~[FlinkTutorial-1.17-1.0-SNAPSHOT.jar:?]
at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:316) ~[FlinkTutorial-1.17-1.0-SNAPSHOT.jar:?]
at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:301) ~[FlinkTutorial-1.17-1.0-SNAPSHOT.jar:?]
at org.apache.flink.connector.kafka.sink.FlinkKafkaInternalProducer.<init>(FlinkKafkaInternalProducer.java:55) ~[FlinkTutorial-1.17-1.0-SNAPSHOT.jar:?]
at org.apache.flink.connector.kafka.sink.KafkaWriter.getOrCreateTransactionalProducer(KafkaWriter.java:326) ~[FlinkTutorial-1.17-1.0-SNAPSHOT.jar:?]
at org.apache.flink.connector.kafka.sink.TransactionAborter.abortTransactionOfSubtask(TransactionAborter.java:104) ~[FlinkTutorial-1.17-1.0-SNAPSHOT.jar:?]
at org.apache.flink.connector.kafka.sink.TransactionAborter.abortTransactionsWithPrefix(TransactionAborter.java:82) ~[FlinkTutorial-1.17-1.0-SNAPSHOT.jar:?]
at org.apache.flink.connector.kafka.sink.TransactionAborter.abortLingeringTransactions(TransactionAborter.java:66) ~[FlinkTutorial-1.17-1.0-SNAPSHOT.jar:?]
at org.apache.flink.connector.kafka.sink.KafkaWriter.abortLingeringTransactions(KafkaWriter.java:289) ~[FlinkTutorial-1.17-1.0-SNAPSHOT.jar:?]
at org.apache.flink.connector.kafka.sink.KafkaWriter.<init>(KafkaWriter.java:176) ~[FlinkTutorial-1.17-1.0-SNAPSHOT.jar:?]
at org.apache.flink.connector.kafka.sink.KafkaSink.createWriter(KafkaSink.java:111) ~[FlinkTutorial-1.17-1.0-SNAPSHOT.jar:?]
at org.apache.flink.connector.kafka.sink.KafkaSink.createWriter(KafkaSink.java:57) ~[FlinkTutorial-1.17-1.0-SNAPSHOT.jar:?]
at org.apache.flink.streaming.runtime.operators.sink.StatefulSinkWriterStateHandler.createWriter(StatefulSinkWriterStateHandler.java:117) ~[flink-dist-1.17.0.jar:1.17.0]
at org.apache.flink.streaming.runtime.operators.sink.SinkWriterOperator.initializeState(SinkWriterOperator.java:146) ~[flink-dist-1.17.0.jar:1.17.0]
at org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.initializeOperatorState(StreamOperatorStateHandler.java:122) ~[flink-dist-1.17.0.jar:1.17.0]
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:274) ~[flink-dist-1.17.0.jar:1.17.0]
at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106) ~[flink-dist-1.17.0.jar:1.17.0]
at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:734) ~[flink-dist-1.17.0.jar:1.17.0]
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) ~[flink-dist-1.17.0.jar:1.17.0]
at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:709) ~[flink-dist-1.17.0.jar:1.17.0]
at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:675) ~[flink-dist-1.17.0.jar:1.17.0]
at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:952) ~[flink-dist-1.17.0.jar:1.17.0]
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:921) ~[flink-dist-1.17.0.jar:1.17.0]
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:745) ~[flink-dist-1.17.0.jar:1.17.0]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) ~[flink-dist-1.17.0.jar:1.17.0]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
Caused by: org.apache.kafka.common.KafkaException: class org.apache.kafka.common.serialization.ByteArraySerializer is not an instance of org.apache.kafka.common.serialization.Serializer
at org.apache.kafka.common.config.AbstractConfig.getConfiguredInstance(AbstractConfig.java:403) ~[FlinkTutorial-1.17-1.0-SNAPSHOT.jar:?]
at org.apache.kafka.common.config.AbstractConfig.getConfiguredInstance(AbstractConfig.java:434) ~[FlinkTutorial-1.17-1.0-SNAPSHOT.jar:?]
at org.apache.kafka.common.config.AbstractConfig.getConfiguredInstance(AbstractConfig.java:419) ~[FlinkTutorial-1.17-1.0-SNAPSHOT.jar:?]
at org.apache.kafka.clients.producer.KafkaProducer.<init>(KafkaProducer.java:365) ~[FlinkTutorial-1.17-1.0-SNAPSHOT.jar:?]
... 26 more
解决:
ByteArraySerializer
存在依赖冲突,把conf目录下flink-conf.yml中的classloader.resolve-order参数由默认的child-first改成parent-first