(二)Flink 读取 Kafka 数据批量写入到 MySQL

本文详细介绍了如何在本地环境中安装配置Kafka与Zookeeper,并通过Flink进行数据流处理,实现从Kafka读取数据到MySQL的数据写入过程。涵盖了从环境搭建、代码编写到运行测试的完整流程。

一、首先安装 kafka 和 zk

(1) zk

wget https://archive.apache.org/dist/zookeeper/zookeeper-3.4.1/zookeeper-3.4.1.tar.gz

tar -zxvfzookeeper-3.4.10.tar.gz

cp   -R  zookeeper-3.4.10  /usr/local

cp /usr/local/zookeeper-3.4.10/conf/zoo_sample.cfg  /usr/local/zookeeper-3.4.10/conf/zoo.cfg

/usr/local/zookeeper-3.4.10/bin/zkServer.sh start

/usr/local/zookeeper-3.4.10/bin/zkServer.sh status

(2) kafka

wget https://archive.apache.org/dist/kafka/0.9.0.0/kafka_2.10-0.9.0.0.tgz

tar -zxvf kafka_2.10-0.9.0.0.tgz

cp -R kafka_2.10-0.9.0.0 /usr/local

vi /usr/local/kafka_2.10-0.9.0.0/config/server.properties

// 允许远程访问 

  • 把31行的注释去掉,listeners=PLAINTEXT://:9092
  • 把36行的注释去掉,把advertised.listeners值改为PLAINTEXT://host_ip:9092(我的服务器ip是192.168.0.118)

启动kafka:

/usr/local/kafka_2.10-0.9.0.0/bin/kafka-server-start.sh   -daemon /usr/local/kafka_2.10-0.9.0.0/config/server.properties

创建topic

/usr/local/kafka_2.10-0.9.0.0//bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

查看topic

/usr/local/kafka_2.10-0.9.0.0/bin/kafka-topics.sh --describe --zookeeper h1:2181,h2:2181,h3:2181 --topic test

c.生产者  消费者

/usr/local/kafka_2.10-0.9.0.0/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
/usr/local/kafka_2.10-0.9.0.0/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --from-beginning --topic test

二.写代码

1.pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.whl.demo</groupId>
    <artifactId>flink-demo-1</artifactId>
    <version>1.0-SNAPSHOT</version>
    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>2.5.5</version>
                <configuration>
                    <archive>
                        <manifest>
                            <mainClass>com.gd.net.NetMain</mainClass>
                        </manifest>
                    </archive>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                </configuration>
                <!--下面是为了使用 mvn package命令,如果不加则使用mvn assembly-->
                <executions>
                    <execution>
                        <id>make-assemble</id>
                        <phase>package</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>

        </plugins>
    </build>

    <dependencies>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-clients_2.11</artifactId>
            <version>1.4.0</version>
        </dependency>


        <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-streaming-java -->
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-streaming-java_2.11</artifactId>
            <version>1.4.0</version>
        </dependency>

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-streaming-scala_2.11</artifactId>
            <version>1.4.0</version>
        </dependency>

        <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-java -->
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-java</artifactId>
            <version>1.4.0</version>
        </dependency>


        <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-connector-kafka-0.9 -->
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-connector-kafka-0.9_2.11</artifactId>
            <version>1.4.0</version>
        </dependency>

        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>fastjson</artifactId>
            <version>1.2.41</version>
        </dependency>

        <dependency>
            <groupId>org.apache.kafka</groupId>
            <artifactId>kafka_2.10</artifactId>
            <version>0.9.0.0</version>
            <!-- 这个必须要加,否则会有jar包冲突导致无法 -->
            <exclusions>
                <exclusion>
                    <groupId>org.slf4j</groupId>
                    <artifactId>slf4j-log4j12</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>5.1.34</version>
        </dependency>
    </dependencies>

</project>

(2) entity.Student

public class Student {
    public int id;
    public String name;
    public String password;
    public int age;

    public Student() {
    }

    public Student(int id, String name, String password, int age) {
        this.id = id;
        this.name = name;
        this.password = password;
        this.age = age;
    }

    @Override
    public String toString() {
        return "Student{" +
                "id=" + id +
                ", name='" + name + '\'' +
                ", password='" + password + '\'' +
                ", age=" + age +
                '}';
    }

    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public String getPassword() {
        return password;
    }

    public void setPassword(String password) {
        this.password = password;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }
}

(3)sink.SinkToMySQL

public class SinkToMySQL extends RichSinkFunction<Student> {
    PreparedStatement ps;
    private Connection connection;

    /**
     *
     *
     * @param parameters
     * @throws Exception
     */
    @Override
    public void open(Configuration parameters) throws Exception {
        super.open(parameters);
        connection = getConnection();
        String sql = "insert into Student(id, name, password, age) values(?, ?, ?, ?);";
        ps = this.connection.prepareStatement(sql);
    }

    @Override
    public void close() throws Exception {
        super.close();
        if (connection != null) {
            connection.close();
        }
        if (ps != null) {
            ps.close();
        }
    }

    /*
     *
     * @param value
     * @param context
     * @throws Exception
     */
    @Override
    public void invoke(Student value, Context context) throws Exception {
        System.out.println(" value:  "  + JSON.toJSONString(value));
        ps.setInt(1, value.getId());
        ps.setString(2, value.getName());
        ps.setString(3, value.getPassword());
        ps.setInt(4, value.getAge());
        ps.executeUpdate();
    }

    private static Connection getConnection() {
        Connection con = null;
        try {
            Class.forName("com.mysql.jdbc.Driver");
            con = DriverManager.getConnection("jdbc:mysql://192.168.0.49:3306/test", "root", "123456");
        } catch (Exception e) {
            System.out.println("-----------mysql get connection has exception , msg = " + e.getMessage());
        }
        return con;
    }

(4) Utils.KafkaUtils2

public class KafkaUtils2 {
    public static final String broker_list = "192.168.0.118:9092";
    public static final String topic = "student";

    public static void writeToKafka() throws InterruptedException {
        Properties props = new Properties();
        props.put("bootstrap.servers", broker_list);
        props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        KafkaProducer producer = new KafkaProducer<String, String>(props);

        for (int i = 1; i <= 100; i++) {
            Student student = new Student(i, "zhisheng" + i, "password" + i, 18 + i);
            ProducerRecord record = new ProducerRecord<String, String>(topic, null, null, JSON.toJSONString(student));
            producer.send(record);
            System.out.println("send data: " + JSON.toJSONString(student));
        }
        producer.flush();
    }

    public static void main(String[] args) throws InterruptedException {
        writeToKafka();
    }
}

(5)Main3

import com.alibaba.fastjson.JSON;
import com.whl.demo.entity.Student;
import com.whl.demo.sink.SinkToMySQL;
import org.apache.flink.api.common.serialization.SimpleStringSchema;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer09;
import org.apache.flink.streaming.api.functions.source.SourceFunction;

import java.util.Properties;
public class Main3 {

    public static void main(String[] args) throws Exception {
        final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

        Properties props = new Properties();
        props.put("bootstrap.servers", "192.168.0.118:9092");
        props.put("zookeeper.connect", "192.168.0.118:2181");
        props.put("group.id", "metric-group");
        props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        props.put("auto.offset.reset", "latest");

        SingleOutputStreamOperator<Student> student = env.addSource(new FlinkKafkaConsumer09<>(
                "student",
                new SimpleStringSchema(),
                props)).setParallelism(1)
                .map(string -> JSON.parseObject(string, Student.class));

        student.addSink(new SinkToMySQL());

        env.execute("Flink add sink");
    }
}

(6) 先运行main3,然后允许kafkaUtils2 就能看到mysql中有数据插入了。

 

Flink是一个分布式流处理框架,能够处理和分析实时数据流。Kafka是一个分布式流式数据处理平台,能够实时地收集、存储和处理大规模数据流。 在Flink读取Kafka数据并将其写入MySQL数据库需要以下步骤: 1. 配置Kafka Consumer:通过配置Kafka Consumer相关的属性,如bootstrap.servers(Kafka的地址)、group.id(消费者组标识)、topic(要读取的主题名称)等。 2. 创建Flink Execution Environment:通过创建Flink执行环境,可以定义Flink作业的运行模式和相关配置。 3. 创建Kafka Data Source:使用FlinkKafka Consumer API创建一个Kafka数据源,通过指定Kafka Consumer的配置和要读取的主题,可以从Kafka获取数据。 4. 定义数据转换逻辑:根据需要,可以使用Flink提供的转换算子对Kafka数据进行处理,如map、filter、reduce等。 5. 创建MySQL Sink:通过配置MySQL数据库的连接信息,如URL、用户名、密码等,创建一个MySQL数据池。 6. 将数据写入MySQL:通过使用FlinkMySQL Sink API,将经过转换后的数据写入MySQL数据库。可以指定要写入的表名、字段映射关系等。 7. 设置并执行作业:将Kafka数据源和MySQL Sink绑定在一起,并设置作业的并行度,然后执行Flink作业。 通过以上步骤,我们可以将Kafka中的数据读取出来,并经过转换后写入MySQL数据库,实现了从KafkaMySQL数据传输。 需要注意的是,在配置Kafka Consumer和MySQL数据库时,要确保其正确性和可用性,以确保数据的正确读取写入。同时,在处理大规模数据流时,还需要考虑分布式部署、容错性和高可用性等方面的问题,以保证系统的稳定性和性能。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值