Debezium、Flink、Hudi数仓、湖仓一体的文献搜集

一、网易音乐数仓建设之路:

https://mp.weixin.qq.com/s/FIKCe6oV8NproiKYzis_6w

二、Streamsets是由Informatica前首席产品官Girish Pancha和Cloudera前开发团队负责人Arvind Prabhakar于2014年创立的公司,总部设在旧金山。streamsets产品是一个做大数据ETL的工具,支持包括结构化和半/非结构化数据源,拖拽式的可视化数据流程设计界面。而Streamsets旗下有如下三个产品: streamsets data collector(核心产品,开源):大数据ETL工具;streamsets data collector Edge(开源):将这个组件安装在物联网等设备上,占用少的内存和CPU;streamsets control hub(收费项目):可以将collector编辑好的pipeline放入control hub进行管理,可实现定时调度、管理和pipeline拓扑;
所以之后的介绍都会在streamsets data collector这个核心开源产品

https://blog.youkuaiyun.com/qq_39657909/article/details/107685907

三、实时数据湖:Flink CDC流式写入Hudi

https://mp.weixin.qq.com/s/JkCbvfJhdz9gT-Tw1pUBIA

四、Debezium-Flink-Hudi:实时流式CDC

Debezium是一个非常方便部署使用的CDC工具,可以有效地将RMSDB数据抽取到消息系统中,供不同的下游应用消费。而Flink直接对接Debezium与Hudi的功能,极大方便了数据湖场景下的实时数据ingestion。

https://mp.weixin.qq.com/s?__biz=MzIyMzQ0NjA0MQ==&mid=2247486157&idx=2&sn=eeb1c5f3bbeb32c99933b32152db49c9&chksm=e81f5fbbdf68d6adf91c2638cfc439353221c6a9b2730b6ed0ad69ec727c0912dfbc4b5f1cc3&scene=21#wechat_redirect

/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

package org.apache.flink.streaming.connectors.kafka.table;

import org.apache.flink.api.common.restartstrategy.RestartStrategies;
import org.apache.flink.api.common.serialization.SerializationSchema;
import org.apache.flink.api.common.serialization.SimpleStringSchema;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerBase;
imp
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值