
大数据
慢熟的孩子
一个思想缓慢发育的孩子
展开
-
Spark Streaming进阶
带状态的算子:UpdateStateByKey实战:计算到目前为止累积出现的单词个数写入到MySQL中基于window的统计实战:黑名单过滤实战:Spark Streaming整合Spark SQL实战updateStateByKey算子需求:统计到目前为止累计出现单词的个数(需要保持以前的状态)import org.apache.spark.SparkConfimport o...原创 2019-10-25 11:07:35 · 196 阅读 · 0 评论 -
Spark Streaming处理案例
Spark Streaming处理socket数据Spark Streaming处理HDFS文件数据Spark Streaming处理socket数据代码import org.apache.spark.SparkConfimport org.apache.spark.streaming.{Seconds, StreamingContext}/** * Spark Stre...原创 2019-10-25 08:08:32 · 156 阅读 · 0 评论 -
Spark Streaming核心
核心概念TransformationsOutput Operatioins核心概念StreamingContextTo initialize a Spark Streaming program, a StreamingContext object has to be created which is the main entry point of all Spark Strea...原创 2019-10-25 00:29:43 · 153 阅读 · 0 评论 -
Spark Streaming入门
概述应用场景集成Spark生态系统的使用发展史从词频统计功能入手工作原理概述Spark Streaming is an extension of the core Spark API that enables scalable, high-throughput, fault-tolerant stream processing of live data streams....原创 2019-10-24 18:36:12 · 278 阅读 · 0 评论 -
整合Flume和Kafka的综合使用
接下来是flume的各个配置文件avro-memory-kafkaavro-memory-kafka.sources = avro-sourceavro-memory-kafka.sinks = kafka-sinkavro-memory-kafka.channels = memory-channelavro-memory-kafka.sources.avro-source....原创 2019-10-24 12:26:15 · 412 阅读 · 0 评论 -
kafka学习
Kafka架构producer:生产者,生产馒头(厨师)consumer:消费者,就是吃馒头的(顾客)broker:篮子topic:主题,给馒头带一个标签,topic的馒头是给一号顾客吃的,topicb的馒头是给二号顾客吃的Kafka is run as a cluster on one or more servers that can span multiple datace...原创 2019-10-20 14:18:35 · 235 阅读 · 0 评论