kafkaTopic:Rides
消息:
{“rideId”:18024,“isStart”:false,“taxiId”:2013006624,“eventTime”:“2013-01-01T00:53:00Z”,“lon”:-73.98275,“lat”:40.747662,“psgCnt”:3}
{“rideId”:18052,“isStart”:false,“taxiId”:2013005498,“eventTime”:“2013-01-01T00:53:00Z”,“lon”:-74.008278,“lat”:40.709415,“psgCnt”:1}
{“rideId”:18087,“isStart”:false,“taxiId”:2013005514,“eventTime”:“2013-01-01T00:53:00Z”,“lon”:-74.012589,“lat”:40.702877,“psgCnt”:5}
{“rideId”:18130,“isStart”:false,“taxiId”:2013004151,“eventTime”:“2013-01-01T00:53:00Z”,“lon”:-73.992706,“lat”:40.731556,“psgCnt”:4}
{“rideId”:18136,“isStart”:false,“taxiId”:2013006161,“eventTime”:“2013-01-01T00:53:00Z”,“lon”:-73.986481,“lat”:40.745056,“psgCnt”:1}
{“rideId”:18154,“isStart”:false,“taxiId”:2013005861,“eventTime”:“2013-01-01T00:53:00Z”,“lon”:-73.955597,“lat”:40.771858,“psgCnt”:5}
{“rideId”:18156,“isStart”:false,“taxiId”:2013005865,“eventTime”:“2013-01-01T00:53:00Z”,“lon”:-74.008041,“lat”:40.708866,“psgCnt”:1}
{“rideId”:18516,“isStart”:false,“taxiId”:2013001713,“eventTime”:“2013-01-01T00:53:00Z”,“lon”:-73.938354,“lat”:40.813011,“psgCnt”:1}
{“rideId”:18528,“isStart”:false,“taxiId”:2013002830,“eventTime”:“2013-01-01T00:53:00Z”,“lon”:-74.015312,“lat”:40.708263,“psgCnt”:1}
DDL语句
CREATE TABLE consume_Rides(
`rideId` BIGINT,
`taxiId` BIGINT,
`isStart` BOOLEAN,
`lon` FLOAT,
`lat` FLOAT,
`eventTime` TIMESTAMP,
`psgCnt` INT
) WITH(
'connector.type' = 'kafka',
'connector.version' = 'universal',
'connector.topic' = 'Rides',
'connector.properties.zookeeper.connect' = 'zookeeper:2181',
'connector.properties.bootstrap.servers' = 'kafka:9092',
'connector.properties.group.id' = 'testGroup',
'connector.startup-mode' = 'earliest-offset',
'update-mode' = '-append',
'format.type' = 'json',
'format.json-schema' = '{
"type": "object",
"properties":{
"rideId":{type: "long"},
"taxiId":{type: "long"},
"isStart":{type: "boolean"},
"lon":{type: "float"},
"lat":{type: "float"},
"eventTime":{type: "bigint"},
"psgCnt":{type: "int"}
}
}'
)
YAML配置文件
tables:
- name: Rides
type: source
update-mode: append
schema:
- name: rideId
type: LONG
- name: taxiId
type: LONG
- name: isStart
type: BOOLEAN
- name: lon
type: FLOAT
- name: lat
type: FLOAT
- name: rideTime
type: TIMESTAMP
rowtime:
timestamps:
type: "from-field"
from: "eventTime"
watermarks:
type: "periodic-bounded"
delay: "60000"
- name: psgCnt
type: INT
connector:
property-version: 1
type: kafka
version: universal
topic: Rides
startup-mode: earliest-offset
properties:
- key: zookeeper.connect
value: zookeeper:2181
- key: bootstrap.servers
value: kafka:9092
- key: group.id
value: testGroup
format:
property-version: 1
type: json
schema: "ROW(rideId LONG, isStart BOOLEAN, eventTime TIMESTAMP, lon FLOAT, lat FLOAT, psgCnt INT, taxiId LONG)"

本文介绍了一个基于Kafka和Flink的实时数据处理系统,通过消费Rides主题的消息,实现了对出租车行程数据的实时分析。系统能够处理大量数据流,包括行程ID、是否开始行程、出租车ID、经度、纬度、事件时间和乘客数量等信息。
1163

被折叠的 条评论
为什么被折叠?



