Flink Stream SQL
Overview
blog https://flink.apache.org/news/2016/05/24/stream-sql.html
提到目前Table API的问题,batch和stream的API各自能支持的query不一样。
However, the original Table API had a few limitations. First of all, it could not stand alone. Table API queries had to be always embedded into a DataSet or DataStream program. Queries against batch Tables did not support outer joins, sorting, and many scalar functions which are commonly used in SQL queries. Queries against streaming tables only supported filters, union, and projections and no aggregations or joins. Also, the translation process did not leverage query optimization techniques except for the physical optimization that is applied to all DataSet programs.
不想再做成一个众多的sql-on-hadoop实现。
继续使用Calcite。为不同的source(streaming 或 static data),使用不同的rule sets。
window agg和join在stream sql上的表达,依赖于Calcite stream SQL对标准SQL的扩展。https://calcite.apache.org/docs/stream.html
下面是tumbling window的一个例子(Calcite语法),
SELECT STREAM
TUMBLE_END(time, INTERVAL '1' DAY) AS day,
location AS room,
AVG((tempF - 32) * 0.556) AS avgTempC