1、概述
官网:https://hudi.apache.org
gitee:https://gitee.com/apache/Hudi
1.1 架构
1.2 特点
-
Upserts, Deletes with fast, pluggable indexing.
-
Incremental queries, Record level change streams
-
Transactions, Rollbacks, Concurrency Control.
-
SQL Read/Writes from Spark, Presto, Trino, Hive & more
-
Automatic file sizing, data clustering, compactions, cleaning.
-
Streaming ingestion, Built-in CDC sources & tools.
-
Built-in metadata tracking for scalable storage access.
-
Backwards compatible schema evolution and enforcement