Spark Datasource Configs
- 读配置
配置项 | 是否必须 | 默认值 | 配置说明 |
---|---|---|---|
as.of.instant | Y | N/A | 0.9.0 版本新增,时间旅行查询从哪儿开始,有两种格式的值:yyyyMMddHHmmss 和 yyyy-MM-dd HH:mm:ss,如果不指定则从最新的 snapshot 开始 |
hoodie.file.index.enable | N | true | |
hoodie.schema.on.read.enable | N | false | |
hoodie.datasource.streaming.startOffset | N | earliest | |
hoodie.datasource.write.precombine.field | N | ts | |
hoodie.datasource.read.begin.instanttime | Y | N/A | |
hoodie.datasource.read.end.instanttime | Y | N/A | |
hoodie.datasource.read.paths | Y | N/A | |
hoodie.datasource.merge.type | N | payload_combine | |
hoodie.datasource.query.incremental.format | N | latest_state | |
hoodie.datasource.query.type | N | snapshot | |
hoodie.datasource.read.extract.partition.values.from.path | N | false | |
hoodie.datasource.read.file.index.listing.mode | N | lazy | |
hoodie.datasource.read.file.index.listing.partition-path-prefix.analysis.enabled | N | true |
- 写配置
配置项 | 是否必须 | 默认值 | 配置说明 |
---|---|---|---|
hoodie.datasource.hive_sync.mode | Y | N/A | |
hoodie.datasource.write.partitionpath.field | Y | N/A | |
hoodie.datasource.write.precombine.field | N | ts | |
hoodie.datasource.write.recordkey.field | Y | N/A | |
hoodie.datasource.write.table.type | N | COPY_ON_WRITE | |
hoodie.datasource.write.insert.drop.duplicates | N | false | 如果设置为 true,则插入时()过滤掉所有重复的记录 |
hoodie.sql.insert.mode | N | upsert | |
hoodie.sql.bulk.insert.enable | N | false | |
hoodie.datasource.write.table.name | Y | N/A | |
hoodie.datasource.write.operation | N | upsert | |
hoodie.datasource.write.payload.class | N | Spark默认为org.apache.hudi.common.model.OverwriteWithLatestAvroPayload,Flink默认为org.apache.hudi.common.model.EventTimeAvroPayload | 指定Payload类 |
hoodie.datasource.write.partitionpath.urlencode | N | false | |
hoodie.datasource.hive_sync.partition_fields | N | N/A | |
hoodie.datasource.hive_sync.auto_create_database | N | true | 自动创建不存在的数据库 |
hoodie.datasource.hive_sync.database | N | default | |
hoodie.datasource.hive_sync.table | N | unknown | |
hoodie.datasource.hive_sync.use_jdbc | N | hive | |
hoodie.datasource.hive_sync.password | N | hive | |
hoodie.datasource.hive_sync.enable | N | false | |
hoodie.datasource.hive_sync.ignore_exceptions | N | false | |
hoodie.datasource.hive_sync.use_jdbc | N | true | |
hoodie.datasource.hive_sync.jdbcurl | N | jdbc:hive2://localhost:10000 | Hive metastore url |
hoodie.datasource.hive_sync.metastore.uris | N | thrift://localhost:9083 | Hive metastore url |
hoodie.datasource.hive_sync.base_file_format | N | PARQUET | |
hoodie.datasource.hive_sync.support_timestamp | N | false | |
hoodie.datasource.meta.sync.enable | N | false | |
hoodie.clustering.inline | N | false | |
hoodie.datasource.write.partitions.to.delete | Y | N/A | 逗号分隔的待删除分区列表,支持星号通配符 |
- PreCommit Validator 配置
配置项 | 是否必须 | 默认值 | 配置说明 |
---|---|---|---|
hoodie.precommit.validators | N | ||
hoodie.precommit.validators.equality.sql.queries | N | ||
hoodie.precommit.validators.inequality.sql.queries | N | ||
hoodie.precommit.validators.single.value.sql.queries | N |