Skip to content

Commit 856aea1

Browse files
authored
[Fix][Hive] Writing parquet files supports the optional timestamp int96 (#8509)
1 parent df14201 commit 856aea1

File tree

3 files changed

+39
-28
lines changed
  • docs
  • seatunnel-connectors-v2/connector-hive/src/main/java/org/apache/seatunnel/connectors/seatunnel/hive/sink

3 files changed

+39
-28
lines changed

docs/en/connector-v2/sink/Hive.md

+19-14
Original file line numberDiff line numberDiff line change
@@ -31,20 +31,21 @@ By default, we use 2PC commit to ensure `exactly-once`
3131

3232
## Options
3333

34-
| name | type | required | default value |
35-
|-------------------------------|---------|----------|----------------|
36-
| table_name | string | yes | - |
37-
| metastore_uri | string | yes | - |
38-
| compress_codec | string | no | none |
39-
| hdfs_site_path | string | no | - |
40-
| hive_site_path | string | no | - |
41-
| hive.hadoop.conf | Map | no | - |
42-
| hive.hadoop.conf-path | string | no | - |
43-
| krb5_path | string | no | /etc/krb5.conf |
44-
| kerberos_principal | string | no | - |
45-
| kerberos_keytab_path | string | no | - |
46-
| abort_drop_partition_metadata | boolean | no | true |
47-
| common-options | | no | - |
34+
| name | type | required | default value |
35+
|---------------------------------------|---------|----------|----------------|
36+
| table_name | string | yes | - |
37+
| metastore_uri | string | yes | - |
38+
| compress_codec | string | no | none |
39+
| hdfs_site_path | string | no | - |
40+
| hive_site_path | string | no | - |
41+
| hive.hadoop.conf | Map | no | - |
42+
| hive.hadoop.conf-path | string | no | - |
43+
| krb5_path | string | no | /etc/krb5.conf |
44+
| kerberos_principal | string | no | - |
45+
| kerberos_keytab_path | string | no | - |
46+
| abort_drop_partition_metadata | boolean | no | true |
47+
| parquet_avro_write_timestamp_as_int96 | boolean | no | false |
48+
| common-options | | no | - |
4849

4950
### table_name [string]
5051

@@ -88,6 +89,10 @@ The keytab path of kerberos
8889

8990
Flag to decide whether to drop partition metadata from Hive Metastore during an abort operation. Note: this only affects the metadata in the metastore, the data in the partition will always be deleted(data generated during the synchronization process).
9091

92+
### parquet_avro_write_timestamp_as_int96 [boolean]
93+
94+
Support writing Parquet INT96 from a timestamp, only valid for parquet files.
95+
9196
### common options
9297

9398
Sink plugin common parameters, please refer to [Sink Common Options](../sink-common-options.md) for details

docs/zh/connector-v2/sink/Hive.md

+19-14
Original file line numberDiff line numberDiff line change
@@ -31,20 +31,21 @@
3131

3232
## 选项
3333

34-
| 名称 | 类型 | 必需 | 默认值 |
35-
|-------------------------------|---------|------|---------|
36-
| table_name | string || - |
37-
| metastore_uri | string || - |
38-
| compress_codec | string || none |
39-
| hdfs_site_path | string || - |
40-
| hive_site_path | string || - |
41-
| hive.hadoop.conf | Map || - |
42-
| hive.hadoop.conf-path | string || - |
43-
| krb5_path | string || /etc/krb5.conf |
44-
| kerberos_principal | string || - |
45-
| kerberos_keytab_path | string || - |
46-
| abort_drop_partition_metadata | boolean || true |
47-
| common-options | || - |
34+
| 名称 | 类型 | 必需 | 默认值 |
35+
|---------------------------------------|---------|----|----------------|
36+
| table_name | string || - |
37+
| metastore_uri | string || - |
38+
| compress_codec | string || none |
39+
| hdfs_site_path | string || - |
40+
| hive_site_path | string || - |
41+
| hive.hadoop.conf | Map || - |
42+
| hive.hadoop.conf-path | string || - |
43+
| krb5_path | string || /etc/krb5.conf |
44+
| kerberos_principal | string || - |
45+
| kerberos_keytab_path | string || - |
46+
| abort_drop_partition_metadata | boolean || true |
47+
| parquet_avro_write_timestamp_as_int96 | boolean || false |
48+
| common-options | || - |
4849

4950
### table_name [string]
5051

@@ -88,6 +89,10 @@ Kerberos 的 keytab 文件路径
8889

8990
在中止操作期间是否从 Hive Metastore 中删除分区元数据的标志。注意:这只影响元存储中的元数据,分区中的数据将始终被删除(同步过程中生成的数据)。
9091

92+
### parquet_avro_write_timestamp_as_int96 [boolean]
93+
94+
支持从时间戳写入 Parquet INT96,仅对 parquet 文件有效。
95+
9196
### 通用选项
9297

9398
Sink 插件的通用参数,请参阅 [Sink Common Options](../sink-common-options.md) 了解详细信息。

seatunnel-connectors-v2/connector-hive/src/main/java/org/apache/seatunnel/connectors/seatunnel/hive/sink/HiveSinkFactory.java

+1
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ public OptionRule optionRule() {
5050
.optional(BaseSinkConfig.REMOTE_USER)
5151
.optional(HiveConfig.HADOOP_CONF)
5252
.optional(HiveConfig.HADOOP_CONF_PATH)
53+
.optional(BaseSinkConfig.PARQUET_AVRO_WRITE_TIMESTAMP_AS_INT96)
5354
.build();
5455
}
5556

0 commit comments

Comments
 (0)