-
Notifications
You must be signed in to change notification settings - Fork 72
Description
To get a list of hive partitions the command SHOW PARTITIONS is used:
dask-sql/dask_sql/input_utils/hive.py
Lines 265 to 277 in ece7ec7
| def _parse_hive_partition_description( | |
| self, | |
| cursor: Union["sqlalchemy.engine.base.Connection", "hive.Cursor"], | |
| schema: str, | |
| table_name: str, | |
| ): | |
| """ | |
| Extract all partition informaton for a given table | |
| """ | |
| cursor.execute(f"USE {schema}") | |
| result = self._fetch_all_results(cursor, f"SHOW PARTITIONS {table_name}") | |
| return [row[0] for row in result] |
For tables with multiple partition keys, it will return a list that is / delimited. For example, if the table has two partition keys, date and region, one of the entries returned might be like this:
date=20210101/region=south
This string is used without modifications to get additional information about each partition using DESCRIBE FORMATTED {table_name} PARTITION ({partition}). If multiple partition keys exist when using this command, the list should be , separated, not / separated.
dask-sql/dask_sql/input_utils/hive.py
Lines 200 to 203 in ece7ec7
| if partition: | |
| result = self._fetch_all_results( | |
| cursor, f"DESCRIBE FORMATTED {table_name} PARTITION ({partition})" | |
| ) |
Because of this, when I try to read in a table with multiple hive partition keys dask-sql throws an error.
The issue #179 might also be somewhat related to the example above.