Impala 4.1 Release Notes
- Documentation on Iceberg support has been created, Apache Impala has been added to the list of query engines that support Apache Iceberg.
Apache Iceberg support includes:
- Reading/writing Iceberg V1 tables:
- (write support is only available for Iceberg tables with Parquet data files)
- (V2 tables are also readable if they don’t contain delete delta files)
- (AVRO and mixed file format tables are not yet supported)
- Support for all partition transforms with unified syntax with Hive
- Partition evolution
- Schema evolution
- Time travel function is available with FOR SYSTEM_TIME AS OF and FOR SYSTEM_VERSION AS OF clauses for Iceberg tables. FOR SYSTEM_TIME AS OF conforms to the SQL2011 standard (IMPALA-10840).
- Hive compatible UTF-8 support in string functions. Turned on by setting the query option UTF8_MODE=true (IMPALA-2019).
- Complex types enhancements:
- Support ALTER TABLE UNSET TBLPROPERTIES/SERDEPROPERTIES (IMPALA-5569).
- Support reading/writing Parquet Bloom filters for the most common types (IMPALA-10640, IMPALA-10642).
- Several improvements in scanning ORC tables:
See more in the Epic IMPALA-9040 (login to see all jiras linked to it).
- Reducing HashTable size by packing its buckets efficiently (IMPALA-7635).
- Avoid materialization of columns for filtered out rows in Parquet tables (IMPALA-9873).
- Improve TimestampValue to String casting (IMPALA-10984).
- ACID lock timeouts are now configurable (IMPALA-11153).
- Implementing adaptive 3-way quicksort in sorter. Improves quicksort performance when there is a large number of duplicates (IMPALA-10961).
- Fine grained table refreshing in catalogd at partition level for transactional tables (IMPALA-10923).
- Improve metadata consistency and self events detection in catalogd (IMPALA-10925).
- Skip file metadata reloading in processing AlterPartition events in EventProcessor in catalogd (IMPALA-11050).
See change log