New Features in Apache Impala

This release of Impala contains the following changes and enhancements from previous releases.

New Features in Impala 3.2

The following sections describe the noteworthy improvements made in Impala 3.2.

For the full list of issues closed in this release, see the changelog for Impala 3.2.

Multi-cluster Support

  • Remote File Handle Cache

    Impala can now cache remote HDFS file handles when the cache_remote_file_handles impalad flag is set to true. This feature does not apply to non-HDFS tables, such as Kudu or HBase tables, and does not apply to the tables that store their data on cloud services, such as S3 or ADLS. See Scalabilty Considerations for file handle caching in Impala.

Enhancements in Resource Management and Admission Control

  • Admission Debug page is available in Impala Daemon (impalad) web UI at \admission and provides the following information about Impala resource pools:
    • Pool configuration
    • Relevant pool stats
    • Queued queries in order of being queued (local to the coordinator)
    • Running queries (local to this coordinator)
    • Histogram of the distribution of peak memory usage by admitted queries
  • A new query option, NUM_ROWS_PRODUCED_LIMIT, was added to limit the number of rows returned from queries.

    Impala will cancel a query if the query produces more rows than the limit specified by this query option. The limit applies only when the results are returned to a client, e.g. for a SELECT query, but not an INSERT query. This query option is a guardrail against users accidentally submitting queries that return a large number of rows.

Metadata Performance Improvements

  • Automatic Metadata Sync using Hive Metastore Notification Events

    When enabled, the catalogd polls Hive Metastore (HMS) notifications events at a configurable interval and syncs with HMS. You can use the new web UI pages of the catalogd to check the state of the automatic invalidate event processor.

    Note: This is a preview feature in Impala 3.2.

Compatibility and Usability Enhancements

  • Impala can now read the TIMESTAMP_MILLIS and TIMESTAMP_MICROS Parquet types. See Using Parquet File Format for Impala Tables for the Parquet support in Impala.
  • Impala can now read the complex types in ORC such as ARRAY, STRUCT, and MAP. See Using ORC File Format for Impala Tables for the ORC support in Impala.
  • The LEVENSHTEIN string function is supported.

    The function returns the Levenshtein distance between two input strings, the minimum number of single-character edits required to transform one string to other.

  • The IF NOT EXISTS clause is supported in the ALTER TABLE statement.
  • The new DEFAULT_FILE_FORMAT query option allows you to set the default table file format. This removes the need for the STORED AS <format> clause. Set this option if you prefer a value that is not TEXT. The supported formats are:
    • TEXT
    • RC_FILE
    • SEQUENCE_FILE
    • AVRO
    • PARQUET
    • KUDU
    • ORC
  • The extended or verbose EXPLAIN output includes the following new information for queries:
    • The text of the analyzed query that may have been rewritten to include various optimizations and implicit casts.
    • The implicit casts and literals shown with the actual types.
  • CPU resource utilization (user, system, iowait) metrics were added to the Impala profile output.

Security Enhancement

New Features in Impala 3.1

For the full list of issues closed in this release, including the issues marked as "new features" or "improvements", see the changelog for Impala 3.1.

New Features in Impala 3.0

For the full list of issues closed in this release, including the issues marked as "new features" or "improvements", see the changelog for Impala 3.0.

New Features in Impala 2.12

For the full list of issues closed in this release, including the issues marked as "new features" or "improvements", see the changelog for Impala 2.12.

New Features in Impala 2.11

For the full list of issues closed in this release, including the issues marked as "new features" or "improvements", see the changelog for Impala 2.11.

New Features in Impala 2.10

For the full list of issues closed in this release, including the issues marked as "new features" or "improvements", see the changelog for Impala 2.10.

New Features in Impala 2.9

For the full list of issues closed in this release, including the issues marked as "new features" or "improvements", see the changelog for Impala 2.9.

The following are some of the most significant new features in this release:

New Features in Impala 2.8

New Features in Impala 2.7

New Features in Impala 2.6

New Features in Impala 2.5

New Features in Impala 2.4

New Features in Impala 2.3

The following are the major new features in Impala 2.3.x. This major release contains improvements to SQL syntax (particularly new support for complex types), performance, manageability, security.

In Impala 2.3.2, the bug fix for IMPALA-2598 removes the restriction on using both Kerberos and SSL for internal communication between Impala components.

New Features in Impala 2.8

The following are the major new features in Impala 2.2. This release contains improvements to performance, manageability, security, and SQL syntax.

New Features in Impala 2.1

This release contains the following enhancements to query performance and system scalability:

New Features in Impala 2.0

The following are the major new features in Impala 2.0. This major release contains improvements to performance, scalability, security, and SQL syntax.

New Features in Impala 1.4

The following are the major new features in Impala 1.4:

New Features in Impala 1.3.2

No new features. This point release is exclusively a bug fix release for the IMPALA-1019 issue related to HDFS caching.

New Features in Impala 1.3.1

This point release is primarily a vehicle to deliver bug fixes. Any new features are minor changes resulting from fixes for performance, reliability, or usability issues.

New Features in Impala 1.3

New Features in Impala 1.2.4

Note: Impala 1.2.4 is primarily a bug fix release for Impala 1.2.3, plus some performance enhancements for the catalog server to minimize startup and DDL wait times for Impala deployments with large numbers of databases, tables, and partitions.

New Features in Impala 1.2.3

Impala 1.2.3 contains exactly the same feature set as Impala 1.2.2. Its only difference is one additional fix for compatibility with Parquet files generated outside of Impala by components such as Hive, Pig, or MapReduce. If you are upgrading from Impala 1.2.1 or earlier, see New Features in Impala 1.2.2 for the latest added features.

New Features in Impala 1.2.2

Impala 1.2.2 includes new features for performance, security, and flexibility. The major enhancements over 1.2.1 are performance related, primarily for join queries.

New user-visible features include:

Because Impala 1.2.2 builds on a number of features introduced in 1.2.1, if you are upgrading from an older 1.1.x release straight to 1.2.2, also review New Features in Impala 1.2.1 to see features such as the SHOW TABLE STATS and SHOW COLUMN STATS statements, and user-defined functions (UDFs).

New Features in Impala 1.2.1

Note: The Impala 1.2.1 feature set is a superset of features in the Impala 1.2.0 beta, with the exception of resource management, which relies on resource management infrastructure in the underlying Hadoop distribution.

Impala 1.2.1 includes new features for security, performance, and flexibility.

New user-visible features include:

New Features in Impala 1.2.0 (Beta)

The Impala 1.2.0 beta includes new features for security, performance, and flexibility.

New user-visible features include:

New Features in Impala 1.1.1

Impala 1.1.1 includes new features for security and stability.

New user-visible features include:

New Features in Impala 1.1

Impala 1.1 includes new features for security, performance, and usability.

New user-visible features include:

New Features in Impala 1.0.1

New user-visible features include:

New Features in Impala 1.0

This version has multiple performance improvements and adds the following functionality:

New Features in Version 0.7 of the Impala Beta Release

This version has multiple performance improvements and adds the following functionality:

New Features in Version 0.6 of the Impala Beta Release

New Features in Version 0.5 of the Impala Beta Release

New Features in Version 0.4 of the Impala Beta Release

New Features in Version 0.3 of the Impala Beta Release

New Features in Version 0.2 of the Impala Beta Release