Impala 4.0 Change Log
New Feature
- [IMPALA-452] - Add support for string concatenation operator using || construct
- [IMPALA-3766] - Optionally compress spilled data before writing it to disk
- [IMPALA-6434] - Add support to decode RLE_DICTIONARY encoded pages
- [IMPALA-7712] - Impala read from and write to GCS
- [IMPALA-7911] - Extend constant propagation for views to include other operators
- [IMPALA-8180] - Change Kudu timestamp writer to round towards minus infinity
- [IMPALA-8242] - Support Iceberg on S3
- [IMPALA-8636] - Implement INSERT for insert-only ACID tables
- [IMPALA-9099] - Allow setting mt_dop manually for queries with joins
- [IMPALA-9234] - Support Ranger row filtering policies
- [IMPALA-9629] - Extend bootstrap_system.sh to support CentOS 8
- [IMPALA-9631] - Import HLL functionality from DataSketches
- [IMPALA-9633] - Implement ds_hll_union() builtin function
- [IMPALA-9793] - Improved Impala quickstart
- [IMPALA-9882] - Import KLL functionality from DataSketches
- [IMPALA-9990] - Kudu table ownership
- [IMPALA-10017] - Implement ds_kll_union() function
- [IMPALA-10018] - Implement ds_kll_rank() function
- [IMPALA-10019] - Implement ds_kll_pmf() function
- [IMPALA-10020] - Implement ds_kll_cdf() function
- [IMPALA-10108] - Implement ds_kll_stringify function
- [IMPALA-10113] - Add feature flag for incremental metadata update
- [IMPALA-10168] - Expose JSON catalog objects in catalogd's debug page
- [IMPALA-10234] - impala-shell: add support for cookie-based authentication
- [IMPALA-10279] - Import CPC functionality from DataSketches
- [IMPALA-10282] - Implement ds_cpc_sketch() and ds_cpc_estimate() functions
- [IMPALA-10317] - Add query option that limits join #rows at runtime
- [IMPALA-10387] - Implement missing overloads of mask functions used in Ranger default masking policies
- [IMPALA-10435] - Extend "compute incremental stats" syntax to support a list of columns
- [IMPALA-10437] - Support SAML 2 browser profile authentication
- [IMPALA-10440] - Import Theta functionality from DataSketches
- [IMPALA-10463] - Implement ds_theta_sketch() and ds_theta_estimate() functions
- [IMPALA-10467] - Implement ds_theta_union() function
- [IMPALA-10483] - Support column-masking/row-filtering policy expressions that contain subqueries
- [IMPALA-10520] - Implement ds_theta_intersect() builtin function
- [IMPALA-10558] - Implement ds_theta_exclude() function.
- [IMPALA-10580] - Implement ds_theta_union_f() function.
- [IMPALA-10581] - Implement ds_theta_intersect_f() function
- [IMPALA-10631] - Upgrade DataSketches to version 3.0.0
- [IMPALA-10632] - Update the Theta sketch serialization interface
Epic
Improvement
- [IMPALA-756] - Improve error message / fallback behavior for impala-shell queries involving tabs
- [IMPALA-2205] - Make cancellation tests check whether all fragments finish
- [IMPALA-2536] - Make ColumnType constructor explicit to prevent certain bugs
- [IMPALA-2563] - Support LDAP search bind operations
- [IMPALA-2658] - Extend the NDV function to accept a precision
- [IMPALA-2783] - Push down filters on rank similar to limit
- [IMPALA-3127] - Decouple partitions from tables
- [IMPALA-3335] - Allow single-node optimization with joins.
- [IMPALA-3816] - Codegen perf-critical loops in Sorter
- [IMPALA-3926] - Reconsider use of LD_LIBRARY_PATH for toolchain libraries
- [IMPALA-4065] - Inline comparator calls into TopN::InsertBatch()
- [IMPALA-4080] - Share codegen work between fragment instances
- [IMPALA-4805] - Avoid hash exchanges before analytic functions in more situations.
- [IMPALA-5444] - Asynchronous code generation
- [IMPALA-6110] - LDAP authentication improvement using multiple LDAP searches instead of only ldap_sasl_bind_s
- [IMPALA-6360] - Don't show full query statement on Impala webUI by default
- [IMPALA-6506] - Codegen in ORC scanner
- [IMPALA-6628] - Use unqualified table references in .test files run from test_queries.py
- [IMPALA-6663] - Expose current DDL metrics (grouped by type) in the Catalog web UI
- [IMPALA-6870] - SummaryStatsCounter should be included in averaged profile
- [IMPALA-7020] - Order by expressions in Analytical functions are not materialized causing slowdown
- [IMPALA-7686] - Allow RANGE() clause before HASH() clause for PARTITION BY
- [IMPALA-7825] - Upgrade Thrift version to 0.11.0
- [IMPALA-7993] - Fix build and scripts to be more useful to developers
- [IMPALA-8013] - Switch from boost:: to std:: locks
- [IMPALA-8125] - Limit number of files generated by insert
- [IMPALA-8301] - Eliminate need for SYNC_DDL in local catalog mode
- [IMPALA-8304] - Generate JUnitXML symptom for compilation/CMake failures
- [IMPALA-8306] - Debug WebUI's Sessions page verbiage clarification
- [IMPALA-8670] - Restructure Impala Maven projects
- [IMPALA-8690] - Better eviction algorithm for data cache
- [IMPALA-8834] - Investigate enabling safe version of OPTIMIZE_PARTITION_KEY_SCANS by default
- [IMPALA-8870] - Bump guava version when building against Hive 3
- [IMPALA-8980] - Remove functional*.alltypesinsert from EE tests
- [IMPALA-9000] - Fix all the TODO-MT comments
- [IMPALA-9046] - Profile counter that indicates if a process or JVM pause occurred
- [IMPALA-9107] - Reduce time spent downloading maven artifacts for precommit tests
- [IMPALA-9156] - Share broadcast join builds between fragments
- [IMPALA-9160] - Remove references to RangerAuthorizationConfig due to changes in Ranger
- [IMPALA-9176] - Make access to null-aware partition from PartitionedHashJoinNode read-only
- [IMPALA-9180] - Remove legacy ImpalaInternalService
- [IMPALA-9191] - Provide a way to build Impala with only one of Sentry / Ranger
- [IMPALA-9218] - Support using development version of Hive with Impala
- [IMPALA-9226] - Improve string allocations of the ORC scanner
- [IMPALA-9228] - ORC scanner could be vectorized
- [IMPALA-9294] - Support DATE for min-max runtime filters
- [IMPALA-9317] - Improve number of instances estimate for scans in planner
- [IMPALA-9318] - Guard rail for mt_dop value
- [IMPALA-9331] - Generate JUnitXML symptom to detect failed dataload due to schema mismatch
- [IMPALA-9362] - Update sqlparse used by impala-shell from version 0.1.19 to latest
- [IMPALA-9422] - Improve join builder profiles
- [IMPALA-9435] - Usability enhancements for the data cache access trace
- [IMPALA-9472] - Keep metrics about the performance of the IO device used for the data cache
- [IMPALA-9473] - Add counts of the number of hits, misses, and cache entries
- [IMPALA-9483] - Add logs for debugging builtin functions throw unknown exception randomly
- [IMPALA-9489] - Setup impala-shell.sh env separately, and use thrift-0.11.0 by default
- [IMPALA-9501] - Upgrade sqlparse to a version that supports python 3.0
- [IMPALA-9530] - Allow limiting memory consumed by preaggregation
- [IMPALA-9531] - Drop support for "dateless timestamps"
- [IMPALA-9537] - Add LDAP auth to the webui
- [IMPALA-9546] - Update ranger-admin-site.xml.template after RANGER-2688
- [IMPALA-9574] - Support ubuntu 18.04 as docker base image
- [IMPALA-9586] - Update query option docs to account for interactions with mt_dop
- [IMPALA-9609] - Minimize Frontend activity in executor only Impalas
- [IMPALA-9643] - Local runtime filters can go missing when mt_dop > 1
- [IMPALA-9646] - Clean up README
- [IMPALA-9679] - Remove some unnecessary jars from docker images
- [IMPALA-9683] - Distcc server bootstrap should support Ubuntu 18.04
- [IMPALA-9690] - Bump minimum x86-64 CPU requirements
- [IMPALA-9691] - Support Kudu Timestamp and Date Bloom Filter
- [IMPALA-9699] - Skip '-1' values when aggregating num_null incremental statistics
- [IMPALA-9716] - Add jitter to the exponential backoff in status reporting
- [IMPALA-9727] - Explain output for Hbase Scans isn't formatted correctly
- [IMPALA-9732] - Improve exceptions of unsupported HdfsTableSink formats
- [IMPALA-9754] - buffer_pool_limit error message is confusing
- [IMPALA-9766] - TestParquet.test_bytes_read_per_column is flaky after IMPALA-6984
- [IMPALA-9770] - Remove Sentry references in documentation
- [IMPALA-9777] - Reduce the diskspace requirements of loading the text version of tpcds.store_sales
- [IMPALA-9778] - Refactor HdfsPartition to be immutable
- [IMPALA-9789] - Disable ineffective bloom filters for Kudu scan
- [IMPALA-9791] - Support validWriteIdList in getPartialCatalogObject
- [IMPALA-9818] - Add fetch size as option to impala shell
- [IMPALA-9843] - Add ability to run schematool against HMS in minicluster
- [IMPALA-9853] - Push rank() predicates into sort
- [IMPALA-9861] - Enable nodiscard for gcc
- [IMPALA-9864] - Produce minidump when TestValidateMetrics.test_metrics_are_zero() fails
- [IMPALA-9885] - Add debug action to simulate slow planning
- [IMPALA-9903] - Queries on a Kudu table call openTable multiple times
- [IMPALA-9913] - Use table id to detect uniqueness of table for drop table event
- [IMPALA-9921] - Parse errors in ToSqlUtils.hiveNeedsQuotes should not be printed to impalad.ERROR
- [IMPALA-9946] - Use table id when comparing the the transactional state of the table
- [IMPALA-9956] - Inlining functions in Sorter::Partition() gives a significant speedup.
- [IMPALA-9959] - Implement ds_kll_sketch() and ds_kll_quantile() functions
- [IMPALA-9962] - Implement ds_kll_quantiles() function
- [IMPALA-9963] - Implement ds_kll_n() function
- [IMPALA-9983] - Push limit from a top level sort onto analytic sort when applicable
- [IMPALA-9989] - Improve admission control pool stats logging
- [IMPALA-9997] - Update to a newer version of LZ4
- [IMPALA-9998] - Investigate updating zstd version
- [IMPALA-10007] - Impala development environment does not support Ubuntu 20.4
- [IMPALA-10027] - Use anonymous user when user is not specified
- [IMPALA-10028] - Additional optimizations of Impala docker container sizes
- [IMPALA-10052] - Expose daemon health on /healthz endpoint for catalogd and statestored as well
- [IMPALA-10064] - Support constant propagation for range predicates
- [IMPALA-10075] - Reuse existing instances of unchanged partitions in REFRESH
- [IMPALA-10076] - Reduce logs about partition level catalog updates
- [IMPALA-10099] - Push down DISTINCT aggregation for EXCEPT/INTERSECT
- [IMPALA-10110] - Separate option to control fpp for bloom filter sizing
- [IMPALA-10112] - Consider skipping FpRateTooHigh() check for bloom filters
- [IMPALA-10117] - Skip calls to FsPermissionCache for blob stores
- [IMPALA-10121] - bin/jenkins/finalize.sh should generate JUnitXML for TSAN failures
- [IMPALA-10147] - Avoid getting a file handle for data cache hits
- [IMPALA-10161] - User LDAP search bind support
- [IMPALA-10164] - Support HadoopCatalog for Iceberg table
- [IMPALA-10165] - Support all partition transforms for Iceberg in create table
- [IMPALA-10172] - Support Hive metastore managed locations for databases
- [IMPALA-10178] - Run-time profile shall report skews
- [IMPALA-10198] - Unify Java components into a single maven project
- [IMPALA-10202] - Enable file handle cache for ABFS files
- [IMPALA-10205] - Avoid MD5 hash for data file path of IcebergTable
- [IMPALA-10206] - Avoid MD5 Digest Authorization for debug Web Server in FIPS mode
- [IMPALA-10207] - Replace MD5 hash for lineage graph
- [IMPALA-10210] - Avoid authentication for connection from a trusted domain over http
- [IMPALA-10218] - Remove dependency on the CDH_BUILD_NUMBER and associated maven repository
- [IMPALA-10225] - Bump Impyla version
- [IMPALA-10226] - Change buildall.sh -notests to invoke a single Make target
- [IMPALA-10237] - Support BUCKET and TRUNCATE partition transforms as built-in functions
- [IMPALA-10266] - Replace instanceof *FileSystem with FS scheme checks
- [IMPALA-10274] - Move impala-python initialization into the CMake build
- [IMPALA-10287] - Distribution strategy is sub-optimal for certain queries
- [IMPALA-10300] - Investigate the need for checking the privilege on server when creating a Kudu table with property of kudu.master_addresses
- [IMPALA-10305] - Sync kudu security code changes for FIPS
- [IMPALA-10313] - When inverting joins stats should be recomputed
- [IMPALA-10314] - Planning time for simple SELECT with LIMIT could be improved
- [IMPALA-10323] - use snprintf instead of lexical_cast when casting int to string, to improve multi-thread performance
- [IMPALA-10332] - Add file formats to HdfsScanNode's thrift representation and codegen for those
- [IMPALA-10343] - control_service_queue_mem_limit default is too low for large clusters
- [IMPALA-10351] - Enable mt_dop for DML
- [IMPALA-10360] - Allow a simple limit to be treated as a sampling hint where applicable
- [IMPALA-10373] - Run impala docker containers as a regular linux user with uid/gid 1000
- [IMPALA-10374] - Limit page iteration at BufferedTupleStream::DebugString()
- [IMPALA-10389] - Container for impala-profile-tool
- [IMPALA-10390] - impala-profile-tool JSON output
- [IMPALA-10406] - Query with analytic function doesn't need to materialize the predicate pushed down to kudu
- [IMPALA-10412] - ConvertToCNFRule can be apply to view table
- [IMPALA-10427] - Remove SkipIfS3.eventually_consistent pytest marker
- [IMPALA-10445] - The ability to adjust NDV's precision with query option
- [IMPALA-10454] - Bump --ssl_minimum_version to tls1.2
- [IMPALA-10455] - Reorder Maven repositories to have cleaner mirror semantics
- [IMPALA-10488] - Add a C++ JWT library to the native toolchain
- [IMPALA-10492] - Lower default MAX_CNF_EXPRS query option
- [IMPALA-10501] - Hit DCHECK in parquet-column-readers.cc: def_levels_.CacheRemaining() <= num_buffered_values_
- [IMPALA-10504] - Add tracing for remote block reads
- [IMPALA-10509] - Add tool to visualize Impala query plan from text profile
- [IMPALA-10516] - Upgrade jackson databind to 2.10.5.1 and slf4j to 1.7.30
- [IMPALA-10519] - Allow setting of num_reactors for KuduClient
- [IMPALA-10544] - Use Centos 7.4 or above for native toolchain
- [IMPALA-10604] - Allow setting KuduClient's verbose logging level directly
- [IMPALA-10605] - Deflake test_refresh_native
- [IMPALA-10606] - Simplify impala-python virtualenv requirements files
- [IMPALA-10608] - Update the virtualenv's kudu-python version to the latest
- [IMPALA-10647] - Improve always-true min/max filter handling in coordinator
- [IMPALA-10652] - False positives when calculating incremental statistics
- [IMPALA-10662] - Remove HS2 vs Beeswax differences is EE tests
- [IMPALA-10677] - Set selectivity of "!="
- [IMPALA-10678] - Support custom SASL protocol name in Kudu client
- [IMPALA-10682] - impala-shell is slow with hs2-http
- [IMPALA-10700] - Introduce an option to skip deleting column statistics on truncate
Bug
- [IMPALA-2794] - Exchange Inactive time in the averaged query profile is always zero
- [IMPALA-4238] - custom_cluster/test_client_ssl.py TestClientSsl.test_ssl AssertionError: SIGINT was not caught by shell within 30s
- [IMPALA-4364] - REFRESH does not pick up ALTER TABLE...PARTITION...SET LOCATION changes
- [IMPALA-5308] - SHOW TABLE STATS for Kudu tables is confusing
- [IMPALA-5534] - Fix and re-enable run-process-failure-tests.sh
- [IMPALA-5746] - Remote fragments continue to hold onto memory after stopping the coordinator daemon
- [IMPALA-6147] - Thrift profile includes counters not directly shown in text profile
- [IMPALA-6267] - MT Scanners do not check runtime filters per-file before processing each split
- [IMPALA-6412] - Memory issues with processing of incoming global runtime filters on coordinator
- [IMPALA-6671] - Metadata operations that modify a table blocks topic updates for other unrelated operations
- [IMPALA-7138] - Fix detection and handling of Device Mapper volumes
- [IMPALA-7779] - Parquet Scanner can write binary data into profile
- [IMPALA-7782] - discrepancy in results with a subquery containing an agg that produces an empty set
- [IMPALA-7833] - Audit and fix other string builtins for long string handling
- [IMPALA-7844] - Analysis code incorrectly attempts to support ordinals in HAVING clause
- [IMPALA-7876] - COMPUTE STATS TABLESAMPLE is not updating number of estimated rows
- [IMPALA-8050] - IS [NOT] NULL gives wrong selectivity when null count is missing
- [IMPALA-8165] - Planner does not push through predicates when there is a disjunction
- [IMPALA-8202] - TestAdmissionControllerStress.test_mem_limit teardown() fails with "Invalid or unknown query handle"
- [IMPALA-8204] - Buildall.sh should check the right Impala-lzo branch
- [IMPALA-8205] - Illegal statistics for numFalse and numTrue
- [IMPALA-8406] - Failed REFRESH can partially modify table without bumping version number
- [IMPALA-8533] - Impala daemon crash on sort
- [IMPALA-8539] - ASAN heap-use-after-free failure during TestGracefulShutdown
- [IMPALA-8547] - get_json_object fails to get value for numeric key
- [IMPALA-8577] - Crash during OpenSSLSocket.read
- [IMPALA-8721] - Wrong result when Impala reads a Hive written parquet TimeStamp column
- [IMPALA-8737] - Patch gperftools to fix O(n) scaling in PageHeap::AllocLarge()
- [IMPALA-8751] - Kudu tables cannot be found after created
- [IMPALA-8830] - Coordinator-only queries get queued when there are no executor groups
- [IMPALA-8857] - test_kudu_col_not_null_changed may fail because client reads older timestamp
- [IMPALA-8908] - Bad error message when failing to connect to HTTPS endpoint with shell
- [IMPALA-8926] - TestResultSpooling::_test_full_queue is flaky
- [IMPALA-8990] - TestAdmissionController.test_set_request_pool seems flaky
- [IMPALA-9050] - test_scanners.TestScanRangeLengths.test_scan_ranges is flaky for kudu
- [IMPALA-9097] - Some backend tests fail if the Hive Metastore is not running
- [IMPALA-9115] - "Exec at coord is" log spam
- [IMPALA-9120] - Refreshing an ABFS table with a deleted directory fails
- [IMPALA-9145] - Compute stats fails with invalid timezone
- [IMPALA-9183] - TPC-DS query 13 - customer_address predicates not propagated to scan
- [IMPALA-9192] - When build with USE_CDP_HIVE=true, Impala should use CDP Avro, Parquet, etc
- [IMPALA-9232] - Potential overflow in SerializeThriftMsg
- [IMPALA-9341] - A grantee gains the delegation privilege after a revoke statement
- [IMPALA-9350] - Ranger audits for column masking not produced
- [IMPALA-9351] - AnalyzeDDLTest.TestCreateTableLikeFileOrc failed due to non-existing path
- [IMPALA-9355] - TestExchangeMemUsage.test_exchange_mem_usage_scaling doesn't hit the memory limit
- [IMPALA-9357] - Fix race condition in alter_database event
- [IMPALA-9398] - When pressing Ctrl+C the content of the shell history gets doubled
- [IMPALA-9415] - DCHECK in ClientRequestState::FetchRowsInternal when using GCC7 with the new ABI
- [IMPALA-9420] - test_scanners.TestOrc.test_type conversions fails after first run
- [IMPALA-9438] - error You need to implement atomic operations for this architecture
- [IMPALA-9513] - query_test.test_kudu.TestKuduOperations.test_column_storage_attributes fails on exhaustive tests
- [IMPALA-9534] - Kudu show create table tests fail due to case difference for external.table.purge
- [IMPALA-9535] - Test for conversion from non-ACID to ACID fail on newer Hive
- [IMPALA-9536] - UdfExecutorTest.HiveStringsTest fails when using newer Hive
- [IMPALA-9539] - Enable the conjunctive normal form rewrites by default
- [IMPALA-9547] - shell.test_shell_commandline.TestImpalaShell.test_socket_opening fails with "Interrupted system call"
- [IMPALA-9548] - UdfExecutorTest failures after HIVE-22893
- [IMPALA-9549] - Impalad startup fails to wait for catalogd to startup when using local catalog
- [IMPALA-9550] - TestResultSpoolingFetchSize.test_fetch is flaky
- [IMPALA-9560] - Changing version from 3.4.0-SNAPSHOT to 3.4.0-RELEASE breaks TestStatsExtrapolation
- [IMPALA-9566] - Sentry service should not be started after IMPALA-8870
- [IMPALA-9571] - Impala fails to start up due to exception from boost::filesystem::remove_all()
- [IMPALA-9572] - Impalad crash when process decimal value
- [IMPALA-9596] - TestNestedTypesNoMtDop.test_tpch_mem_limit_single_node failed
- [IMPALA-9602] - Local catalog cache treats db and table names as case-sensitive
- [IMPALA-9606] - ABFS reads should use hdfsPreadFully
- [IMPALA-9607] - Syntax error query_test.test_kudu.TestKuduOperations.test_column_storage_attributes fails on exhaustive tests
- [IMPALA-9608] - Multiple query tests failure due to org.apache.hadoop.hive.ql.exec.tez.TezTask execution error
- [IMPALA-9611] - Hang in HandoffToProbesAndWait() for multithreaded join build
- [IMPALA-9612] - Runtime filter wait longer than it should be
- [IMPALA-9618] - Usability issues with dev env setup.
- [IMPALA-9620] - Predicates in the SELECT and GROUP-BY cause failure with CNF rewrite enabled
- [IMPALA-9641] - Query hang when containing alias names as empty backticks
- [IMPALA-9649] - Exclude shiro-crypto-core and shiro-core jars from maven download
- [IMPALA-9650] - RuntimeFilterTest appears to be flaky
- [IMPALA-9652] - CTAS doesn't respect transactional properties
- [IMPALA-9653] - Impala shouldn't create/remove staging directory during transactional INSERTs
- [IMPALA-9661] - Avoid introducing unused columns in table masking view
- [IMPALA-9663] - Insert overwrites should not throw NPE.
- [IMPALA-9664] - Support Hive replication for ACID tables
- [IMPALA-9665] - Database not found errors in query_test.test_insert (TestInsertQueries)
- [IMPALA-9667] - TestImpalaShellInteractive failing as session not correctly closed
- [IMPALA-9669] - loaded views are still returned as tables for GET_TABLES in LocalCatalog mode
- [IMPALA-9673] - Tests expecting results to be in test-warehouse/managed but find test-warehouse
- [IMPALA-9677] - FE Analysis tests using fake S3 bucket fail with AnalysisException
- [IMPALA-9678] - Dockerised tests with USE_CDP_HIVE=true crash with orc-metadata-utils.cc DCHECK
- [IMPALA-9680] - Compressed inserts failing
- [IMPALA-9681] - LdapImpalaShellTest.testShellLdapAuth failed
- [IMPALA-9685] - Full-ACID support breaks in LocalCatalog mode
- [IMPALA-9686] - Toolchain Python missing readline support
- [IMPALA-9687] - Plans for Kudu can contains hosts > num of Impala nodes
- [IMPALA-9693] - Predicate in the ORDER BY clause causes failure with cnf rewrite enabled
- [IMPALA-9694] - IllegalStateException when inlineView has AggregationNode and different alias on the same column
- [IMPALA-9701] - data race detected in ConcurrentReaders test in TSAN build
- [IMPALA-9702] - TestDdlStatements::test_alter_table() and TestMixedPartitions::test_incompatible_avro_partition_in_non_avro_table() consistently fail on S3
- [IMPALA-9707] - Parquet stat filtering issue when min/max values are cast to NULL
- [IMPALA-9708] - Remove Sentry support
- [IMPALA-9709] - Remove Impala-lzo support
- [IMPALA-9712] - Hit OOM on TPC-H Q19
- [IMPALA-9714] - SimpleLogger does not respect limits when there are high frequency appends
- [IMPALA-9721] - Fix python 3 compatibility regression in impala-shell
- [IMPALA-9722] - Consolidate unused total_width and the way avg_width is computed in PerColumnStats
- [IMPALA-9725] - LEFT ANTI JOIN produces wrong result when PHJ build spills
- [IMPALA-9729] - TestImpalaShell.test_summary fails with Could not execute command: summary
- [IMPALA-9731] - Remove USE_CDP_HIVE=false and associated code
- [IMPALA-9735] - Shell tests on Centos 7 failing in get_python_version_for_shell_env
- [IMPALA-9736] - "MT_DOP not supported for plans with base table joins or table sinks" error is out of date
- [IMPALA-9737] - DCHECK in buffer-pool.cc - min_bytes_to_write <= dirty_unpinned_pages_.bytes()
- [IMPALA-9743] - IndexOutOfBoundsException in Analyze test when touching partitions of functional.alltypes
- [IMPALA-9745] - SELECT from view fails with "AnalysisException: No matching function with signature: to_timestamp(TIMESTAMP, STRING)" after expression rewrite.
- [IMPALA-9749] - ASAN builds should not run FE Tests
- [IMPALA-9751] - TestHS2.test_get_exec_summary is flaky
- [IMPALA-9753] - Possible bug in TRUNCATE of ACID tables on S3
- [IMPALA-9755] - Flaky test: test_global_exchange_counters
- [IMPALA-9756] - Queries are not guaranteed to be cancelled before unregistration
- [IMPALA-9760] - Use different locations for native toolchain packages built with different compilers
- [IMPALA-9761] - Fix GCC 7 compilation issue: Ambiguous else warning with gtest macros
- [IMPALA-9762] - Fix GCC7 compilation issue: shift-count-overflow in tuple-row-compare.cc
- [IMPALA-9763] - impala查询偶尔报错 -- Impala queries occasionally report errors
- [IMPALA-9767] - ASAN crash during coordinator runtime filter updates
- [IMPALA-9775] - Failure in TestAcid.test_acid_heartbeats
- [IMPALA-9776] - Fix test failure in add_test_dimensions
- [IMPALA-9781] - Fix GCC 7 runtime issue: Unaligned loads and stores for int128_t types
- [IMPALA-9782] - KuduPartitionExpr is not thread-safe
- [IMPALA-9787] - Catalog spins on one core when memory-based invalidation is enabled
- [IMPALA-9790] - Dockerized daemons should set --hostname to the resolved IP
- [IMPALA-9794] - OutOfMemoryError when loading tpcds text data via Hive
- [IMPALA-9798] - TestScratchDir.test_multiple_dirs fails to start impalad
- [IMPALA-9799] - Flakiness in TestFetchFirst due to wrong results of get_num_in_flight_queries
- [IMPALA-9800] - BE test parquet-plain-test crashes in ubsan test
- [IMPALA-9801] - E2E tests crashed in DecimalUtil::DecodeFromFixedLenByteArray
- [IMPALA-9802] - TestCompressedFormats.test_compressed_formats fails in HDFS copy
- [IMPALA-9804] - Fix up LD_LIBRARY_PATH for bin/impala-shell.sh
- [IMPALA-9809] - A query with multi-aggregation functions on particular dataset crashes impala daemon
- [IMPALA-9814] - Analytic planner can under-parallelise with mt_dop
- [IMPALA-9815] - Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000.xxxx during build
- [IMPALA-9820] - Pull in DataSketches HLL MurmurHash fix
- [IMPALA-9830] - TestMtDopScanNode.test_mt_dop_scan_node fails BytesRead > 0 assert
- [IMPALA-9831] - TestScannersFuzzing::test_fuzz_alltypes() hits DCHECK in parquet-page-reader.cc
- [IMPALA-9834] - test_query_retries.TestQueryRetries is flaky on erasure coding configurations
- [IMPALA-9835] - Log spam about kudu_scan_token containing non-UTF-8 values
- [IMPALA-9837] - Switch native-toolchain to use GCC 7.5.0
- [IMPALA-9838] - Switch to GCC 7.5.0
- [IMPALA-9840] - ThreadSanitizer: data race internal-queue.h in InternalQueueBase::Enqueue
- [IMPALA-9842] - TestValidateMetrics.test_metrics_are_zero fails with num-fragments-in-flight not reaching zero
- [IMPALA-9845] - Ant had a new release, so bootstrap_system.sh can't find the old one on Centos
- [IMPALA-9851] - Query status can be unbounded in size
- [IMPALA-9858] - Wrong partition hit/request metrics in profile of LocalCatalog
- [IMPALA-9862] - Impala fails to start up due to ClassNotFoundException: SolrException
- [IMPALA-9866] - Query Plan in Debug UI Constantly Refreshes After Completion
- [IMPALA-9871] - Toolchain bootstrap download fails on SLES12 sp5
- [IMPALA-9878] - Use-after-free in tmp-file-mgr-test.cc
- [IMPALA-9886] - Maven exclusion for Kafka should also exclude version kafka_2.12
- [IMPALA-9887] - ASAN builds timeout frequently
- [IMPALA-9889] - test_runtime_filters flaky on Kudu table format
- [IMPALA-9894] - TmpFile incorrectly uses default hdfs queue
- [IMPALA-9907] - NullPointerException in ParallelFileMetadataLoader's load() method
- [IMPALA-9911] - IS [NOT] NULL predicate selectivity estimate is wrong if #nulls is 0
- [IMPALA-9918] - HdfsOrcScanner crash on resolving columns
- [IMPALA-9923] - Data loading of TPC-DS ORC fails with "Fail to get checksum"
- [IMPALA-9929] - Unsupported subquery in select list throws IllegalStateException instead of AnalysisException
- [IMPALA-9940] - Kudu util build is missing dependency on generated protobuf
- [IMPALA-9941] - ExprTest.CastExprs fails when running with ASAN
- [IMPALA-9949] - Subqueries in select can result in rows not being returned
- [IMPALA-9952] - Invalid offset index in Parquet file
- [IMPALA-9953] - Shell does not return all rows if a fetch times out in FINISHED state
- [IMPALA-9955] - Internal error for a query with large rows and spilling
- [IMPALA-9957] - Impalad crashes when serializing large rows in aggregation spilling
- [IMPALA-9961] - Invalid memory access in SimpleDataFormatTokenizer
- [IMPALA-9964] - CatalogServiceCatalog.setFileMetadataFromFS() doesn't fill insert/delete file descriptors
- [IMPALA-9966] - Add missing breaks in SetQueryOption
- [IMPALA-9980] - Remove jersey* jars from exclusions.
- [IMPALA-9985] - CentOS 8 builds break with __glibc_has_include ("__linux__/stat.h")
- [IMPALA-10005] - Impala can't read Snappy compressed text files on S3 or ABFS
- [IMPALA-10006] - Better handling of non-writable /opt/impala/logs in containers
- [IMPALA-10012] - ds_hll_sketch() results ascii codec decoding error
- [IMPALA-10024] - CatalogServiceCatalog.isBlacklistedDb should do a case-insensitive comparison
- [IMPALA-10036] - Admission control incorrectly rejecting query based on coordinator limit
- [IMPALA-10037] - BytesRead check in TestMtDopScanNode.test_mt_dop_scan_node is flaky
- [IMPALA-10039] - Expr-test crash in ExprTest.LiteralExprs during core run
- [IMPALA-10043] - Keep all the logs when using EE_TEST_SHARDS > 1
- [IMPALA-10044] - bin/bootstrap_toolchain.py error handling can delete the toolchain directory
- [IMPALA-10047] - Performance regression on short queries due to IMPALA-6984 fix
- [IMPALA-10050] - DCHECK was hit possibly while executing TestFailpoints::test_failpoints
- [IMPALA-10051] - impala-shell exits with ValueError with WITH clauses
- [IMPALA-10054] - test_multiple_sort_run_bytes_limits fails in parallel-all-tests-nightly
- [IMPALA-10055] - DCHECK was hit while executing e2e test TestQueries::test_subquery
- [IMPALA-10058] - Kudu queries hit error "Unable to deserialize scan token"
- [IMPALA-10062] - TestCompressedNonText.test_insensitivity_to_extension can fail due to wrong filename
- [IMPALA-10070] - TestImpalaShellInteractive.test_cancellation_mid_command fails on Ubuntu 18.04
- [IMPALA-10072] - Data load failures in ubuntu-16.04-from-scratch
- [IMPALA-10077] - test_concurrent_invalidate_metadata timed out
- [IMPALA-10080] - Skip loading HDFS cache pools for non-HDFS file systems
- [IMPALA-10087] - IMPALA-6050 causes alluxio not to be supported
- [IMPALA-10092] - Some tests in custom_cluster/test_kudu.py do not run even they are not explicitly disabled.
- [IMPALA-10094] - TestResetMetadata.test_refresh_updated_partitions fails due to connection error
- [IMPALA-10096] - May throw exception after expr rewritten , if the group by ordinal reference is still a numeric literal
- [IMPALA-10109] - fetch fails in TestQueryRetries.test_retries_from_cancellation_pool
- [IMPALA-10115] - Impala should check file schema as well to check full ACIDv2 files
- [IMPALA-10119] - TestImpalaShellInteractive.test_history_does_not_duplicate_on_interrupt
- [IMPALA-10124] - admission-controller-test fails with no such file or directory error
- [IMPALA-10127] - LIRS enforcement of tombstone limit has pathological performance scenarios
- [IMPALA-10129] - Data race in MemTracker::GetTopNQueriesAndUpdatePoolStats
- [IMPALA-10140] - Throw CatalogException for query "create database if not exist" with sync_ddl as true
- [IMPALA-10143] - TestAcid.test_full_acid_original_files
- [IMPALA-10145] - UnicodeDecodeError in Thrift 0.11.0 generated files
- [IMPALA-10154] - Data race on coord_backend_id
- [IMPALA-10155] - Apparent data race in GetTopNQueriesAndUpdatePoolStats
- [IMPALA-10156] - test_unmatched_schema flaky with wrong results
- [IMPALA-10157] - IllegalStateException when using grouping() or grouping_id() with no GROUP BY clause
- [IMPALA-10158] - test_iceberg_query and test_iceberg_profile fail after IMPALA-9741
- [IMPALA-10167] - Docs Typo with DEFAULT_TRANSACTIONAL_TYPE
- [IMPALA-10177] - run-hive-jdbc.sh throws ClassNotFoundException exception.
- [IMPALA-10179] - After inverting a join's inputs the join's parallelism does not get reset
- [IMPALA-10182] - Rows with NULLs filtered out with duplicate columns in subquery select inside UNION ALL
- [IMPALA-10183] - Hit promise DCHECK while looping result spooling tests
- [IMPALA-10192] - IllegalStateException in processing column masking audit events
- [IMPALA-10193] - Limit the memory usage of the whole mini-cluster
- [IMPALA-10216] - BufferPoolTest.WriteErrorBlacklistCompression is flaky on UBSAN builds
- [IMPALA-10220] - Min value of RpcNetworkTime can be negative
- [IMPALA-10229] - Analytic limit pushdown optimization can be applied incorrectly based on predicates present
- [IMPALA-10230] - column stats num_nulls less than -1
- [IMPALA-10233] - Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned table with zorder
- [IMPALA-10235] - Averaged timer profile counters can be negative for trivial queries
- [IMPALA-10243] - ConcurrentModificationException during parallel INSERTs
- [IMPALA-10245] - Test fails in TestKuduReadTokenSplit.test_kudu_scanner
- [IMPALA-10248] - TestKuduOperations.test_column_storage_attributes on exhaustive tests
- [IMPALA-10252] - Query returns less number of rows with run-time filtering on integer column in a subquery against functional_parquet schema
- [IMPALA-10255] - query_test.test_insert.TestInsertQueries.test_insert fails in exhaustive builds
- [IMPALA-10256] - TestDisableFeatures.test_disable_incremental_metadata_updates fails
- [IMPALA-10257] - Hit DCHECK in HdfsParquetScanner::CheckPageFiltering in a CORE S3 build
- [IMPALA-10258] - TestQueryRetries.test_original_query_cancel is flaky
- [IMPALA-10259] - Hit DCHECK in TestImpalaShell.test_completed_query_errors_2
- [IMPALA-10261] - impala-minimal-hive-exec should include org/apache/hive/com/google/**
- [IMPALA-10267] - Impala crashes in HdfsScanner::WriteTemplateTuples() with negative num_tuples
- [IMPALA-10276] - Release build sees SIGSEGV when updating the total time counter
- [IMPALA-10277] - TestDebugActions.test_catalogd_debug_actions hits assert on s3 test run
- [IMPALA-10278] - impalad_executor Docker container fails to find JniUtil
- [IMPALA-10283] - IllegalStateException in applying incremental partition updates
- [IMPALA-10286] - metadata.test_catalogd_debug_actions test failed due to assertion
- [IMPALA-10294] - Improvement to test_skew_reporting_in_runtime_profile
- [IMPALA-10299] - Impala-shell hangs in printing partial UTF-8 characters
- [IMPALA-10302] - test_scanners_fuzz.py does not log the random seed
- [IMPALA-10303] - Fix warnings from impala-shell with --quiet
- [IMPALA-10304] - Pytest logging is not using the expected INFO log level
- [IMPALA-10312] - test_shell_interactive.TestImpalaShellInteractive AssertionError: alter query should be closed
- [IMPALA-10318] - default_transactional_type shouldn't affect Iceberg tables
- [IMPALA-10325] - Parquet scan should use min/max statistics to skip pages based on equi-join predicate
- [IMPALA-10333] - shell.test_shell_commandline.TestImpalaShell.test_utf8_decoding_error_handling failing
- [IMPALA-10336] - test_type_conversions_hive3 fails because incorrect error is returned to client
- [IMPALA-10337] - DCHECK hit at SpillableRowBatchQueue when row size exceed max reservation
- [IMPALA-10340] - Cannot set up KDC from scratch
- [IMPALA-10345] - Impala hits DCHECK in parquet-column-stats.inline.h result in Impala Daemon breakdown
- [IMPALA-10358] - Correct Iceberg type mappings
- [IMPALA-10362] - FE test testShellLdapAuth() seems to be flaky
- [IMPALA-10363] - test_mixed_catalog_ddls_with_invalidate_metadata failed after reaching timeout (120 seconds)
- [IMPALA-10364] - Set the real location for external Iceberg tables stored in HadoopCatalog
- [IMPALA-10366] - TestRuntimeProfile.test_runtime_profile_aggregated failed on master core run
- [IMPALA-10367] - Impala-shell internal error - UnboundLocalError, local variable 'retry_msg' referenced before assign
- [IMPALA-10379] - NoClassDefFoundError: org/apache/hadoop/hive/ql/parse/Quotation
- [IMPALA-10382] - Predicate with coalesce on both sides of LOJ isn't NULL filtering
- [IMPALA-10383] - Data race in AdmissionController::WaitOnQueued
- [IMPALA-10384] - Make partition names consistent between BE and FE
- [IMPALA-10385] - bootstrap_system.sh fails when installing snappy-devel on Centos 8.3
- [IMPALA-10386] - Don't allow PARTITION BY SPEC for non-Iceberg tables
- [IMPALA-10391] - LIRS cache edge case where there is exactly one unprotected element
- [IMPALA-10393] - Iceberg field id-based column resolution fails in ASAN builds
- [IMPALA-10397] - TestAutoScaling.test_single_workload failed in exhaustive release build
- [IMPALA-10398] - Altering an Iceberg table might throw NullPointerException
- [IMPALA-10413] - Impala crashing when retrying failed query
- [IMPALA-10414] - Retrying failed query may cause memory leak
- [IMPALA-10416] - Testfile can't deal with non-ascii results
- [IMPALA-10419] - pytest hits UnicodeDecodeError in reporting assert failures for UTF-8 results
- [IMPALA-10422] - EXPLAIN statements leak ACID transactions and locks
- [IMPALA-10424] - Fix race on not_admitted_reason in AdmissionController
- [IMPALA-10426] - Impala crashes when it tries to write invalid timestamp value with INT64 Parquet timestamp type
- [IMPALA-10434] - impala-shell crash in parsing multiline queries that contain UTF-8 characters
- [IMPALA-10441] - query_test.test_scanners.TestParquet.test_bytes_read_per_column fails on S3 and EC builds
- [IMPALA-10450] - Catalogd crashes in serializing thrift debug string
- [IMPALA-10460] - Impala should write normalized paths in Iceberg manifests
- [IMPALA-10462] - TestIcebergTable::test_create_iceberg_tables fails with ClassNotFoundException on newer Hive/Iceberg
- [IMPALA-10473] - Order by a constant should not be ignored in row_number()
- [IMPALA-10482] - Select-star query on unrelative collection column of transactional table hits IllegalStateException
- [IMPALA-10493] - Using JOIN ON syntax to join two full ACID collections produces wrong results
- [IMPALA-10497] - test_no_fd_caching_on_cached_data failing
- [IMPALA-10523] - Impala-shell crash in printing error messages that contain UTF-8 characters
- [IMPALA-10526] - BufferPoolTest.Multi8RandomSpillToRemoteMix failed in tsan build
- [IMPALA-10527] - DiskIoMgrTest.WriteToRemotePartialFileSuccess failed in tsan build
- [IMPALA-10528] - DiskIoMgrTest.WriteToRemoteDiffPagesSuccess failed in asan build
- [IMPALA-10529] - Hit DCHECK in DiskIoMgr::AssignQueue in core-s3 build
- [IMPALA-10530] - DiskIoMgrTest.WriteToRemoteEvictLocal failed in asan build
- [IMPALA-10531] - TmpFileMgrTest.TestCompressBufferManagementEncryptedRemoteUpload failed in exhaustive release build
- [IMPALA-10532] - TestOverlapMinMaxFilters.test_overlap_min_max_filters seems flaky
- [IMPALA-10533] - TestScratchDir.test_scratch_dirs_mix_local_and_remote_dir_spill_local_only seems flaky
- [IMPALA-10534] - TestScratchDir.test_scratch_dirs_remote_spill_with_options seems flaky
- [IMPALA-10547] - TPC-DS "reason" table missing from Kudu schema
- [IMPALA-10554] - Block modifications when row-filter/column-mask is enabled for the user
- [IMPALA-10555] - Hit DCHECK in TmpFileGroup::RecoverWriteError
- [IMPALA-10559] - TestScratchLimit seems flaky
- [IMPALA-10564] - No error returned when inserting an overflowed value into a decimal column
- [IMPALA-10565] - Validate max_spilled_result_spooling_mem against scratch_limit
- [IMPALA-10571] - ImpalaJdbcClient might silently choose a different driver than the one specified
- [IMPALA-10579] - Deadloop in table metadata loading when using an invalid RemoteIterator
- [IMPALA-10582] - Webpage of Catalogd operations doesn't sum up operations correctly
- [IMPALA-10583] - Result spooling hang with unbounded spooling mem limit.
- [IMPALA-10584] - Investigate intermittent crash in TestScratchLimit::test_with_unlimited_scratch_limit
- [IMPALA-10592] - Exhaustive tests timeout after 20 hours
- [IMPALA-10597] - Enable setting 'iceberg.file_format'
- [IMPALA-10600] - Provide fewer details in logs
- [IMPALA-10607] - TestDecimalOverflowExprs::test_ctas_exprs failed in S3 build
- [IMPALA-10609] - NullPointerException in loading tables introduced by ranger masking policies
- [IMPALA-10611] - test_wide_row fails with 'Failed to allocate row batch'
- [IMPALA-10618] - Kudu in the minicluster fails to start in Ubuntu 20.04 container during Docker-based tests
- [IMPALA-10624] - TestIcebergTable::test_alter_iceberg_tables failed by stale file format
- [IMPALA-10629] - bin/load-data.py does not respect compression codec for parquet
- [IMPALA-10644] - RangerAuthorizationFactory cannot be instantiated after latest GBN bump up
- [IMPALA-10646] - Toolchain bootstrap download fails on Red Hat platforms
- [IMPALA-10658] - LOAD DATA INPATH silently fails between HDFS and Azure ABFS
- [IMPALA-10691] - Impala crashes sporadically when multiple CAST(FORMAT)
- [IMPALA-10692] - Inserting to ACID tables are broken in local_catalog mode with hms event polling
- [IMPALA-10704] - test_retry_query_result_cacheing_failed and test_retry_query_set_query_in_flight_failed are flaky
- [IMPALA-10728] - Impala should check access privileges inside masking expressions
- [IMPALA-10744] - Send INSERT events even when Impala's even processing is not enabled
- [IMPALA-10755] - Wrong results for a query with predicate on an analytic function
- [IMPALA-10765] - IllegalStateException when inserting empty results to unpartitioned table with event processor enabled
Task
- [IMPALA-1270] - Consider adding distinct aggregation to subqueries as perf optimization
- [IMPALA-3695] - Remove KUDU_IS_SUPPORTED
- [IMPALA-3741] - Push bloom filters to Kudu scanners
- [IMPALA-5960] - Add TPC-DS reason table to data load and enable q85 and q93
- [IMPALA-6861] - Fix OpenSSL initialization
- [IMPALA-9236] - Ported native-toolchain to work on aarch64
- [IMPALA-9478] - Runtime profiles should indicate if custom UDFs are being used
- [IMPALA-9642] - Set USE_CDP_HIVE to true by default
- [IMPALA-9647] - Exclude or update fluent-hc jar
- [IMPALA-9697] - Support priority based scratch directory selection
- [IMPALA-9829] - Add write metrics for Spilling
- [IMPALA-9856] - Enable result spooling by default
- [IMPALA-9987] - Improve logging around HTTP connections
- [IMPALA-9988] - Integrate ldap filters and proxy users
- [IMPALA-10010] - Allow unathenticated access to some webui endpoints
- [IMPALA-10034] - Include all tpc-ds queries in tpcds testdata workload
- [IMPALA-10053] - Remove uses of MonoTime::GetDeltaSince()
- [IMPALA-10060] - Postgres JDBC driver should be upgraded to 42.2.14
- [IMPALA-10074] - Set impala-shell's default protocol to hs2
- [IMPALA-10095] - Include query plan tests for all of TPC-DS
- [IMPALA-10103] - Jquery upgrade to 3.5.1
- [IMPALA-10378] - Retire support for Debian 8
- [IMPALA-10421] - Documented JOIN_ROWS_PRODUCED_LIMIT
- [IMPALA-10459] - Remove workaround codes for MAPREDUCE-6441
Sub-task
- [IMPALA-2515] - Impala rejects Parquet schemas where decimal fixed_len_byte_array columns have unnecessary padding bytes
- [IMPALA-3380] - Add TCP timeouts to all RPCs that don't block
- [IMPALA-4192] - Pull all expressions in a fragment into QueryState
- [IMPALA-4973] - Convert UnionStmt class into to SetOperationStmt
- [IMPALA-4974] - Add INTERSECT and EXCEPT support to SetOperationStmt
- [IMPALA-6101] - DataStreamMgr::Cancel() should take a query ID instead of a finst ID
- [IMPALA-6692] - When partition exchange is followed by sort each sort node becomes a synchronization point across the cluster
- [IMPALA-6788] - Abort ExecFInstance() RPC loop early after query failure
- [IMPALA-6984] - Coordinator should cancel backends when returning EOS
- [IMPALA-7097] - Print EC info in the query plan and profile
- [IMPALA-7501] - Slim down metastore Partition objects in LocalCatalog cache
- [IMPALA-7533] - Optimize fetch-from-catalog by caching partitions across table versions
- [IMPALA-7538] - Support HDFS caching with LocalCatalog
- [IMPALA-8291] - 'DESCRIBE EXTENDED ..' does not display constraint information
- [IMPALA-8632] - Add support for self-event detection for insert events
- [IMPALA-8769] - Impala Doc: Change the shell default to HS2
- [IMPALA-8954] - Support uncorrelated subqueries in the select list
- [IMPALA-9199] - Add support for single query retries on cluster membership changes
- [IMPALA-9213] - Client logs should indicate if a query has been retried
- [IMPALA-9224] - Blacklist nodes with faulty disks
- [IMPALA-9225] - Retryable queries should spool all results before returning any to the client
- [IMPALA-9229] - Link failed and retried runtime profiles
- [IMPALA-9366] - Remove embedded pointer references in handcrafted codegen code
- [IMPALA-9373] - Trial run of IWYU on codebase
- [IMPALA-9374] - Possible data race in TupleDescriptor::GetLlvmStruct
- [IMPALA-9380] - Serialize query profile asynchronously
- [IMPALA-9382] - Prototype denser runtime profile implementation
- [IMPALA-9401] - Add initial IWYU mappings
- [IMPALA-9426] - Download Python dependencies even skipping bootstrap toolchain
- [IMPALA-9428] - Add arm64 atomic ops
- [IMPALA-9484] - Milestone 1: properly scan files that has full ACID schema
- [IMPALA-9485] - Enable file handle cache for EC files
- [IMPALA-9502] - Avoid copying TExecRequest when retrying queries
- [IMPALA-9512] - Milestone 2: Validate each row against the valid write id list
- [IMPALA-9515] - Milestone 3: Reading “original files”
- [IMPALA-9538] - Bump up linux-syscall-support.h to newest version
- [IMPALA-9543] - Reduce duplicate code in thrift CMakeLists.txt
- [IMPALA-9544] - Replace Intel's SSE instructions with ARM's NEON instructions
- [IMPALA-9545] - Decide cacheline size of aarch64
- [IMPALA-9561] - Change hadoop-ozone-filesystem dependency to hadoop-ozone-filesystem-lib-current
- [IMPALA-9565] - Remove unused included file mm_malloc.h on ARM
- [IMPALA-9568] - Template tuples are initialized multiple times
- [IMPALA-9569] - Query progress bar freezes when a query is retried
- [IMPALA-9585] - Update docs about mt_dop for IMPALA-9099
- [IMPALA-9590] - Resolve error when build tsan and ubsan on arm64
- [IMPALA-9604] - Add tpch_nested tests for column masking
- [IMPALA-9626] - Use Python 2.7 from toolchain
- [IMPALA-9630] - Keep blocking queue cache line aligned on aarch64
- [IMPALA-9636] - Retried queries that blacklist nodes should ensure they don't run on the blacklisted node
- [IMPALA-9645] - Port LLVM codegen to adapt aarch64
- [IMPALA-9655] - Dynamic intra-node load balancing for HDFS scans
- [IMPALA-9668] - OSError: Cannot call rmtree on a symbolic link when creating python virtualenv
- [IMPALA-9676] - Add aarch64 compile options for clang
- [IMPALA-9692] - Model QuerySchedule as a protobuf
- [IMPALA-9711] - Incrementally compute averaged profile
- [IMPALA-9718] - Remove pkg_resources.py from Impala/shell
- [IMPALA-9719] - Upgrade sasl-0.1.1 to 0.2.1 in Impala/shell/ext-py
- [IMPALA-9720] - Upgrade bitarray from 0.9.0 to 1.2.1 in Impala/shell/ext-py
- [IMPALA-9730] - TSAN data race in RuntimeFilterBank::CancelLocked()
- [IMPALA-9739] - TSAN data races during impalad shutdown
- [IMPALA-9740] - TSAN data race in hdfs-bulk-ops
- [IMPALA-9741] - Support query iceberg table by impala
- [IMPALA-9744] - Treat corrupt table stats as missing to avoid bad plans
- [IMPALA-9752] - Move instance profile operations to executors
- [IMPALA-9784] - Support uncorrelated scalar subqueries in HAVING
- [IMPALA-9812] - Remove --unlock_mt_dop and--mt_dop_auto_fallback
- [IMPALA-9844] - Ozone support for load data inpath
- [IMPALA-9847] - JSON profiles are mostly space characters
- [IMPALA-9849] - Set halt_on_error=1 for TSAN builds
- [IMPALA-9854] - TSAN data race in QueryDriver::CreateRetriedClientRequestState
- [IMPALA-9855] - TSAN lock-order-inversion warning in QueryDriver::RetryQueryFromThread
- [IMPALA-9859] - Milestone 4: Read updated tables
- [IMPALA-9865] - Utility to pretty-print thrift profiles at various levels
- [IMPALA-9870] - summary and profile command in impala-shell should show both original and retried info
- [IMPALA-9897] - Parser support for ROLLUP, CUBE and GROUPING SETS
- [IMPALA-9898] - Plan generation and execution for ROLLUP, CUBE and GROUPING SETS
- [IMPALA-9904] - Fix bad cipher test failed case on aarch64
- [IMPALA-9906] - Fix thread-pool-test failed case on aarch64
- [IMPALA-9917] - Support grouping() and grouping_id() functions
- [IMPALA-9924] - Add support for single IN in disjunction
- [IMPALA-9925] - cast(pow(2, 31) as int) return 2147483647 on aarch64
- [IMPALA-9926] - base64decode % will not return error when in newer OS
- [IMPALA-9930] - Introduce new admission control rpc service
- [IMPALA-9936] - Only send invalidations in DDL responses to LocalCatalog coordinators
- [IMPALA-9943] - Add INTERSECT and EXCEPT with DISTINCT Qualifier
- [IMPALA-9954] - RpcRecvrTime can be negative
- [IMPALA-9972] - Use defined referential constraints for join cardinality calculations
- [IMPALA-9975] - Introduce new admission control daemon
- [IMPALA-9979] - Backend partitioned top-n operator
- [IMPALA-9995] - Fix test_alloc_fail failed case on aarch64
- [IMPALA-10016] - Split jars for Impala executor and coordinator Docker images
- [IMPALA-10029] - Strip debug symbols from libkudu_client and libstdc++ binaries
- [IMPALA-10030] - Remove unneeded jars from fe/pom.xml
- [IMPALA-10061] - Fix bugs of IMPALA-9645
- [IMPALA-10065] - Hit DCHECK when retrying a query in FINISHED state
- [IMPALA-10073] - Create shaded dependency for S3A and aws-java-sdk-bundle
- [IMPALA-10116] - Builtin cast function's selectivity is different from that of explicit cast
- [IMPALA-10132] - Implement ds_hll_estimate_bounds()
- [IMPALA-10133] - Implement ds_hll_stringify()
- [IMPALA-10134] - Implement ds_hll_union_f()
- [IMPALA-10144] - Add a statement of platforms that Impala runs on
- [IMPALA-10151] - Upgrade Iceberg to a version that is compatible with Hive3
- [IMPALA-10152] - Add support for Iceberg HiveCatalog
- [IMPALA-10170] - Data race on Webserver::UrlHandler::is_on_nav_bar_
- [IMPALA-10189] - Avoid unnecessarily loading metadata for drop stats DDL
- [IMPALA-10215] - Implement INSERT INTO for non-partitioned Iceberg tables (Parquet)
- [IMPALA-10219] - Add a query option to simulate catalogd HDFS listing delays
- [IMPALA-10223] - Implement INSERT OVERWRITE for Iceberg tables
- [IMPALA-10288] - Functionality to display table history for Iceberg tables
- [IMPALA-10295] - Fix analytic limit pushdown when no predicates are present
- [IMPALA-10296] - Fix analytic limit pushdown when predicates are present
- [IMPALA-10380] - INSERT INTO identity-partitioned Iceberg tables
- [IMPALA-10404] - Update docs to reflect RLE_DICTIONARY support
- [IMPALA-10432] - INSERT INTO Iceberg tables with partition transforms
- [IMPALA-10456] - Implement TRUNCATE for Iceberg tables
- [IMPALA-10469] - Support pushing quickstart images to Apache repo
- [IMPALA-10470] - Update wiki and README with info about Impala quickstart
- [IMPALA-10510] - Change code to help with third party extensions
- [IMPALA-10512] - ALTER TABLE ADD PARTITION should bump the write id for ACID tables
- [IMPALA-10515] - Planner refactoring to support external FE
- [IMPALA-10518] - Add server interface to retrieve executor membership
- [IMPALA-10522] - Support external use of frontend libraries
- [IMPALA-10524] - Change HdfsPartition to allow third party extensions
- [IMPALA-10525] - Add param to BuiltinsDb to defer initialization
- [IMPALA-10535] - Add interface to ImpalaServer for execution of externally compiled statements
- [IMPALA-10546] - Add ImpalaServer interface to Retrieve BackendConfig from impalad
- [IMPALA-10549] - Register transactions from external frontends
- [IMPALA-10551] - Add result sink support
- [IMPALA-10552] - Support external frontends supplying timeline for profile
- [IMPALA-10553] - Support for CTAS/Insert for external frontends
- [IMPALA-10577] - Add retrying of AdmitQuery
- [IMPALA-10590] - Ensure admissiond stays in sync with coordinators
- [IMPALA-10591] - Fix issues with failed ReleaseQueryBackends rpc
- [IMPALA-10594] - Handle failed coordinators in admissiond
- [IMPALA-10613] - Expose table and partition metadata over HMS API
- [IMPALA-10619] - Enable external FE to leverage standardization of first_value/last_value functions
Test
- [IMPALA-9004] - TestCompressedFormats is broken for text files
- [IMPALA-9757] - Test failures with HiveServer2Error: Invalid session id
- [IMPALA-9780] - All FE tests should explicitly set/unset the test flag
- [IMPALA-9902] - Add rewrite of TPC-DS q38
- [IMPALA-10369] - Dump server stacktraces when test_concurrent_ddls.py timeout
- [IMPALA-10452] - CREATE Iceberg tables with old PARTITIONED BY syntax
- [IMPALA-10598] - test_cache_reload_validation is flaky
Wish
Documentation
- [IMPALA-9541] - Document for dynamic log level changes
- [IMPALA-9817] - Document flags of Fetch-on-demand metadata coordinators
- [IMPALA-10388] - Document the limitation on mask functions
- [IMPALA-10538] - Document the newly added scale argument of ndv function