Impala 2.11 Change Log
New Feature
- [IMPALA-1767] - Boolean type does not include ISO-SQL is true, unknown or false predicate
- [IMPALA-4252] - Add RuntimeFilters for "in list" and/or min/max at KuduScanNode
- [IMPALA-5317] - add DATE_TRUNC() function
Improvement
- [IMPALA-743] - Impala should use -l instead of -r with kinit
- [IMPALA-2181] - Add a flag for hidden query options
- [IMPALA-2250] - Make multiple COUNT(DISTINCT) message state workarounds
- [IMPALA-2281] - Use a better hash function than FNV for exchanges
- [IMPALA-2758] - Remove BufferedTupleStream::GetRows()
- [IMPALA-3437] - Consider changing arithmetic conversions to produce decimal in more cases
- [IMPALA-3548] - Prune runtime filters based on RUNTIME_FILTER_MODE in the frontend
- [IMPALA-3804] - Re-enable per-scan filtering for sequence-based scanners
- [IMPALA-3877] - Support unpatched LLVM
- [IMPALA-4177] - Add batch dictionary/RLE decoding in Parquet
- [IMPALA-4236] - Codegen CopyRows() for select nodes
- [IMPALA-4506] - Make "tip of the day" message respect --quiet option
- [IMPALA-4704] - ImpalaD should not open 21000 and 21050 Ports till Catalog is Received
- [IMPALA-4826] - Impala should ignore the root schema's repetition in Parquet
- [IMPALA-4847] - Simplify the code for file/block metadata loading by manually calling listLocatedStatus() for each partition.
- [IMPALA-4985] - Evaluate parquet::Statistics to skip data in nested types
- [IMPALA-5129] - Use Kudu's Kinit code to avoid expensive fork
- [IMPALA-5211] - Simplify remaining constant conditionals
- [IMPALA-5394] - Set socket timeouts while opening TSaslTransport
- [IMPALA-5425] - Add test for validating input when setting query options.
- [IMPALA-5541] - Enforce sane maximum for BATCH_SIZE
- [IMPALA-5625] - stress test: collect profiles for timed out or errored queries
- [IMPALA-5736] - Add impala-shell argument to set default query options
- [IMPALA-5789] - Prune all partitions if an always-false runtime filter is received
- [IMPALA-5844] - Fix management of FunctionContext "local" allocations.
- [IMPALA-5849] - Don't disable TLS configuration at compile-time even with OpenSSL 1.0.0
- [IMPALA-5860] - Upgrade LLVM to 3.9.0 or later
- [IMPALA-5895] - Simplify and document memory management of RuntimeProfile Counters
- [IMPALA-5932] - Improve the performance of transitive closure computation in value transfer graph
- [IMPALA-5965] - Avoid per-value switch on NeedsConversionInline() when decoding dictionary-encoded strings and timestamps
- [IMPALA-5988] - Improve MemPool::TryAllocate() efficiency for small strings
- [IMPALA-6016] - Confusing logging in TableLoadingMgr.loadNextTable()
- [IMPALA-6054] - Parquet dictionary pages should be freed on dictionary construction
- [IMPALA-6067] - S3: Impala should be able to use IAM roles to access s3 storage
- [IMPALA-6076] - Add deprecation warning for BIT_PACKED encoding.
- [IMPALA-6080] - Clean up descriptor table handling in coordinator
- [IMPALA-6084] - Avoid "using namespace llvm" in C++ source files
- [IMPALA-6121] - Remove RequestContext cache in DiskIoMgr
- [IMPALA-6128] - Spill-to-disk Encryption(AES-CFB + SHA256) can be a performance bottleneck while IO is getting faster
- [IMPALA-6151] - Add NumBackends and NumFragments counters to profile
- [IMPALA-6210] - Add query id to lineage graph logging
Bug
- [IMPALA-467] - Builds intermittently fail due to problems getting the volume id metadata
- [IMPALA-1144] - Query cancellation throws error, reports wrong query text
- [IMPALA-1291] - Parquet read fails if io buffer size is less than the footer size
- [IMPALA-1422] - Allow constant exprs as the left-hand side of an IN subquery
- [IMPALA-1474] - Add metric for running queries
- [IMPALA-1575] - Cancelled queries do not yield resources until close
- [IMPALA-2234] - Exceeding mem limit may result in "Cancelled" failure
- [IMPALA-2235] - Impala shell automatic reconnect does not appear to maintain "use <db>"
- [IMPALA-2294] - Impalad unable to kinit after several days
- [IMPALA-2615] - annotate Status with [[nodiscard]]
- [IMPALA-2810] - Error message when moving a partitioned table from one database to another
- [IMPALA-3360] - Unroll loops / replace types in filter logic in PHJ::ProcessBuildBatch()
- [IMPALA-3516] - Some things are still written to /tmp instead of IMPALA_HOME/logs
- [IMPALA-3613] - Statestore should not update reconnected subscribers repeatedly
- [IMPALA-3897] - Codegen null-aware constant in PHJ::ProcessBuildBatch()
- [IMPALA-4591] - Kudu client error memory should be bounded
- [IMPALA-4620] - The eval cost of exprs should always be set in in analyze(), even if the eval cost is unknown (-1).
- [IMPALA-4682] - IllegalStateException when ordering by aggregate function
- [IMPALA-4863] - Incorrect accounting of file count and compression type when Runtime filters are applied on partition and non-partition columns
- [IMPALA-4918] - Support getting column comments via HS2
- [IMPALA-4951] - Impala does not show database if the user only has column-level access
- [IMPALA-4964] - Decimal modulo operator is overflowing
- [IMPALA-4987] - test_rows_availability.py is flaky
- [IMPALA-5018] - DECIMAL V2: Error on decimal divide by 0
- [IMPALA-5019] - DECIMAL V2 add/sub result type
- [IMPALA-5146] - from_unixtime() given an out-of-range unix time produces inconsistent results
- [IMPALA-5199] - Impala may hang on empty row batch exchange
- [IMPALA-5210] - Nested types : Scans spend 30% of CPU in impala::RuntimeProfile::Counter::Add and 8% in apic_timer_interrupt
- [IMPALA-5250] - Non-deterministic error reporting for compressed corrupt Parquet files
- [IMPALA-5311] - Select count(*) queries show in incorrect compression in profile
- [IMPALA-5341] - File size filter in planner tests also filters row-size
- [IMPALA-5416] - Chaining source command in impala-shell with a SQL query runs the query twice and crashes if followed by another source
- [IMPALA-5429] - Use a thread pool to load block metadata in parallel
- [IMPALA-5448] - Invalid number of files reported in Parquet scan node
- [IMPALA-5491] - Improve error message when loading metadata for table where HDFS files have missing blocks
- [IMPALA-5597] - IllegalStateException in RuntimeFilterGenerator.computeTargetExpr() with left join
- [IMPALA-5599] - Clean up non-TIMESTAMP usages of TimestampValue
- [IMPALA-5617] - Stress test not finding tpch_nested queries
- [IMPALA-5624] - ProcessStateInfo::ReadProcFileDescriptorInfo() should not fork a process
- [IMPALA-5664] - Unix time to timestamp conversions may crash impala (boost exception)
- [IMPALA-5668] - Subsecond Unix times around the first supported TIMESTAMP may be wrong
- [IMPALA-5750] - Handle uncaught exceptions in thread creation
- [IMPALA-5812] - Query hits NullPointerException in FE
- [IMPALA-5816] - ssl-related custom cluster tests failing during setup on exhaustive RHEL7
- [IMPALA-5836] - Eclipse frontend debugging setup requires manual steps like creating a launcher
- [IMPALA-5846] - Kudu libraries are written to be/src/.., not be/build/...
- [IMPALA-5853] - GetResultSetMetadata() invalid query id error message is confusing
- [IMPALA-5854] - Update external hadoop ecosystem
- [IMPALA-5856] - Queries with full outer and left join miss result rows
- [IMPALA-5863] - Include-what-you-use for Kudu client
- [IMPALA-5867] - Out-of-range yy month format can crash Impala
- [IMPALA-5870] - Partial sort profile counters don't make sense for partial sort
- [IMPALA-5871] - KuduPartitionExpr incorrectly handles its child types
- [IMPALA-5873] - Building Impala on legacy platforms is broken due to sync_file_range() not defined
- [IMPALA-5885] - Parquet scanner does not free local allocations in filter contexts
- [IMPALA-5888] - Parquet scanner does not free local allocations for min/max and dictionary filters
- [IMPALA-5890] - Segmentation fault in ScannerContext::Stream::GetBytesInternal(long, unsigned char**, bool, long*)
- [IMPALA-5891] - PeriodicCounterUpdater should not rely on static initialisation and destruction order
- [IMPALA-5892] - Fault injection at DescriptorTbl::Create() can lead to query hang
- [IMPALA-5911] - Grouping aggregations with having conjuncts and Serialize()/Finalize() functions uses excessive expr memory
- [IMPALA-5912] - Impala gets SIGABRT while running expression tests
- [IMPALA-5920] - Remove admission control dependency on YARN resourcemanager
- [IMPALA-5923] - We're printing a binary ID in ChildQuery::Cancel()
- [IMPALA-5926] - Avoid printing expensive stack when closing a session
- [IMPALA-5927] - enable_distcc broken for ZSH
- [IMPALA-5936] - Difference between the % Operator and Mod function with large decimal values
- [IMPALA-5940] - Log-spew and performance hit from Status objects that generate stack traces unnecessarily
- [IMPALA-5941] - create-test-configuration.sh does not properly create Hive Metastore schema
- [IMPALA-5949] - test_exchange_small_delay failure: Expected exception: Sender timed out waiting for receiver fragment instance
- [IMPALA-5951] - test_catalogd_timeout failure: Expected exception: Error creating Kudu table
- [IMPALA-5954] - Prefer StatsSetupConst.DO_NOT_UPDATE_STATS over STATS_GENERATED_VIA_STATS_TASK
- [IMPALA-5955] - Use the totalSize Hive table property instead of rawDataSize
- [IMPALA-5957] - DCHECK attempts to print non-string as a c-style string
- [IMPALA-5964] - common/yarn-extras/README.txt doesn't pass Apache RAT check
- [IMPALA-5966] - PlannerTest result files are written to the wrong location
- [IMPALA-5983] - Dateless timestamps (e.g. "10:00:00") can cause crash during timezone conversion
- [IMPALA-5986] - Impala test suite harness fails to reset some options in SQL sessions
- [IMPALA-5987] - LZ4 Codec silently produces bogus compressed data for large inputs
- [IMPALA-5994] - Failure in star expansion on struct fields
- [IMPALA-5999] - Multiple failures in TestUdfExecution and test_spilling
- [IMPALA-6001] - Integration job failed in TestDdlStatements.test_functions_ddl - one extra function in actual output
- [IMPALA-6009] - FE compilation fails: ColumnLineageGraph.java:[593,11] no suitable method found for putString(java.lang.String)
- [IMPALA-6012] - HIVE-12730 breaks Impala compilation
- [IMPALA-6021] - FE fails to compile due to incompatible Guava Hasher API
- [IMPALA-6023] - impalad failed to start in test_dcheck_writes_minidump
- [IMPALA-6030] - Don't start coordinator specific thread pools if a node isn't a coordinator node
- [IMPALA-6039] - BitReader::GetAligned() doesn't zero out trailing bytes
- [IMPALA-6040] - test_multi_compression_types uses hive in incompatible environments
- [IMPALA-6049] - Fix for IMPALA-6023 breaks test_breakpad on localFs builds
- [IMPALA-6053] - IllegalStateException when storageIds don't match hosts
- [IMPALA-6055] - Impala doesn't work with Hadoop 2.8 and newer
- [IMPALA-6060] - Crash in JniUtfCharGuard::create()
- [IMPALA-6061] - Impala needs to handle deprecation of s3n in hadoop 3.0
- [IMPALA-6068] - Dataload does not populate functional_*.complextypes_fileformat correctly
- [IMPALA-6069] - Incorrect handling of Nan with join and codegen
- [IMPALA-6081] - TestRuntimeFilters fails due to runtime profile missing portions
- [IMPALA-6092] - Flaky test: query_test/test_udfs.py still happening
- [IMPALA-6093] - TestHashJoinTimer failed on local filesystem and ASAN builds
- [IMPALA-6099] - DCHECK in runtime filters: "Tried to increment unknown counters group"
- [IMPALA-6100] - test_exchange_delays flaky under ASAN
- [IMPALA-6106] - test_tpcds_q53 extremely flaky because of decimal_v2 not being reset
- [IMPALA-6109] - Hbase in minicluster appears to be flaky
- [IMPALA-6114] - Incorrect type deduction causing analysis exception to be thrown
- [IMPALA-6118] - Assertion failure in mem-tracker when releasing runtime filter memory
- [IMPALA-6123] - test_inline_view_limit fails in exhaustive tests
- [IMPALA-6124] - test_last_ddl_time_update fails on S3
- [IMPALA-6126] - ASAN detects heap-use-after-free in thrift-server-test
- [IMPALA-6127] - Failure in TestRuntimeFilters.test_wait_time on ASAN
- [IMPALA-6132] - ASAN test fails when trying to move/copy string created by kudu::EnvPosix::GetExecutablePath into InitAuth()
- [IMPALA-6136] - Duration in /queries page is shows a negative value
- [IMPALA-6137] - ASAN heap-use-after-free in HdfsTextScanner::CheckForSplitDelimiter()
- [IMPALA-6144] - Coordinator threads that publish RuntimeFilters continue to run after query failure/cancellation
- [IMPALA-6163] - LLVM link error in test_ir_functions
- [IMPALA-6164] - test_always_false_filter failure on ASAN
- [IMPALA-6170] - Failure in llvm-codegen-test: Failed to get file info /test-warehouse/test-udfs.ll
- [IMPALA-6171] - Failure in test_admission_controller "assert metric_deltas['timed-out'] == 0"
- [IMPALA-6173] - SHOW CREATE TABLE broken for unpartitioned Kudu tables
- [IMPALA-6183] - Converting Decimal to Double loses precision
- [IMPALA-6184] - Check failed: !initialized_ || closed_
- [IMPALA-6187] - Scan with conjuncts but no materialized slots crashes Impalad
- [IMPALA-6188] - test_top_n_reclaim is flaky
- [IMPALA-6198] - Error starting cluster for a custom cluster test on a release build
- [IMPALA-6201] - TestRuntimeFilters.test_basic_filters fails on ASAN
- [IMPALA-6206] - Data loading fails if tests are not built
- [IMPALA-6213] - The partitioning compatibility check is wrong in consecutive outer join cases
- [IMPALA-6217] - parquet-column-readers.cc:417] Check failed: def_levels_.CacheHasNext()
- [IMPALA-6220] - Build broken due to ‘EVP_aes_256_ctr’ not declared in openssl-uitl.cc
- [IMPALA-6225] - IMPALA-5599 broke a client software due to precision change in date-time string
- [IMPALA-6232] - Short circuit reads disabled when using Impala HDFS file handle cache
- [IMPALA-6239] - Remote data load breaks with "LOAD DATA LOCAL INPATH": Invalid path
- [IMPALA-6241] - Timeout in TestAdmissionControllerStress.test_mem_limit under ASAN
- [IMPALA-6242] - Flaky test: TimerCounterTest.CountersTestOneThread
- [IMPALA-6255] - Add disk names to disk-io-mgr threads
- [IMPALA-6262] - Crash Impalad [ DataSink::Create fail which cause profile nullptr ]
- [IMPALA-6265] - TestImpalaShell.test_query_cancellation_during_fetch breaks in ASAN builds
- [IMPALA-6273] - test_subquery_in_constant_lhs failing on exhaustive runs in hbase
- [IMPALA-6278] - Set up release notes for Impala 2.11
- [IMPALA-6280] - Invalid plan for sorted INSERT with an outer join and null checking
- [IMPALA-6281] - thrift-server-test and rpc-mgr-test failing ASAN builds
- [IMPALA-6284] - Exhaustive release build failing
- [IMPALA-6285] - Avoid printing the stack as part of DoTransmitDataRpc as it leads to burning lots of kernel CPU
- [IMPALA-6286] - Wrong results with outer join and RUNTIME_FILTER_MODE=GLOBAL
- [IMPALA-6291] - Various crashes and incorrect results on CPUs with AVX512
- [IMPALA-6292] - Decimal v2 subtraction hits a DCHECK
- [IMPALA-6298] - failure in test_profile_fragment_instances
- [IMPALA-6308] - RPC timeout message printing invalid destination name
- [IMPALA-6332] - Impala webserver should return HTTP error code for missing query profiles
Sub-task
- [IMPALA-2494] - Impala Unable to scan a Decimal column stored as Bytes
- [IMPALA-4655] - Add Kerberos minicluster test framework
- [IMPALA-4670] - Add RpcMgr to interface between Impala and KRPC library
- [IMPALA-4671] - Replace kudu::ServicePool with one that uses Impala threads
- [IMPALA-4786] - Refactor CreateImpalaServer() to allow it to be used in tests.
- [IMPALA-4856] - Port datastream portions of ImpalaInternalService to KRPC
- [IMPALA-4872] - Remove per-RPC DNS lookup
- [IMPALA-5053] - Enable KRPC Kerberos support in Impala
- [IMPALA-5174] - Suppress kudu flags that aren't relevant to Impala
- [IMPALA-5307] - Consider always copying-out Disk I/O buffers instead of attaching to RowBatches
- [IMPALA-5417] - Consider limiting I/O buffer queue size to 2 buffers
- [IMPALA-5493] - Add Protobuf headers to Impala-lzo
- [IMPALA-5538] - Use explicit catalog versions for deleted objects in catalog updates
- [IMPALA-5596] - Data load failed with " Failed to find any Kerberos tgt" on secure cluster
- [IMPALA-5902] - Add ThreadSanitizer build
- [IMPALA-5905] - Add ThreadSanitizer to https://jenkins.impala.io/job/all-build-options/
- [IMPALA-5976] - Remove equivalence classes
- [IMPALA-6002] - Impala should install an LLVM diagnostic handler
- [IMPALA-6058] - Address log spew originating from InboundCall::Respond()
- [IMPALA-6134] - Update code base to use impala::ConditionVariable
- [IMPALA-6172] - KRPC w/ TLS doesn't work on remote clusters after rebase
- [IMPALA-6219] - Use AES-GCM for spill-to-disk encryption when CLMUL instruction is present and performant
- [IMPALA-6238] - Add source and destination hosts to TErrorCode::DATASTREAM_SENDER_TIMEOUT
- [IMPALA-6250] - Document IS <boolean>
- [IMPALA-6251] - Document DATE_TRUNC() function
- [IMPALA-6252] - Document impala-shell argument --query_options
- [IMPALA-6253] - Document upper limit for BATCH_SIZE query option
- [IMPALA-6339] - Document changes to SET output and new SET ALL syntax
Task
- [IMPALA-4082] - Replace getRegionsInRange() in fe/src/main/java/com/cloudera/impala/catalog/HBaseTable.java with call to hBase
- [IMPALA-5653] - Remove "unlimited" process mem_limit option
- [IMPALA-5915] - Make all errors coming out of the IoMgr identifiable
- [IMPALA-6203] - Take "incubating" out of documentation
Test
- [IMPALA-5827] - Add test coverage for failure to repartition in hash join