Impala 2.9 Change Log
New Feature
- [IMPALA-3586] - Passthrough mode for Union ALL operator
- [IMPALA-3807] - Introduce support for dedicated Impalad coordinator(s)
- [IMPALA-3909] - Parquet file writer should populate the min/max statistics per block per column to be used by the reader
- [IMPALA-4166] - Introduce SORT BY clause in CREATE TABLE statement
- [IMPALA-4403] - Implement SHOW RANGE PARTITIONS for Kudu tables
- [IMPALA-4431] - Add a way to control the number of audit event log files
- [IMPALA-4460] - Investigate and support Kudu basic authentication
- [IMPALA-4616] - Specify more options when adding new kudu columns
- [IMPALA-4728] - materialize expressions for window sorts vs lazy expression evaluation
- [IMPALA-4729] - REPLACE() function
- [IMPALA-4734] - HdfsParquetTableWriter should populate sorting_columns in row groups with any ordering columns
- [IMPALA-4740] - Add option to use hdfsPread() instead of hdfsRead() for HDFS hedged reads
- [IMPALA-4810] - DECIMAL datatype changes for Impala 2.9
- [IMPALA-4815] - Populate min/max statistics in Parquet files for Decimal types
- [IMPALA-4817] - Populate min/max statistics in Parquet files for String values
- [IMPALA-4819] - Populate min/max statistics in Parquet files for Timestamp values
- [IMPALA-4883] - Implement Codegen for the Union operator
- [IMPALA-5030] - add NVL2()
- [IMPALA-5137] - Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP
- [IMPALA-5259] - Add REFRESH FUNCTIONS <db> statement
- [IMPALA-5333] - Add support for Impala to work with ADLS
- [IMPALA-5381] - Add query option to control join strategy when tables have no stats
Improvement
- [IMPALA-681] - Investigate memory usage in TestWideRow
- [IMPALA-1430] - Codegen all aggregate functions, including UDAs
- [IMPALA-1861] - Conditional functions with constant arguments should be simplified during analysis
- [IMPALA-2079] - Don't fail when write to scratch dir results in error.
- [IMPALA-3175] - Cache the Kudu client between queries
- [IMPALA-3654] - Range based pruning for in-predicate
- [IMPALA-3742] - INSERTs into Kudu tables should partition and sort
- [IMPALA-3955] - Remove Scheduler class and rename SimpleScheduler to Scheduler
- [IMPALA-3989] - Display skew warning for poorly formatted Parquet files
- [IMPALA-4118] - Implement disk spill encryption and integrity checking for new buffer pool
- [IMPALA-4141] - Error when creating a partition that already exists in HMS
- [IMPALA-4282] - Allow Impala to create columns whose type has more than 4000 characters
- [IMPALA-4293] - Query profile should include error log
- [IMPALA-4341] - Include metadata loading time in planner timeline
- [IMPALA-4547] - Fix ExecEnv singleton issues in tests
- [IMPALA-4611] - Checking perms on S3 files is a very expensive no-op
- [IMPALA-4617] - Remove duplication of isConstant() and IsConstant() in frontend and backend
- [IMPALA-4624] - Add dictionary filtering to Parquet scanner
- [IMPALA-4635] - Reduce bootstrap time for Python virtualenv
- [IMPALA-4636] - Add support for SLES12 for Impala/Kudu integration
- [IMPALA-4648] - Remove build_thirdparty.sh
- [IMPALA-4649] - Add a mechanism to pass flags into make
- [IMPALA-4653] - Revamp impala-config.sh to avoid annoying "sticky config variables" problem
- [IMPALA-4673] - Use --local_library_dir for timestamp db scratch space
- [IMPALA-4676] - Remove vestigial references to getBlockStorageLocations() API
- [IMPALA-4711] - Document is_null semantics in UDF interface
- [IMPALA-4735] - py.test needs to be upgraded in the Impala python environment
- [IMPALA-4750] - Future proof use of pytest in Impala environment
- [IMPALA-4752] - ObjectPool should not do extra dynamic memory allocation
- [IMPALA-4762] - RECOVER PARTITIONS should send new partitions in small batches to HMS
- [IMPALA-4764] - add hedged read metrics
- [IMPALA-4774] - Change log level without restart
- [IMPALA-4787] - Optimize APPX_MEDIAN() mem usage in case of many grouping keys
- [IMPALA-4822] - Allow dynamic log4j configuration for Catalog and Impalads
- [IMPALA-4825] - Change Timestamp representation for binary compatibility with Kudu.
- [IMPALA-4846] - Upgrade snappy to 1.1.4
- [IMPALA-4859] - Push down IS NULL / IS NOT NULL to Kudu
- [IMPALA-4873] - run_test_case() should find/replace arbitrary strings in /testdata/workloads/*/queries/*.test
- [IMPALA-4880] - Clarify synchronization policy for 'done_' in KuduScanNode
- [IMPALA-4891] - [Kudu] Pushdown IS NULL and IS NOT NULL predicates
- [IMPALA-4941] - Bump Zookeeper version to 3.4.10 to address ZOOKEEPER-2044
- [IMPALA-4943] - Improve metadata load speed for "alter table add partition".
- [IMPALA-4966] - Add flatbuffers 1.6.0 to toolchain
- [IMPALA-4988] - Add query option to control filtering based on parquet::Statistics
- [IMPALA-5003] - Add 'constant propagation' for Views with a partition filter
- [IMPALA-5015] - Run parquet_stats_test.py with mt_dop != 0
- [IMPALA-5034] - Update breakpad to a newer version
- [IMPALA-5110] - dump_breakpad_symbols.py should support Debian packages
- [IMPALA-5120] - Consider defaulting to partitioned join when no stats are available.
- [IMPALA-5127] - Impala shell history size is fixed and very small
- [IMPALA-5140] - clean up markdown, make slight improvements to doc building howto
- [IMPALA-5156] - Drop VLOG level passed into Kudu client
- [IMPALA-5162] - support kerberized+ssl TPC-H nested data loading
- [IMPALA-5163] - support running concurrent_select.py against Kerberized+SSL Impala
- [IMPALA-5169] - Parallelise read I/O of BufferPool::Pin()
- [IMPALA-5181] - Make it possible to get Python package metadata from an HTML web page in pip_download.py
- [IMPALA-5187] - Bump breakpad version to include the fix for Breakpad #681, re-enable the strict check that was disabled in IMPALA-3794
- [IMPALA-5192] - Avoid hard coding pointer to the tuple pool into generated IR of Tuple::CodegenMaterializeExprs()
- [IMPALA-5214] - Distcc scripts should not require toolchain at /opt/Impala-Toolchain
- [IMPALA-5220] - Clean up TCMalloc memory maintenance logic
- [IMPALA-5229] - Try using TCMalloc + Huge Pages for buffers
- [IMPALA-5238] - Support transferring reservation between ReservationTrackers
- [IMPALA-5301] - minicluster kudu needs mem limits set
- [IMPALA-5304] - Parquet scanner transfers decompression buffers when not needed
- [IMPALA-5347] - Parquet scanner has a lot of small CPU inefficiencies
Bug
- [IMPALA-391] - Expr-test does not actually test the codegen path
- [IMPALA-397] - ORDER BY rand() does not work.
- [IMPALA-1427] - Improve "unknown disk id" warning messages
- [IMPALA-1464] - Bug in explain plan: Plan nodes in unpartitioned fragments should have hosts=1.
- [IMPALA-1670] - Support multiple partition specs in ALTER TABLE ADD PARTITION
- [IMPALA-1972] - Queries that take a long time to plan can cause webserver to block other queries
- [IMPALA-2328] - Parquet scan should use min/max statistics to skip blocks based on predicate
- [IMPALA-2518] - DROP DATABASE CASCADE does not remove cache directives of tables
- [IMPALA-2716] - Hive/Impala incompatibility for timestamp data in Parquet
- [IMPALA-2800] - impalad process blocked while releasing memory after a big query
- [IMPALA-3079] - Fix Sequence file writer (crashes or produces invalid files)
- [IMPALA-3524] - Spilling joins unnecessarily process spilled partitions with 0 probe rows
- [IMPALA-3641] - DROP / CREATE sequence on same table failed with "table already exists"
- [IMPALA-3785] - "Invalid query handle" error should report which query handle is invalid.
- [IMPALA-3794] - test_breakpad.py is flaky
- [IMPALA-3932] - virtualenv does not build binary python packages with toolchain
- [IMPALA-4033] - ALTER TABLE ADD PARTITION treats string-partition key values as case insensitive.
- [IMPALA-4036] - show create table outputs invalid sql for partitioned tables with comments
- [IMPALA-4055] - Investigate and fix to_date() slowness
- [IMPALA-4088] - HDFS data nodes pick HTTP server ports at random, sometimes stealing HBase master's port
- [IMPALA-4164] - Codegen does not generate target-specific machine code for cross-compiled functions
- [IMPALA-4263] - Wrong results due to missing hash exchange believed to be redundant.
- [IMPALA-4449] - Revisit locking scheme in CatalogOpEx.alterTable()
- [IMPALA-4499] - Address Kudu query profile issues
- [IMPALA-4546] - Incorporate Russian/Moscow timezone changes in to the tz db
- [IMPALA-4548] - BlockingJoinNode::Close() should wait for completion of async build thread
- [IMPALA-4549] - Validation of timestamp year is inconsistent about whether upper bound is 9999 or 10000
- [IMPALA-4593] - kudu-python is built with the system C++ compiler, which may not be ABI-compatible with the toolchain C++ compiler
- [IMPALA-4615] - test_avro_schema_resolution.py fails with wrong results
- [IMPALA-4640] - parquet-reader always prints "Rows: 0"
- [IMPALA-4647] - Cannot do full data load with ninja
- [IMPALA-4659] - TestScannersFuzzing should set a mem_limit
- [IMPALA-4675] - Mixed or uppercase columns are not resolved in parquet when using PARQUET_FALLBACK_SCHEMA_RESOLUTION=NAME
- [IMPALA-4684] - check-hbase-nodes.py: Build failing on RHEL7 when trying to start HBase
- [IMPALA-4689] - Expiration computes last active timestamp wrong
- [IMPALA-4701] - ccache does not understand that distcc.sh could be clang or gcc
- [IMPALA-4702] - Webserver command line option 'webserver_private_key_file' erroneously refers to 'ssl_server_certificate' instead of 'webserver_certificate_file
- [IMPALA-4705] - Impala may miss materialization of indirectly referenced functions
- [IMPALA-4707] - Heap use-after-free in QueryExecMgr
- [IMPALA-4710] - There is an error in control audit log file size number
- [IMPALA-4716] - Expr rewrite causes IllegalStateException
- [IMPALA-4721] - Test names which are prefix of other tests make it impossible to select them using impala-py.test -k
- [IMPALA-4722] - test_scratch_disk.py fails sporadically when asserting logfile content
- [IMPALA-4725] - Wrong field resolution of nested Parquet fields
- [IMPALA-4731] - Sorter crash Impalad instance
- [IMPALA-4733] - HBase/Zookeeper continues to be flaky when starting the minicluster on RHEL7
- [IMPALA-4738] - stddev_samp() returns 0 when it should return NULL
- [IMPALA-4742] - run-tests.py not compatible with python 2.6
- [IMPALA-4745] - TestScratchLimit fails on S3
- [IMPALA-4748] - tmp-file-mgr.h:263] Check failed: !write_in_flight_
- [IMPALA-4749] - Hit DCHECK in sorter for spilling query with scratch limit
- [IMPALA-4751] - For unknown query IDs, /query_profile_encoded?query_id=123 starts with an empty line
- [IMPALA-4757] - Macros in testutil/gtest-util.h evaluate their arguments twice
- [IMPALA-4765] - Catalog loading threads can be wasted waiting for a large table to load
- [IMPALA-4767] - Table stats are removed after any ALTER TABLE in Impala
- [IMPALA-4768] - Improve logging of table loading for supportability.
- [IMPALA-4775] - discrepancy_searcher.py logging overwriting itself
- [IMPALA-4779] - Conditional functions isfalse(), istrue(), isnotfalse() and isnottrue() don't work with codegen
- [IMPALA-4780] - Wrong result with next_day() when codegen is enabled.
- [IMPALA-4788] - Partition recovery is very slow as it uses an ArrayList to check if a partition already exists
- [IMPALA-4789] - Slow metadata loading with many partitions that have inconsistent HDFS path qualification
- [IMPALA-4792] - NDV estimates for case expressions with limited number of output values could be improved
- [IMPALA-4801] - Heap use-after-free in expr-test
- [IMPALA-4808] - Crash in old hash join node for full outer join
- [IMPALA-4818] - TestCancellationSerial.test_cancel_insert is meta-flaky
- [IMPALA-4820] - TmpFileMgr can write unencrypted data to disk even when encryption is on
- [IMPALA-4828] - Altering Kudu table schema outside of Impala may result in crash on read
- [IMPALA-4839] - Kudu-related tests failing on remote cluster because localhost / loopback is hard-coded in the test framework
- [IMPALA-4840] - Fix REFRESH perf issues.
- [IMPALA-4842] - BufferedBlockMgrTest.WriteError occasionally fails with error
- [IMPALA-4849] - Case expression with constant condition generates IllegalStateException
- [IMPALA-4853] - test_kudu_dml_reporting in test_shell_commandline.py should not run on distros where Kudu is not supported
- [IMPALA-4854] - COMPUTE INCREMENTAL STATS should ignore missing stats on complex columns
- [IMPALA-4858] - Provide better explanation for obscure Memory limit exceeded failures
- [IMPALA-4868] - TestRequestPoolService.testUpdatingConfigs fails: "checkModifiedConfigResults:245 expected:<root.queueC> but was:<null>"
- [IMPALA-4876] - Remove _test suffix from test names that had been introduced to make names prefix-free
- [IMPALA-4878] - FunctionContext::GetIntermediateType() is not implemented
- [IMPALA-4879] - FunctionContext::GetArgType() returns wrong type in UDA Merge() and Finalize()
- [IMPALA-4887] - Broken local filesystem TestHdfsParquetTableStatsWriter
- [IMPALA-4890] - stress crash: Coordinator race between TearDown() and GetNext() (crash dereferencing executor_)
- [IMPALA-4893] - Sequence scanner increments RuntimeProfile rows read counter per row wasting 30% CPU
- [IMPALA-4895] - Memory limit exceeded in TestTPCHJoinQueries.test_outer_joins on local filesystem and non-partitioned-aggs-and-joins
- [IMPALA-4897] - AnalysisException: specified cache pool does not exist
- [IMPALA-4899] - Parquet table writer leaks dictionaries
- [IMPALA-4902] - Concurrent DDL may fail with a ConcurrentModificationException
- [IMPALA-4904] - test_ddl_stress isn't runnable through buildall.sh entry point
- [IMPALA-4907] - Unable to open scanner: Timed out errors when running COMPUTE STATS on Kudu-related tables
- [IMPALA-4913] - Toolchain broken on centos6/ubuntu12 after Kudu added boost
- [IMPALA-4914] - TestSpillStress makes flawed assumptions about running concurrently
- [IMPALA-4915] - Unbounded DECIMAL casts from floating point to decimal trigger undefined behavior
- [IMPALA-4916] - Missing, redundant or non-evaluable predicates due to buggy equivalence classes.
- [IMPALA-4920] - pytest metadata for custom cluster tests being put in wrong path
- [IMPALA-4923] - Operators running on top of selective Hdfs scan nodes spend a lot of time calling impala::MemPool::FreeAll on empty batches
- [IMPALA-4936] - Cast from double to decimal doesn't always handle overflow correctly
- [IMPALA-4937] - Remove unused kudu scanner keep alive vairable
- [IMPALA-4946] - Rare hang in buffer-pool-test
- [IMPALA-4955] - Insert overwrite into partitioned table started failing with IllegalStateException: null
- [IMPALA-4962] - Max Size column incorrectly has NULLs in column stats via HS2 interface
- [IMPALA-4977] - Evaluate IN predicates against parquet::Statistics
- [IMPALA-4980] - Drop partition against table with custom partition paths is failing with Error making 'dropPartition' RPC to Hive Metastore
- [IMPALA-4981] - COMPUTE STATS with MT_DOP=1 and tight memory limit produces spilling error
- [IMPALA-4982] - Add a test for statistics based filtering of row groups for root-level scalar columns of parquet files with nested types
- [IMPALA-4983] - Regression in exchange operators introduced by LZ4 1.7.5 upgrade
- [IMPALA-4995] - crash when limit clause > MAX_INT
- [IMPALA-4997] - crash when using sortby hint on a very large table
- [IMPALA-4998] - Table.toThrift() called without holding the table lock: test_view_compatibility_b0595633.test_hive org.apache.impala.catalog
- [IMPALA-4999] - Impala.tests.custom_cluster.test_spilling.TestSpillStress.test_spill_stress failed intermittently
- [IMPALA-5005] - Don't allow server to send SASL COMPLETE message out of order
- [IMPALA-5008] - AddressSanitizer: heap-buffer-overflow in ParquetPlainEncoder
- [IMPALA-5021] - COMPUTE STATS hang while RowsRead of one SCAN fragment winds down
- [IMPALA-5025] - Upgrade binutils to 2.26.1
- [IMPALA-5027] - udf headers are not longer buildable outside of Impala source tree
- [IMPALA-5028] - Exception in catalog web UI when trying to display loaded table
- [IMPALA-5038] - File size mismatch in PlannerTest.testPredicatePropagation
- [IMPALA-5039] - test_mt_dop.py fails on local filesystem build
- [IMPALA-5041] - Allow AuthManager::Init() to be called more than once
- [IMPALA-5042] - Loading metadata for partitioned tables is slow due to usage of an ArrayList, potential 4x speedup
- [IMPALA-5044] - backports.tempfile not supported in python 2.6
- [IMPALA-5055] - Jenkins test run hit DCHECK in parquet-column-readers.cc
- [IMPALA-5072] - test_recover_many_partitions fails on S3
- [IMPALA-5074] - query_test.test_aggregation.TestAggregationQueries.test_aggregation fails on SLES12 SP2
- [IMPALA-5075] - query_test.test_queries.TestQueriesTextTables.test_strict_mode fails on SLES12 SP2
- [IMPALA-5076] - query_test.test_exprs.TestExprs.test_exprs fails on SLES12 SP2
- [IMPALA-5077] - Add NUMA info and the current CPU to CpuInfo
- [IMPALA-5079] - Flaky tests: Kudu EE tests need longer HS2 connection timeouts
- [IMPALA-5080] - test_java_udfs: OutOfMemoryError: PermGen space
- [IMPALA-5088] - heap-buffer-overflow in impala_udf::StringVal::CopyFrom
- [IMPALA-5111] - IllegalArgumentException when using explicit "NOT NULL" on pk column
- [IMPALA-5115] - Occasional crash in HdfsTableSink while using mod(cast(rand(7) * 1000000000 as int),2) as partition column
- [IMPALA-5123] - ASAN failure: heap-use-after-free in timezone_db.cc:683
- [IMPALA-5125] - Check failed: tuple_desc_map_.back() != __null
- [IMPALA-5143] - Crash while running/cancelling concurrent queries QueryExecState::ExecQueryOrDmlRequest query-exec-state.cc:469
- [IMPALA-5144] - Remove sortby() query hint
- [IMPALA-5145] - CTAS failing when creating from a view with error "Unsupported type 'null_type'"
- [IMPALA-5150] - Uneven load distribution of work across NUMA nodes
- [IMPALA-5154] - catalogd hangs trying to load an unpartitioned Kudu table
- [IMPALA-5157] - Remove "SORTBY()" hint from new features in 2.8.0
- [IMPALA-5164] - BenchmarkTest.Basic test is flaky
- [IMPALA-5171] - fix broken RAT build
- [IMPALA-5172] - crash in tcmalloc::CentralFreeList::FetchFromOneSpans
- [IMPALA-5173] - Crash when NestedLoopJoin has HashJoin feeding directly into it's right side
- [IMPALA-5177] - Error making alter_table rpc, job failure
- [IMPALA-5180] - Non-deterministic exprs without slot refs cause HDFS query failure
- [IMPALA-5182] - Explicitly close connection to impalad on error from shell
- [IMPALA-5183] - buffered-block-mgr-test: Writes did not complete after 500ms
- [IMPALA-5186] - stress test caused crash in HdfsParquetScanner::Close()
- [IMPALA-5188] - DCHECK in UnionNode::GetNextPassThrough with GROUP BY, AVG
- [IMPALA-5189] - python env fails to install pytest-xdist
- [IMPALA-5193] - Impala reads gzip compressed text as binary when skip.header.line.count > 0
- [IMPALA-5197] - Parquet scan may incorrectly report "Corrupt Parquet file" in the logs
- [IMPALA-5198] - Error messages are sometimes dropped before reaching client
- [IMPALA-5207] - enable_distcc doesn't reset IMPALA_DISTCC_ENABLED
- [IMPALA-5208] - Forked breakpad process blocks indefinitely for WaitForContinueSignal and fails new Impalad process at startup
- [IMPALA-5217] - KuduTableSink checks null constraints incorrectly
- [IMPALA-5222] - Bits::Log2Ceiling eating 1% of CPU when running targeted-perf.
- [IMPALA-5224] - Remove repository.codehaus.org from Maven pom's
- [IMPALA-5230] - Impala does not start under ASAN
- [IMPALA-5231] - S3 build fails because memory estimates changes
- [IMPALA-5232] - Parquet reader error message prints memory address instead of value
- [IMPALA-5235] - Query throws a NullPointerException on starting impala cluster with logging_level=3
- [IMPALA-5244] - data_errors/test_data_errors.py:56: in test_hdfs_file_open_fail on local filesystem build
- [IMPALA-5245] - buffer-allocator-test failed in ASAN build
- [IMPALA-5246] - Queries failing with "Process: memory limit exceeded" during ASAN builds
- [IMPALA-5247] - test_kudu_col_null_changed flaky
- [IMPALA-5251] - DecimalAvgFinalize() gets the wrong arg type
- [IMPALA-5252] - Java UDF returning string can lead to crash under memory pressure.
- [IMPALA-5257] - TestTableWriters.test_seq_writer_hive_compatibility fails in local file system build
- [IMPALA-5258] - Need to reenable building Impala-lzo in release mode
- [IMPALA-5261] - Heap use-after-free in HdfsSequenceTableWriter::ConsumeRow()
- [IMPALA-5262] - test_sort.py::test_analytic_order_by_random fails with assert
- [IMPALA-5267] - test_seq_writer_hive_compatibility hits error running statement on Hive
- [IMPALA-5268] - After canceling query on secure cluster coordinator node doesn't accept new connections
- [IMPALA-5273] - StringCompare is very slow
- [IMPALA-5287] - Add a test for skip.header.line.count on compressed files
- [IMPALA-5291] - statestore-test failed during exhaustive testing of ASF RELEASE build
- [IMPALA-5294] - Kudu INSERT partitioning fails with constants
- [IMPALA-5295] - "Process: memory limit exceeded" in shell tests during asf-master-core-asan build
- [IMPALA-5297] - free-pool-test may be OOM killed on jenkins.impala.io runs
- [IMPALA-5302] - tcmalloc contention limits CPU utilization on machines with >40 logical processors
- [IMPALA-5305] - query_test/test_observability.py failing on s3, localFS and Isilon after recent changes to test data
- [IMPALA-5318] - Impala does not always generated fully qualified table names in audit events
- [IMPALA-5319] - data_errors/test_data_errors.py::TestHdfsScanNodeErrors failing on asf-master-exhaustive
- [IMPALA-5322] - Potential crash in Frontend & Catalog JNI startup
- [IMPALA-5324] - Fix version check in EvalDictionaryFilters
- [IMPALA-5330] - Impala tests never use or set a secondary FS, so TestMultipleFilesystems is always skipped
- [IMPALA-5331] - Use new libHDFS API to address "Unknown Error 255"
- [IMPALA-5338] - Fix Kudu timestamp default values
- [IMPALA-5339] - IMPALA-4166 breaks queries on tables with sort.column that do a expr rewrite
- [IMPALA-5340] - Query profile and debug webpage can disagree about 'Query State'
- [IMPALA-5342] - GetTables() Thrift call does not fill up the table comments field
- [IMPALA-5343] - Sort by Column(s) added as part of inserting into Kudu table is incorrect
- [IMPALA-5349] - BufferedBlockMgrTest.NoDirsAllocationError failed to write earlier than expected
- [IMPALA-5354] - nocluster/noshuffle doesn't work for DML into Kudu tables
- [IMPALA-5357] - Reading Kudu timestamp causes severe kernel spinning due to locking in impala::TimestampValue::UnixTimeToPtime-> __tz_convert
- [IMPALA-5358] - Off-by-one error in testTableSample
- [IMPALA-5375] - Builds on CentOS 6.4 failing with broken python dependencies
- [IMPALA-5378] - Disk IO manager needs to understand ADLS
- [IMPALA-5379] - parquet_dictionary_filtering query option is not tested
- [IMPALA-5383] - Fix PARQUET_FILE_SIZE option for ADLS
- [IMPALA-5387] - Excessive logging to INFO and ERROR files when reading S3 data
- [IMPALA-5388] - wrong results under stress with secure cluster
- [IMPALA-5391] - Cannot compile UDFs with older GCC versions
- [IMPALA-5402] - Changing the log level on the Catalog doesn't work as expected
- [IMPALA-5411] - Excessive logging while queries are loading metadata from ImpalaServer::GetRuntimeProfileStr
- [IMPALA-5413] - test_seq_writer_hive_compatibility fails on a real cluster because test user lacks write access
- [IMPALA-5419] - PhjBuilder::Partition::InsertBatch () continue to make progress even after query cancellation
- [IMPALA-5426] - Metastore fails to start up
- [IMPALA-5479] - Propagate the argument 'type' for RawValue::Compare()
Task
- [IMPALA-2923] - Integration job should run full data load + exhaustive.
- [IMPALA-3398] - Move Impala documentation development to ASF
- [IMPALA-3557] - Add workaround to create BIGINT stored as Kudu's UNIXTIME_MICROS
- [IMPALA-4686] - parquet-reader doesn't know about INT96 columns
- [IMPALA-4803] - Write release notes for 2.8 and 2.9
- [IMPALA-4829] - Change default Kudu read behavior for "RYW"
- [IMPALA-5002] - Toolchain build flags should be associated with builds
- [IMPALA-5033] - update external hadoop ecosystem versions
- [IMPALA-5328] - [DOCS] Document Parquet enhancements
- [IMPALA-5329] - [DOCS] Document Kudu enhancements
Sub-task
- [IMPALA-2020] - Rounding should be done instead of truncating when casting DECIMAL to DECIMAL, FLOAT/DOUBLE to DECIMAL, DECIMAL to INT
- [IMPALA-2550] - Switch to per-query exec rpc
- [IMPALA-3202] - Add spilling support to new buffer pool
- [IMPALA-3203] - Implement scalable buffer recycling in buffer pool
- [IMPALA-3224] - Move Impala JIRA to ASF
- [IMPALA-3401] - Remove Cloudera Manager-related content from doc source
- [IMPALA-3402] - Remove CDH version number dependencies from doc source
- [IMPALA-3403] - Rework Impala installation instructions to be generic
- [IMPALA-3405] - Rework Impala upgrade instructions to be generic
- [IMPALA-3406] - Rework Impala FAQs to be generic
- [IMPALA-3410] - Rework Impala security info to be generic
- [IMPALA-3411] - Rework Impala data management / governance info to be generic
- [IMPALA-4014] - Introduce query-wide execution state.
- [IMPALA-4029] - Reduce memory requirements for storing THdfsFileDesc
- [IMPALA-4041] - Limit catalog and admission control updates to coordinator nodes only
- [IMPALA-4114] - Port relevant BufferedBlockMgr unit tests for BufferPool
- [IMPALA-4181] - Host rendered documentation on ASF resources
- [IMPALA-4251] - Define gerrit process for documentation updates and reviews
- [IMPALA-4351] - query generator random profile options for INSERT
- [IMPALA-4353] - random query generation for INSERTs
- [IMPALA-4355] - qgen: rework query execution to handle CRUD queries
- [IMPALA-4359] - qgen: add UPSERT support
- [IMPALA-4370] - DECIMAL divide result type (Impala TPC-DS query 11 result lost one row)
- [IMPALA-4643] - Consolidate links that point outside the Impala doc bundle
- [IMPALA-4650] - Add Protobuf 2.6.1 to toolchain and as a build dependency
- [IMPALA-4651] - Add libev 4.2.0 to toolchain and as a build dependency
- [IMPALA-4652] - Add crcutil to toolchain
- [IMPALA-4678] - Set up query mem tracker in QueryState
- [IMPALA-4758] - Upgrade gutil to recent Kudu version
- [IMPALA-4809] - Add codegen GetConstant() for query options
- [IMPALA-4811] - Add strict mode tests for DECIMAL overflow of precision/scale in text file parsing
- [IMPALA-4813] - DECIMAL div/mod/multiply rounding
- [IMPALA-4821] - DECIMAL AVG() result type
- [IMPALA-4831] - Clients can violate BufferPool invariants by calling ReservationTracker methods directly.
- [IMPALA-4877] - Incorrect precedence of unary minus and plus
- [IMPALA-4884] - Add JVM heap and non-heap usage in memory metrics and UI
- [IMPALA-4885] - Add JVM thread stacktraces and synchronization info in web UI
- [IMPALA-4926] - Upgrade LZ4 to recent version
- [IMPALA-4984] - [DOCS] Remove Cloudera copyright information from codeblocks
- [IMPALA-4996] - Single-threaded KuduScanNode
- [IMPALA-5006] - [DOCS] Remove Cloudera-specific chunks of content tagged audience=hidden from security guide
- [IMPALA-5057] - Upgrade glog and gflags to most recent releases
- [IMPALA-5106] - KRPC DCHECK hit when closing DataStreamRecvr
- [IMPALA-5113] - Buffer pool unpinned invariant does not take into account multiply-pinned bytes
- [IMPALA-5124] - Fix BufferPool handling of scratch read errors
- [IMPALA-5130] - MemTracker::EnableReservationReporting() is not thread-safe
- [IMPALA-5147] - Add the ability to exclude coordinators from query execution
- [IMPALA-5166] - Clean up BufferPool profile counters
- [IMPALA-5174] - Suppress kudu flags that aren't relevant to Impala
- [IMPALA-5184] - Get Impala working against Hive2 APIs
- [IMPALA-5228] - test_coordinators custom cluster test fails after rebase
- [IMPALA-5309] - Implement TABLESAMPLE for HDFS tables
- [IMPALA-5326] - [DOCS] Document REPLACE() function
- [IMPALA-5359] - Document SORT BY syntax for CREATE TABLE and ALTER TABLE
- [IMPALA-5370] - Document REFRESH FUNCTIONS syntax
- [IMPALA-5371] - Document TIMESTAMP support for Kudu tables
- [IMPALA-5372] - Document new ALTER TABLE ADD COLUMNS options for Kudu tables
- [IMPALA-5374] - Document perf improvement from IS [NOT] NULL pushdown to Kudu
- [IMPALA-5382] - Document ADLS support for Impala
Project
- [IMPALA-5253] - Use appropriate transport for the StatestoreSubscriber