Apache Impala Guide
Introducing Apache Impala
Concepts and Architecture
Components
Developing Applications
Role in the Hadoop Ecosystem
Deployment Planning
Requirements
Designing Schemas
Installing Impala
Managing Impala
Post-Installation Configuration for Impala
Upgrading Impala
Starting Impala
Modifying Impala Startup Options
Tutorials
Administration
Setting Timeouts
Load-Balancing Proxy for HA
Managing Disk Space
Impala Security
Security Guidelines for Impala
Securing Impala Data and Log Files
Installation Considerations for Impala Security
Securing the Hive Metastore Database
Securing the Impala Web User Interface
Configuring TLS/SSL for Impala
Impala Authorization
Impala Authentication
Enabling Kerberos Authentication for Impala
Enabling LDAP Authentication for Impala
Using Multiple Authentication Methods with Impala
Configuring Impala Delegation for Clients
Auditing
Viewing Lineage Info
SQL Reference
Comments
Data Types
ARRAY Complex Type (Impala 2.3 or higher only)
BIGINT
BOOLEAN
CHAR
DATE
DECIMAL
DOUBLE
FLOAT
INT
MAP Complex Type (Impala 2.3 or higher only)
REAL
SMALLINT
STRING
STRUCT Complex Type (Impala 2.3 or higher only)
TIMESTAMP
Customizing Time Zones
TINYINT
VARCHAR
Complex Types (Impala 2.3 or higher only)
Querying arrays
Zipping unnest on arrays from views
Literals
SQL Operators
Schema Objects and Object Names
Aliases
Databases
Functions
Identifiers
Tables
Views
Transactions
SQL Statements
DDL Statements
DML Statements
ALTER DATABASE
ALTER TABLE
ALTER VIEW
COMMENT
COMPUTE STATS
CREATE DATABASE
CREATE FUNCTION
CREATE ROLE
CREATE TABLE
CREATE VIEW
DELETE
DESCRIBE
DROP DATABASE
DROP FUNCTION
DROP ROLE
DROP STATS
DROP TABLE
DROP VIEW
EXPLAIN
GRANT
INSERT
INVALIDATE METADATA
LOAD DATA
REFRESH
REFRESH AUTHORIZATION
REFRESH FUNCTIONS
REVOKE
SELECT
Joins
ORDER BY Clause
GROUP BY Clause
HAVING Clause
LIMIT Clause
OFFSET Clause
UNION Clause
Subqueries
TABLESAMPLE Clause
WITH Clause
DISTINCT Operator
SET
ABORT ON ERROR
ALLOW ERASURE CODED FILES
ALLOW UNSUPPORTED FORMATS
APPX COUNT DISTINCT
BATCH SIZE
BROADCAST BYTES LIMIT
BUFFER POOL LIMIT
COMPRESSION CODEC
COMPUTE STATS MIN SAMPLE SIZE
DEBUG ACTION
DECIMAL V2
DEFAULT FILE FORMAT
DEFAULT HINTS INSERT STATEMENT
DEFAULT JOIN DISTRIBUTION MODE
DEFAULT SPILLABLE BUFFER SIZE
DEFAULT TRANSACTIONAL TYPE
DELETE STATS IN TRUNCATE
DISABLE CODEGEN
DISABLE CODEGEN ROWS THRESHOLD
DISABLE HBASE NUM ROWS ESTIMATE
DISABLE ROW RUNTIME FILTERING
DISABLE STREAMING PREAGGREGATIONS
DISABLE UNSAFE SPILLS
ENABLE EXPR REWRITES
EXEC SINGLE NODE ROWS THRESHOLD
EXEC TIME LIMIT S
EXPAND COMPLEX TYPES
EXPLAIN LEVEL
MAX NUM RUNTIME FILTERS
FETCH ROWS TIMEOUT MS
JOIN ROWS PRODUCED LIMIT
HBASE CACHE BLOCKS
HBASE CACHING
IDLE SESSION TIMEOUT
KUDU READ MODE
LIVE PROGRESS
LIVE SUMMARY
MAX ERRORS
MAX MEM ESTIMATE FOR ADMISSION
MAX RESULT SPOOLING MEM
MAX ROW SIZE
MAX SCAN RANGE LENGTH
MAX SPILLED RESULT SPOOLING MEM
MEM LIMIT
MIN SPILLABLE BUFFER SIZE
MT DOP
NUM NODES
NUM ROWS PRODUCED LIMIT
NUM SCANNER THREADS
OPTIMIZE PARTITION KEY SCANS
PARQUET COMPRESSION CODEC
PARQUET ANNOTATE STRINGS UTF8
PARQUET ARRAY RESOLUTION
PARQUET DICTIONARY FILTERING
PARQUET FALLBACK SCHEMA RESOLUTION
PARQUET FILE SIZE
PARQUET OBJECT STORE SPLIT SIZE
PARQUET PAGE ROW COUNT LIMIT
PARQUET READ STATISTICS
PARQUET READ PAGE INDEX
PARQUET WRITE PAGE INDEX
PREFETCH MODE
QUERY TIMEOUT S
REFRESH UPDATED HMS PARTITIONS
REPLICA PREFERENCE
REQUEST POOL
RESOURCE TRACE RATIO
RETRY FAILED QUERIES
ENABLED RUNTIME FILTER TYPES
RUNTIME BLOOM FILTER SIZE
RUNTIME FILTER MAX SIZE
RUNTIME FILTER MIN SIZE
RUNTIME FILTER MODE
RUNTIME FILTER WAIT TIME MS
S3 SKIP INSERT STAGING
SCAN BYTES LIMIT
SCHEDULE RANDOM REPLICA
SCRATCH LIMIT
SHUFFLE DISTINCT EXPRS
SPOOL QUERY RESULTS
SUPPORT START OVER
SYNC DDL
THREAD RESERVATION AGGREGATE LIMIT
THREAD RESERVATION LIMIT
TIMEZONE
TOPN BYTES LIMIT
USE NULL SLOTS CACHE
UTF8 MODE
SHOW
SHUTDOWN
TRUNCATE TABLE
UPDATE
UPSERT
USE
VALUES
Optimizer Hints
Built-In Functions
Mathematical Functions
Bit Functions
Type Conversion Functions
Date and Time Functions
Conditional Functions
String Functions
Miscellaneous Functions
Aggregate Functions
APPX_MEDIAN
AVG
COUNT
GROUP_CONCAT
MAX
MIN
NDV
STDDEV, STDDEV_SAMP, STDDEV_POP
SUM
VARIANCE, VARIANCE_SAMP, VARIANCE_POP, VAR_SAMP, VAR_POP
Analytic Functions
User-Defined Functions (UDFs)
SQL Differences Between Impala and Hive
Porting SQL
UTF-8 Support
Performance Tuning
Performance Best Practices
Join Performance
Table and Column Statistics
Benchmarking
Controlling Resource Usage
Runtime Filtering
HDFS Caching
HDFS Block Skew
Data Cache for Remote Reads
Testing Impala Performance
EXPLAIN Plans and Query Profiles
Scalability Considerations
Scaling Limits and Guidelines
Dedicated Coordinators Optimization
Metadata Management
Resource Management
Admission Control and Query Queuing
Configuring Admission Control
Partitioning
File Formats
Text Data Files
Parquet Data Files
ORC Data Files
Avro Data Files
Hudi Data Files
RCFile Data Files
SequenceFile Data Files
Using Impala to Query Kudu Tables
HBase Tables
Iceberg Tables
S3 Tables
ADLS Tables
Isilon Storage
Ozone Storage
Logging
Client Access
The Impala Shell
Configuration Options
Connecting to impalad
Running Commands and SQL Statements
Command Reference
Configuring Impala to Work with ODBC
Configuring Impala to Work with JDBC
Spooling Impala Query Results
Fault Tolerance
Impala Transparent Query Retries
Impala Node Blacklisting
Troubleshooting Impala
Web User Interface
Breakpad Minidumps
Ports Used by Impala
Impala Reserved Words
Impala Frequently Asked Questions
Impala Release Notes
New Features in Apache Impala
Incompatible Changes and Limitations in Apache Impala
Known Issues and Workarounds in Impala
Fixed Issues in Apache Impala