SET Statement
Specifies values for query options that control the runtime behavior of other statements within the same session.
In Impala 2.5 and higher, SET
also defines user-specified substitution variables for
the impala-shell interpreter. This feature uses the SET
command
built into impala-shell instead of the SQL SET
statement.
Therefore the substitution mechanism only works with queries processed by impala-shell,
not with queries submitted through JDBC or ODBC.
In Impala 2.11 and higher, the output of the SET
statement changes in some important ways:
-
The options are divided into groups:
Regular Query Options
,Advanced Query Options
,Development Query Options
, andDeprecated Query Options
. -
The advanced options are intended for use in specific kinds of performance tuning and debugging scenarios. The development options are related to internal development of Impala or features that are not yet finalized; these options might be changed or removed without notice. The deprecated options are related to features that are removed or changed so that the options no longer have any purpose; these options might be removed in future versions.
-
By default, only the first two groups (regular and advanced) are displayed by the
SET
command. Use the syntaxSET ALL
to see all groups of options. -
impala-shell options and user-specified variables are always displayed at the end of the list of query options, after all appropriate option groups.
-
When the
SET
command is run through the JDBC or ODBC interfaces, the result set has a new third column,level
, indicating which group each option belongs to. The same distinction ofSET
returning the regular and advanced options, andSET ALL
returning all option groups, applies to JDBC and ODBC also.
Syntax:
SET [query_option=option_value]
SET ALL
SET
and SET ALL
with no arguments return a
result set consisting of all the applicable query options and their current values.
The query option name and any string argument values are case-insensitive.
Each query option has a specific allowed notation for its arguments. Boolean options can be enabled and
disabled by assigning values of either true
and false
, or
1
and 0
. Some numeric options accept a final character signifying the unit,
such as 2g
for 2 gigabytes or 100m
for 100 megabytes. See
Query Options for the SET Statement for the details of each query option.
Setting query options during impala-shell invocation:
In Impala 2.11 and higher, you can use one or more command-line options
of the form --query_option=option=value
when running the impala-shell command. The corresponding query option settings
take effect for that impala-shell session.
User-specified substitution variables:
In Impala 2.5 and higher, you can specify your own names and string substitution values
within the impala-shell interpreter. Once a substitution variable is set up,
its value is inserted into any SQL statement in that same impala-shell session
that contains the notation ${var:varname}
.
Using SET
in an interactive impala-shell session overrides
any value for that same variable passed in through the --var=varname=value
command-line option.
For example, to set up some default parameters for report queries, but then override those default within an impala-shell session, you might issue commands and statements such as the following:
-- Initial setup for this example.
create table staging_table (s string);
insert into staging_table values ('foo'), ('bar'), ('bletch');
create table production_table (s string);
insert into production_table values ('North America'), ('EMEA'), ('Asia');
quit;
-- Start impala-shell with user-specified substitution variables,
-- run a query, then override the variables with SET and run the query again.
$ impala-shell --var=table_name=staging_table --var=cutoff=2
... banner message ...
[localhost:21000] > select s from ${var:table_name} order by s limit ${var:cutoff};
Query: select s from staging_table order by s limit 2
+--------+
| s |
+--------+
| bar |
| bletch |
+--------+
Fetched 2 row(s) in 1.06s
[localhost:21000] > set var:table_name=production_table;
Variable TABLE_NAME set to production_table
[localhost:21000] > set var:cutoff=3;
Variable CUTOFF set to 3
[localhost:21000] > select s from ${var:table_name} order by s limit ${var:cutoff};
Query: select s from production_table order by s limit 3
+---------------+
| s |
+---------------+
| Asia |
| EMEA |
| North America |
+---------------+
The following example shows how SET ALL
with no parameters displays
all user-specified substitution variables, and how UNSET
removes
the substitution variable entirely:
[localhost:21000] > set all;
Query options (defaults shown in []):
ABORT_ON_ERROR: [0]
COMPRESSION_CODEC: []
DISABLE_CODEGEN: [0]
...
Advanced Query Options:
APPX_COUNT_DISTINCT: [0]
BUFFER_POOL_LIMIT: []
DEFAULT_JOIN_DISTRIBUTION_MODE: [0]
...
Development Query Options:
BATCH_SIZE: [0]
DEBUG_ACTION: []
DECIMAL_V2: [0]
...
Deprecated Query Options:
ABORT_ON_DEFAULT_LIMIT_EXCEEDED: [0]
ALLOW_UNSUPPORTED_FORMATS: [0]
DEFAULT_ORDER_BY_LIMIT: [-1]
...
Shell Options
LIVE_PROGRESS: False
LIVE_SUMMARY: False
Variables:
CUTOFF: 3
TABLE_NAME: staging_table
[localhost:21000] > unset var:cutoff;
Unsetting variable CUTOFF
[localhost:21000] > select s from ${var:table_name} order by s limit ${var:cutoff};
Error: Unknown variable CUTOFF
See Running Commands and SQL Statements in impala-shell for more examples of using the
--var
, SET
, and ${var:varname}
substitution technique in impala-shell.
Usage notes:
MEM_LIMIT
is probably the most commonly used query option. You can specify a high value to
allow a resource-intensive query to complete. For testing how queries would work on memory-constrained
systems, you might specify an artificially low value.
Complex type considerations:
Examples:
The following example sets some numeric and some Boolean query options to control usage of memory, disk space, and timeout periods, then runs a query whose success could depend on the options in effect:
set mem_limit=64g;
set DISABLE_UNSAFE_SPILLS=true;
set parquet_file_size=400m;
set RESERVATION_REQUEST_TIMEOUT=900000;
insert overwrite parquet_table select c1, c2, count(c3) from text_table group by c1, c2, c3;
Added in: Impala 2.0.0
SET
has always been available as an impala-shell command. Promoting it to
a SQL statement lets you use this feature in client applications through the JDBC and ODBC APIs.
Cancellation: Cannot be cancelled.
HDFS permissions: This statement does not touch any HDFS files or directories, therefore no HDFS permissions are required.
Related information:
See Query Options for the SET Statement for the query options you can adjust using this statement.