CONNECT
command. 'hs2'
or 'hs2-http'
protocol
(--protocol
option).You can connect to any Impala daemon (impalad), and that daemon coordinates the execution of all queries sent to it.
For simplicity during development, you might always connect to the same host, perhaps running impala-shell on
the same host as impalad and specifying the hostname as localhost
.
In a production environment, you might enable load balancing, in which you connect to specific host/port combination but queries are forwarded to arbitrary hosts. This technique spreads the overhead of acting as the coordinator node among all the Impala daemons in the cluster. See Using Impala through a Proxy for High Availability for details.
To connect to an Impala during shell startup:
-i
option to the
impala-shell interpreter to specify the connection
information for that instance of impalad:
# When you are connecting to an impalad running on the same machine.
# The prompt will reflect the current hostname.
$ impala-shell
# When you are connecting to an impalad running on a remote machine, and impalad is listening
# on a non-default port over the HTTP HiveServer2 protocol.
$ impala-shell -i some.other.hostname:port_number --protocol='hs2-http'
# When you are connecting to an impalad running on a remote machine, and impalad is listening
# on a non-default port.
$ impala-shell -i some.other.hostname:port_number
To connect to an Impala in theimpala-shell session:
$ impala-shell
connect
command to connect to an Impala
instance. Enter a command of the form:
[Not connected] > connect impalad-host
To start impala-shell in a specific database:
You can use all the same connection options as in previous examples. For simplicity, these examples assume that you are logged into one of the Impala daemons.
-d
option to the
impala-shell interpreter to connect and immediately
switch to the specified database, without the need for a USE
statement or fully qualified names:
# Subsequent queries with unqualified names operate on
# tables, views, and so on inside the database named 'staging'.
$ impala-shell -i localhost -d staging
# It is common during development, ETL, benchmarking, and so on
# to have different databases containing the same table names
# but with different contents or layouts.
$ impala-shell -i localhost -d parquet_snappy_compression
$ impala-shell -i localhost -d parquet_gzip_compression
To run one or several statements in non-interactive mode:
You can use all the same connection options as in previous examples. For simplicity, these examples assume that you are logged into one of the Impala daemons.
-q
option to run a single statement, or
the -f
option to run a sequence of statements from a file.
The impala-shell command returns immediately, without going into
the interactive interpreter.
# A utility command that you might run while developing shell scripts
# to manipulate HDFS files.
$ impala-shell -i localhost -d database_of_interest -q 'show tables'
# A sequence of CREATE TABLE, CREATE VIEW, and similar DDL statements
# can go into a file to make the setup process repeatable.
$ impala-shell -i localhost -d database_of_interest -f recreate_tables.sql