Starting Impala

To activate Impala if it is installed but not yet started:

  1. Set any necessary configuration options for the Impala services. See Modifying Impala Startup Options for details.
  2. Start one instance of the Impala statestore. The statestore helps Impala to distribute work efficiently, and to continue running in the event of availability problems for other Impala nodes. If the statestore becomes unavailable, Impala continues to function.
  3. Start one instance of the Impala catalog service.
  4. Start the main Impala service on one or more DataNodes, ideally on all DataNodes to maximize local processing and avoid network traffic due to remote reads.

Once Impala is running, you can conduct interactive experiments using the instructions in Impala Tutorials and try Using the Impala Shell (impala-shell Command).

Starting Impala from the Command Line

To start the Impala state store and Impala from the command line or a script, you can either use the service command or you can start the daemons directly through the impalad, statestored, and catalogd executables.

Start the Impala statestore and then start impalad instances. You can modify the values the service initialization scripts use when starting the statestore and Impala by editing /etc/default/impala.

Start the statestore service using a command similar to the following:

$ sudo service impala-state-store start

Start the catalog service using a command similar to the following:

$ sudo service impala-catalog start

Start the Impala service on each DataNode using a command similar to the following:

$ sudo service impala-server start
Note:

In Impala 2.5 and higher, Impala UDFs and UDAs written in C++ are persisted in the metastore database. Java UDFs are also persisted, if they were created with the new CREATE FUNCTION syntax for Java UDFs, where the Java function argument and return types are omitted. Java-based UDFs created with the old CREATE FUNCTION syntax do not persist across restarts because they are held in the memory of the catalogd daemon. Until you re-create such Java UDFs using the new CREATE FUNCTION syntax, you must reload those Java-based UDFs by running the original CREATE FUNCTION statements again each time you restart the catalogd daemon. Prior to Impala 2.5 the requirement to reload functions after a restart applied to both C++ and Java functions.

If any of the services fail to start, review: