Installing Impala

Impala is an open-source analytic database for Apache Hadoop that returns rapid responses to queries.

Follow these steps to set up Impala on a cluster by building from source:

What is Included in an Impala Installation

Impala is made up of a set of components that can be installed on multiple nodes throughout your cluster. The key installation step for performance is to install the impalad daemon (which does most of the query processing work) on all DataNodes in the cluster.

Impala primarily consists of these executables, which should be available after you build from source:

Before starting working with Impala, ensure that you have all necessary prerequisites. See Impala Requirements for details.