Configuring Impala for High Availability
The Impala StateStore checks on the health of all Impala daemons in a cluster, and continuously relays its findings to each of the daemons. The Catalog stores metadata of databases, tables, partitions, resource usage information, configuration settings, and other objects managed by Impala. If StateStore and Catalog daemons are single instances in an Impala cluster, it will create a single point of failure. Although Impala coordinators/executors continue to execute queries if the StateStore node is down, coordinators/executors will not get state updates. This causes degradation of admission control & cluster membership updates. To mitigate this, a pair of StateStore and Catalog instances can be deployed in an Impala cluster so that Impala cluster could survive failures of StateStore or Catalog.
Prerequisite: