This topic describes various knobs you can use to control how Impala manages its metadata in order to improve performance and scalability.
In previous versions of Impala, every coordinator kept a replica of all the cache in
catalogd
, consuming large memory on each coordinator with no option to
evict. Metadata always propagated through the statestored
and suffers
from head-of-line blocking, for example, one user loading a big table blocking another
user loading a small table.
With this new feature, the coordinators pull metadata as needed from
catalogd
and cache it locally. The cached metadata gets evicted
automatically under memory pressure.
The granularity of on-demand metadata fetches is now at the partition level between the
coordinator and catalogd
. Common use cases like add/drop partitions do
not trigger unnecessary serialization/deserialization of large metadata.
This feature is disabled by default.
use_local_catalog
use_local_catalog
is enabled or set to True
on the impalad
coordinators the following list of flags configure various parameters as described below. It is not
recommended to change the default values on these flags.
To keep the size of metadata bounded, catalogd
periodically scans all
the tables and invalidates those not recently used. There are two types of
configurations for catalogd
and impalad
.
Catalogd
invalidates tables that are not recently used in the
specified time period (in seconds).
catalogd
, Impala invalidates 10% of the least
recently used tables.
Automatic invalidation of metadata provides more stability with lower chances of running out of memory, but the feature could potentially cause performance issues and may require tuning.
When tools such as Hive and Spark are used to process the raw data ingested into Hive
tables, new HMS metadata (database, tables, partitions) and filesystem metadata (new
files in existing partitions/tables) is generated. In previous versions of Impala, in
order to pick up this new information, Impala users needed to manually issue an
INVALIDATE
or REFRESH
commands.
When automatic invalidate/refresh of metadata is enabled, catalogd
polls Hive Metastore (HMS) notification events at a configurable interval and processes
the following changes:
ALTER TABLE
event.
ALTER
,
ADD
, or DROP
partitions.
CREATE TABLE
or
CREATE DATABASE
events.
catalogd
when it receives the DROP
TABLE
or DROP DATABASE
events.
INSERT
events.
If the table is not loaded at the time of processing the INSERT
event, the event processor does not need to refresh the table and skips it.
catalogd
when it receives the
ALTER DATABASE
events. The following changes are supported. This
event does not invalidate the tables in the database.
Changing the default location of the database does not move the tables of that database to the new location. Only the new tables which are created subsequently use the default location of the database in case it is not provided in the create table statement.
This feature is controlled by the
‑‑hms_event_polling_interval_s
flag. Start the
catalogd
with the ‑‑hms_event_polling_interval_s
flag set to a positive integer to enable the feature and set the polling frequency in
seconds. We recommend the value to be less than 5 seconds.
The following use cases are not supported:
INSERT
event, and the event
processor will not invalidate the corresponding table or refresh the corresponding
partition.
It is recommended that you use the LOAD DATA
command to do the data
load in such cases, so that event processor can act on the events generated by the
LOAD
command.
Seq((1, 2)).toDF("i", "j").write.save("/user/hive/warehouse/spark_etl.db/customers/date=01012019")
This feature is turned off by default with the
‑‑hms_event_polling_interval_s
flag set to
0
.
To use the HMS event based metadata sync:
hive-site.xml
of the Hive
Metastore service.
<property>
<name>hive.metastore.transactional.event.listeners</name>
<value>org.apache.hive.hcatalog.listener.DbNotificationListener</value>
</property>
<property>
<name>hive.metastore.dml.events</name>
<value>true</true>
</property>
hive-site.xml
.
hive.metastore.dml.events
configuration key to
true
in HiveServer2 service's hive-site.xml
. This
configuration key needs to be set to true
in both Hive services,
HiveServer2 and Hive Metastore.
hive.metastore.dml.events
configuration key
to true
in hive-site.xml
used by the Spark
applications (typically, /etc/hive/conf/hive-site.xml
) so that the
INSERT
events are generated when the Spark application inserts data
into existing tables and partitions.
When the ‑‑hms_event_polling_interval_s
flag is set to a non-zero
value for your catalogd
, the event-based automatic invalidation is
enabled for all databases and tables. If you wish to have the fine-grained control on
which tables or databases need to be synced using events, you can use the
impala.disableHmsSync
property to disable the event processing at the
table or database level.
When you add the DBPROPERTIES
or TBLPROPERTIES
with
the impala.disableHmsSync
key, the HMS event based sync is turned on
or off. The value of the impala.disableHmsSync
property determines if
the event processing needs to be disabled for a particular table or database.
'impala.disableHmsSync'='true'
, the events for that table or
database are ignored and not synced with HMS.
'impala.disableHmsSync'='false'
or if
impala.disableHmsSync
is not set, the automatic sync with HMS is
enabled if the ‑‑hms_event_polling_interval_s
global flag is
set to non-zero.
impala.disableHmsSync
database properties in Hive as currently,
Impala does not support setting database properties:
CREATE DATABASE <name> WITH DBPROPERTIES ('impala.disableHmsSync'='true');
CREATE TABLE <name> WITH TBLPROPERTIES ('impala.disableHmsSync'='true' | 'false');
ALTER TABLE <name> WITH TBLPROPERTIES ('impala.disableHmsSync'='true' | 'false');
When both table and database level properties are set, the table level property takes precedence. If the table level property is not set, then the database level property is used to evaluate if the event needs to be processed or not.
If the property is changed from true
(meaning events are skipped) to
false
(meaning events are not skipped), you need to issue a manual
INVALIDATE METADATA
command to reset event processor because it
doesn't know how many events have been skipped in the past and cannot know if the
object in the event is the latest. In such a case, the status of the event processor
changes to NEEDS_INVALIDATE
.
You can use the web UI of the catalogd
to check the state of the
automatic invalidate event processor.
This provides a detailed view of the metrics of the event processor, including min, max, mean, median, of the durations and rate metrics for all the counters listed on the /metrics#events page.
The /metrics#events page provides the following metrics about the HMS event processor.
Name | Description |
---|---|
events-processor.avg-events-fetch-duration | Average duration to fetch a batch of events and process it. |
events-processor.avg-events-process-duration | Average time taken to process a batch of events received from the Metastore. |
events-processor.events-received | Total number of the Metastore events received. |
events-processor.events-received-15min-rate |
Exponentially weighted moving average (EWMA) of number of events received in
last 15 min.
This rate of events can be used to determine if there are spikes in event processor activity during certain hours of the day. |
events-processor.events-received-1min-rate |
Exponentially weighted moving average (EWMA) of number of events received in
last 1 min.
This rate of events can be used to determine if there are spikes in event processor activity during certain hours of the day. |
events-processor.events-received-5min-rate |
Exponentially weighted moving average (EWMA) of number of events received in
last 5 min.
This rate of events can be used to determine if there are spikes in event processor activity during certain hours of the day. |
events-processor.events-skipped |
Total number of the Metastore events skipped.
Events can be skipped based on certain flags are table and database level.
You can use this metric to make decisions, such as:
|
events-processor.status |
Metastore event processor status to see if there are events being received
or not. Possible states are:
|