Impala
Impalaistheopensource,nativeanalyticdatabaseforApacheHadoop.
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros
impala::AdmissionController Class Reference

#include <admission-controller.h>

Collaboration diagram for impala::AdmissionController:

Classes

struct  PoolMetrics
 
struct  QueueNode
 

Public Member Functions

 AdmissionController (RequestPoolService *request_pool_service, MetricGroup *metrics, const std::string &backend_id)
 
 ~AdmissionController ()
 
Status AdmitQuery (QuerySchedule *schedule)
 
Status ReleaseQuery (QuerySchedule *schedule)
 
Status Init (StatestoreSubscriber *subscriber)
 Registers with the subscription manager. More...
 

Private Types

typedef boost::unordered_map
< std::string, TPoolStats > 
PoolStatsMap
 Map of pool names to pool statistics. More...
 
typedef boost::unordered_set
< std::string > 
PoolSet
 A set of pool names. More...
 
typedef boost::unordered_map
< std::string, PoolStatsMap
PerBackendPoolStatsMap
 
typedef InternalQueue< QueueNodeRequestQueue
 
typedef boost::unordered_map
< std::string, RequestQueue
RequestQueueMap
 Map of pool names to request queues. More...
 
typedef boost::unordered_map
< std::string, PoolMetrics
PoolMetricsMap
 Map of pool names to pool metrics. More...
 
typedef boost::unordered_map
< std::string,
TPoolConfigResult > 
PoolConfigMap
 

Private Member Functions

void UpdatePoolStats (const StatestoreSubscriber::TopicDeltaMap &incoming_topic_deltas, std::vector< TTopicDelta > *subscriber_topic_updates)
 Statestore subscriber callback that updates the pool stats state. More...
 
void HandleTopicUpdates (const std::vector< TTopicItem > &topic_updates)
 
void HandleTopicDeletions (const std::vector< std::string > &topic_deletions)
 
void UpdateClusterAggregates (const std::string &pool_name)
 
void UpdateLocalMemUsage (const std::string &pool_name)
 
void AddPoolUpdates (std::vector< TTopicDelta > *subscriber_topic_updates)
 
void DequeueLoop ()
 Dequeues and admits queued queries when notified by dequeue_cv_. More...
 
Status CanAdmitRequest (const std::string &pool, const int64_t max_requests, const int64_t mem_limit, const QuerySchedule &schedule, bool admit_from_queue)
 
Status RejectRequest (const std::string &pool, const int64_t max_requests, const int64_t mem_limit, const int64_t max_queued, const QuerySchedule &schedule)
 
PoolMetricsGetPoolMetrics (const std::string &pool_name)
 

Private Attributes

RequestPoolServicerequest_pool_service_
 
MetricGroupmetrics_
 Metrics subsystem access. More...
 
boost::scoped_ptr< Threaddequeue_thread_
 Thread dequeuing and admitting queries. More...
 
const std::string backend_id_
 Unique id for this impalad, used to construct topic keys. More...
 
ThriftSerializer thrift_serializer_
 Serializes/deserializes TPoolStats when sending and receiving topic updates. More...
 
boost::mutex admission_ctrl_lock_
 
PoolStatsMap local_pool_stats_
 
PoolSet pools_for_updates_
 
PerBackendPoolStatsMap per_backend_pool_stats_map_
 
PoolStatsMap cluster_pool_stats_
 
RequestQueueMap request_queue_map_
 
PoolMetricsMap pool_metrics_map_
 
PoolConfigMap pool_config_cache_
 
boost::condition_variable dequeue_cv_
 
bool done_
 If true, tear down the dequeuing thread. This only happens in unit tests. More...
 

Static Private Attributes

static const std::string IMPALA_REQUEST_QUEUE_TOPIC
 

Detailed Description

The AdmissionController is used to make local admission decisions based on cluster state disseminated by the statestore. Requests are submitted for execution to a given pool via AdmitQuery(). A request will either be admitted immediately, queued, or rejected. This decision is based on per-pool estimates of the total number of concurrently executing queries across the entire cluster, their memory usage, and the total number of queued requests across the entire cluster. When the number of concurrently executing queries goes above a configurable per-pool threshold or the memory usage of those queries goes above a per-pool memory limit, requests will be queued. When the total number of queued requests in a particular pool goes above a configurable threshold, incoming requests to that pool will be rejected. TODO: When we resolve users->pools, explain the model and configuration story. (There is one hard-coded pool right now, configurable via gflags.) The pool statistics are updated by the statestore using the IMPALA_REQUEST_QUEUE_TOPIC topic. Every <impalad, pool> pair is sent as a topic update when pool statistics change, and the topic updates from other impalads are used to re-compute the total per-pool stats. When there are queued requests, the number of executing queries drops below the configured maximum, and the memory usage of those queries is below the memory limit, a number of queued requests will be admitted according to the following formula: N = (#local_pool_queued / #global_pool_queued) * (pool_limit - #global_pool_running) If there is a memory limit specified but no limit on the number of running queries, we will dequeue and admit all of the queued requests because we don't attempt to estimate request memory usage. Because the pool statistics are only updated on statestore heartbeats and all decisions are made locally, the total pool statistics are estimates. As a result, more requests may be admitted or queued than the configured thresholds, which are really soft limits. Because the memory usage (tracked by the per-pool mem trackers) may not reflect the peak memory usage of a query for some time, it would be possible to over admit requests if they are submitted much faster than running queries were to start using memory. However, to avoid this, we use the request's memory estimate from planning (even though we know these estimates may be off) before admitting a request, and we also keep track of the sum of the memory estimates for all running queries, per-pool. The local, per-pool mem_usage is set to the maximum of the estimate and the actual per-pool current consumption. We then need to update the local and cluster mem_usage stats when admitting (in AdmitQuery() and DequeueLoop()) as well as in ReleaseQuery(). When requests are submitted very quickly and the memory estimates from planning are significantly off this strategy can still result in over or under subscription, but this is not completely unavoidable unless we can produce better estimates. TODO: We can reduce the effect of very high estimates by using a weighted combination of the estimate and the actual consumption as a function of time.

Definition at line 81 of file admission-controller.h.

Member Typedef Documentation

typedef boost::unordered_map<std::string, PoolStatsMap> impala::AdmissionController::PerBackendPoolStatsMap
private

Mimics the statestore topic, i.e. stores a local copy of the logical data structure that the statestore broadcasts. The local stats are not stored in this map because we need to be able to clear the stats for all remote backends when a full topic update is received. By storing the local pool stats in local_pool_stats_, we can simply clear() the map. Pool names -> full topic keys (i.e. "<topic>!<backend_id>") -> pool stats

Definition at line 211 of file admission-controller.h.

typedef boost::unordered_map<std::string, TPoolConfigResult> impala::AdmissionController::PoolConfigMap
private

Map of pool names to the most recent pool configs returned by request_pool_service_. Stored so that the dequeue thread does not need to access the configs via the request pool service again (which involves a JNI call and error checking).

Definition at line 238 of file admission-controller.h.

typedef boost::unordered_map<std::string, PoolMetrics> impala::AdmissionController::PoolMetricsMap
private

Map of pool names to pool metrics.

Definition at line 232 of file admission-controller.h.

typedef boost::unordered_set<std::string> impala::AdmissionController::PoolSet
private

A set of pool names.

Definition at line 199 of file admission-controller.h.

typedef boost::unordered_map<std::string, TPoolStats> impala::AdmissionController::PoolStatsMap
private

Map of pool names to pool statistics.

Definition at line 192 of file admission-controller.h.

Queue for the queries waiting to be admitted for execution. Once the maximum number of concurrently executing queries has been reached, incoming queries are queued and admitted FCFS.

Definition at line 225 of file admission-controller.h.

typedef boost::unordered_map<std::string, RequestQueue> impala::AdmissionController::RequestQueueMap
private

Map of pool names to request queues.

Definition at line 228 of file admission-controller.h.

Constructor & Destructor Documentation

impala::AdmissionController::AdmissionController ( RequestPoolService request_pool_service,
MetricGroup metrics,
const std::string &  backend_id 
)

Definition at line 176 of file admission-controller.cc.

References dequeue_thread_, and DequeueLoop().

impala::AdmissionController::~AdmissionController ( )

Definition at line 187 of file admission-controller.cc.

References admission_ctrl_lock_, dequeue_cv_, dequeue_thread_, and done_.

Member Function Documentation

void impala::AdmissionController::AddPoolUpdates ( std::vector< TTopicDelta > *  subscriber_topic_updates)
private
Status impala::AdmissionController::AdmitQuery ( QuerySchedule schedule)

Submits the request for admission. Returns immediately if rejected, but otherwise blocks until the request is admitted. When this method returns, schedule->is_admitted() is true if and only if the request was admitted. For all calls to AdmitQuery(), ReleaseQuery() should also be called after the query completes to ensure that the pool statistics are updated.

Definition at line 265 of file admission-controller.cc.

References impala::RuntimeProfile::AddInfoString(), admission_ctrl_lock_, CanAdmitRequest(), impala::AdmissionController::PoolMetrics::cluster_mem_estimate, cluster_pool_stats_, impala::InternalQueue< T >::Contains(), impala::DebugPoolStats(), impala::InternalQueue< T >::Enqueue(), impala::Promise< T >::Get(), impala::QuerySchedule::GetClusterMemoryEstimate(), impala::Status::GetDetail(), impala::RequestPoolService::GetPoolConfig(), GetPoolMetrics(), impala::AdmissionController::QueueNode::is_admitted, impala::Promise< T >::IsSet(), impala::AdmissionController::PoolMetrics::local_admitted, impala::AdmissionController::PoolMetrics::local_mem_estimate, local_pool_stats_, impala::AdmissionController::PoolMetrics::local_queued, impala::AdmissionController::PoolMetrics::local_rejected, impala::AdmissionController::PoolMetrics::local_time_in_queue_ms, impala::AdmissionController::PoolMetrics::local_timed_out, impala::RuntimeProfile::EventSequence::MarkEvent(), impala::MonotonicMillis(), impala::Status::OK, impala::Status::ok(), pool_config_cache_, pools_for_updates_, impala::PrettyPrinter::Print(), impala::PROFILE_INFO_KEY_ADMISSION_RESULT, impala::PROFILE_INFO_VAL_ADMIT_IMMEDIATELY, impala::PROFILE_INFO_VAL_ADMIT_QUEUED, impala::PROFILE_INFO_VAL_REJECTED, impala::PROFILE_INFO_VAL_TIME_OUT, impala::QUERY_EVENT_COMPLETED_ADMISSION, impala::QUERY_EVENT_SUBMIT_FOR_ADMISSION, impala::QuerySchedule::query_events(), impala::QuerySchedule::query_id(), RejectRequest(), impala::InternalQueue< T >::Remove(), impala::QuerySchedule::request_pool(), request_pool_service_, request_queue_map_, RETURN_IF_ERROR, impala::Promise< T >::Set(), impala::QuerySchedule::set_is_admitted(), impala::STATUS_TIME_OUT, impala::QuerySchedule::summary_profile(), VLOG_QUERY, and VLOG_RPC.

Status impala::AdmissionController::CanAdmitRequest ( const std::string &  pool,
const int64_t  max_requests,
const int64_t  mem_limit,
const QuerySchedule schedule,
bool  admit_from_queue 
)
private

Returns OK if the request can be admitted, i.e. admitting would not go over the limits for this pool. Otherwise, the error message specifies the reason the request can not be admitted immediately.

Definition at line 211 of file admission-controller.cc.

References cluster_pool_stats_, impala::Status::Expected(), impala::QuerySchedule::GetClusterMemoryEstimate(), impala::Status::OK, impala::PrettyPrinter::Print(), impala::QUEUED_MEM_LIMIT, impala::QUEUED_NUM_RUNNING, and impala::QUEUED_QUEUE_NOT_EMPTY.

Referenced by AdmitQuery(), and DequeueLoop().

void impala::AdmissionController::HandleTopicDeletions ( const std::vector< std::string > &  topic_deletions)
private

Removes stats from the per_backend_pool_stats_map_ from topic deletions. Called by UpdatePoolStats(). Must hold admission_ctrl_lock_.

Definition at line 494 of file admission-controller.cc.

References impala::DebugPoolStats(), impala::ParsePoolTopicKey(), per_backend_pool_stats_map_, and VLOG_ROW.

Referenced by UpdatePoolStats().

void impala::AdmissionController::HandleTopicUpdates ( const std::vector< TTopicItem > &  topic_updates)
private

Updates the per_backend_pool_stats_map_ with topic_updates. Called by UpdatePoolStats(). Must hold admission_ctrl_lock_.

Definition at line 459 of file admission-controller.cc.

References backend_id_, impala::DebugPoolStats(), impala::DeserializeThriftMsg(), local_pool_stats_, impala::Status::ok(), impala::ParsePoolTopicKey(), per_backend_pool_stats_map_, VLOG_QUERY, and VLOG_ROW.

Referenced by UpdatePoolStats().

Status impala::AdmissionController::Init ( StatestoreSubscriber subscriber)

Registers with the subscription manager.

Definition at line 201 of file admission-controller.cc.

References impala::Status::AddDetail(), impala::StatestoreSubscriber::AddTopic(), IMPALA_REQUEST_QUEUE_TOPIC, and UpdatePoolStats().

Status impala::AdmissionController::RejectRequest ( const std::string &  pool,
const int64_t  max_requests,
const int64_t  mem_limit,
const int64_t  max_queued,
const QuerySchedule schedule 
)
private
Status impala::AdmissionController::ReleaseQuery ( QuerySchedule schedule)
void impala::AdmissionController::UpdateClusterAggregates ( const std::string &  pool_name)
private
void impala::AdmissionController::UpdateLocalMemUsage ( const std::string &  pool_name)
private

Updates the memory usage of the local pool stats based on the most recent mem tracker consumption. Called by UpdatePoolStats() before sending local pool updates. Must hold admission_ctrl_lock_.

Definition at line 554 of file admission-controller.cc.

References impala::MemTracker::consumption(), GetPoolMetrics(), impala::MemTracker::GetRequestPoolMemTracker(), impala::AdmissionController::PoolMetrics::local_mem_usage, local_pool_stats_, pools_for_updates_, and tracker.

Referenced by UpdatePoolStats().

void impala::AdmissionController::UpdatePoolStats ( const StatestoreSubscriber::TopicDeltaMap incoming_topic_deltas,
std::vector< TTopicDelta > *  subscriber_topic_updates 
)
private

Member Data Documentation

boost::mutex impala::AdmissionController::admission_ctrl_lock_
private

Protects all access to all variables below. Coordinates access to the results of the promise QueueNode::is_admitted, but the lock is not required to wait on the promise.

Definition at line 189 of file admission-controller.h.

Referenced by AdmitQuery(), DequeueLoop(), ReleaseQuery(), UpdatePoolStats(), and ~AdmissionController().

const std::string impala::AdmissionController::backend_id_
private

Unique id for this impalad, used to construct topic keys.

Definition at line 181 of file admission-controller.h.

Referenced by AddPoolUpdates(), HandleTopicUpdates(), and UpdateClusterAggregates().

PoolStatsMap impala::AdmissionController::cluster_pool_stats_
private

The (estimated) total pool statistics for the entire cluster. Includes the current local stats in local_pool_stats_. Updated when (a) IMPALA_REQUEST_QUEUE_TOPIC updates are received by aggregating the stats in per_backend_pool_stats_map_ and (b) when local stats change (i.e. AdmitQuery(), ReleaseQuery(), and when dequeuing in DequeueLoop()). Pool names -> estimated total pool stats

Definition at line 220 of file admission-controller.h.

Referenced by AdmitQuery(), CanAdmitRequest(), DequeueLoop(), RejectRequest(), ReleaseQuery(), and UpdateClusterAggregates().

boost::condition_variable impala::AdmissionController::dequeue_cv_
private

Notifies the dequeuing thread that pool stats have changed and it may be possible to dequeue and admit queries.

Definition at line 243 of file admission-controller.h.

Referenced by DequeueLoop(), ReleaseQuery(), UpdatePoolStats(), and ~AdmissionController().

boost::scoped_ptr<Thread> impala::AdmissionController::dequeue_thread_
private

Thread dequeuing and admitting queries.

Definition at line 178 of file admission-controller.h.

Referenced by AdmissionController(), and ~AdmissionController().

bool impala::AdmissionController::done_
private

If true, tear down the dequeuing thread. This only happens in unit tests.

Definition at line 246 of file admission-controller.h.

Referenced by DequeueLoop(), and ~AdmissionController().

const string impala::AdmissionController::IMPALA_REQUEST_QUEUE_TOPIC
staticprivate

Definition at line 105 of file admission-controller.h.

Referenced by AddPoolUpdates(), Init(), and UpdatePoolStats().

PoolStatsMap impala::AdmissionController::local_pool_stats_
private

The local pool statistics. Updated when requests are executed, queued, and completed.

Definition at line 196 of file admission-controller.h.

Referenced by AddPoolUpdates(), AdmitQuery(), DequeueLoop(), HandleTopicUpdates(), ReleaseQuery(), UpdateClusterAggregates(), UpdateLocalMemUsage(), and UpdatePoolStats().

MetricGroup* impala::AdmissionController::metrics_
private

Metrics subsystem access.

Definition at line 175 of file admission-controller.h.

Referenced by GetPoolMetrics().

PerBackendPoolStatsMap impala::AdmissionController::per_backend_pool_stats_map_
private
PoolConfigMap impala::AdmissionController::pool_config_cache_
private

Definition at line 239 of file admission-controller.h.

Referenced by AdmitQuery(), and DequeueLoop().

PoolMetricsMap impala::AdmissionController::pool_metrics_map_
private

Definition at line 233 of file admission-controller.h.

Referenced by GetPoolMetrics().

PoolSet impala::AdmissionController::pools_for_updates_
private

The set of local pools that have changed between topic updates that need to be sent to the statestore.

Definition at line 203 of file admission-controller.h.

Referenced by AddPoolUpdates(), AdmitQuery(), DequeueLoop(), ReleaseQuery(), and UpdateLocalMemUsage().

RequestPoolService* impala::AdmissionController::request_pool_service_
private

Used for user-to-pool resolution and looking up pool configurations. Not owned by the AdmissionController.

Definition at line 172 of file admission-controller.h.

Referenced by AdmitQuery().

RequestQueueMap impala::AdmissionController::request_queue_map_
private

Definition at line 229 of file admission-controller.h.

Referenced by AdmitQuery(), and DequeueLoop().

ThriftSerializer impala::AdmissionController::thrift_serializer_
private

Serializes/deserializes TPoolStats when sending and receiving topic updates.

Definition at line 184 of file admission-controller.h.

Referenced by AddPoolUpdates().


The documentation for this class was generated from the following files: