Impala
Impalaistheopensource,nativeanalyticdatabaseforApacheHadoop.
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros
impala::CgroupsMgr Class Reference

#include <cgroups-mgr.h>

Collaboration diagram for impala::CgroupsMgr:

Public Member Functions

 CgroupsMgr (MetricGroup *metrics)
 
Status Init (const std::string &cgroups_hierarchy_path, const std::string &staging_cgroup)
 
std::string UniqueIdToCgroup (const std::string &unique_id) const
 
int32_t VirtualCoresToCpuShares (int16_t v_cpu_cores)
 
Status RegisterFragment (const TUniqueId &fragment_instance_id, const std::string &cgroup, bool *is_first)
 
Status UnregisterFragment (const TUniqueId &fragment_instance_id, const std::string &cgroup)
 
Status CreateCgroup (const std::string &cgroup, bool if_not_exists) const
 
Status DropCgroup (const std::string &cgroup, bool if_exists) const
 
Status SetCpuShares (const std::string &cgroup, int32_t num_shares)
 
Status AssignThreadToCgroup (const Thread &thread, const std::string &cgroup) const
 
Status RelocateThreads (const std::string &src_cgroup, const std::string &dst_cgroup) const
 

Private Member Functions

Status GetCgroupPaths (const std::string &cgroup, std::string *cgroup_path, std::string *tasks_path) const
 

Private Attributes

IntCounteractive_cgroups_metric_
 Number of currently active Impala-managed cgroups. More...
 
std::string cgroups_hierarchy_path_
 Root of the CPU cgroup hierarchy. Created cgroups are placed directly under it. More...
 
std::string staging_cgroup_
 
boost::mutex active_cgroups_lock_
 Protects active_cgroups_. More...
 
boost::unordered_map
< std::string, int32_t > 
active_cgroups_
 

Detailed Description

Control Groups, or 'cgroups', are a Linux-specific mechanism for arbitrating resources amongst threads. CGroups are organised in a forest of 'hierarchies', each of which are mounted at a path in the filesystem. Each hierarchy contains one or more cgroups, arranged hierarchically. Each hierarchy has one or more 'subsystems' attached. Each subsystem represents a resource to manage, so for example there is a CPU subsystem and a MEMORY subsystem. There are rules about when subsystems may be attached to more than one hierarchy, which are out of scope of this description. Each thread running on a kernel with cgroups enabled belongs to exactly one cgroup in every hierarchy at once. Impala is only concerned with a single hierarchy that assigns CPU resources in the first instance. Threads are assigned to cgroups by writing their thread ID to a file in the special cgroup filesystem. For more information: access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/ www.kernel.org/doc/Documentation/cgroups/cgroups.txt Manages the lifecycle of Impala-internal cgroups as well as the assignment of execution threads into cgroups. To execute queries Impala requests resources from Yarn via the Llama. Yarn returns granted resources via the Llama in the form or "RM resource ids" that conventionally correspond to a CGroups that the Yarn NM creates. Instead of directly using the NM-provided CGroups, Impala creates and manages its own CGroups for the following reasons:

  1. In typical CM/Yarn setups, Impala would not have permissions to write to the tasks file of NM-provided CGroups. It is arguably not even desirable (e.g., for security reasons) for external process to be able to manipulate the permissions of NM-generated CGroups either directly or indirectly.
  2. Yarn-granted CGroups are created asynchronously (the AM calls to create the CGroups are non-blocking). From Impala's perspective that means that once Impala receives notice from the Llama that resources have been granted, it cannot assume that the corresponding containers have been created (although the Yarn NMs eventually will). While each of Impala's plan fragments could wait for the CGroups to be created, it seems unnecessarily complicated and slow to do so.
  3. Impala will probably want to manage its own CGroups eventually, e.g., for optimistic query scheduling. In summary, the typical CGroups-related flow of an Impala query is as follows:
  1. Impala receives granted resources from Llama and sends out plan fragments
  2. On each node execution such a fragment, convert the Yarn resource id into a CGroup that Impala should create and assign the query's threads to
  3. Register the fragment(s) and the CGroup for the query with the node-local CGroup manager. The registration creates the CGroup maintains a count of all fragments using that CGroup.
  4. Execute the fragments, assigning threads into the Impala-managed CGroup.
  5. Complete the fragments by unregistering them with the CGroup from the node-local CGroups manager. When the last fragment for a CGroup is unregistered, all threads from that CGroup are relocated into a special staging CGroup, so that the now unused CGroup can safely be deleted (otherwise, we'd have to wait for the OS to drain all entries from the CGroup's tasks file)

Definition at line 79 of file cgroups-mgr.h.

Constructor & Destructor Documentation

impala::CgroupsMgr::CgroupsMgr ( MetricGroup metrics)

Definition at line 40 of file cgroups-mgr.cc.

References impala::MetricGroup::AddCounter().

Member Function Documentation

Status impala::CgroupsMgr::AssignThreadToCgroup ( const Thread thread,
const std::string &  cgroup 
) const

Assigns a given thread to a cgroup, by writing its thread id to <cgroups_hierarchy_path_>/<cgroup>/tasks. If there is no file at that location, returns an error. Otherwise no attempt is made to check that the target belongs to a cgroup hierarchy due to the cost of reading and parsing cgroup information from the filesystem.

Definition at line 142 of file cgroups-mgr.cc.

References impala::Status::OK, RETURN_IF_ERROR, impala::Thread::tid(), and VLOG_ROW.

Referenced by impala::BlockingJoinNode::Open().

Status impala::CgroupsMgr::CreateCgroup ( const std::string &  cgroup,
bool  if_not_exists 
) const

Creates a cgroup at <cgroups_hierarchy_path_>/<cgroup>. Returns a non-OK status if the cgroup creation failed, e.g., because of insufficient privileges. If is_not_exists is true then no error is returned if the cgroup already exists.

Definition at line 64 of file cgroups-mgr.cc.

References impala::Status::OK.

Status impala::CgroupsMgr::DropCgroup ( const std::string &  cgroup,
bool  if_exists 
) const

Drops the cgroup at <cgroups_hierarchy_path_>/<cgroup>. Returns a non-OK status if the cgroup deletion failed, e.g., because of insufficient privileges. If if_exists is true then no error is returned if the cgroup does not exist.

Definition at line 83 of file cgroups-mgr.cc.

References impala::Status::OK.

Status impala::CgroupsMgr::GetCgroupPaths ( const std::string &  cgroup,
std::string *  cgroup_path,
std::string *  tasks_path 
) const
private

Checks that the cgroups_hierarchy_path_ and the given cgroup under it exists. Returns an error if either of them do not exist. Returns the absolute cgroup path and the absolute path to its tasks file.

Definition at line 120 of file cgroups-mgr.cc.

References impala::Status::OK.

Status impala::CgroupsMgr::Init ( const std::string &  cgroups_hierarchy_path,
const std::string &  staging_cgroup 
)

Sets the cgroups mgr's corresponding members and creates the staging cgroup under <cgroups_hierarchy_path>/<staging_cgroup>. Returns a non-OK status if creation of the staging cgroup failed, e.g., because of insufficient privileges.

Definition at line 45 of file cgroups-mgr.cc.

References impala::Status::OK, and RETURN_IF_ERROR.

Status impala::CgroupsMgr::RegisterFragment ( const TUniqueId &  fragment_instance_id,
const std::string &  cgroup,
bool is_first 
)

Informs the cgroups mgr that a plan fragment intends to use the given cgroup. If this is the first fragment requesting use of cgroup, then the cgroup will be created and *is_first will be set to true (otherwise to false). In any case the reference count active_cgroups_[cgroup] is incremented. Returns a non-OK status if there was an error creating the cgroup.

Definition at line 198 of file cgroups-mgr.cc.

References impala::Status::OK, impala::PrintId(), and RETURN_IF_ERROR.

Referenced by impala::PlanFragmentExecutor::Prepare().

Status impala::CgroupsMgr::RelocateThreads ( const std::string &  src_cgroup,
const std::string &  dst_cgroup 
) const

Reads the <cgroups_hierarchy_path_>/<src_cgroup>/tasks file and writing all the contained thread ids to <cgroups_hierarchy_path_>/<dst_cgroup>/tasks. Assumes that the destination cgroup has already been created. Returns a non-OK status if there was an error reading src_cgroup and/or writing dst_cgroup.

Definition at line 161 of file cgroups-mgr.cc.

References impala::Status::OK, RETURN_IF_ERROR, and VLOG_ROW.

Status impala::CgroupsMgr::SetCpuShares ( const std::string &  cgroup,
int32_t  num_shares 
)

Sets the number of CPU shares for the given cgroup by writing num_shares into the cgroup's cpu.shares file. Returns a non-OK status if there was an error writing to the file, e.g., because of insufficient privileges.

Definition at line 101 of file cgroups-mgr.cc.

References impala::Status::OK, and RETURN_IF_ERROR.

Referenced by impala::QueryResourceMgr::AcquireVcoreResources(), and impala::PlanFragmentExecutor::Prepare().

string impala::CgroupsMgr::UniqueIdToCgroup ( const std::string &  unique_id) const

Returns the cgroup Impala should create and use for enforcing granted resources identified by the given unique ID (which usually corresponds to a query ID). Returns an empty string if unique_id is empty.

Definition at line 54 of file cgroups-mgr.cc.

References impala::IMPALA_CGROUP_SUFFIX.

Referenced by impala::QueryResourceMgr::AcquireVcoreResources(), and impala::PlanFragmentExecutor::Prepare().

Status impala::CgroupsMgr::UnregisterFragment ( const TUniqueId &  fragment_instance_id,
const std::string &  cgroup 
)

Informs the cgroups mgr that a plan fragment using the given cgroup is complete. Decrements the corresponding reference count active_cgroups_[cgroup]. If the reference count reaches zero this function relocates all thread ids from the cgroup to the staging_cgroup_ and drops cgroup (a cgroup with active thread ids cannot be dropped, so we relocate the thread ids first). Returns a non-OK status there was an error creating the cgroup.

Definition at line 215 of file cgroups-mgr.cc.

References impala::Status::OK, impala::PrintId(), and RETURN_IF_ERROR.

Referenced by impala::PlanFragmentExecutor::Close().

int32_t impala::CgroupsMgr::VirtualCoresToCpuShares ( int16_t  v_cpu_cores)

Returns the cgroup CPU shares corresponding to the given number of virtual cores. Returns -1 if v_cpu_cores is <= 0 (which is invalid).

Definition at line 59 of file cgroups-mgr.cc.

References impala::CPU_DEFAULT_WEIGHT.

Referenced by impala::QueryResourceMgr::AcquireVcoreResources(), and impala::PlanFragmentExecutor::Prepare().

Member Data Documentation

boost::unordered_map<std::string, int32_t> impala::CgroupsMgr::active_cgroups_
private

Process-wide map from cgroup to number of fragments using the cgroup. A cgroup can be safely dropped once the number of fragments in the cgroup, according to this map, reaches zero.

Definition at line 167 of file cgroups-mgr.h.

boost::mutex impala::CgroupsMgr::active_cgroups_lock_
private

Protects active_cgroups_.

Definition at line 162 of file cgroups-mgr.h.

IntCounter* impala::CgroupsMgr::active_cgroups_metric_
private

Number of currently active Impala-managed cgroups.

Definition at line 152 of file cgroups-mgr.h.

std::string impala::CgroupsMgr::cgroups_hierarchy_path_
private

Root of the CPU cgroup hierarchy. Created cgroups are placed directly under it.

Definition at line 155 of file cgroups-mgr.h.

std::string impala::CgroupsMgr::staging_cgroup_
private

Cgroup that threads from completed queries are relocated into such that the query's cgroup can be dropped.

Definition at line 159 of file cgroups-mgr.h.


The documentation for this class was generated from the following files: