Impala
Impalaistheopensource,nativeanalyticdatabaseforApacheHadoop.
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros
com.cloudera.impala.util.HdfsCachingUtil Class Reference
Collaboration diagram for com.cloudera.impala.util.HdfsCachingUtil:

Static Public Member Functions

static long submitCacheTblDirective (org.apache.hadoop.hive.metastore.api.Table table, String poolName, short replication) throws ImpalaRuntimeException
 
static long submitCachePartitionDirective (HdfsPartition part, String poolName, short replication) throws ImpalaRuntimeException
 
static long submitCachePartitionDirective (org.apache.hadoop.hive.metastore.api.Partition part, String poolName, short replication) throws ImpalaRuntimeException
 
static void uncacheTbl (org.apache.hadoop.hive.metastore.api.Table table) throws ImpalaRuntimeException
 
static void uncachePartition (HdfsPartition part) throws ImpalaException
 
static void uncachePartition (org.apache.hadoop.hive.metastore.api.Partition part) throws ImpalaException
 
static Long getCacheDirectiveId (Map< String, String > params)
 
static String getCachePool (long directiveId) throws ImpalaRuntimeException
 
static Short getCacheReplication (long directiveId) throws ImpalaRuntimeException
 
static Short getCachedCacheReplication (Map< String, String > params)
 
static void waitForDirective (long directiveId) throws ImpalaRuntimeException
 
static long modifyCacheDirective (Long id, org.apache.hadoop.hive.metastore.api.Table table, String poolName, short replication) throws ImpalaRuntimeException
 
static long modifyCacheDirective (Long id, HdfsPartition part, String poolName, short replication) throws ImpalaRuntimeException
 
static boolean isSamePool (String poolName, Long directiveId) throws ImpalaRuntimeException
 
static short getReplicationOrDefault (THdfsCachingOp op)
 
static boolean isUpdateOp (THdfsCachingOp op, Map< String, String > params) throws ImpalaRuntimeException
 
static void validateCachePool (THdfsCachingOp op, Long directiveId, TableName table, HdfsPartition partition) throws ImpalaRuntimeException
 
static void validateCachePool (THdfsCachingOp op, Long directiveId, TableName table) throws ImpalaRuntimeException
 
static boolean validateCacheParams (Map< String, String > params)
 

Static Package Functions

 [static initializer]
 

Static Private Member Functions

static long submitDirective (Path path, String poolName, short replication) throws ImpalaRuntimeException
 
static void modifyCacheDirective (Long id, Path path, String poolName, short replication) throws ImpalaRuntimeException
 
static void removeDirective (long directiveId) throws ImpalaRuntimeException
 
static CacheDirectiveEntry getDirective (long directiveId) throws ImpalaRuntimeException
 

Static Private Attributes

static final Logger LOG = Logger.getLogger(HdfsCachingUtil.class)
 
static final String CACHE_DIR_ID_PROP_NAME = "cache_directive_id"
 
static final String CACHE_DIR_REPLICATION_PROP_NAME = "cache_replication"
 
static final int MAX_UNCHANGED_CACHING_REFRESH_INTERVALS = 5
 
static final DistributedFileSystem dfs
 

Detailed Description

Utility class for submitting and dropping HDFS cache requests.

Definition at line 42 of file HdfsCachingUtil.java.

Member Function Documentation

com.cloudera.impala.util.HdfsCachingUtil.[static initializer] ( )
inlinestaticpackage
static Short com.cloudera.impala.util.HdfsCachingUtil.getCachedCacheReplication ( Map< String, String >  params)
inlinestatic

Returns the cache replication value from the parameters map. We assume that only cached table parameters are used and the property is always present.

Definition at line 199 of file HdfsCachingUtil.java.

References com.cloudera.impala.util.HdfsCachingUtil.CACHE_DIR_REPLICATION_PROP_NAME.

static Long com.cloudera.impala.util.HdfsCachingUtil.getCacheDirectiveId ( Map< String, String >  params)
inlinestatic

Returns the cache directive ID from the given table/partition parameter map. Returns null if the CACHE_DIR_ID_PROP_NAME key was not set or if there was an error parsing the associated ID.

Definition at line 164 of file HdfsCachingUtil.java.

References com.cloudera.impala.util.HdfsCachingUtil.CACHE_DIR_ID_PROP_NAME.

Referenced by com.cloudera.impala.util.HdfsCachingUtil.uncachePartition(), com.cloudera.impala.util.HdfsCachingUtil.uncacheTbl(), and com.cloudera.impala.util.HdfsCachingUtil.validateCacheParams().

static String com.cloudera.impala.util.HdfsCachingUtil.getCachePool ( long  directiveId) throws ImpalaRuntimeException
inlinestatic

Given a cache directive ID, returns the pool the directive is cached in. Returns null if no outstanding cache directive match this ID.

Definition at line 179 of file HdfsCachingUtil.java.

References com.cloudera.impala.util.HdfsCachingUtil.getDirective().

Referenced by com.cloudera.impala.util.HdfsCachingUtil.isSamePool().

static Short com.cloudera.impala.util.HdfsCachingUtil.getCacheReplication ( long  directiveId) throws ImpalaRuntimeException
inlinestatic

Given a cache directive ID, returns the replication factor for the directive. Returns null if no outstanding cache directives match this ID.

Definition at line 189 of file HdfsCachingUtil.java.

References com.cloudera.impala.util.HdfsCachingUtil.getDirective().

static CacheDirectiveEntry com.cloudera.impala.util.HdfsCachingUtil.getDirective ( long  directiveId) throws ImpalaRuntimeException
inlinestaticprivate
static short com.cloudera.impala.util.HdfsCachingUtil.getReplicationOrDefault ( THdfsCachingOp  op)
inlinestatic

Helper method for frequent lookup of replication factor in the thrift caching structure.

Definition at line 400 of file HdfsCachingUtil.java.

static boolean com.cloudera.impala.util.HdfsCachingUtil.isSamePool ( String  poolName,
Long  directiveId 
) throws ImpalaRuntimeException
inlinestatic

Check if the poolName matches the pool of the cache directive identified by directiveId

Definition at line 391 of file HdfsCachingUtil.java.

References com.cloudera.impala.util.HdfsCachingUtil.getCachePool().

static boolean com.cloudera.impala.util.HdfsCachingUtil.isUpdateOp ( THdfsCachingOp  op,
Map< String, String >  params 
) throws ImpalaRuntimeException
inlinestatic

Returns a boolean indicating if the given thrift caching operation would perform an update on an already existing cache directive.

Definition at line 409 of file HdfsCachingUtil.java.

References com.cloudera.impala.util.HdfsCachingUtil.CACHE_DIR_ID_PROP_NAME, and com.cloudera.impala.util.HdfsCachingUtil.getDirective().

Referenced by com.cloudera.impala.service.CatalogOpExecutor.alterPartitionSetCached(), and com.cloudera.impala.service.CatalogOpExecutor.alterTableSetCached().

static long com.cloudera.impala.util.HdfsCachingUtil.modifyCacheDirective ( Long  id,
org.apache.hadoop.hive.metastore.api.Table  table,
String  poolName,
short  replication 
) throws ImpalaRuntimeException
inlinestatic

Update cache directive for a table and updates the metastore parameters. Returns the cache directive ID

Definition at line 300 of file HdfsCachingUtil.java.

References com.cloudera.impala.util.HdfsCachingUtil.CACHE_DIR_ID_PROP_NAME, and com.cloudera.impala.util.HdfsCachingUtil.CACHE_DIR_REPLICATION_PROP_NAME.

static long com.cloudera.impala.util.HdfsCachingUtil.modifyCacheDirective ( Long  id,
HdfsPartition  part,
String  poolName,
short  replication 
) throws ImpalaRuntimeException
inlinestatic

Update cache directive for a partition and update the metastore parameters. Returns the cache directive ID

Definition at line 315 of file HdfsCachingUtil.java.

References com.cloudera.impala.util.HdfsCachingUtil.CACHE_DIR_ID_PROP_NAME, and com.cloudera.impala.util.HdfsCachingUtil.CACHE_DIR_REPLICATION_PROP_NAME.

static void com.cloudera.impala.util.HdfsCachingUtil.modifyCacheDirective ( Long  id,
Path  path,
String  poolName,
short  replication 
) throws ImpalaRuntimeException
inlinestaticprivate

Update an existing cache directive to avoid having the same entry multiple times

Definition at line 329 of file HdfsCachingUtil.java.

References path().

static void com.cloudera.impala.util.HdfsCachingUtil.removeDirective ( long  directiveId) throws ImpalaRuntimeException
inlinestaticprivate

Removes the given cache directive if it exists, uncaching the data. If the cache request does not exist in HDFS no error is returned. Throws an ImpalaRuntimeException if there was any problem removing the directive.

Definition at line 354 of file HdfsCachingUtil.java.

static long com.cloudera.impala.util.HdfsCachingUtil.submitCachePartitionDirective ( HdfsPartition  part,
String  poolName,
short  replication 
) throws ImpalaRuntimeException
inlinestatic

Caches the location of the given partition and updates the partitions's properties with the submitted cache directive ID. The caller is responsible for not caching the same partition twice, as HDFS will create a second cache directive even if it is similar to an already existing one.

Returns the ID of the submitted cache directive and throws if there is an error submitting the directive.

Definition at line 93 of file HdfsCachingUtil.java.

References com.cloudera.impala.util.HdfsCachingUtil.CACHE_DIR_ID_PROP_NAME, and com.cloudera.impala.util.HdfsCachingUtil.CACHE_DIR_REPLICATION_PROP_NAME.

static long com.cloudera.impala.util.HdfsCachingUtil.submitCachePartitionDirective ( org.apache.hadoop.hive.metastore.api.Partition  part,
String  poolName,
short  replication 
) throws ImpalaRuntimeException
inlinestatic
static long com.cloudera.impala.util.HdfsCachingUtil.submitCacheTblDirective ( org.apache.hadoop.hive.metastore.api.Table  table,
String  poolName,
short  replication 
) throws ImpalaRuntimeException
inlinestatic

Caches the location of the given Hive Metastore Table and updates the table's properties with the submitted cache directive ID. The caller is responsible for not caching the same table twice, as HDFS will create a second cache directive even if it is similar to an already existing one.

Returns the ID of the submitted cache directive and throws if there is an error submitting.

Definition at line 74 of file HdfsCachingUtil.java.

References com.cloudera.impala.util.HdfsCachingUtil.CACHE_DIR_ID_PROP_NAME, and com.cloudera.impala.util.HdfsCachingUtil.CACHE_DIR_REPLICATION_PROP_NAME.

static long com.cloudera.impala.util.HdfsCachingUtil.submitDirective ( Path  path,
String  poolName,
short  replication 
) throws ImpalaRuntimeException
inlinestaticprivate

Submits a new caching directive for the specified cache pool name, path and replication. Returns the directive ID if the submission was successful or an ImpalaRuntimeException if the submission fails.

Definition at line 279 of file HdfsCachingUtil.java.

References path().

static void com.cloudera.impala.util.HdfsCachingUtil.uncachePartition ( HdfsPartition  part) throws ImpalaException
inlinestatic

Removes the cache directive associated with the partition from HDFS, uncaching all data. Also updates the partition's metadata to remove the cache directive ID. No-op if the table is not cached.

Definition at line 136 of file HdfsCachingUtil.java.

References com.cloudera.impala.util.HdfsCachingUtil.CACHE_DIR_ID_PROP_NAME, com.cloudera.impala.util.HdfsCachingUtil.CACHE_DIR_REPLICATION_PROP_NAME, and com.cloudera.impala.util.HdfsCachingUtil.getCacheDirectiveId().

static void com.cloudera.impala.util.HdfsCachingUtil.uncachePartition ( org.apache.hadoop.hive.metastore.api.Partition  part) throws ImpalaException
inlinestatic
static void com.cloudera.impala.util.HdfsCachingUtil.uncacheTbl ( org.apache.hadoop.hive.metastore.api.Table  table) throws ImpalaRuntimeException
inlinestatic

Removes the cache directive associated with the table from HDFS, uncaching all data. Also updates the table's metadata. No-op if the table is not cached.

Definition at line 120 of file HdfsCachingUtil.java.

References com.cloudera.impala.util.HdfsCachingUtil.CACHE_DIR_ID_PROP_NAME, com.cloudera.impala.util.HdfsCachingUtil.CACHE_DIR_REPLICATION_PROP_NAME, and com.cloudera.impala.util.HdfsCachingUtil.getCacheDirectiveId().

static boolean com.cloudera.impala.util.HdfsCachingUtil.validateCacheParams ( Map< String, String >  params)
inlinestatic

Validates and returns true if a parameter map contains a cache directive ID and validates it against the NameNode to make sure it exists. If the cache directive ID does not exist, we remove the value from the parameter map, issue a log message and return false. As the value is not written back to the Hive MS from this method, the result will be only valid until the next metadata fetch. Lastly, we update the cache replication factor in the parameters with the value read from HDFS.

Definition at line 470 of file HdfsCachingUtil.java.

References com.cloudera.impala.util.HdfsCachingUtil.CACHE_DIR_ID_PROP_NAME, com.cloudera.impala.util.HdfsCachingUtil.CACHE_DIR_REPLICATION_PROP_NAME, com.cloudera.impala.util.HdfsCachingUtil.getCacheDirectiveId(), and com.cloudera.impala.util.HdfsCachingUtil.getDirective().

static void com.cloudera.impala.util.HdfsCachingUtil.validateCachePool ( THdfsCachingOp  op,
Long  directiveId,
TableName  table,
HdfsPartition  partition 
) throws ImpalaRuntimeException
inlinestatic

Validates the properties of the chosen cache pool. Throws on error.

Definition at line 434 of file HdfsCachingUtil.java.

References com.cloudera.impala.util.HdfsCachingUtil.getDirective().

Referenced by com.cloudera.impala.util.HdfsCachingUtil.validateCachePool().

static void com.cloudera.impala.util.HdfsCachingUtil.validateCachePool ( THdfsCachingOp  op,
Long  directiveId,
TableName  table 
) throws ImpalaRuntimeException
inlinestatic

Validates the properties of the chosen cache pool. Throws on error.

Definition at line 456 of file HdfsCachingUtil.java.

References com.cloudera.impala.util.HdfsCachingUtil.validateCachePool().

static void com.cloudera.impala.util.HdfsCachingUtil.waitForDirective ( long  directiveId) throws ImpalaRuntimeException
inlinestatic

Waits on a cache directive to either complete or stop making progress. Progress is checked by polling the HDFS caching stats every DFS_NAMENODE_PATH_BASED_CACHE_REFRESH_INTERVAL_MS. We verify the request's "currentBytesCached" is increasing compared to "bytesNeeded". If "currentBytesCached" == "bytesNeeded" or if no progress is made for a MAX_UNCHANGED_CACHING_REFRESH_INTERVALS, this function returns.

Definition at line 220 of file HdfsCachingUtil.java.

References com.cloudera.impala.util.HdfsCachingUtil.getDirective(), and com.cloudera.impala.util.HdfsCachingUtil.MAX_UNCHANGED_CACHING_REFRESH_INTERVALS.

Member Data Documentation

final DistributedFileSystem com.cloudera.impala.util.HdfsCachingUtil.dfs
staticprivate

Definition at line 55 of file HdfsCachingUtil.java.

final Logger com.cloudera.impala.util.HdfsCachingUtil.LOG = Logger.getLogger(HdfsCachingUtil.class)
staticprivate

Definition at line 43 of file HdfsCachingUtil.java.

final int com.cloudera.impala.util.HdfsCachingUtil.MAX_UNCHANGED_CACHING_REFRESH_INTERVALS = 5
staticprivate

The documentation for this class was generated from the following file: