Impala
Impalaistheopensource,nativeanalyticdatabaseforApacheHadoop.
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros
com.cloudera.impala.catalog.Catalog Class Referenceabstract
Inheritance diagram for com.cloudera.impala.catalog.Catalog:
Collaboration diagram for com.cloudera.impala.catalog.Catalog:

Public Member Functions

 Catalog (boolean initMetastoreClientPool)
 
Db getBuiltinsDb ()
 
Db addDb (Db db)
 
Db getDb (String dbName)
 
Db removeDb (String dbName)
 
List< String > getDbNames (String dbPattern)
 
Table getTable (String dbName, String tableName) throws CatalogException
 
Table removeTable (TTableName tableName)
 
List< String > getTableNames (String dbName, String tablePattern) throws DatabaseNotFoundException
 
boolean containsTable (String dbName, String tableName)
 
boolean addDataSource (DataSource dataSource)
 
DataSource removeDataSource (String dataSourceName)
 
DataSource getDataSource (String dataSourceName)
 
List< DataSourcegetDataSources ()
 
List< String > getDataSourceNames (String pattern)
 
List< DataSourcegetDataSources (String pattern)
 
boolean addFunction (Function fn)
 
Function getFunction (Function desc, Function.CompareMode mode)
 
Function removeFunction (Function desc)
 
boolean containsFunction (FunctionName name)
 
boolean addHdfsCachePool (HdfsCachePool cachePool)
 
HdfsCachePool getHdfsCachePool (String poolName)
 
void close ()
 
MetaStoreClient getMetaStoreClient ()
 
HdfsPartition getHdfsPartition (String dbName, String tableName, List< TPartitionKeyValue > partitionSpec) throws CatalogException
 
boolean containsHdfsPartition (String dbName, String tableName, List< TPartitionKeyValue > partitionSpec) throws CatalogException
 
TCatalogObject getTCatalogObject (TCatalogObject objectDesc) throws CatalogException
 

Static Public Member Functions

static Function getBuiltin (Function desc, Function.CompareMode mode)
 

Static Public Attributes

static final long INITIAL_CATALOG_VERSION = 0L
 
static final String DEFAULT_DB = "default"
 
static final String BUILTINS_DB = "_impala_builtins"
 

Protected Attributes

final MetaStoreClientPool metaStoreClientPool_ = new MetaStoreClientPool(0)
 
AuthorizationPolicy authPolicy_ = new AuthorizationPolicy()
 
AtomicReference
< ConcurrentHashMap< String,
Db > > 
dbCache_
 
final CatalogObjectCache
< DataSource
dataSources_
 
final CatalogObjectCache
< HdfsCachePool
hdfsCachePools_
 

Private Member Functions

List< String > filterStringsByPattern (Iterable< String > candidates, String matchPattern)
 

Static Private Attributes

static final Logger LOG = Logger.getLogger(Catalog.class)
 
static final int META_STORE_CLIENT_POOL_SIZE = 5
 
static Db builtinsDb_
 

Detailed Description

Thread safe interface for reading and updating metadata stored in the Hive MetaStore. This class provides a storage API for caching CatalogObjects: databases, tables, and functions and the relevant metadata to go along with them. Although this class is thread safe, it does not guarantee consistency with the MetaStore. It is important to keep in mind that there may be external (potentially conflicting) concurrent metastore updates occurring at any time. The CatalogObject storage hierarchy is: Catalog -> Db -> Table -> Function Each level has its own synchronization, so the cache of Dbs is synchronized and each Db has a cache of tables which is synchronized independently.

The catalog is populated with the impala builtins on startup. Builtins and user functions are treated identically by the catalog. The builtins go in a specific database that the user cannot modify. Builtins are populated on startup in initBuiltins().

Definition at line 53 of file Catalog.java.

Constructor & Destructor Documentation

com.cloudera.impala.catalog.Catalog.Catalog ( boolean  initMetastoreClientPool)
inline

Creates a new instance of a Catalog. If initMetastoreClientPool is true, will also add META_STORE_CLIENT_POOL_SIZE clients to metastoreClientPool_.

Definition at line 91 of file Catalog.java.

References com.cloudera.impala.catalog.Catalog.addDb(), com.cloudera.impala.catalog.Catalog.BUILTINS_DB, com.cloudera.impala.catalog.Catalog.builtinsDb_, com.cloudera.impala.catalog.Catalog.dataSources_, and com.cloudera.impala.catalog.Catalog.META_STORE_CLIENT_POOL_SIZE.

Member Function Documentation

boolean com.cloudera.impala.catalog.Catalog.addDataSource ( DataSource  dataSource)
inline

Adds a data source to the in-memory map of data sources. It is not persisted to the metastore.

Returns
true if this item was added or false if the existing value was preserved.

Definition at line 197 of file Catalog.java.

Db com.cloudera.impala.catalog.Catalog.addDb ( Db  db)
inline

Adds a new database to the catalog, replacing any existing database with the same name. Returns the previous database with this name, or null if there was no previous database.

Definition at line 107 of file Catalog.java.

References com.cloudera.impala.catalog.Db.getName().

Referenced by com.cloudera.impala.catalog.Catalog.Catalog().

boolean com.cloudera.impala.catalog.Catalog.addFunction ( Function  fn)
inline

Adds a function to the catalog. Returns true if the function was successfully added. Returns false if the function already exists. TODO: allow adding a function to a global scope. We probably want this to resolve after the local scope. e.g. if we had fn() and db.fn(). If the current database is 'db', fn() would resolve first to db.fn().

Definition at line 259 of file Catalog.java.

References com.cloudera.impala.catalog.Function.dbName(), and com.cloudera.impala.catalog.Catalog.getDb().

boolean com.cloudera.impala.catalog.Catalog.addHdfsCachePool ( HdfsCachePool  cachePool)
inline

Adds a new HdfsCachePool to the catalog.

Definition at line 304 of file Catalog.java.

void com.cloudera.impala.catalog.Catalog.close ( )
inline

Release the Hive Meta Store Client resources. Can be called multiple times (additional calls will be no-ops).

Definition at line 320 of file Catalog.java.

boolean com.cloudera.impala.catalog.Catalog.containsFunction ( FunctionName  name)
inline

Returns true if there is a function with this function name. Parameters are ignored.

Definition at line 295 of file Catalog.java.

References com.cloudera.impala.analysis.FunctionName.getDb(), and com.cloudera.impala.catalog.Catalog.getDb().

boolean com.cloudera.impala.catalog.Catalog.containsHdfsPartition ( String  dbName,
String  tableName,
List< TPartitionKeyValue >  partitionSpec 
) throws CatalogException
inline

Returns true if the table contains the given partition spec, otherwise false. This may trigger a metadata load if the table metadata is not yet cached.

Exceptions
DatabaseNotFoundException- If the database does not exist.
TableNotFoundException- If the table does not exist.
TableLoadingException- If there is an error loading the table metadata.

Definition at line 384 of file Catalog.java.

References com.cloudera.impala.catalog.Catalog.getHdfsPartition().

Referenced by com.cloudera.impala.service.CatalogOpExecutor.alterTableAddPartition(), and com.cloudera.impala.service.CatalogOpExecutor.alterTableDropPartition().

boolean com.cloudera.impala.catalog.Catalog.containsTable ( String  dbName,
String  tableName 
)
inline

Returns true if the table and the database exist in the Impala Catalog. Returns false if either the table or the database do not exist.

Definition at line 187 of file Catalog.java.

References com.cloudera.impala.catalog.Catalog.getDb().

Referenced by com.cloudera.impala.service.CatalogOpExecutor.createTable(), com.cloudera.impala.service.CatalogOpExecutor.createTableLike(), and com.cloudera.impala.service.CatalogOpExecutor.createView().

List<String> com.cloudera.impala.catalog.Catalog.filterStringsByPattern ( Iterable< String >  candidates,
String  matchPattern 
)
inlineprivate

Implement Hive's pattern-matching semantics for SHOW statements. The only metacharacters are '*' which matches any string of characters, and '|' which denotes choice. Doing the work here saves loading tables or databases from the metastore (which Hive would do if we passed the call through to the metastore client).

If matchPattern is null, all strings are considered to match. If it is the empty string, no strings match.

Definition at line 338 of file Catalog.java.

References com.cloudera.impala.util.PatternMatcher.matches().

Referenced by com.cloudera.impala.catalog.Catalog.getDataSourceNames(), com.cloudera.impala.catalog.Catalog.getDataSources(), com.cloudera.impala.catalog.Catalog.getDbNames(), and com.cloudera.impala.catalog.Catalog.getTableNames().

static Function com.cloudera.impala.catalog.Catalog.getBuiltin ( Function  desc,
Function.CompareMode  mode 
)
inlinestatic

Definition at line 276 of file Catalog.java.

Db com.cloudera.impala.catalog.Catalog.getBuiltinsDb ( )
inline

Definition at line 100 of file Catalog.java.

References com.cloudera.impala.catalog.Catalog.builtinsDb_.

DataSource com.cloudera.impala.catalog.Catalog.getDataSource ( String  dataSourceName)
inline
List<String> com.cloudera.impala.catalog.Catalog.getDataSourceNames ( String  pattern)
inline

Returns a list of data sources names that match pattern. See filterStringsByPattern for details of the pattern match semantics.

pattern may be null (and thus matches everything).

Definition at line 231 of file Catalog.java.

References com.cloudera.impala.catalog.Catalog.dataSources_, and com.cloudera.impala.catalog.Catalog.filterStringsByPattern().

List<DataSource> com.cloudera.impala.catalog.Catalog.getDataSources ( )
inline

Gets a list of all data sources.

Definition at line 221 of file Catalog.java.

Referenced by com.cloudera.impala.catalog.CatalogServiceCatalog.getCatalogObjects().

List<DataSource> com.cloudera.impala.catalog.Catalog.getDataSources ( String  pattern)
inline

Returns a list of data sources that match pattern. See filterStringsByPattern for details of the pattern match semantics.

pattern may be null (and thus matches everything).

Definition at line 241 of file Catalog.java.

References com.cloudera.impala.catalog.Catalog.dataSources_, com.cloudera.impala.catalog.Catalog.filterStringsByPattern(), and impala.name.

Db com.cloudera.impala.catalog.Catalog.getDb ( String  dbName)
inline

Gets the Db object from the Catalog using a case-insensitive lookup on the name. Returns null if no matching database is found.

Definition at line 115 of file Catalog.java.

Referenced by com.cloudera.impala.catalog.ImpaladCatalog.addDb(), com.cloudera.impala.catalog.Catalog.addFunction(), com.cloudera.impala.catalog.ImpaladCatalog.addFunction(), com.cloudera.impala.catalog.CatalogServiceCatalog.addFunction(), com.cloudera.impala.catalog.CatalogServiceCatalog.addPartition(), com.cloudera.impala.catalog.ImpaladCatalog.addTable(), com.cloudera.impala.catalog.CatalogServiceCatalog.addTable(), com.cloudera.impala.catalog.Catalog.containsFunction(), com.cloudera.impala.catalog.Catalog.containsTable(), com.cloudera.impala.service.CatalogOpExecutor.createDatabase(), com.cloudera.impala.catalog.TableLoadingMgr.LoadRequest.get(), com.cloudera.impala.catalog.CatalogServiceCatalog.getCatalogObjects(), com.cloudera.impala.catalog.Catalog.getFunction(), com.cloudera.impala.catalog.CatalogServiceCatalog.getFunctions(), com.cloudera.impala.testutil.ImpaladTestCatalog.getTable(), com.cloudera.impala.catalog.Catalog.getTable(), com.cloudera.impala.catalog.Catalog.getTableNames(), com.cloudera.impala.catalog.Catalog.getTCatalogObject(), com.cloudera.impala.testutil.ImpaladTestCatalog.ImpaladTestCatalog(), com.cloudera.impala.catalog.CatalogServiceCatalog.invalidateTable(), com.cloudera.impala.catalog.ImpaladCatalog.removeDb(), com.cloudera.impala.catalog.Catalog.removeFunction(), com.cloudera.impala.catalog.ImpaladCatalog.removeFunction(), com.cloudera.impala.catalog.Catalog.removeTable(), com.cloudera.impala.catalog.ImpaladCatalog.removeTable(), com.cloudera.impala.catalog.CatalogServiceCatalog.removeTable(), com.cloudera.impala.catalog.CatalogServiceCatalog.renameTable(), com.cloudera.impala.catalog.CatalogServiceCatalog.replaceTableIfUnchanged(), com.cloudera.impala.catalog.CatalogTest.testStats(), com.cloudera.impala.analysis.AuthorizationTest.TestTPCHCleanup(), and com.cloudera.impala.catalog.CatalogServiceCatalog.updateLastDdlTime().

List<String> com.cloudera.impala.catalog.Catalog.getDbNames ( String  dbPattern)
inline

Returns a list of databases that match dbPattern. See filterStringsByPattern for details of the pattern match semantics.

dbPattern may be null (and thus matches everything).

Definition at line 136 of file Catalog.java.

References com.cloudera.impala.catalog.Catalog.dbCache_, and com.cloudera.impala.catalog.Catalog.filterStringsByPattern().

Referenced by com.cloudera.impala.catalog.CatalogServiceCatalog.getCatalogObjects(), com.cloudera.impala.testutil.ImpaladTestCatalog.ImpaladTestCatalog(), and com.cloudera.impala.testutil.BlockIdGenerator.main().

Function com.cloudera.impala.catalog.Catalog.getFunction ( Function  desc,
Function.CompareMode  mode 
)
inline

Returns the function that best matches 'desc' that is registered with the catalog using 'mode' to check for matching. If desc matches multiple functions in the catalog, it will return the function with the strictest matching mode.

Definition at line 270 of file Catalog.java.

References com.cloudera.impala.catalog.Function.dbName(), and com.cloudera.impala.catalog.Catalog.getDb().

Referenced by com.cloudera.impala.catalog.Catalog.getTCatalogObject().

HdfsCachePool com.cloudera.impala.catalog.Catalog.getHdfsCachePool ( String  poolName)
inline

Gets a HdfsCachePool given a cache pool name. Returns null if the cache pool does not exist.

Definition at line 312 of file Catalog.java.

Referenced by com.cloudera.impala.catalog.Catalog.getTCatalogObject().

HdfsPartition com.cloudera.impala.catalog.Catalog.getHdfsPartition ( String  dbName,
String  tableName,
List< TPartitionKeyValue >  partitionSpec 
) throws CatalogException
inline

Returns the HdfsPartition object for the given dbName/tableName and partition spec. This will trigger a metadata load if the table metadata is not yet cached.

Exceptions
DatabaseNotFoundException- If the database does not exist.
TableNotFoundException- If the table does not exist.
PartitionNotFoundException- If the partition does not exist.
TableLoadingException- If there is an error loading the table metadata.

Definition at line 361 of file Catalog.java.

References com.cloudera.impala.catalog.Catalog.getTable().

Referenced by com.cloudera.impala.catalog.Catalog.containsHdfsPartition().

MetaStoreClient com.cloudera.impala.catalog.Catalog.getMetaStoreClient ( )
inline

Returns a managed meta store client from the client connection pool.

Definition at line 326 of file Catalog.java.

Referenced by com.cloudera.impala.catalog.ImpaladCatalog.getTablePath(), and com.cloudera.impala.catalog.CatalogServiceCatalog.invalidateTable().

Table com.cloudera.impala.catalog.Catalog.getTable ( String  dbName,
String  tableName 
) throws CatalogException
inline
List<String> com.cloudera.impala.catalog.Catalog.getTableNames ( String  dbName,
String  tablePattern 
) throws DatabaseNotFoundException
inline

Returns a list of tables in the supplied database that match tablePattern. See filterStringsByPattern for details of the pattern match semantics.

dbName must not be null, but tablePattern may be null (and thus matches everything).

Table names are returned unqualified.

Definition at line 173 of file Catalog.java.

References com.cloudera.impala.catalog.Catalog.filterStringsByPattern(), and com.cloudera.impala.catalog.Catalog.getDb().

TCatalogObject com.cloudera.impala.catalog.Catalog.getTCatalogObject ( TCatalogObject  objectDesc) throws CatalogException
inline

Gets the thrift representation of a catalog object, given the "object description". The object description is just a TCatalogObject with only the catalog object type and object name set. If the object is not found, a CatalogException is thrown.

Definition at line 399 of file Catalog.java.

References com.cloudera.impala.catalog.Catalog.getDataSource(), com.cloudera.impala.catalog.Catalog.getDb(), com.cloudera.impala.catalog.Catalog.getFunction(), com.cloudera.impala.catalog.Catalog.getHdfsCachePool(), com.cloudera.impala.catalog.Role.getName(), com.cloudera.impala.catalog.Role.getPrivileges(), com.cloudera.impala.catalog.Catalog.getTable(), and pool.

DataSource com.cloudera.impala.catalog.Catalog.removeDataSource ( String  dataSourceName)
inline

Removes a data source from the in-memory map of data sources.

Returns
the item that was removed if it existed in the cache, null otherwise.

Definition at line 205 of file Catalog.java.

Db com.cloudera.impala.catalog.Catalog.removeDb ( String  dbName)
inline

Removes a database from the metadata cache. Returns the value removed or null if not database was removed as part of this operation. Used by DROP DATABASE statements.

Definition at line 126 of file Catalog.java.

Function com.cloudera.impala.catalog.Catalog.removeFunction ( Function  desc)
inline

Removes a function from the catalog. Increments the catalog version and returns the Function object that was removed if the function existed, otherwise returns null.

Definition at line 285 of file Catalog.java.

References com.cloudera.impala.catalog.Function.dbName(), and com.cloudera.impala.catalog.Catalog.getDb().

Table com.cloudera.impala.catalog.Catalog.removeTable ( TTableName  tableName)
inline

Removes a table from the catalog and returns the table that was removed, or null if the table/database does not exist.

Definition at line 157 of file Catalog.java.

References com.cloudera.impala.catalog.Catalog.getDb().

Member Data Documentation

Db com.cloudera.impala.catalog.Catalog.builtinsDb_
staticprivate
AtomicReference<ConcurrentHashMap<String, Db> > com.cloudera.impala.catalog.Catalog.dbCache_
protected
Initial value:
=
new AtomicReference<ConcurrentHashMap<String, Db>>(
new ConcurrentHashMap<String, Db>())

Definition at line 72 of file Catalog.java.

Referenced by com.cloudera.impala.catalog.Catalog.getDbNames(), and com.cloudera.impala.catalog.CatalogServiceCatalog.reset().

final CatalogObjectCache<HdfsCachePool> com.cloudera.impala.catalog.Catalog.hdfsCachePools_
protected
Initial value:
=
new CatalogObjectCache<HdfsCachePool>(false)

Definition at line 84 of file Catalog.java.

Referenced by com.cloudera.impala.catalog.CatalogServiceCatalog.getCatalogObjects(), and com.cloudera.impala.catalog.CatalogServiceCatalog.CachePoolReader.run().

final Logger com.cloudera.impala.catalog.Catalog.LOG = Logger.getLogger(Catalog.class)
staticprivate

Definition at line 54 of file Catalog.java.

final int com.cloudera.impala.catalog.Catalog.META_STORE_CLIENT_POOL_SIZE = 5
staticprivate

Definition at line 59 of file Catalog.java.

Referenced by com.cloudera.impala.catalog.Catalog.Catalog().

final MetaStoreClientPool com.cloudera.impala.catalog.Catalog.metaStoreClientPool_ = new MetaStoreClientPool(0)
protected

Definition at line 63 of file Catalog.java.


The documentation for this class was generated from the following file: