Impala
Impalaistheopensource,nativeanalyticdatabaseforApacheHadoop.
|
#include <lib-cache.h>
Classes | |
struct | LibCacheEntry |
Public Types | |
enum | LibType { TYPE_SO, TYPE_IR, TYPE_JAR } |
Public Member Functions | |
~LibCache () | |
Calls dlclose on all cached handles. More... | |
Status | GetLocalLibPath (const std::string &hdfs_lib_file, LibType type, std::string *local_path) |
Status | CheckSymbolExists (const std::string &hdfs_lib_file, LibType type, const std::string &symbol, bool quiet=false) |
Status | GetSoFunctionPtr (const std::string &hdfs_lib_file, const std::string &symbol, void **fn_ptr, LibCacheEntry **entry, bool quiet=false) |
If 'quiet' is true, returned error statuses will not be logged. More... | |
void | SetNeedsRefresh (const std::string &hdfs_lib_file) |
void | DecrementUseCount (LibCacheEntry *entry) |
See comment in GetSoFunctionPtr(). More... | |
void | RemoveEntry (const std::string &hdfs_lib_file) |
Removes the cache entry for 'hdfs_lib_file'. More... | |
void | DropCache () |
Removes all cached entries. More... | |
Static Public Member Functions | |
static LibCache * | instance () |
static Status | Init () |
Initializes the libcache. Must be called before any other APIs. More... | |
Private Types | |
typedef boost::unordered_map < std::string, LibCacheEntry * > | LibMap |
Private Member Functions | |
LibCache () | |
LibCache (LibCache const &l) | |
LibCache & | operator= (LibCache const &l) |
Status | InitInternal () |
Status | GetCacheEntry (const std::string &hdfs_lib_file, LibType type, boost::unique_lock< boost::mutex > *entry_lock, LibCacheEntry **entry) |
Status | GetCacheEntryInternal (const std::string &hdfs_lib_file, LibType type, boost::unique_lock< boost::mutex > *entry_lock, LibCacheEntry **entry) |
std::string | MakeLocalPath (const std::string &hdfs_path, const std::string &local_dir) |
void | RemoveEntryInternal (const std::string &hdfs_lib_file, const LibMap::iterator &entry_iterator) |
Private Attributes | |
void * | current_process_handle_ |
dlopen() handle for the current process (i.e. impalad). More... | |
AtomicInt< int64_t > | num_libs_copied_ |
boost::mutex | lock_ |
LibMap | lib_cache_ |
Static Private Attributes | |
static boost::scoped_ptr < LibCache > | instance_ |
Singleton instance. Instantiated in Init(). More... | |
Process-wide cache of dynamically-linked libraries loaded from HDFS. These libraries can either be shared objects, llvm modules or jars. For shared objects, when we load the shared object, we dlopen() it and keep it in our process. For modules, we store the symbols in the module to service symbol lookups. We can't cache the module since it (i.e. the external module) is consumed when it is linked with the query codegen module. Locking strategy: We don't want to grab a big lock across all operations since one of the operations is copying a file from HDFS. With one lock that would prevent any UDFs from running on the system. Instead, we have a global lock that is taken when doing the cache lookup, but is not taking during any blocking calls. During the block calls, we take the per-lib lock. Entry lifetime management: We cannot delete the entry while a query is using the library. When the caller requests a ptr into the library, they are given the entry handle and must decrement the ref count when they are done. TODO:
Definition at line 53 of file lib-cache.h.
|
private |
Maps HDFS library path => cache entry. Entries in the cache need to be explicitly deleted.
Definition at line 128 of file lib-cache.h.
Enumerator | |
---|---|
TYPE_SO | |
TYPE_IR | |
TYPE_JAR |
Definition at line 57 of file lib-cache.h.
LibCache::~LibCache | ( | ) |
Calls dlclose on all cached handles.
Definition at line 95 of file lib-cache.cc.
References current_process_handle_, DropCache(), and impala::DynamicClose().
|
private |
Definition at line 92 of file lib-cache.cc.
Referenced by Init().
|
private |
Status LibCache::CheckSymbolExists | ( | const std::string & | hdfs_lib_file, |
LibType | type, | ||
const std::string & | symbol, | ||
bool | quiet = false |
||
) |
Returns status.ok() if the symbol exists in 'hdfs_lib_file', non-ok otherwise. If 'quiet' is true, the error status for non-Java unfound symbols will not be logged.
Definition at line 192 of file lib-cache.cc.
References impala::LibCache::LibCacheEntry::local_path, impala::OK, RETURN_IF_ERROR, impala::LibCache::LibCacheEntry::symbols, and impala::LibCache::LibCacheEntry::type.
Referenced by ResolveSymbolLookup().
void LibCache::DecrementUseCount | ( | LibCacheEntry * | entry | ) |
See comment in GetSoFunctionPtr().
Definition at line 170 of file lib-cache.cc.
References impala::LibCache::LibCacheEntry::lock, impala::LibCache::LibCacheEntry::should_remove, and impala::LibCache::LibCacheEntry::use_count.
Referenced by impala::AggFnEvaluator::Close(), and impala::Expr::Close().
void LibCache::DropCache | ( | ) |
Removes all cached entries.
Definition at line 262 of file lib-cache.cc.
References lock_.
Referenced by impala::ImpalaServer::CatalogUpdateCallback(), and ~LibCache().
|
private |
Returns the cache entry for 'hdfs_lib_file'. If this library has not been copied locally, it will copy it and add a new LibCacheEntry to 'lib_cache_'. Result is returned in *entry. No locks should be take before calling this. On return the entry's lock is taken and returned in *entry_lock. If an error is returned, there will be no entry in lib_cache_ and *entry is NULL.
Definition at line 279 of file lib-cache.cc.
References impala::Status::ok().
|
private |
Implementation to get the cache entry for 'hdfs_lib_file'. Errors are returned without evicting the cache entry if the status is not OK and *entry is not NULL.
Definition at line 304 of file lib-cache.cc.
References impala::CopyHdfsFile(), impala::DynamicOpen(), impala::GetLastModificationTime(), lock_, impala::OK, impala::Status::ok(), path(), pool, and RETURN_IF_ERROR.
Status LibCache::GetLocalLibPath | ( | const std::string & | hdfs_lib_file, |
LibType | type, | ||
std::string * | local_path | ||
) |
Gets the local file system path for the library at 'hdfs_lib_file'. If this file is not already on the local fs, it copies it and caches the result. Returns an error if 'hdfs_lib_file' cannot be copied to the local fs.
Definition at line 181 of file lib-cache.cc.
References impala::LibCache::LibCacheEntry::local_path, impala::OK, RETURN_IF_ERROR, and impala::LibCache::LibCacheEntry::type.
Referenced by Java_com_cloudera_impala_service_FeSupport_NativeCacheJar(), and ResolveSymbolLookup().
Status LibCache::GetSoFunctionPtr | ( | const std::string & | hdfs_lib_file, |
const std::string & | symbol, | ||
void ** | fn_ptr, | ||
LibCacheEntry ** | entry, | ||
bool | quiet = false |
||
) |
If 'quiet' is true, returned error statuses will not be logged.
Returns a pointer to the function for the given library and symbol. If 'hdfs_lib_file' is empty, the symbol is looked up in the impalad process. Otherwise, 'hdfs_lib_file' should be the HDFS path to a shared library (.so) file. dlopen handles and symbols are cached. Only usable if 'hdfs_lib_file' refers to a shared object. If entry is non-null and *entry is null, *entry will be set to the cached entry. If entry is non-null and *entry is non-null, *entry will be reused (i.e., the use count is not increased). The caller must call DecrementUseCount(*entry) when it is done using fn_ptr and it is no longer valid to use fn_ptr.
Definition at line 130 of file lib-cache.cc.
References impala::DynamicLookup(), impala::LibCache::LibCacheEntry::lock, impala::OK, RETURN_IF_ERROR, impala::LibCache::LibCacheEntry::shared_object_handle, impala::LibCache::LibCacheEntry::symbol_cache, impala::LibCache::LibCacheEntry::type, and impala::LibCache::LibCacheEntry::use_count.
Referenced by impala::ScalarFnCall::GetFunction(), impala::ScalarFnCall::GetUdf(), and impala::ScalarFnCall::Prepare().
|
static |
Initializes the libcache. Must be called before any other APIs.
Definition at line 100 of file lib-cache.cc.
References instance_, and LibCache().
Referenced by impala::InitCommonRuntime().
|
private |
Definition at line 106 of file lib-cache.cc.
References current_process_handle_, impala::DynamicOpen(), impala::PathBuilder::GetFullBuildPath(), impala::TestInfo::is_fe_test(), impala::Status::OK, and RETURN_IF_ERROR.
|
inlinestatic |
Definition at line 63 of file lib-cache.h.
References instance_.
Referenced by impala::ImpalaServer::CatalogUpdateCallback(), impala::AggFnEvaluator::Close(), impala::Expr::Close(), impala::ScalarFnCall::GetFunction(), impala::ScalarFnCall::GetUdf(), impala::CatalogOpExecutor::HandleDropDataSource(), impala::CatalogOpExecutor::HandleDropFunction(), impala::ExternalDataSourceExecutor::Init(), Java_com_cloudera_impala_service_FeSupport_NativeCacheJar(), impala::ScalarFnCall::Prepare(), impala::HiveUdfCall::Prepare(), impala::AggFnEvaluator::Prepare(), and ResolveSymbolLookup().
|
private |
Utility function for generating a filename unique to this process and 'hdfs_path'. This is to prevent multiple impalad processes or different library files with the same name from clobbering each other. 'hdfs_path' should be the full path (including the filename) of the file we're going to copy to the local FS, and 'local_dir' is the local directory prefix of the returned path.
Definition at line 414 of file lib-cache.cc.
References path().
void LibCache::RemoveEntry | ( | const std::string & | hdfs_lib_file | ) |
Removes the cache entry for 'hdfs_lib_file'.
Definition at line 232 of file lib-cache.cc.
References lock_.
Referenced by impala::ImpalaServer::CatalogUpdateCallback(), impala::CatalogOpExecutor::HandleDropDataSource(), and impala::CatalogOpExecutor::HandleDropFunction().
|
private |
Implementation to remove an entry from the cache. lock_ must be held. The entry's lock should not be held.
Definition at line 239 of file lib-cache.cc.
References impala::LibCache::LibCacheEntry::local_path, impala::LibCache::LibCacheEntry::lock, impala::LibCache::LibCacheEntry::should_remove, and impala::LibCache::LibCacheEntry::use_count.
void LibCache::SetNeedsRefresh | ( | const std::string & | hdfs_lib_file | ) |
Marks the entry for 'hdfs_lib_file' as needing to be refreshed if the file in HDFS is newer than the local cached copied. The refresh will occur the next time the entry is accessed.
Definition at line 221 of file lib-cache.cc.
References impala::LibCache::LibCacheEntry::check_needs_refresh, impala::LibCache::LibCacheEntry::lock, and lock_.
Referenced by impala::ImpalaServer::CatalogUpdateCallback(), and ResolveSymbolLookup().
|
private |
dlopen() handle for the current process (i.e. impalad).
Definition at line 116 of file lib-cache.h.
Referenced by InitInternal(), and ~LibCache().
|
staticprivate |
Singleton instance. Instantiated in Init().
Definition at line 113 of file lib-cache.h.
Referenced by Init(), and instance().
|
private |
Definition at line 129 of file lib-cache.h.
|
private |
Protects lib_cache_. For lock ordering, this lock must always be taken before the per entry lock.
Definition at line 124 of file lib-cache.h.
|
private |
The number of libs that have been copied from HDFS to the local FS. This is appended to the local fs path to remove collisions.
Definition at line 120 of file lib-cache.h.