Impala
Impalaistheopensource,nativeanalyticdatabaseforApacheHadoop.
|
#include <udf.h>
Classes | |
struct | TypeDesc |
struct | UniqueId |
Public Types | |
enum | ImpalaVersion { v1_2, v1_3 } |
enum | Type { INVALID_TYPE = 0, TYPE_NULL, TYPE_BOOLEAN, TYPE_TINYINT, TYPE_SMALLINT, TYPE_INT, TYPE_BIGINT, TYPE_FLOAT, TYPE_DOUBLE, TYPE_TIMESTAMP, TYPE_STRING, TYPE_FIXED_BUFFER, TYPE_DECIMAL, TYPE_VARCHAR } |
enum | FunctionStateScope { FRAGMENT_LOCAL, THREAD_LOCAL } |
Public Member Functions | |
ImpalaVersion | version () const |
Returns the version of Impala that's currently running. More... | |
const char * | user () const |
UniqueId | query_id () const |
Returns the query_id for the current query. More... | |
void | SetError (const char *error_msg) |
bool | AddWarning (const char *warning_msg) |
bool | has_error () const |
Returns true if there's been an error set. More... | |
const char * | error_msg () const |
Returns the current error message. Returns NULL if there is no error. More... | |
uint8_t * | Allocate (int byte_size) |
uint8_t * | Reallocate (uint8_t *ptr, int byte_size) |
void | Free (uint8_t *buffer) |
Frees a buffer returned from Allocate() or Reallocate() More... | |
void | TrackAllocation (int64_t byte_size) |
void | Free (int64_t byte_size) |
void | SetFunctionState (FunctionStateScope scope, void *ptr) |
void * | GetFunctionState (FunctionStateScope scope) const |
const TypeDesc & | GetReturnType () const |
const TypeDesc & | GetIntermediateType () const |
int | GetNumArgs () const |
const TypeDesc * | GetArgType (int arg_idx) const |
bool | IsArgConstant (int arg_idx) const |
AnyVal * | GetConstantArg (int arg_idx) const |
impala::FunctionContextImpl * | impl () |
TODO: Add mechanism for UDAs to update stats similar to runtime profile counters. More... | |
~FunctionContext () | |
Private Member Functions | |
FunctionContext () | |
FunctionContext (const FunctionContext &other) | |
Disable copy ctor and assignment operator. More... | |
FunctionContext & | operator= (const FunctionContext &other) |
Private Attributes | |
impala::FunctionContextImpl * | impl_ |
Friends | |
class | impala::FunctionContextImpl |
A FunctionContext is passed to every UDF/UDA and is the interface for the UDF to the rest of the system. It contains APIs to examine the system state, report errors and manage memory.
Enumerator | |
---|---|
FRAGMENT_LOCAL |
Indicates that the function state for this FunctionContext's UDF is shared across the plan fragment (a query is divided into multiple plan fragments, each of which is responsible for a part of the query execution). Within the plan fragment, there may be multiple instances of the UDF executing concurrently with multiple FunctionContexts sharing this state, meaning that the state must be thread-safe. The Prepare() function for the UDF may be called with this scope concurrently on a single host if the UDF will be evaluated in multiple plan fragments on that host. In general, read-only state that doesn't need to be recomputed for every UDF call should be fragment-local. TODO: not yet implemented |
THREAD_LOCAL |
Indicates that the function state is local to the execution thread. This state does not need to be thread-safe. However, this state will be initialized (via the Prepare() function) once for every execution thread, so fragment-local state should be used when possible for better performance. In general, inexpensive shared state that is written to by the UDF (e.g. scratch space) should be thread-local. |
FunctionContext::~FunctionContext | ( | ) |
Definition at line 159 of file udf.cc.
References impala::FunctionContextImpl::closed_, impl_, and impala::FunctionContextImpl::pool_.
|
private |
Disable copy ctor and assignment operator.
bool FunctionContext::AddWarning | ( | const char * | warning_msg | ) |
Adds a warning that is returned to the user. This can include things like overflow or other recoverable error conditions. Warnings are capped at a maximum number. Returns true if the warning was added and false if it was ignored due to the cap.
Definition at line 345 of file udf.cc.
References MAX_WARNINGS.
Referenced by impala::FunctionContextImpl::Close(), impala::TimestampFunctions::DateAddSub(), impala::AggregateFunctions::DecimalAvgGetValue(), impala::HiveUdfCall::Evaluate(), impala::TimestampFunctions::FromUtc(), impala::StringFunctions::ParseUrl(), impala::StringFunctions::ParseUrlKey(), impala::StringFunctions::RegexpExtract(), impala::StringFunctions::RegexpReplace(), impala::TimestampFunctions::ReportBadFormat(), TestWarnings(), and impala::TimestampFunctions::ToUtc().
uint8_t * FunctionContext::Allocate | ( | int | byte_size | ) |
Allocates memory. All UDF/UDAs should use this if possible instead of malloc/new. The UDF/UDA is responsible for calling Free() on all buffers returned by Allocate(). If this Allocate causes the memory limit to be exceeded, the error will be set in this object causing the query to fail.
Definition at line 262 of file udf.cc.
References VLOG_ROW.
Referenced by impala::AggregateFunctions::AvgInit(), ConstantArgPrepare(), impala_udf::UdaTestHarnessUtil::CopyIntermediate(), CountPrepare(), impala_udf::UdaTestHarnessUtil::CreateIntermediate(), impala::AggregateFunctions::DecimalAvgInit(), impala::UdfBuiltins::ExtractPrepare(), impala::AggregateFunctions::FirstValUpdate(), HllInit(), impala::AggregateFunctions::HllInit(), impala::AggregateFunctions::LastValUpdate(), impala::AggregateFunctions::Max(), MemTestPrepare(), impala::AggregateFunctions::Min(), impala::CaseExpr::Open(), impala::AggregateFunctions::PcInit(), impala::MathFunctions::RandPrepare(), impala::AggregateFunctions::RankInit(), impala::AggregateFunctions::ReservoirSampleInit(), MinState::Set(), impala::AggregateFunctions::StringConcatMerge(), impala::AggregateFunctions::StringConcatUpdate(), impala::UdfBuiltins::TruncPrepare(), ValidateMem(), ValidateOpenPrepare(), and ValidateSharedStatePrepare().
const char * FunctionContext::error_msg | ( | ) | const |
Returns the current error message. Returns NULL if there is no error.
Definition at line 257 of file udf.cc.
Referenced by impala_udf::UdaTestHarnessBase< RESULT, INTERMEDIATE >::CheckContext(), impala::ScalarFnCall::Open(), impala_udf::UdfTestHarness::ValidateError(), ValidateFail(), and ValidateUdf().
void FunctionContext::Free | ( | uint8_t * | buffer | ) |
Frees a buffer returned from Allocate() or Reallocate()
Definition at line 291 of file udf.cc.
References VLOG_ROW.
Referenced by impala::AggregateFunctions::AppxMedianFinalize(), impala::AggregateFunctions::AvgFinalize(), impala::CaseExpr::Close(), impala::FunctionContextImpl::Close(), ConstantArgClose(), CountClose(), impala::AggregateFunctions::DecimalAvgFinalize(), DoubleFreeTest(), impala::UdfBuiltins::ExtractClose(), impala_udf::UdaTestHarnessUtil::FreeIntermediate(), impala::AggregateFunctions::HistogramFinalize(), HllFinalize(), impala::AggregateFunctions::HllFinalize(), HllSerialize(), IncrementNdvFinalize(), impala::AggregateFunctions::LastValRemove(), impala::AggregateFunctions::LastValUpdate(), impala::AggregateFunctions::Max(), MemTestClose(), MemTestFinalize(), MemTestSerialize(), impala::AggregateFunctions::Min(), MinFinalize(), MinSerialize(), impala::AggregateFunctions::PcFinalize(), impala::AggregateFunctions::PcsaFinalize(), impala::AggregateFunctions::RankFinalize(), impala::AggregateFunctions::ReservoirSampleFinalize(), impala::AggregateFunctions::ReservoirSampleSerialize(), MinState::Set(), impala::AggregateFunctions::StringConcatFinalize(), impala::AggregateFunctions::StringValSerializeOrFinalize(), impala::AggregateFunctions::TimestampAvgFinalize(), impala::UdfBuiltins::TruncClose(), ValidateMem(), ValidateOpenClose(), and ValidateSharedStateClose().
const FunctionContext::TypeDesc * FunctionContext::GetArgType | ( | int | arg_idx | ) | const |
Returns the type information for the arg_idx-th argument (0-indexed, not including the FunctionContext* argument). Returns NULL if arg_idx is invalid.
Definition at line 425 of file udf.cc.
References impala::FunctionContextImpl::arg_types_, and impl_.
Referenced by impala::DecimalFunctions::Abs(), impala::DecimalOperators::CastToBooleanVal(), impala::DecimalOperators::CastToStringVal(), impala::DecimalOperators::CastToTimestampVal(), impala::StringFunctions::CharLength(), impala::AggregateFunctions::DecimalAvgAddOrRemove(), impala::UtilityFunctions::FnvHashDecimal(), impala::AggregateFunctions::HllUpdate(), impala::AggregateFunctions::Max(), impala::AggregateFunctions::Min(), impala::AggregateFunctions::OffsetFnInit(), impala::StringFunctions::ParseUrlPrepare(), impala::AggregateFunctions::PcsaUpdate(), impala::AggregateFunctions::PcUpdate(), impala::DecimalFunctions::Precision(), impala::MathFunctions::RandPrepare(), impala::StringFunctions::RegexpPrepare(), impala::DecimalOperators::RoundDecimal(), impala::DecimalFunctions::RoundTo(), impala::DecimalFunctions::Scale(), impala::InPredicate::SetLookupPrepare(), SumSmallDecimalUpdate(), impala::InPredicate::TemplatedIn(), impala::UtilityFunctions::TypeOf(), ValidateArgType(), and VarSum().
AnyVal * FunctionContext::GetConstantArg | ( | int | arg_idx | ) | const |
Returns a pointer to the value of the arg_idx-th input argument (0 indexed, not including the FunctionContext* argument). Returns NULL if the argument is not constant. This function can be used to obtain user-specified constants in a UDF's Init() or Close() functions.
Definition at line 25 of file udf-ir.cc.
References impala::FunctionContextImpl::constant_args_, and impl_.
Referenced by ConstantArgPrepare(), impala::UdfBuiltins::ExtractPrepare(), impala::LikePredicate::LikePrepare(), impala::AggregateFunctions::OffsetFnInit(), impala::StringFunctions::ParseUrlPrepare(), impala::MathFunctions::RandPrepare(), impala::StringFunctions::RegexpPrepare(), impala::LikePredicate::RegexPrepare(), impala::InPredicate::SetLookupPrepare(), impala::UdfBuiltins::TruncPrepare(), impala::TimestampFunctions::UnixAndFromUnixPrepare(), and ValidateSharedStatePrepare().
void * FunctionContext::GetFunctionState | ( | FunctionStateScope | scope | ) | const |
Definition at line 38 of file udf-ir.cc.
References impala::FunctionContextImpl::closed_, FRAGMENT_LOCAL, impala::FunctionContextImpl::fragment_local_fn_state_, impl_, THREAD_LOCAL, and impala::FunctionContextImpl::thread_local_fn_state_.
Referenced by impala::CaseExpr::Close(), impala::HiveUdfCall::Close(), ConstantArg(), ConstantArgClose(), impala::LikePredicate::ConstantEndsWithFn(), impala::LikePredicate::ConstantEqualsFn(), impala::LikePredicate::ConstantRegexFn(), impala::LikePredicate::ConstantRegexFnPartial(), impala::LikePredicate::ConstantStartsWithFn(), impala::LikePredicate::ConstantSubstringFn(), impala::LikePredicate::ConvertLikePattern(), Count(), CountClose(), impala::HiveUdfCall::Evaluate(), impala::UdfBuiltins::Extract(), impala::UdfBuiltins::ExtractClose(), impala::TimestampFunctions::FromUnix(), impala::LikePredicate::Like(), impala::LikePredicate::LikeClose(), MemTest(), MemTestClose(), impala::StringFunctions::ParseUrl(), impala::StringFunctions::ParseUrlClose(), impala::StringFunctions::ParseUrlKey(), impala::MathFunctions::Rand(), impala::LikePredicate::Regex(), impala::LikePredicate::RegexClose(), impala::LikePredicate::RegexMatch(), impala::StringFunctions::RegexpClose(), impala::StringFunctions::RegexpExtract(), impala::StringFunctions::RegexpReplace(), impala::InPredicate::SetLookupClose(), impala::InPredicate::TemplatedIn(), impala::UdfBuiltins::Trunc(), impala::UdfBuiltins::TruncClose(), impala::TimestampFunctions::Unix(), impala::TimestampFunctions::UnixAndFromUnixClose(), ValidateOpen(), ValidateOpenClose(), ValidateSharedState(), and ValidateSharedStateClose().
const TypeDesc& impala_udf::FunctionContext::GetIntermediateType | ( | ) | const |
Returns the intermediate type for UDAs, i.e., the one returned by update and merge functions. Returns INVALID_TYPE for UDFs.
int FunctionContext::GetNumArgs | ( | ) | const |
Returns the number of arguments to this function (not including the FunctionContext* argument).
Definition at line 30 of file udf-ir.cc.
References impala::FunctionContextImpl::arg_types_, and impl_.
Referenced by impala::AggregateFunctions::OffsetFnInit(), impala::MathFunctions::RandPrepare(), and impala::InPredicate::SetLookupPrepare().
const FunctionContext::TypeDesc & FunctionContext::GetReturnType | ( | ) | const |
Returns the return type information of this function. For UDAs, this is the final return type of the UDA (e.g., the type returned by the finalize function).
Definition at line 34 of file udf-ir.cc.
References impl_, and impala::FunctionContextImpl::return_type_.
Referenced by impala::CastFunctions::CastToChar(), impala::DecimalOperators::CastToDecimalVal(), impala::CastFunctions::CastToStringVal(), impala::AggregateFunctions::DecimalAvgGetValue(), impala::MathFunctions::LeastGreatest(), impala::MathFunctions::Negative(), impala::ConditionalFunctions::NullIfZero(), impala::DecimalOperators::RoundDecimal(), and impala::DecimalFunctions::RoundTo().
bool FunctionContext::has_error | ( | ) | const |
Returns true if there's been an error set.
Definition at line 253 of file udf.cc.
Referenced by impala_udf::UdaTestHarnessBase< RESULT, INTERMEDIATE >::CheckContext(), impala::ScalarFnCall::Open(), impala_udf::UdfTestHarness::Validate(), impala_udf::UdfTestHarness::ValidateError(), ValidateFail(), and ValidateUdf().
|
inline |
TODO: Add mechanism for UDAs to update stats similar to runtime profile counters.
TODO: Do we need to add arbitrary key/value metadata. This would be plumbed through the query. E.g. "select UDA(col, 'sample=true') from tbl". const char* GetMetadata(const char*) const;TODO: Add mechanism to query for table/column stats Returns the underlying opaque implementation object. The UDF/UDA should not use this. This is used internally.
Definition at line 202 of file udf.h.
References impl_.
Referenced by impala::AggFnEvaluator::Add(), impala::CastFunctions::CastToChar(), impala_udf::UdfTestHarness::CloseContext(), impala::UtilityFunctions::CurrentDatabase(), impala::ScalarFnCall::EvaluateChildren(), impala::AggregateFunctions::FirstValUpdate(), impala::AggFnEvaluator::Init(), impala::ScalarFnCall::InterpretEval(), impala::AggregateFunctions::LastValRemove(), impala::TimestampFunctions::Now(), impala::ScalarFnCall::Open(), impala::AggFnEvaluator::Open(), impala::UtilityFunctions::Pid(), impala::AggFnEvaluator::Remove(), impala_udf::UdfTestHarness::SetConstantArgs(), impala::AggregateFunctions::SumDecimalRemove(), impala::AggregateFunctions::SumRemove(), impala::AggregateFunctions::SumUpdate(), and impala::TimestampFunctions::Unix().
bool FunctionContext::IsArgConstant | ( | int | arg_idx | ) | const |
Returns true if the arg_idx-th input argument (0 indexed, not including the FunctionContext* argument) is a constant (e.g. 5, "string", 1 + 1).
Definition at line 20 of file udf-ir.cc.
References impala::FunctionContextImpl::constant_args_, and impl_.
Referenced by ConstantArgPrepare(), impala::UdfBuiltins::ExtractPrepare(), impala::TimestampFunctions::FromUnix(), impala::LikePredicate::LikePrepare(), impala::AggregateFunctions::OffsetFnInit(), impala::StringFunctions::ParseUrl(), impala::StringFunctions::ParseUrlKey(), impala::StringFunctions::ParseUrlPrepare(), impala::MathFunctions::RandPrepare(), impala::LikePredicate::RegexMatch(), impala::StringFunctions::RegexpExtract(), impala::StringFunctions::RegexpPrepare(), impala::LikePredicate::RegexPrepare(), impala::StringFunctions::RegexpReplace(), impala::InPredicate::SetLookupPrepare(), impala::UdfBuiltins::TruncPrepare(), impala::TimestampFunctions::Unix(), and impala::TimestampFunctions::UnixAndFromUnixPrepare().
|
private |
FunctionContext::UniqueId FunctionContext::query_id | ( | ) | const |
Returns the query_id for the current query.
Definition at line 242 of file udf.cc.
References impala_udf::FunctionContext::UniqueId::hi.
uint8_t * FunctionContext::Reallocate | ( | uint8_t * | ptr, |
int | byte_size | ||
) |
Reallocates 'ptr' to the new byte_size. If the currently underlying allocation is big enough, the original ptr will be returned. If the allocation needs to grow, a new allocation is made that is at least 'byte_size' and the contents of 'ptr' will be copied into it. This should be used for buffers that constantly get appended to.
Definition at line 276 of file udf.cc.
References VLOG_ROW.
Referenced by impala::AggregateFunctions::LastValUpdate(), impala::AggregateFunctions::StringConcatMerge(), and impala::AggregateFunctions::StringConcatUpdate().
void FunctionContext::SetError | ( | const char * | error_msg | ) |
Sets an error for this UDF. If this is called, this will trigger the query to fail.
Definition at line 332 of file udf.cc.
Referenced by impala::FunctionContextImpl::Close(), impala::HiveUdfCall::Evaluate(), impala::UdfBuiltins::Extract(), impala::UdfBuiltins::ExtractPrepare(), impala::LikePredicate::LikePrepare(), impala::StringFunctions::ParseUrlPrepare(), impala::MathFunctions::RandPrepare(), impala::LikePredicate::RegexMatch(), impala::StringFunctions::RegexpPrepare(), impala::LikePredicate::RegexPrepare(), impala::TimestampFunctions::ReportBadFormat(), impala_udf::UdfTestHarness::SetConstantArgs(), impala::AggFnEvaluator::SetDstSlot(), TestError(), impala::UdfBuiltins::ToVector(), impala::UdfBuiltins::Trunc(), impala::UdfBuiltins::TruncPrepare(), ValidateFail(), ValidateMADlibVector(), and VarSum().
void FunctionContext::SetFunctionState | ( | FunctionStateScope | scope, |
void * | ptr | ||
) |
Methods for maintaining state across UDF/UDA function calls. SetFunctionState() can be used to store a pointer that can then be retreived via GetFunctionState(). If GetFunctionState() is called when no pointer is set, it will return NULL. SetFunctionState() does not take ownership of 'ptr'; it is up to the UDF/UDA to clean up any function state if necessary.
Definition at line 370 of file udf.cc.
Referenced by ConstantArgClose(), ConstantArgPrepare(), CountClose(), CountPrepare(), impala::UdfBuiltins::ExtractClose(), impala::UdfBuiltins::ExtractPrepare(), impala::LikePredicate::LikePrepare(), MemTestClose(), MemTestPrepare(), impala::CaseExpr::Open(), impala::HiveUdfCall::Open(), impala::StringFunctions::ParseUrlPrepare(), impala::MathFunctions::RandPrepare(), impala::StringFunctions::RegexpPrepare(), impala::LikePredicate::RegexPrepare(), impala::InPredicate::SetLookupPrepare(), impala::UdfBuiltins::TruncClose(), impala::UdfBuiltins::TruncPrepare(), impala::TimestampFunctions::UnixAndFromUnixPrepare(), ValidateOpenClose(), ValidateOpenPrepare(), ValidateSharedStateClose(), and ValidateSharedStatePrepare().
void FunctionContext::TrackAllocation | ( | int64_t | byte_size | ) |
For allocations that cannot use the Allocate() API provided by this object, TrackAllocation()/Free() can be used to just keep count of the byte sizes. For each call to TrackAllocation(), the UDF/UDA must call the corresponding Free().
Definition at line 312 of file udf.cc.
Referenced by DoubleFreeTest(), MemTest(), MemTestMerge(), and MemTestUpdate().
const char * FunctionContext::user | ( | ) | const |
Returns the user that is running the query. Returns NULL if it is not available.
Definition at line 237 of file udf.cc.
Referenced by impala::UtilityFunctions::User().
FunctionContext::ImpalaVersion FunctionContext::version | ( | ) | const |
Returns the version of Impala that's currently running.
Definition at line 233 of file udf.cc.
Referenced by ValidateUdf().
|
friend |
|
private |
Definition at line 214 of file udf.h.
Referenced by impala::FunctionContextImpl::Clone(), GetArgType(), GetConstantArg(), GetFunctionState(), GetNumArgs(), GetReturnType(), impl(), IsArgConstant(), and ~FunctionContext().