Impala
Impalaistheopensource,nativeanalyticdatabaseforApacheHadoop.
|
#include <agg-fn-evaluator.h>
Public Types | |
enum | AggregationOp { COUNT, MIN, MAX, SUM, AVG, NDV, OTHER } |
Public Member Functions | |
Status | Prepare (RuntimeState *state, const RowDescriptor &desc, const SlotDescriptor *intermediate_slot_desc, const SlotDescriptor *output_slot_desc, MemPool *agg_fn_pool, FunctionContext **agg_fn_ctx) |
~AggFnEvaluator () | |
Status | Open (RuntimeState *state, FunctionContext *agg_fn_ctx) |
void | Close (RuntimeState *state) |
const ColumnType & | intermediate_type () const |
bool | is_merge () const |
AggregationOp | agg_op () const |
const std::vector< ExprContext * > & | input_expr_ctxs () const |
bool | is_count_star () const |
bool | is_builtin () const |
bool | SupportsRemove () const |
bool | SupportsSerialize () const |
const std::string & | fn_name () const |
const std::string & | update_symbol () const |
const std::string & | merge_symbol () const |
std::string | DebugString () const |
void | Init (FunctionContext *agg_fn_ctx, Tuple *dst) |
Functions for different phases of the aggregation. More... | |
void | Add (FunctionContext *agg_fn_ctx, TupleRow *src, Tuple *dst) |
void | Remove (FunctionContext *agg_fn_ctx, TupleRow *src, Tuple *dst) |
void | Merge (FunctionContext *agg_fn_ctx, Tuple *src, Tuple *dst) |
void | Serialize (FunctionContext *agg_fn_ctx, Tuple *dst) |
void | Finalize (FunctionContext *agg_fn_ctx, Tuple *src, Tuple *dst) |
void | GetValue (FunctionContext *agg_fn_ctx, Tuple *src, Tuple *dst) |
Static Public Member Functions | |
static Status | Create (ObjectPool *pool, const TExpr &desc, AggFnEvaluator **result) |
static Status | Create (ObjectPool *pool, const TExpr &desc, bool is_analytic_fn, AggFnEvaluator **result) |
static std::string | DebugString (const std::vector< AggFnEvaluator * > &exprs) |
static void | Init (const std::vector< AggFnEvaluator * > &evaluators, const std::vector< FunctionContext * > &fn_ctxs, Tuple *dst) |
Helper functions for calling the above functions on many evaluators. More... | |
static void | Add (const std::vector< AggFnEvaluator * > &evaluators, const std::vector< FunctionContext * > &fn_ctxs, TupleRow *src, Tuple *dst) |
static void | Remove (const std::vector< AggFnEvaluator * > &evaluators, const std::vector< FunctionContext * > &fn_ctxs, TupleRow *src, Tuple *dst) |
static void | Serialize (const std::vector< AggFnEvaluator * > &evaluators, const std::vector< FunctionContext * > &fn_ctxs, Tuple *dst) |
static void | GetValue (const std::vector< AggFnEvaluator * > &evaluators, const std::vector< FunctionContext * > &fn_ctxs, Tuple *src, Tuple *dst) |
static void | Finalize (const std::vector< AggFnEvaluator * > &evaluators, const std::vector< FunctionContext * > &fn_ctxs, Tuple *src, Tuple *dst) |
Private Member Functions | |
AggFnEvaluator (const TExprNode &desc, bool is_analytic_fn) | |
Use Create() instead. More... | |
void | Update (FunctionContext *agg_fn_ctx, TupleRow *row, Tuple *dst, void *fn) |
void | SerializeOrFinalize (FunctionContext *agg_fn_ctx, Tuple *src, const SlotDescriptor *dst_slot_desc, Tuple *dst, void *fn) |
void | SetDstSlot (FunctionContext *ctx, const impala_udf::AnyVal *src, const SlotDescriptor *dst_slot_desc, Tuple *dst) |
Writes the result in src into dst pointed to by dst_slot_desc. More... | |
Private Attributes | |
const TFunction | fn_ |
const bool | is_merge_ |
Indicates whether to Update() or Merge() More... | |
const bool | is_analytic_fn_ |
Indicates which functions must be loaded. More... | |
const SlotDescriptor * | intermediate_slot_desc_ |
Slot into which Update()/Merge()/Serialize() write their result. Not owned. More... | |
const SlotDescriptor * | output_slot_desc_ |
std::vector< ExprContext * > | input_expr_ctxs_ |
AggregationOp | agg_op_ |
The enum for some of the builtins that still require special cased logic. More... | |
std::vector< impala_udf::AnyVal * > | staging_input_vals_ |
impala_udf::AnyVal * | staging_intermediate_val_ |
impala_udf::AnyVal * | staging_merge_input_val_ |
LibCache::LibCacheEntry * | cache_entry_ |
Cache entry for the library containing the function ptrs. More... | |
void * | init_fn_ |
Function ptrs for the different phases of the aggregate function. More... | |
void * | update_fn_ |
void * | remove_fn_ |
void * | merge_fn_ |
void * | serialize_fn_ |
void * | get_value_fn_ |
void * | finalize_fn_ |
This class evaluates aggregate functions. Aggregate functions can either be builtins or external UDAs. For both of types types, they can either use codegen or not. This class provides an interface that's 1:1 with the UDA interface and serves as glue code between the TupleRow/Tuple signature used by the AggregationNode and the AnyVal signature of the UDA interface. It handles evaluating input slots from TupleRows and aggregating the result to the result tuple. This class is not threadsafe. However, it can be used for multiple interleaved evaluations of the aggregation function by using multiple FunctionContexts.
Definition at line 62 of file agg-fn-evaluator.h.
TODO: The aggregation node has custom codegen paths for a few of the builtins. That logic needs to be removed. For now, add some enums for those builtins.
Enumerator | |
---|---|
COUNT | |
MIN | |
MAX | |
SUM | |
AVG | |
NDV | |
OTHER |
Definition at line 66 of file agg-fn-evaluator.h.
AggFnEvaluator::~AggFnEvaluator | ( | ) |
Definition at line 120 of file agg-fn-evaluator.cc.
References cache_entry_.
|
private |
|
inline |
Updates the intermediate state dst based on adding the input src row. This can be called either to drive the UDA's Update() or Merge() function depending on is_merge_. That is, from the caller, it doesn't mater.
Definition at line 238 of file agg-fn-evaluator.h.
References impala_udf::FunctionContext::impl(), and impala::FunctionContextImpl::IncrementNumUpdates().
Referenced by impala::AnalyticEvalNode::AddRow(), and impala::AggregationNode::UpdateTuple().
|
inlinestatic |
Definition at line 268 of file agg-fn-evaluator.h.
|
inline |
Definition at line 112 of file agg-fn-evaluator.h.
Referenced by impala::AggregationNode::CodegenUpdateSlot(), impala::PartitionedAggregationNode::CodegenUpdateSlot(), impala::AggregationNode::CodegenUpdateTuple(), impala::PartitionedAggregationNode::CodegenUpdateTuple(), impala::AggregationNode::ConstructIntermediateTuple(), and impala::PartitionedAggregationNode::ConstructIntermediateTuple().
void AggFnEvaluator::Close | ( | RuntimeState * | state | ) |
Definition at line 227 of file agg-fn-evaluator.cc.
References cache_entry_, impala::Expr::Close(), impala::LibCache::DecrementUseCount(), input_expr_ctxs_, and impala::LibCache::instance().
|
static |
Creates an AggFnEvaluator object from desc. The object is added to 'pool' and returned in *result. This constructs the input Expr trees for this aggregate function as specified in desc. The result is returned in *result.
Definition at line 64 of file agg-fn-evaluator.cc.
Referenced by impala::AggregationNode::Init(), impala::AnalyticEvalNode::Init(), and impala::PartitionedAggregationNode::Init().
|
static |
Creates an AggFnEvaluator object from desc. If is_analytic_fn, the evaluator is prepared for analytic function evaluation. TODO: Avoid parameter for analytic fns, should this be added to TAggregateExpr?
Definition at line 69 of file agg-fn-evaluator.cc.
References impala::ObjectPool::Add(), impala::Expr::CreateTreeFromThrift(), impala::Status::OK, and RETURN_IF_ERROR.
|
static |
string AggFnEvaluator::DebugString | ( | ) | const |
Definition at line 518 of file agg-fn-evaluator.cc.
References agg_op_, and input_expr_ctxs_.
Referenced by impala::AggregationNode::DebugString(), impala::AnalyticEvalNode::DebugString(), and impala::PartitionedAggregationNode::DebugString().
|
inline |
|
inlinestatic |
Definition at line 296 of file agg-fn-evaluator.h.
|
inline |
Definition at line 118 of file agg-fn-evaluator.h.
Referenced by impala::PartitionedAggregationNode::CodegenUpdateTuple().
|
inline |
Puts the finalized value from Tuple* src in Tuple* dst just as Finalize() does. However, unlike Finalize(), GetValue() does not clean up state in src. GetValue() can be called repeatedly with the same src. Only used internally for analytic fn builtins.
Definition at line 256 of file agg-fn-evaluator.h.
Referenced by impala::AnalyticEvalNode::AddResultTuple().
|
inlinestatic |
Definition at line 289 of file agg-fn-evaluator.h.
void AggFnEvaluator::Init | ( | FunctionContext * | agg_fn_ctx, |
Tuple * | dst | ||
) |
Functions for different phases of the aggregation.
Definition at line 314 of file agg-fn-evaluator.cc.
References impala::StringValue::CharSlotToPtr(), impala::Tuple::GetSlot(), impala_udf::FunctionContext::impl(), init_fn_, intermediate_slot_desc_, intermediate_type(), impala_udf::AnyVal::is_null, impala::Tuple::IsNull(), impala::ColumnType::len, impala_udf::StringVal::len, impala::SlotDescriptor::null_indicator_offset(), impala_udf::StringVal::ptr, impala::FunctionContextImpl::set_num_removes(), impala::FunctionContextImpl::set_num_updates(), SetDstSlot(), staging_intermediate_val_, impala::SlotDescriptor::tuple_offset(), and impala::TYPE_CHAR.
Referenced by impala::AggregationNode::ConstructIntermediateTuple(), impala::PartitionedAggregationNode::ConstructIntermediateTuple(), impala::AnalyticEvalNode::InitNextPartition(), and impala::AnalyticEvalNode::Open().
|
inlinestatic |
Helper functions for calling the above functions on many evaluators.
Definition at line 261 of file agg-fn-evaluator.h.
|
inline |
Definition at line 113 of file agg-fn-evaluator.h.
Referenced by impala::AggregationNode::CodegenUpdateSlot(), and impala::PartitionedAggregationNode::CodegenUpdateSlot().
|
inline |
Definition at line 110 of file agg-fn-evaluator.h.
References impala::ColumnType::type.
Referenced by impala::PartitionedAggregationNode::CodegenUpdateSlot(), Init(), Prepare(), and SerializeOrFinalize().
|
inline |
Definition at line 115 of file agg-fn-evaluator.h.
Referenced by impala::AggregationNode::CodegenUpdateTuple(), and impala::PartitionedAggregationNode::CodegenUpdateTuple().
|
inline |
Definition at line 114 of file agg-fn-evaluator.h.
Referenced by impala::AggregationNode::CodegenUpdateTuple(), and impala::PartitionedAggregationNode::CodegenUpdateTuple().
|
inline |
Definition at line 111 of file agg-fn-evaluator.h.
Referenced by impala::AggregationNode::CodegenUpdateSlot(), and impala::PartitionedAggregationNode::CodegenUpdateSlot().
void AggFnEvaluator::Merge | ( | FunctionContext * | agg_fn_ctx, |
Tuple * | src, | ||
Tuple * | dst | ||
) |
Explicitly does a merge, even if this evalutor is not marked as merging. This is used by the partitioned agg node when it needs to merge spill results. In the non-spilling case, this node would normally not merge.
Definition at line 409 of file agg-fn-evaluator.cc.
References intermediate_slot_desc_, merge_fn_, SetAnyVal(), SetDstSlot(), staging_intermediate_val_, and staging_merge_input_val_.
|
inline |
Definition at line 120 of file agg-fn-evaluator.h.
Referenced by impala::PartitionedAggregationNode::CodegenUpdateSlot().
Status AggFnEvaluator::Open | ( | RuntimeState * | state, |
FunctionContext * | agg_fn_ctx | ||
) |
'agg_fn_ctx' may be cloned after calling Open(). Note that closing all FunctionContexts, including the original one returned by Prepare(), is the responsibility of the caller.
Definition at line 214 of file agg-fn-evaluator.cc.
References impala_udf::FunctionContext::impl(), input_expr_ctxs_, impala::Status::OK, impala::Expr::Open(), RETURN_IF_ERROR, and impala::FunctionContextImpl::SetConstantArgs().
Status AggFnEvaluator::Prepare | ( | RuntimeState * | state, |
const RowDescriptor & | desc, | ||
const SlotDescriptor * | intermediate_slot_desc, | ||
const SlotDescriptor * | output_slot_desc, | ||
MemPool * | agg_fn_pool, | ||
FunctionContext ** | agg_fn_ctx | ||
) |
Initializes the agg expr. 'desc' must be the row descriptor for the input TupleRow. It is used to get the input values in the Update() and Merge() functions. 'intermediate_slot_desc' is the slot into which this evaluator should write the results of Update()/Merge()/Serialize(). 'output_slot_desc' is the slot into which this evaluator should write the results of Finalize() 'agg_fn_ctx' will be initialized for the agg function using 'agg_fn_pool'. Caller is responsible for closing and deleting 'agg_fn_ctx'.
Definition at line 124 of file agg-fn-evaluator.cc.
References cache_entry_, impala::AnyValUtil::ColumnTypeToTypeDesc(), impala::CreateAnyVal(), impala::FunctionContextImpl::CreateContext(), finalize_fn_, fn_, get_value_fn_, init_fn_, input_expr_ctxs_, impala::LibCache::instance(), intermediate_slot_desc_, intermediate_type(), is_analytic_fn_, is_merge_, impala::MemPool::mem_tracker(), merge_fn_, impala::RuntimeState::obj_pool(), impala::obj_pool(), impala::Status::OK, output_slot_desc_, impala::Expr::Prepare(), remove_fn_, RETURN_IF_ERROR, serialize_fn_, staging_input_vals_, staging_intermediate_val_, staging_merge_input_val_, impala::ColumnType::type, impala::SlotDescriptor::type(), and update_fn_.
|
inline |
Updates the intermediate state dst to remove the input src row, i.e. undoes Add(src, dst). Only used internally for analytic fn builtins.
Definition at line 243 of file agg-fn-evaluator.h.
References impala_udf::FunctionContext::impl(), and impala::FunctionContextImpl::IncrementNumRemoves().
Referenced by impala::AnalyticEvalNode::TryAddRemainingResults(), and impala::AnalyticEvalNode::TryRemoveRowsBeforeWindow().
|
inlinestatic |
Definition at line 275 of file agg-fn-evaluator.h.
|
inline |
Definition at line 248 of file agg-fn-evaluator.h.
Referenced by impala::PartitionedAggregationNode::CleanupHashTbl(), impala::AggregationNode::Close(), impala::AggregationNode::FinalizeTuple(), impala::PartitionedAggregationNode::GetOutputTuple(), and impala::PartitionedAggregationNode::Partition::Spill().
|
inlinestatic |
Definition at line 282 of file agg-fn-evaluator.h.
|
private |
Sets up the arguments to call fn. This converts from the agg-expr signature, taking TupleRow to the UDA signature taking AnvVals. Writes the serialize/finalize result to the given destination slot/tuple. The fn can be NULL to indicate the src value should simply be written into the destination.
Definition at line 421 of file agg-fn-evaluator.cc.
References impala::Tuple::GetSlot(), intermediate_slot_desc_, intermediate_type(), impala::Tuple::IsNull(), impala::SlotDescriptor::null_indicator_offset(), impala::AnyValUtil::SetAnyVal(), SetDstSlot(), staging_intermediate_val_, impala::SlotDescriptor::tuple_offset(), impala::ColumnType::type, impala::SlotDescriptor::type(), impala::TYPE_BIGINT, impala::TYPE_BOOLEAN, impala::TYPE_DECIMAL, impala::TYPE_DOUBLE, impala::TYPE_FLOAT, impala::TYPE_INT, impala::TYPE_SMALLINT, impala::TYPE_STRING, impala::TYPE_TIMESTAMP, impala::TYPE_TINYINT, impala::TYPE_VARCHAR, and impala::RawValue::Write().
|
inlineprivate |
Writes the result in src into dst pointed to by dst_slot_desc.
Definition at line 236 of file agg-fn-evaluator.cc.
References impala::StringValue::FromStringVal(), impala::TimestampValue::FromTimestampVal(), impala::ColumnType::GetByteSize(), impala::Tuple::GetSlot(), impala_udf::AnyVal::is_null, impala::SlotDescriptor::null_indicator_offset(), impala_udf::FunctionContext::SetError(), impala::Tuple::SetNotNull(), impala::Tuple::SetNull(), impala::SlotDescriptor::tuple_offset(), impala::ColumnType::type, impala::SlotDescriptor::type(), impala::TYPE_BIGINT, impala::TYPE_BOOLEAN, impala::TYPE_CHAR, impala::TYPE_DECIMAL, impala::TYPE_DOUBLE, impala::TYPE_FLOAT, impala::TYPE_INT, impala::TYPE_NULL, impala::TYPE_SMALLINT, impala::TYPE_STRING, impala::TYPE_TIMESTAMP, impala::TYPE_TINYINT, and impala::TYPE_VARCHAR.
Referenced by Init(), Merge(), SerializeOrFinalize(), and Update().
|
inline |
Definition at line 116 of file agg-fn-evaluator.h.
|
inline |
Definition at line 117 of file agg-fn-evaluator.h.
|
private |
TODO: these functions below are not extensible and we need to use codegen to generate the calls into the UDA functions (like for UDFs). Remove these functions when this is supported. Sets up the arguments to call fn. This converts from the agg-expr signature, taking TupleRow to the UDA signature taking AnvVals by populating the staging AnyVals. fn must be a function that implement's the UDA Update() signature.
Definition at line 339 of file agg-fn-evaluator.cc.
References input_expr_ctxs_, intermediate_slot_desc_, impala::AnyValUtil::SetAnyVal(), SetAnyVal(), SetDstSlot(), staging_input_vals_, and staging_intermediate_val_.
|
inline |
Definition at line 119 of file agg-fn-evaluator.h.
Referenced by impala::PartitionedAggregationNode::CodegenUpdateSlot().
|
private |
The enum for some of the builtins that still require special cased logic.
Definition at line 191 of file agg-fn-evaluator.h.
Referenced by AggFnEvaluator(), and DebugString().
|
private |
Cache entry for the library containing the function ptrs.
Definition at line 202 of file agg-fn-evaluator.h.
Referenced by Close(), Prepare(), and ~AggFnEvaluator().
|
private |
Definition at line 211 of file agg-fn-evaluator.h.
Referenced by Prepare().
|
private |
TODO: implement codegen path. These functions would return IR functions with the same signature as the interpreted ones above. Function* GetIrInitFn(); Function* GetIrAddFn(); Function* GetIrRemoveFn(); Function* GetIrSerializeFn(); Function* GetIrGetValueFn(); Function* GetIrFinalizeFn();
Definition at line 175 of file agg-fn-evaluator.h.
Referenced by AggFnEvaluator(), and Prepare().
|
private |
Definition at line 210 of file agg-fn-evaluator.h.
Referenced by Prepare().
|
private |
Function ptrs for the different phases of the aggregate function.
Definition at line 205 of file agg-fn-evaluator.h.
|
private |
Definition at line 188 of file agg-fn-evaluator.h.
Referenced by Close(), DebugString(), Open(), Prepare(), and Update().
|
private |
Slot into which Update()/Merge()/Serialize() write their result. Not owned.
Definition at line 182 of file agg-fn-evaluator.h.
Referenced by Init(), Merge(), Prepare(), SerializeOrFinalize(), and Update().
|
private |
Indicates which functions must be loaded.
Definition at line 179 of file agg-fn-evaluator.h.
Referenced by Prepare().
|
private |
Indicates whether to Update() or Merge()
Definition at line 177 of file agg-fn-evaluator.h.
Referenced by Prepare().
|
private |
Definition at line 208 of file agg-fn-evaluator.h.
|
private |
Slot into which Finalize() results are written. Not owned. Identical to intermediate_slot_desc_ if this agg fn has the same intermediate and output type.
Definition at line 186 of file agg-fn-evaluator.h.
Referenced by Prepare().
|
private |
Definition at line 207 of file agg-fn-evaluator.h.
Referenced by Prepare().
|
private |
Definition at line 209 of file agg-fn-evaluator.h.
Referenced by Prepare().
|
private |
Created to a subclass of AnyVal for type(). We use this to convert values from the UDA interface to the Expr interface. These objects are allocated in the runtime state's object pool. TODO: this is awful, remove this when exprs are updated.
Definition at line 197 of file agg-fn-evaluator.h.
|
private |
Definition at line 198 of file agg-fn-evaluator.h.
Referenced by Init(), Merge(), Prepare(), SerializeOrFinalize(), and Update().
|
private |
Definition at line 199 of file agg-fn-evaluator.h.
|
private |
Definition at line 206 of file agg-fn-evaluator.h.
Referenced by Prepare().