Impala
Impalaistheopensource,nativeanalyticdatabaseforApacheHadoop.
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros
impala::AggFnEvaluator Class Reference

#include <agg-fn-evaluator.h>

Collaboration diagram for impala::AggFnEvaluator:

Public Types

enum  AggregationOp {
  COUNT, MIN, MAX, SUM,
  AVG, NDV, OTHER
}
 

Public Member Functions

Status Prepare (RuntimeState *state, const RowDescriptor &desc, const SlotDescriptor *intermediate_slot_desc, const SlotDescriptor *output_slot_desc, MemPool *agg_fn_pool, FunctionContext **agg_fn_ctx)
 
 ~AggFnEvaluator ()
 
Status Open (RuntimeState *state, FunctionContext *agg_fn_ctx)
 
void Close (RuntimeState *state)
 
const ColumnTypeintermediate_type () const
 
bool is_merge () const
 
AggregationOp agg_op () const
 
const std::vector< ExprContext * > & input_expr_ctxs () const
 
bool is_count_star () const
 
bool is_builtin () const
 
bool SupportsRemove () const
 
bool SupportsSerialize () const
 
const std::string & fn_name () const
 
const std::string & update_symbol () const
 
const std::string & merge_symbol () const
 
std::string DebugString () const
 
void Init (FunctionContext *agg_fn_ctx, Tuple *dst)
 Functions for different phases of the aggregation. More...
 
void Add (FunctionContext *agg_fn_ctx, TupleRow *src, Tuple *dst)
 
void Remove (FunctionContext *agg_fn_ctx, TupleRow *src, Tuple *dst)
 
void Merge (FunctionContext *agg_fn_ctx, Tuple *src, Tuple *dst)
 
void Serialize (FunctionContext *agg_fn_ctx, Tuple *dst)
 
void Finalize (FunctionContext *agg_fn_ctx, Tuple *src, Tuple *dst)
 
void GetValue (FunctionContext *agg_fn_ctx, Tuple *src, Tuple *dst)
 

Static Public Member Functions

static Status Create (ObjectPool *pool, const TExpr &desc, AggFnEvaluator **result)
 
static Status Create (ObjectPool *pool, const TExpr &desc, bool is_analytic_fn, AggFnEvaluator **result)
 
static std::string DebugString (const std::vector< AggFnEvaluator * > &exprs)
 
static void Init (const std::vector< AggFnEvaluator * > &evaluators, const std::vector< FunctionContext * > &fn_ctxs, Tuple *dst)
 Helper functions for calling the above functions on many evaluators. More...
 
static void Add (const std::vector< AggFnEvaluator * > &evaluators, const std::vector< FunctionContext * > &fn_ctxs, TupleRow *src, Tuple *dst)
 
static void Remove (const std::vector< AggFnEvaluator * > &evaluators, const std::vector< FunctionContext * > &fn_ctxs, TupleRow *src, Tuple *dst)
 
static void Serialize (const std::vector< AggFnEvaluator * > &evaluators, const std::vector< FunctionContext * > &fn_ctxs, Tuple *dst)
 
static void GetValue (const std::vector< AggFnEvaluator * > &evaluators, const std::vector< FunctionContext * > &fn_ctxs, Tuple *src, Tuple *dst)
 
static void Finalize (const std::vector< AggFnEvaluator * > &evaluators, const std::vector< FunctionContext * > &fn_ctxs, Tuple *src, Tuple *dst)
 

Private Member Functions

 AggFnEvaluator (const TExprNode &desc, bool is_analytic_fn)
 Use Create() instead. More...
 
void Update (FunctionContext *agg_fn_ctx, TupleRow *row, Tuple *dst, void *fn)
 
void SerializeOrFinalize (FunctionContext *agg_fn_ctx, Tuple *src, const SlotDescriptor *dst_slot_desc, Tuple *dst, void *fn)
 
void SetDstSlot (FunctionContext *ctx, const impala_udf::AnyVal *src, const SlotDescriptor *dst_slot_desc, Tuple *dst)
 Writes the result in src into dst pointed to by dst_slot_desc. More...
 

Private Attributes

const TFunction fn_
 
const bool is_merge_
 Indicates whether to Update() or Merge() More...
 
const bool is_analytic_fn_
 Indicates which functions must be loaded. More...
 
const SlotDescriptorintermediate_slot_desc_
 Slot into which Update()/Merge()/Serialize() write their result. Not owned. More...
 
const SlotDescriptoroutput_slot_desc_
 
std::vector< ExprContext * > input_expr_ctxs_
 
AggregationOp agg_op_
 The enum for some of the builtins that still require special cased logic. More...
 
std::vector< impala_udf::AnyVal * > staging_input_vals_
 
impala_udf::AnyValstaging_intermediate_val_
 
impala_udf::AnyValstaging_merge_input_val_
 
LibCache::LibCacheEntrycache_entry_
 Cache entry for the library containing the function ptrs. More...
 
void * init_fn_
 Function ptrs for the different phases of the aggregate function. More...
 
void * update_fn_
 
void * remove_fn_
 
void * merge_fn_
 
void * serialize_fn_
 
void * get_value_fn_
 
void * finalize_fn_
 

Detailed Description

This class evaluates aggregate functions. Aggregate functions can either be builtins or external UDAs. For both of types types, they can either use codegen or not. This class provides an interface that's 1:1 with the UDA interface and serves as glue code between the TupleRow/Tuple signature used by the AggregationNode and the AnyVal signature of the UDA interface. It handles evaluating input slots from TupleRows and aggregating the result to the result tuple. This class is not threadsafe. However, it can be used for multiple interleaved evaluations of the aggregation function by using multiple FunctionContexts.

Definition at line 62 of file agg-fn-evaluator.h.

Member Enumeration Documentation

TODO: The aggregation node has custom codegen paths for a few of the builtins. That logic needs to be removed. For now, add some enums for those builtins.

Enumerator
COUNT 
MIN 
MAX 
SUM 
AVG 
NDV 
OTHER 

Definition at line 66 of file agg-fn-evaluator.h.

Constructor & Destructor Documentation

AggFnEvaluator::~AggFnEvaluator ( )

Definition at line 120 of file agg-fn-evaluator.cc.

References cache_entry_.

AggFnEvaluator::AggFnEvaluator ( const TExprNode &  desc,
bool  is_analytic_fn 
)
private

Use Create() instead.

Definition at line 85 of file agg-fn-evaluator.cc.

References agg_op_, AVG, COUNT, fn_, MAX, MIN, NDV, OTHER, and SUM.

Member Function Documentation

void impala::AggFnEvaluator::Add ( FunctionContext agg_fn_ctx,
TupleRow src,
Tuple dst 
)
inline

Updates the intermediate state dst based on adding the input src row. This can be called either to drive the UDA's Update() or Merge() function depending on is_merge_. That is, from the caller, it doesn't mater.

Definition at line 238 of file agg-fn-evaluator.h.

References impala_udf::FunctionContext::impl(), and impala::FunctionContextImpl::IncrementNumUpdates().

Referenced by impala::AnalyticEvalNode::AddRow(), and impala::AggregationNode::UpdateTuple().

void impala::AggFnEvaluator::Add ( const std::vector< AggFnEvaluator * > &  evaluators,
const std::vector< FunctionContext * > &  fn_ctxs,
TupleRow src,
Tuple dst 
)
inlinestatic

Definition at line 268 of file agg-fn-evaluator.h.

Status AggFnEvaluator::Create ( ObjectPool pool,
const TExpr &  desc,
AggFnEvaluator **  result 
)
static

Creates an AggFnEvaluator object from desc. The object is added to 'pool' and returned in *result. This constructs the input Expr trees for this aggregate function as specified in desc. The result is returned in *result.

Definition at line 64 of file agg-fn-evaluator.cc.

Referenced by impala::AggregationNode::Init(), impala::AnalyticEvalNode::Init(), and impala::PartitionedAggregationNode::Init().

Status AggFnEvaluator::Create ( ObjectPool pool,
const TExpr &  desc,
bool  is_analytic_fn,
AggFnEvaluator **  result 
)
static

Creates an AggFnEvaluator object from desc. If is_analytic_fn, the evaluator is prepared for analytic function evaluation. TODO: Avoid parameter for analytic fns, should this be added to TAggregateExpr?

Definition at line 69 of file agg-fn-evaluator.cc.

References impala::ObjectPool::Add(), impala::Expr::CreateTreeFromThrift(), impala::Status::OK, and RETURN_IF_ERROR.

static std::string impala::AggFnEvaluator::DebugString ( const std::vector< AggFnEvaluator * > &  exprs)
static
string AggFnEvaluator::DebugString ( ) const
void impala::AggFnEvaluator::Finalize ( const std::vector< AggFnEvaluator * > &  evaluators,
const std::vector< FunctionContext * > &  fn_ctxs,
Tuple src,
Tuple dst 
)
inlinestatic

Definition at line 296 of file agg-fn-evaluator.h.

const std::string& impala::AggFnEvaluator::fn_name ( ) const
inline
void impala::AggFnEvaluator::GetValue ( FunctionContext agg_fn_ctx,
Tuple src,
Tuple dst 
)
inline

Puts the finalized value from Tuple* src in Tuple* dst just as Finalize() does. However, unlike Finalize(), GetValue() does not clean up state in src. GetValue() can be called repeatedly with the same src. Only used internally for analytic fn builtins.

Definition at line 256 of file agg-fn-evaluator.h.

Referenced by impala::AnalyticEvalNode::AddResultTuple().

void impala::AggFnEvaluator::GetValue ( const std::vector< AggFnEvaluator * > &  evaluators,
const std::vector< FunctionContext * > &  fn_ctxs,
Tuple src,
Tuple dst 
)
inlinestatic

Definition at line 289 of file agg-fn-evaluator.h.

void impala::AggFnEvaluator::Init ( const std::vector< AggFnEvaluator * > &  evaluators,
const std::vector< FunctionContext * > &  fn_ctxs,
Tuple dst 
)
inlinestatic

Helper functions for calling the above functions on many evaluators.

Definition at line 261 of file agg-fn-evaluator.h.

const std::vector<ExprContext*>& impala::AggFnEvaluator::input_expr_ctxs ( ) const
inline
const ColumnType& impala::AggFnEvaluator::intermediate_type ( ) const
inline
bool impala::AggFnEvaluator::is_builtin ( ) const
inline
bool impala::AggFnEvaluator::is_count_star ( ) const
inline
bool impala::AggFnEvaluator::is_merge ( ) const
inline
void AggFnEvaluator::Merge ( FunctionContext agg_fn_ctx,
Tuple src,
Tuple dst 
)

Explicitly does a merge, even if this evalutor is not marked as merging. This is used by the partitioned agg node when it needs to merge spill results. In the non-spilling case, this node would normally not merge.

Definition at line 409 of file agg-fn-evaluator.cc.

References intermediate_slot_desc_, merge_fn_, SetAnyVal(), SetDstSlot(), staging_intermediate_val_, and staging_merge_input_val_.

const std::string& impala::AggFnEvaluator::merge_symbol ( ) const
inline
Status AggFnEvaluator::Open ( RuntimeState state,
FunctionContext agg_fn_ctx 
)

'agg_fn_ctx' may be cloned after calling Open(). Note that closing all FunctionContexts, including the original one returned by Prepare(), is the responsibility of the caller.

Definition at line 214 of file agg-fn-evaluator.cc.

References impala_udf::FunctionContext::impl(), input_expr_ctxs_, impala::Status::OK, impala::Expr::Open(), RETURN_IF_ERROR, and impala::FunctionContextImpl::SetConstantArgs().

Status AggFnEvaluator::Prepare ( RuntimeState state,
const RowDescriptor desc,
const SlotDescriptor intermediate_slot_desc,
const SlotDescriptor output_slot_desc,
MemPool agg_fn_pool,
FunctionContext **  agg_fn_ctx 
)

Initializes the agg expr. 'desc' must be the row descriptor for the input TupleRow. It is used to get the input values in the Update() and Merge() functions. 'intermediate_slot_desc' is the slot into which this evaluator should write the results of Update()/Merge()/Serialize(). 'output_slot_desc' is the slot into which this evaluator should write the results of Finalize() 'agg_fn_ctx' will be initialized for the agg function using 'agg_fn_pool'. Caller is responsible for closing and deleting 'agg_fn_ctx'.

Definition at line 124 of file agg-fn-evaluator.cc.

References cache_entry_, impala::AnyValUtil::ColumnTypeToTypeDesc(), impala::CreateAnyVal(), impala::FunctionContextImpl::CreateContext(), finalize_fn_, fn_, get_value_fn_, init_fn_, input_expr_ctxs_, impala::LibCache::instance(), intermediate_slot_desc_, intermediate_type(), is_analytic_fn_, is_merge_, impala::MemPool::mem_tracker(), merge_fn_, impala::RuntimeState::obj_pool(), impala::obj_pool(), impala::Status::OK, output_slot_desc_, impala::Expr::Prepare(), remove_fn_, RETURN_IF_ERROR, serialize_fn_, staging_input_vals_, staging_intermediate_val_, staging_merge_input_val_, impala::ColumnType::type, impala::SlotDescriptor::type(), and update_fn_.

void impala::AggFnEvaluator::Remove ( FunctionContext agg_fn_ctx,
TupleRow src,
Tuple dst 
)
inline

Updates the intermediate state dst to remove the input src row, i.e. undoes Add(src, dst). Only used internally for analytic fn builtins.

Definition at line 243 of file agg-fn-evaluator.h.

References impala_udf::FunctionContext::impl(), and impala::FunctionContextImpl::IncrementNumRemoves().

Referenced by impala::AnalyticEvalNode::TryAddRemainingResults(), and impala::AnalyticEvalNode::TryRemoveRowsBeforeWindow().

void impala::AggFnEvaluator::Remove ( const std::vector< AggFnEvaluator * > &  evaluators,
const std::vector< FunctionContext * > &  fn_ctxs,
TupleRow src,
Tuple dst 
)
inlinestatic

Definition at line 275 of file agg-fn-evaluator.h.

void impala::AggFnEvaluator::Serialize ( const std::vector< AggFnEvaluator * > &  evaluators,
const std::vector< FunctionContext * > &  fn_ctxs,
Tuple dst 
)
inlinestatic

Definition at line 282 of file agg-fn-evaluator.h.

void AggFnEvaluator::SerializeOrFinalize ( FunctionContext agg_fn_ctx,
Tuple src,
const SlotDescriptor dst_slot_desc,
Tuple dst,
void *  fn 
)
private

Sets up the arguments to call fn. This converts from the agg-expr signature, taking TupleRow to the UDA signature taking AnvVals. Writes the serialize/finalize result to the given destination slot/tuple. The fn can be NULL to indicate the src value should simply be written into the destination.

Definition at line 421 of file agg-fn-evaluator.cc.

References impala::Tuple::GetSlot(), intermediate_slot_desc_, intermediate_type(), impala::Tuple::IsNull(), impala::SlotDescriptor::null_indicator_offset(), impala::AnyValUtil::SetAnyVal(), SetDstSlot(), staging_intermediate_val_, impala::SlotDescriptor::tuple_offset(), impala::ColumnType::type, impala::SlotDescriptor::type(), impala::TYPE_BIGINT, impala::TYPE_BOOLEAN, impala::TYPE_DECIMAL, impala::TYPE_DOUBLE, impala::TYPE_FLOAT, impala::TYPE_INT, impala::TYPE_SMALLINT, impala::TYPE_STRING, impala::TYPE_TIMESTAMP, impala::TYPE_TINYINT, impala::TYPE_VARCHAR, and impala::RawValue::Write().

bool impala::AggFnEvaluator::SupportsRemove ( ) const
inline

Definition at line 116 of file agg-fn-evaluator.h.

bool impala::AggFnEvaluator::SupportsSerialize ( ) const
inline

Definition at line 117 of file agg-fn-evaluator.h.

void AggFnEvaluator::Update ( FunctionContext agg_fn_ctx,
TupleRow row,
Tuple dst,
void *  fn 
)
private

TODO: these functions below are not extensible and we need to use codegen to generate the calls into the UDA functions (like for UDFs). Remove these functions when this is supported. Sets up the arguments to call fn. This converts from the agg-expr signature, taking TupleRow to the UDA signature taking AnvVals by populating the staging AnyVals. fn must be a function that implement's the UDA Update() signature.

Definition at line 339 of file agg-fn-evaluator.cc.

References input_expr_ctxs_, intermediate_slot_desc_, impala::AnyValUtil::SetAnyVal(), SetAnyVal(), SetDstSlot(), staging_input_vals_, and staging_intermediate_val_.

const std::string& impala::AggFnEvaluator::update_symbol ( ) const
inline

Member Data Documentation

AggregationOp impala::AggFnEvaluator::agg_op_
private

The enum for some of the builtins that still require special cased logic.

Definition at line 191 of file agg-fn-evaluator.h.

Referenced by AggFnEvaluator(), and DebugString().

LibCache::LibCacheEntry* impala::AggFnEvaluator::cache_entry_
private

Cache entry for the library containing the function ptrs.

Definition at line 202 of file agg-fn-evaluator.h.

Referenced by Close(), Prepare(), and ~AggFnEvaluator().

void* impala::AggFnEvaluator::finalize_fn_
private

Definition at line 211 of file agg-fn-evaluator.h.

Referenced by Prepare().

const TFunction impala::AggFnEvaluator::fn_
private

TODO: implement codegen path. These functions would return IR functions with the same signature as the interpreted ones above. Function* GetIrInitFn(); Function* GetIrAddFn(); Function* GetIrRemoveFn(); Function* GetIrSerializeFn(); Function* GetIrGetValueFn(); Function* GetIrFinalizeFn();

Definition at line 175 of file agg-fn-evaluator.h.

Referenced by AggFnEvaluator(), and Prepare().

void* impala::AggFnEvaluator::get_value_fn_
private

Definition at line 210 of file agg-fn-evaluator.h.

Referenced by Prepare().

void* impala::AggFnEvaluator::init_fn_
private

Function ptrs for the different phases of the aggregate function.

Definition at line 205 of file agg-fn-evaluator.h.

Referenced by Init(), and Prepare().

std::vector<ExprContext*> impala::AggFnEvaluator::input_expr_ctxs_
private

Definition at line 188 of file agg-fn-evaluator.h.

Referenced by Close(), DebugString(), Open(), Prepare(), and Update().

const SlotDescriptor* impala::AggFnEvaluator::intermediate_slot_desc_
private

Slot into which Update()/Merge()/Serialize() write their result. Not owned.

Definition at line 182 of file agg-fn-evaluator.h.

Referenced by Init(), Merge(), Prepare(), SerializeOrFinalize(), and Update().

const bool impala::AggFnEvaluator::is_analytic_fn_
private

Indicates which functions must be loaded.

Definition at line 179 of file agg-fn-evaluator.h.

Referenced by Prepare().

const bool impala::AggFnEvaluator::is_merge_
private

Indicates whether to Update() or Merge()

Definition at line 177 of file agg-fn-evaluator.h.

Referenced by Prepare().

void* impala::AggFnEvaluator::merge_fn_
private

Definition at line 208 of file agg-fn-evaluator.h.

Referenced by Merge(), and Prepare().

const SlotDescriptor* impala::AggFnEvaluator::output_slot_desc_
private

Slot into which Finalize() results are written. Not owned. Identical to intermediate_slot_desc_ if this agg fn has the same intermediate and output type.

Definition at line 186 of file agg-fn-evaluator.h.

Referenced by Prepare().

void* impala::AggFnEvaluator::remove_fn_
private

Definition at line 207 of file agg-fn-evaluator.h.

Referenced by Prepare().

void* impala::AggFnEvaluator::serialize_fn_
private

Definition at line 209 of file agg-fn-evaluator.h.

Referenced by Prepare().

std::vector<impala_udf::AnyVal*> impala::AggFnEvaluator::staging_input_vals_
private

Created to a subclass of AnyVal for type(). We use this to convert values from the UDA interface to the Expr interface. These objects are allocated in the runtime state's object pool. TODO: this is awful, remove this when exprs are updated.

Definition at line 197 of file agg-fn-evaluator.h.

Referenced by Prepare(), and Update().

impala_udf::AnyVal* impala::AggFnEvaluator::staging_intermediate_val_
private

Definition at line 198 of file agg-fn-evaluator.h.

Referenced by Init(), Merge(), Prepare(), SerializeOrFinalize(), and Update().

impala_udf::AnyVal* impala::AggFnEvaluator::staging_merge_input_val_
private

Definition at line 199 of file agg-fn-evaluator.h.

Referenced by Merge(), and Prepare().

void* impala::AggFnEvaluator::update_fn_
private

Definition at line 206 of file agg-fn-evaluator.h.

Referenced by Prepare().


The documentation for this class was generated from the following files: