Impala
Impalaistheopensource,nativeanalyticdatabaseforApacheHadoop.
|
#include <partitioned-hash-join-node.h>
Classes | |
class | Partition |
Public Member Functions | |
PartitionedHashJoinNode (ObjectPool *pool, const TPlanNode &tnode, const DescriptorTbl &descs) | |
virtual Status | Init (const TPlanNode &tnode) |
virtual Status | Prepare (RuntimeState *state) |
virtual Status | GetNext (RuntimeState *state, RowBatch *row_batch, bool *eos) |
Open() implemented in BlockingJoinNode. More... | |
virtual Status | Reset (RuntimeState *state) |
virtual void | Close (RuntimeState *state) |
virtual Status | Open (RuntimeState *state) |
std::string | DebugString () const |
Returns a string representation in DFS order of the plan rooted at this. More... | |
void | CollectNodes (TPlanNodeType::type node_type, std::vector< ExecNode * > *nodes) |
void | CollectScanNodes (std::vector< ExecNode * > *nodes) |
Collect all scan node types. More... | |
const std::vector< ExprContext * > & | conjunct_ctxs () const |
int | id () const |
TPlanNodeType::type | type () const |
const RowDescriptor & | row_desc () const |
int64_t | rows_returned () const |
int64_t | limit () const |
bool | ReachedLimit () |
RuntimeProfile * | runtime_profile () |
MemTracker * | mem_tracker () |
MemTracker * | expr_mem_tracker () |
Static Public Member Functions | |
static Status | CreateTree (ObjectPool *pool, const TPlan &plan, const DescriptorTbl &descs, ExecNode **root) |
static void | SetDebugOptions (int node_id, TExecNodePhase::type phase, TDebugAction::type action, ExecNode *tree) |
Set debug action for node with given id in 'tree'. More... | |
static bool | EvalConjuncts (ExprContext *const *ctxs, int num_ctxs, TupleRow *row) |
static llvm::Function * | CodegenEvalConjuncts (RuntimeState *state, const std::vector< ExprContext * > &conjunct_ctxs, const char *name="EvalConjuncts") |
static int | GetNodeIdFromProfile (RuntimeProfile *p) |
Extract node id from p->name(). More... | |
Static Public Attributes | |
static const char * | LLVM_CLASS_NAME = "class.impala::BlockingJoinNode" |
static const std::string | ROW_THROUGHPUT_COUNTER = "RowsReturnedRate" |
Names of counters shared by all exec nodes. More... | |
Protected Member Functions | |
virtual void | AddToDebugString (int indentation_level, std::stringstream *out) const |
virtual Status | InitGetNext (TupleRow *first_probe_row) |
virtual Status | ConstructBuildSide (RuntimeState *state) |
virtual void | DebugString (int indentation_level, std::stringstream *out) const |
Subclasses should not override, use AddToDebugString() to add to the result. More... | |
std::string | GetLeftChildRowString (TupleRow *row) |
void | CreateOutputRow (TupleRow *out_row, TupleRow *probe_row, TupleRow *build_row) |
ExecNode * | child (int i) |
bool | is_closed () |
virtual bool | IsScanNode () const |
void | InitRuntimeProfile (const std::string &name) |
Status | ExecDebugAction (TExecNodePhase::type phase, RuntimeState *state) |
void | AddRuntimeExecOption (const std::string &option) |
Appends option to 'runtime_exec_options_'. More... | |
virtual Status | QueryMaintenance (RuntimeState *state) |
void | AddExprCtxToFree (ExprContext *ctx) |
void | AddExprCtxsToFree (const std::vector< ExprContext * > &ctxs) |
void | AddExprCtxsToFree (const SortExecExprs &sort_exec_exprs) |
Static Protected Member Functions | |
static Status | CreateNode (ObjectPool *pool, const TPlanNode &tnode, const DescriptorTbl &descs, ExecNode **node) |
Create a single exec node derived from thrift node; place exec node in 'pool'. More... | |
static Status | CreateTreeHelper (ObjectPool *pool, const std::vector< TPlanNode > &tnodes, const DescriptorTbl &descs, ExecNode *parent, int *node_idx, ExecNode **root) |
Private Types | |
enum | State { PARTITIONING_BUILD, PROCESSING_PROBE, PROBING_SPILLED_PARTITION, REPARTITIONING } |
typedef Status(* | ProcessBuildBatchFn )(PartitionedHashJoinNode *, RowBatch *) |
llvm function and signature for codegening build batch. More... | |
typedef int(* | ProcessProbeBatchFn )(PartitionedHashJoinNode *, RowBatch *, HashTableCtx *) |
llvm function and signature for codegening probe batch. More... | |
Private Member Functions | |
bool | AppendRow (BufferedTupleStream *stream, TupleRow *row) |
bool | AppendRowStreamFull (BufferedTupleStream *stream, TupleRow *row) |
Status | SpillPartition (Partition **spilled_partition) |
Status | ProcessBuildInput (RuntimeState *state, int level) |
Status | ProcessBuildBatch (RowBatch *build_batch) |
Reads the rows in build_batch and partitions them in hash_partitions_. More... | |
Status | BuildHashTables (RuntimeState *state) |
template<int const JoinOp> | |
int | ProcessProbeBatch (RowBatch *out_batch, HashTableCtx *ht_ctx) |
int | ProcessProbeBatch (const TJoinOp::type join_op, RowBatch *out_batch, HashTableCtx *ht_ctx) |
Wrapper that calls the templated version of ProcessProbeBatch() based on 'join_op'. More... | |
void | OutputUnmatchedBuild (RowBatch *out_batch) |
Status | PrepareNullAwarePartition () |
Initializes null_aware_partition_ and nulls_build_batch_ to output rows. More... | |
Status | OutputNullAwareProbeRows (RuntimeState *state, RowBatch *out_batch) |
Status | EvaluateNullProbe (BufferedTupleStream *build) |
Status | PrepareNullAwareNullProbe () |
Status | OutputNullAwareNullProbe (RuntimeState *state, RowBatch *out_batch) |
Status | CleanUpHashPartitions (RowBatch *batch) |
Status | ReserveTupleStreamBlocks () |
Status | NextProbeRowBatch (RuntimeState *, RowBatch *out_batch) |
Status | NextSpilledProbeRowBatch (RuntimeState *, RowBatch *out_batch) |
Status | PrepareNextPartition (RuntimeState *) |
int64_t | LargestSpilledPartition () const |
void | ResetForProbe () |
Prepares for probing the next batch. More... | |
bool | AllocateProbeFilters (RuntimeState *state) |
bool | AttachProbeFilters (RuntimeState *state) |
Attach the probe filters to runtime state. More... | |
llvm::Function * | CodegenCreateOutputRow (LlvmCodeGen *codegen) |
Codegen function to create output row. Assumes that the probe row is non-NULL. More... | |
bool | CodegenProcessBuildBatch (RuntimeState *state, llvm::Function *hash_fn, llvm::Function *murmur_hash_fn) |
bool | CodegenProcessProbeBatch (RuntimeState *state, llvm::Function *hash_fn, llvm::Function *murmur_hash_fn) |
std::string | PrintState () const |
Returns the current state of the partition as a string. More... | |
void | UpdateState (State s) |
Updates state_ to 's', logging the transition. More... | |
std::string | NodeDebugString () const |
int | MinRequiredBuffers () const |
Static Private Attributes | |
static const int | PARTITION_FANOUT = 16 |
static const int | NUM_PARTITIONING_BITS = 4 |
Needs to be the log(PARTITION_FANOUT) More... | |
static const int | MAX_PARTITION_DEPTH = 16 |
static const int | MAX_IN_MEM_BUILD_TABLES = PARTITION_FANOUT |
Operator to perform partitioned hash join, spilling to disk as necessary. A spilled partition is one that is not fully pinned. The operator runs in these distinct phases:
Definition at line 57 of file partitioned-hash-join-node.h.
|
private |
llvm function and signature for codegening build batch.
Definition at line 418 of file partitioned-hash-join-node.h.
|
private |
llvm function and signature for codegening probe batch.
Definition at line 427 of file partitioned-hash-join-node.h.
|
private |
Implementation details: Logically, the algorithm runs in three modes.
Definition at line 95 of file partitioned-hash-join-node.h.
PartitionedHashJoinNode::PartitionedHashJoinNode | ( | ObjectPool * | pool, |
const TPlanNode & | tnode, | ||
const DescriptorTbl & | descs | ||
) |
Definition at line 43 of file partitioned-hash-join-node.cc.
References impala::BlockingJoinNode::can_add_probe_filters_, and hash_tbls_.
|
protectedinherited |
|
protectedinherited |
Definition at line 410 of file exec-node.cc.
References impala::ExecNode::AddExprCtxsToFree(), impala::SortExecExprs::lhs_ordering_expr_ctxs(), impala::SortExecExprs::rhs_ordering_expr_ctxs(), and impala::SortExecExprs::sort_tuple_slot_expr_ctxs().
|
inlineprotectedinherited |
Add an ExprContext to have its local allocations freed by QueryMaintenance(). Exprs that are evaluated in the main execution thread should be added. Exprs evaluated in a separate thread are generally not safe to add, since a local allocation may be freed while it's being used. Rather than using this mechanism, threads should call FreeLocalAllocations() on local ExprContexts periodically.
Definition at line 276 of file exec-node.h.
References impala::ExecNode::expr_ctxs_to_free_.
Referenced by impala::AnalyticEvalNode::Prepare().
|
protectedinherited |
Appends option to 'runtime_exec_options_'.
Definition at line 188 of file exec-node.cc.
References impala::RuntimeProfile::AddInfoString(), impala::ExecNode::exec_options_lock_, impala::ExecNode::runtime_exec_options_, and impala::ExecNode::runtime_profile().
Referenced by AttachProbeFilters(), impala::HashJoinNode::ConstructBuildSide(), impala::BlockingJoinNode::Open(), impala::HashJoinNode::Prepare(), impala::AggregationNode::Prepare(), Prepare(), impala::PartitionedAggregationNode::Prepare(), impala::HdfsScanNode::Prepare(), and impala::HdfsScanNode::StopAndFinalizeCounters().
|
protectedvirtual |
Gives subclasses an opportunity to add debug output to the debug string printed by DebugString().
Reimplemented from impala::BlockingJoinNode.
Definition at line 1285 of file partitioned-hash-join-node.cc.
References build_expr_ctxs_, impala::Expr::DebugString(), and probe_expr_ctxs_.
|
private |
For each 'probe_expr_' in 'ht_ctx' that is a slot ref, allocate a bitmap filter on that slot. Returns false if it should not add probe filters.
Definition at line 382 of file partitioned-hash-join-node.cc.
References build_expr_ctxs_, impala::BlockingJoinNode::can_add_probe_filters_, ht_ctx_, probe_expr_ctxs_, probe_filters_, and impala::RuntimeState::slot_filter_bitmap_size().
Referenced by ConstructBuildSide().
|
inlineprivate |
Append the row to stream. In the common case, the row is just in memory. If we run out of memory, this will spill a partition and try to add the row again. returns true if the row was added and false otherwise. If false is returned, status_ contains the error (doesn't return status because this is very perf sensitive).
Definition at line 31 of file partitioned-hash-join-node.inline.h.
References impala::BufferedTupleStream::AddRow(), AppendRowStreamFull(), and LIKELY.
Referenced by ProcessBuildBatch(), and ProcessProbeBatch().
|
private |
Slow path for AppendRow() above except the stream has failed to append the row. We need to find more memory by spilling.
Definition at line 427 of file partitioned-hash-join-node.cc.
References impala::BufferedTupleStream::AddRow(), impala::Status::ok(), ReserveTupleStreamBlocks(), SpillPartition(), impala::BufferedTupleStream::status(), status_, and using_small_buffers_.
Referenced by AppendRow().
|
private |
Attach the probe filters to runtime state.
Definition at line 400 of file partitioned-hash-join-node.cc.
References impala::RuntimeState::AddBitmapFilter(), impala::ExecNode::AddRuntimeExecOption(), impala::BlockingJoinNode::can_add_probe_filters_, and probe_filters_.
Referenced by ConstructBuildSide().
|
private |
Call at the end of partitioning the build rows (which could be from the build child or from repartitioning an existing partition). After this function returns, all partitions in hash_partitions_ are ready to accept probe rows. This function constructs hash tables for as many partitions as fit in memory (which can be none). For the remaining partitions, this function initializes the probe spilling structures.
Definition at line 1065 of file partitioned-hash-join-node.cc.
References impala::PartitionedHashJoinNode::Partition::build_rows(), impala::PartitionedHashJoinNode::Partition::BuildHashTable(), impala::BlockingJoinNode::can_add_probe_filters_, impala::BufferedTupleStream::Close(), impala::PartitionedHashJoinNode::Partition::Close(), impala::BufferedTupleStream::has_write_block(), hash_partitions_, impala::PartitionedHashJoinNode::Partition::hash_tbl(), hash_tbls_, input_partition_, impala::PartitionedHashJoinNode::Partition::is_closed(), impala::BufferedTupleStream::is_pinned(), impala::PartitionedHashJoinNode::Partition::is_spilled(), impala::BufferedTupleStream::num_rows(), impala::Status::OK, PARTITION_FANOUT, impala::PartitionedHashJoinNode::Partition::probe_rows(), RETURN_IF_ERROR, impala::RuntimeState::slot_filter_bitmap_size(), impala::PartitionedHashJoinNode::Partition::Spill(), SpillPartition(), impala::BufferedTupleStream::SwitchToIoBuffers(), and impala::BufferedTupleStream::using_small_buffers().
Referenced by ProcessBuildInput().
|
inlineprotectedinherited |
Definition at line 241 of file exec-node.h.
References impala::ExecNode::children_.
Referenced by impala::CrossJoinNode::BuildListDebugString(), impala::BlockingJoinNode::BuildSideThread(), impala::HashJoinNode::CodegenCreateOutputRow(), CodegenCreateOutputRow(), impala::CrossJoinNode::ConstructBuildSide(), impala::HashJoinNode::ConstructBuildSide(), ConstructBuildSide(), impala::BlockingJoinNode::GetLeftChildRowString(), impala::SelectNode::GetNext(), impala::UnionNode::GetNext(), impala::CrossJoinNode::GetNext(), impala::HashJoinNode::GetNext(), impala::AnalyticEvalNode::GetNextOutputBatch(), impala::PartitionedAggregationNode::Partition::InitStreams(), impala::HashJoinNode::LeftJoinGetNext(), NextProbeRowBatch(), impala::SelectNode::Open(), impala::SortNode::Open(), impala::TopNNode::Open(), impala::BlockingJoinNode::Open(), impala::AggregationNode::Open(), impala::AnalyticEvalNode::Open(), impala::PartitionedAggregationNode::Open(), impala::UnionNode::OpenCurrentChild(), impala::SelectNode::Prepare(), impala::SortNode::Prepare(), impala::UnionNode::Prepare(), impala::TopNNode::Prepare(), impala::BlockingJoinNode::Prepare(), impala::HashJoinNode::Prepare(), impala::AggregationNode::Prepare(), impala::AnalyticEvalNode::Prepare(), Prepare(), impala::PartitionedAggregationNode::Prepare(), ProcessBuildInput(), impala::AnalyticEvalNode::ProcessChildBatches(), and impala::SortNode::SortInput().
Call at the end of consuming the probe rows. Walks hash_partitions_ and
Definition at line 1228 of file partitioned-hash-join-node.cc.
References impala::PartitionedHashJoinNode::Partition::build_rows(), impala::PartitionedHashJoinNode::Partition::Close(), EvaluateNullProbe(), hash_partitions_, impala::PartitionedHashJoinNode::Partition::hash_tbl(), impala::PartitionedHashJoinNode::Partition::hash_tbl_, hash_tbl_iterator_, ht_ctx_, input_partition_, impala::PartitionedHashJoinNode::Partition::is_closed(), impala::PartitionedHashJoinNode::Partition::is_spilled(), impala::BlockingJoinNode::join_op_, NodeDebugString(), null_probe_rows_, impala::BufferedTupleStream::num_rows(), impala::Status::OK, output_build_partitions_, impala::BlockingJoinNode::probe_batch_pos_, impala::PartitionedHashJoinNode::Partition::probe_rows(), RETURN_IF_ERROR, spilled_partitions_, and impala::BufferedTupleStream::UnpinStream().
Referenced by GetNext().
|
virtual |
Subclasses should close any other structures and then call BlockingJoinNode::Close().
Reimplemented from impala::BlockingJoinNode.
Definition at line 192 of file partitioned-hash-join-node.cc.
References impala::RuntimeState::block_mgr(), block_mgr_client_, build_expr_ctxs_, impala::BufferedBlockMgr::ClearReservations(), impala::BlockingJoinNode::Close(), impala::Expr::Close(), impala::BufferedTupleStream::Close(), impala::PartitionedHashJoinNode::Partition::Close(), hash_partitions_, ht_ctx_, input_partition_, impala::ExecNode::is_closed(), null_aware_partition_, null_probe_rows_, nulls_build_batch_, other_join_conjunct_ctxs_, output_build_partitions_, probe_expr_ctxs_, and spilled_partitions_.
|
private |
Codegen function to create output row. Assumes that the probe row is non-NULL.
Definition at line 1382 of file partitioned-hash-join-node.cc.
References impala::LlvmCodeGen::FnPrototype::AddArgument(), impala::BlockingJoinNode::build_tuple_row_size_, impala::ExecNode::child(), impala::LlvmCodeGen::CodegenMemcpy(), impala::LlvmCodeGen::context(), impala::LlvmCodeGen::FinalizeFunction(), impala::LlvmCodeGen::GetIntConstant(), impala::LlvmCodeGen::GetType(), impala::BlockingJoinNode::join_op_, impala::BlockingJoinNode::LLVM_CLASS_NAME, impala::TupleRow::LLVM_CLASS_NAME, impala::LlvmCodeGen::null_ptr_value(), impala::BlockingJoinNode::probe_tuple_row_size_, impala::LlvmCodeGen::ptr_type(), impala::ExecNode::row_desc(), impala::RowDescriptor::tuple_descriptors(), impala::TYPE_INT, and impala::LlvmCodeGen::void_type().
Referenced by CodegenProcessProbeBatch().
|
staticinherited |
Returns a codegen'd version of EvalConjuncts(), or NULL if the function couldn't be codegen'd. The codegen'd version uses inlined, codegen'd GetBooleanVal() functions.
Definition at line 452 of file exec-node.cc.
References impala::LlvmCodeGen::FnPrototype::AddArgument(), impala::LlvmCodeGen::context(), impala::CodegenAnyVal::CreateCallWrapped(), impala::LlvmCodeGen::false_value(), impala::LlvmCodeGen::FinalizeFunction(), impala::RuntimeState::GetCodegen(), impala::Status::GetDetail(), impala::CodegenAnyVal::GetIsNull(), impala::LlvmCodeGen::GetType(), impala::CodegenAnyVal::GetVal(), impala::TupleRow::LLVM_CLASS_NAME, impala::ExprContext::LLVM_CLASS_NAME, impala::Status::ok(), impala::LlvmCodeGen::true_value(), impala::ExecNode::type(), impala::TYPE_BOOLEAN, impala::TYPE_INT, and VLOG_QUERY.
Referenced by impala::HdfsAvroScanner::CodegenDecodeAvroData(), impala::HashJoinNode::CodegenProcessProbeBatch(), and CodegenProcessProbeBatch().
|
private |
Codegen processing build batches. Identical signature to ProcessBuildBatch. Returns false if codegen was not possible.
Definition at line 1454 of file partitioned-hash-join-node.cc.
References impala::LlvmCodeGen::AddFunctionToJit(), impala::RuntimeState::GetCodegen(), impala::LlvmCodeGen::GetFunction(), ht_ctx_, impala::Status::ok(), impala::LlvmCodeGen::OptimizeFunctionWithExprs(), process_build_batch_fn_, process_build_batch_fn_level0_, and impala::LlvmCodeGen::ReplaceCallSites().
Referenced by Prepare().
|
private |
Codegen processing probe batches. Identical signature to ProcessProbeBatch. Returns false if codegen was not possible.
Definition at line 1498 of file partitioned-hash-join-node.cc.
References impala::LlvmCodeGen::AddFunctionToJit(), impala::LlvmCodeGen::CastPtrToLlvmPtr(), impala::LlvmCodeGen::CloneFunction(), CodegenCreateOutputRow(), impala::ExecNode::CodegenEvalConjuncts(), impala::ExecNode::conjunct_ctxs_, impala::LlvmCodeGen::GetArgument(), impala::RuntimeState::GetCodegen(), impala::LlvmCodeGen::GetFunction(), ht_ctx_, impala::BlockingJoinNode::join_op_, impala::Status::ok(), impala::LlvmCodeGen::OptimizeFunctionWithExprs(), other_join_conjunct_ctxs_, impala::LlvmCodeGen::Print(), process_probe_batch_fn_, process_probe_batch_fn_level0_, and impala::LlvmCodeGen::ReplaceCallSites().
Referenced by Prepare().
|
inherited |
Collect all nodes of given 'node_type' that are part of this subtree, and return in 'nodes'.
Definition at line 359 of file exec-node.cc.
References impala::ExecNode::children_, and impala::ExecNode::type_.
Referenced by impala::ExecNode::CollectScanNodes(), and impala::PlanFragmentExecutor::Prepare().
|
inherited |
Collect all scan node types.
Definition at line 366 of file exec-node.cc.
References impala::ExecNode::CollectNodes().
Referenced by impala::PlanFragmentExecutor::Prepare().
|
inlineinherited |
Definition at line 152 of file exec-node.h.
References impala::ExecNode::conjunct_ctxs_.
Referenced by impala::HdfsScanNode::ComputeSlotMaterializationOrder(), impala::SelectNode::CopyRows(), impala::UnionNode::EvalAndMaterializeExprs(), impala::HashJoinNode::GetNext(), OutputUnmatchedBuild(), impala::HashJoinNode::ProcessProbeBatch(), and ProcessProbeBatch().
|
protectedvirtual |
We parallelize building the build-side with Open'ing the left child. If, for example, the left child is another join node, it can start to build its own build-side at the same time.
Implements impala::BlockingJoinNode.
Definition at line 485 of file partitioned-hash-join-node.cc.
References AllocateProbeFilters(), AttachProbeFilters(), build_expr_ctxs_, impala::ExecNode::child(), impala::Status::OK, impala::BlockingJoinNode::Open(), impala::Expr::Open(), other_join_conjunct_ctxs_, probe_expr_ctxs_, ProcessBuildInput(), PROCESSING_PROBE, RETURN_IF_ERROR, and UpdateState().
|
staticprotectedinherited |
Create a single exec node derived from thrift node; place exec node in 'pool'.
Definition at line 260 of file exec-node.cc.
References impala::ObjectPool::Add(), impala::Status::OK, and RETURN_IF_ERROR.
Referenced by impala::ExecNode::CreateTreeHelper().
|
protectedinherited |
Write combined row, consisting of the left child's 'probe_row' and right child's 'build_row' to 'out_row'. This is replaced by codegen.
Definition at line 240 of file blocking-join-node.cc.
References impala::BlockingJoinNode::build_tuple_row_size_, and impala::BlockingJoinNode::probe_tuple_row_size_.
Referenced by EvaluateNullProbe(), impala::HashJoinNode::GetNext(), OutputNullAwareProbeRows(), OutputUnmatchedBuild(), impala::CrossJoinNode::ProcessLeftChildBatch(), impala::HashJoinNode::ProcessProbeBatch(), and ProcessProbeBatch().
|
staticinherited |
Creates exec node tree from list of nodes contained in plan via depth-first traversal. All nodes are placed in pool. Returns error if 'plan' is corrupted, otherwise success.
Definition at line 199 of file exec-node.cc.
References impala::ExecNode::CreateTreeHelper(), impala::Status::OK, and impala::Status::ok().
Referenced by impala::PlanFragmentExecutor::Prepare().
|
staticprotectedinherited |
Definition at line 218 of file exec-node.cc.
References impala::RuntimeProfile::AddChild(), impala::ExecNode::children_, impala::ExecNode::CreateNode(), impala::Status::OK, RETURN_IF_ERROR, and impala::ExecNode::runtime_profile().
Referenced by impala::ExecNode::CreateTree().
|
protectedvirtualinherited |
Subclasses should not override, use AddToDebugString() to add to the result.
Reimplemented from impala::ExecNode.
Definition at line 212 of file blocking-join-node.cc.
References impala::BlockingJoinNode::AddToDebugString(), impala::ExecNode::DebugString(), impala::BlockingJoinNode::eos_, impala::BlockingJoinNode::node_name_, and impala::BlockingJoinNode::probe_batch_pos_.
|
inherited |
Returns a string representation in DFS order of the plan rooted at this.
Definition at line 345 of file exec-node.cc.
Referenced by impala::SortNode::DebugString(), impala::TopNNode::DebugString(), impala::ExchangeNode::DebugString(), impala::AggregationNode::DebugString(), impala::AnalyticEvalNode::DebugString(), impala::PartitionedAggregationNode::DebugString(), impala::BlockingJoinNode::DebugString(), and impala::PlanFragmentExecutor::Prepare().
|
staticinherited |
Evaluate ExprContexts over row. Returns true if all exprs return true. TODO: This doesn't use the vector<Expr*> signature because I haven't figured out how to deal with declaring a templated std:vector type in IR
Definition at line 393 of file exec-node.cc.
References impala::ExprContext::GetBooleanVal(), impala_udf::AnyVal::is_null, and impala_udf::BooleanVal::val.
Referenced by impala::SelectNode::CopyRows(), impala::UnionNode::EvalAndMaterializeExprs(), impala::HdfsScanner::EvalConjuncts(), EvalOtherJoinConjuncts(), EvalOtherJoinConjuncts2(), EvaluateNullProbe(), impala::HBaseScanNode::GetNext(), impala::HashJoinNode::GetNext(), impala::AggregationNode::GetNext(), impala::PartitionedAggregationNode::GetNext(), impala::AnalyticEvalNode::GetNextOutputBatch(), OutputNullAwareProbeRows(), OutputUnmatchedBuild(), impala::CrossJoinNode::ProcessLeftChildBatch(), impala::HashJoinNode::ProcessProbeBatch(), and ProcessProbeBatch().
|
private |
Evaluates all other_join_conjuncts against null_probe_rows_ with all the rows in build. This updates matched_null_probe_, short-circuiting if one of the conjuncts pass (i.e. there is a match). This is used for NAAJ, when there are NULL probe rows.
Definition at line 1163 of file partitioned-hash-join-node.cc.
References impala::BlockingJoinNode::CreateOutputRow(), impala::ExecNode::EvalConjuncts(), impala::BufferedTupleStream::GetRows(), matched_null_probe_, null_aware_eval_timer_, null_probe_rows_, NullAwareAntiJoinError(), impala::BufferedTupleStream::num_rows(), impala::Status::OK, other_join_conjunct_ctxs_, RETURN_IF_ERROR, SCOPED_TIMER, and impala::BlockingJoinNode::semi_join_staging_row_.
Referenced by CleanUpHashPartitions(), and OutputNullAwareProbeRows().
|
protectedinherited |
Executes debug_action_ if phase matches debug_phase_. 'phase' must not be INVALID.
Definition at line 378 of file exec-node.cc.
References impala::Status::CANCELLED, impala::ExecNode::debug_action_, impala::ExecNode::debug_phase_, impala::RuntimeState::is_cancelled(), and impala::Status::OK.
Referenced by impala::SelectNode::GetNext(), impala::SortNode::GetNext(), impala::UnionNode::GetNext(), impala::HBaseScanNode::GetNext(), impala::TopNNode::GetNext(), impala::ExchangeNode::GetNext(), impala::CrossJoinNode::GetNext(), impala::HashJoinNode::GetNext(), impala::AggregationNode::GetNext(), impala::AnalyticEvalNode::GetNext(), GetNext(), impala::PartitionedAggregationNode::GetNext(), impala::HdfsScanNode::GetNextInternal(), impala::ExecNode::Open(), and impala::ExecNode::Prepare().
|
inlineinherited |
Definition at line 163 of file exec-node.h.
References impala::ExecNode::expr_mem_tracker_.
Referenced by impala::PartitionedAggregationNode::Partition::InitStreams(), impala::SortNode::Prepare(), impala::UnionNode::Prepare(), impala::TopNNode::Prepare(), impala::ExchangeNode::Prepare(), impala::HashJoinNode::Prepare(), impala::AggregationNode::Prepare(), Prepare(), impala::AnalyticEvalNode::Prepare(), impala::ExecNode::Prepare(), impala::PartitionedAggregationNode::Prepare(), and impala::HdfsScanNode::Prepare().
|
protectedinherited |
Returns a debug string for the left child's 'row'. They have tuple ptrs that are uninitialized; the left child only populates the tuple ptrs it is responsible for. This function outputs just the row values and leaves the build side values as NULL. This is only used for debugging and outputting the left child rows before doing the join.
Definition at line 222 of file blocking-join-node.cc.
References impala::ExecNode::child(), impala::TupleRow::GetTuple(), impala::PrintTuple(), impala::ExecNode::row_desc(), and impala::RowDescriptor::tuple_descriptors().
Referenced by impala::HashJoinNode::GetNext().
|
virtual |
Open() implemented in BlockingJoinNode.
Implements impala::ExecNode.
Definition at line 735 of file partitioned-hash-join-node.cc.
References impala::RowBatch::AtCapacity(), CleanUpHashPartitions(), impala::RowBatch::CommitRows(), COUNTER_SET, impala::BlockingJoinNode::current_probe_row_, impala::ExecNode::ExecDebugAction(), hash_partitions_, ht_ctx_, input_partition_, impala::BlockingJoinNode::join_op_, NextProbeRowBatch(), NextSpilledProbeRowBatch(), null_aware_partition_, null_probe_output_idx_, nulls_build_batch_, impala::ExecNode::num_rows_returned_, impala::Status::OK, impala::Status::ok(), output_build_partitions_, OutputNullAwareNullProbe(), OutputNullAwareProbeRows(), OutputUnmatchedBuild(), PARTITIONING_BUILD, PrepareNextPartition(), PrepareNullAwarePartition(), impala::BlockingJoinNode::probe_batch_pos_, impala::BlockingJoinNode::probe_timer_, process_probe_batch_fn_, process_probe_batch_fn_level0_, ProcessProbeBatch(), impala::ExecNode::QueryMaintenance(), impala::ExecNode::ReachedLimit(), RETURN_IF_CANCELLED, RETURN_IF_ERROR, impala::ExecNode::rows_returned_counter_, impala::ExecNode::runtime_profile_, SCOPED_TIMER, state_, status_, and UNLIKELY.
Referenced by NextProbeRowBatch(), and ProcessBuildInput().
|
staticinherited |
Extract node id from p->name().
Definition at line 62 of file exec-node.cc.
References impala::RuntimeProfile::metadata().
|
inlineinherited |
Definition at line 154 of file exec-node.h.
References impala::ExecNode::id_.
Referenced by impala::AnalyticEvalNode::AddResultTuple(), impala::AnalyticEvalNode::AddRow(), impala::AnalyticEvalNode::AnalyticEvalNode(), impala::AnalyticEvalNode::GetNext(), impala::AnalyticEvalNode::GetNextOutputBatch(), impala::HdfsScanner::InitializeWriteTuplesFn(), impala::HdfsAvroScanner::InitNewRange(), impala::AnalyticEvalNode::InitNextPartition(), impala::PartitionedAggregationNode::MoveHashPartitions(), NodeDebugString(), impala::AnalyticEvalNode::Open(), impala::HdfsScanNode::Open(), impala::PlanFragmentExecutor::Prepare(), ProcessBuildInput(), impala::AnalyticEvalNode::ProcessChildBatch(), impala::HdfsScanNode::ScannerThread(), impala::AnalyticEvalNode::TryAddRemainingResults(), impala::AnalyticEvalNode::TryAddResultTupleForCurrRow(), impala::AnalyticEvalNode::TryAddResultTupleForPrevRow(), and impala::AnalyticEvalNode::TryRemoveRowsBeforeWindow().
|
virtual |
Subclasses should call BlockingJoinNode::Init() and then perform any other Init() work, e.g. creating expr trees.
Reimplemented from impala::BlockingJoinNode.
Definition at line 66 of file partitioned-hash-join-node.cc.
References build_expr_ctxs_, impala::ExecNode::conjunct_ctxs_, impala::Expr::CreateExprTree(), impala::Expr::CreateExprTrees(), impala::BlockingJoinNode::Init(), impala::BlockingJoinNode::join_op_, impala::Status::OK, other_join_conjunct_ctxs_, impala::ExecNode::pool_, probe_expr_ctxs_, and RETURN_IF_ERROR.
Referenced by ProcessBuildInput().
Init the build-side state for a new left child row (e.g. hash table iterator or list iterator) given the first row. Used in Open() to prepare for GetNext(). A NULL ptr for first_left_child_row indicates the left child eos.
Implements impala::BlockingJoinNode.
Definition at line 592 of file partitioned-hash-join-node.cc.
References impala::Status::OK, and ResetForProbe().
|
protectedinherited |
Definition at line 371 of file exec-node.cc.
References impala::ExecNode::id_, impala::ExecNode::pool_, and impala::ExecNode::runtime_profile_.
Referenced by impala::ExecNode::ExecNode().
|
inlineprotectedinherited |
Definition at line 242 of file exec-node.h.
References impala::ExecNode::is_closed_.
Referenced by impala::SelectNode::Close(), impala::SortNode::Close(), impala::UnionNode::Close(), impala::TopNNode::Close(), impala::ExchangeNode::Close(), impala::HBaseScanNode::Close(), impala::CrossJoinNode::Close(), impala::HashJoinNode::Close(), impala::AggregationNode::Close(), impala::BlockingJoinNode::Close(), impala::AnalyticEvalNode::Close(), Close(), impala::PartitionedAggregationNode::Close(), impala::HdfsScanNode::Close(), impala::PartitionedAggregationNode::Partition::Close(), impala::PartitionedHashJoinNode::Partition::Close(), ReserveTupleStreamBlocks(), SpillPartition(), impala::PartitionedAggregationNode::SpillPartition(), and impala::PartitionedHashJoinNode::Partition::~Partition().
|
inlineprotectedvirtualinherited |
Reimplemented in impala::ScanNode.
Definition at line 251 of file exec-node.h.
|
private |
Iterates over all the partitions in hash_partitions_ and returns the number of rows of the largest partition (in terms of number of aggregated and unaggregated rows).
Definition at line 723 of file partitioned-hash-join-node.cc.
References impala::PartitionedHashJoinNode::Partition::build_rows(), hash_partitions_, impala::PartitionedHashJoinNode::Partition::is_spilled(), and impala::BufferedTupleStream::num_rows().
Referenced by PrepareNextPartition().
|
inlineinherited |
Definition at line 158 of file exec-node.h.
References impala::ExecNode::limit_.
Referenced by impala::CrossJoinNode::GetNext(), and impala::HashJoinNode::LeftJoinGetNext().
|
inlineinherited |
Definition at line 162 of file exec-node.h.
References impala::ExecNode::mem_tracker_.
Referenced by impala::ExecNode::Close(), impala::CrossJoinNode::ConstructBuildSide(), impala::HashJoinNode::ConstructBuildSide(), impala::HdfsScanNode::EnoughMemoryForScannerThread(), impala::AnalyticEvalNode::GetNextOutputBatch(), impala::SortNode::Open(), impala::TopNNode::Open(), impala::AggregationNode::Open(), impala::AnalyticEvalNode::Open(), impala::PartitionedAggregationNode::Open(), impala::HdfsScanNode::Open(), impala::UnionNode::OpenCurrentChild(), impala::SelectNode::Prepare(), impala::HBaseScanNode::Prepare(), impala::TopNNode::Prepare(), impala::BlockingJoinNode::Prepare(), impala::HashJoinNode::Prepare(), impala::AggregationNode::Prepare(), Prepare(), impala::AnalyticEvalNode::Prepare(), impala::PartitionedAggregationNode::Prepare(), impala::HdfsScanNode::Prepare(), PrepareNextPartition(), ProcessBuildInput(), impala::PartitionedAggregationNode::ProcessStream(), impala::HdfsRCFileScanner::ReadRowGroup(), impala::HdfsAvroScanner::ResolveSchemas(), impala::SortNode::SortInput(), and impala::HdfsScanner::StartNewRowBatch().
|
inlineprivate |
We need two output buffers per partition (one for build and one for probe) and and two additional buffers for the input (while repartitioning; for the build and probe sides). For NAAJ, we need 3 additional buffers to maintain the null_aware_partition_.
Definition at line 282 of file partitioned-hash-join-node.h.
References impala::BlockingJoinNode::join_op_, and PARTITION_FANOUT.
Referenced by Prepare().
|
private |
Get the next row batch from the probe (left) side (child(0)). If we are done consuming the input, sets probe_batch_pos_ to -1, otherwise, sets it to 0.
Definition at line 598 of file partitioned-hash-join-node.cc.
References impala::RowBatch::AtCapacity(), impala::ExecNode::child(), COUNTER_ADD, impala::BlockingJoinNode::current_probe_row_, GetNext(), impala::Status::OK, impala::BlockingJoinNode::probe_batch_, impala::BlockingJoinNode::probe_batch_pos_, impala::BlockingJoinNode::probe_row_counter_, impala::BlockingJoinNode::probe_side_eos_, ResetForProbe(), and RETURN_IF_ERROR.
Referenced by GetNext().
|
private |
Get the next probe row batch from input_partition_. If we are done consuming the input, sets probe_batch_pos_ to -1, otherwise, sets it to 0.
Definition at line 622 of file partitioned-hash-join-node.cc.
References impala::RowBatch::AtCapacity(), impala::PartitionedHashJoinNode::Partition::Close(), impala::BlockingJoinNode::current_probe_row_, impala::BufferedTupleStream::GetNext(), impala::PartitionedHashJoinNode::Partition::hash_tbl_, hash_tbl_iterator_, ht_ctx_, input_partition_, impala::BlockingJoinNode::join_op_, LIKELY, impala::BufferedTupleStream::num_rows(), impala::Status::OK, output_build_partitions_, impala::BlockingJoinNode::probe_batch_, impala::BlockingJoinNode::probe_batch_pos_, impala::PartitionedHashJoinNode::Partition::probe_rows(), ResetForProbe(), RETURN_IF_ERROR, and impala::BufferedTupleStream::rows_returned().
Referenced by GetNext().
|
private |
Definition at line 1310 of file partitioned-hash-join-node.cc.
References impala::BufferedTupleStream::blocks_pinned(), impala::PartitionedHashJoinNode::Partition::build_rows(), hash_partitions_, impala::PartitionedHashJoinNode::Partition::hash_tbl(), impala::ExecNode::id(), impala::PartitionedHashJoinNode::Partition::is_closed(), impala::PartitionedHashJoinNode::Partition::is_spilled(), impala::BlockingJoinNode::join_op_, impala::BufferedTupleStream::num_rows(), PrintState(), impala::PartitionedHashJoinNode::Partition::probe_rows(), impala::HashTable::size(), and spilled_partitions_.
Referenced by CleanUpHashPartitions(), PrepareNextPartition(), ProcessBuildInput(), SpillPartition(), and UpdateState().
|
virtualinherited |
Open prepares the build side structures (subclasses should implement ConstructBuildSide()) and then prepares for GetNext with the first left child row (subclasses should implement InitGetNext()).
Reimplemented from impala::ExecNode.
Definition at line 156 of file blocking-join-node.cc.
References impala::ExecNode::AddRuntimeExecOption(), impala::CgroupsMgr::AssignThreadToCgroup(), impala::BlockingJoinNode::BuildSideThread(), impala::RuntimeState::cgroup(), impala::ExecEnv::cgroups_mgr(), impala::ExecNode::child(), impala::BlockingJoinNode::ConstructBuildSide(), COUNTER_ADD, impala::BlockingJoinNode::current_probe_row_, impala::BlockingJoinNode::eos_, impala::RuntimeState::exec_env(), impala::Promise< T >::Get(), impala::ExecNode::GetNext(), impala::BlockingJoinNode::InitGetNext(), impala::RuntimeState::LogError(), impala::Status::msg(), impala::BlockingJoinNode::node_name_, impala::Status::OK, impala::Status::ok(), impala::ExecNode::Open(), impala::BlockingJoinNode::probe_batch_, impala::BlockingJoinNode::probe_batch_pos_, impala::BlockingJoinNode::probe_row_counter_, impala::BlockingJoinNode::probe_side_eos_, impala::ExecNode::QueryMaintenance(), impala::RuntimeState::resource_pool(), RETURN_IF_CANCELLED, RETURN_IF_ERROR, impala::ExecNode::runtime_profile_, SCOPED_TIMER, and impala::ThreadResourceMgr::ResourcePool::TryAcquireThreadToken().
Referenced by impala::CrossJoinNode::ConstructBuildSide(), impala::HashJoinNode::ConstructBuildSide(), and ConstructBuildSide().
|
private |
Outputs NULLs on the probe side, returning rows where matched_null_probe_[i] is false. Used for NAAJ.
Definition at line 927 of file partitioned-hash-join-node.cc.
References impala::RowBatch::AddRow(), impala::RowBatch::AtCapacity(), impala::PartitionedHashJoinNode::Partition::Close(), impala::RowBatch::CommitLastRow(), impala::RowBatch::CopyRow(), impala::BufferedTupleStream::GetNext(), impala::RowBatch::GetRow(), matched_null_probe_, null_aware_partition_, null_probe_output_idx_, null_probe_rows_, nulls_build_batch_, impala::Status::OK, impala::BlockingJoinNode::probe_batch_, impala::BlockingJoinNode::probe_batch_pos_, and RETURN_IF_ERROR.
Referenced by GetNext().
|
private |
Continues processing from null_aware_partition_. Called after we have finished processing all build and probe input (including repartitioning them).
Definition at line 998 of file partitioned-hash-join-node.cc.
References impala::RowBatch::AddRow(), impala::RowBatch::AtCapacity(), impala::PartitionedHashJoinNode::Partition::build_rows(), impala::RowBatch::CommitLastRow(), impala::RowBatch::CopyRow(), impala::BlockingJoinNode::CreateOutputRow(), impala::ExecNode::EvalConjuncts(), EvaluateNullProbe(), impala::RowBatch::GetRow(), null_aware_partition_, nulls_build_batch_, impala::Status::OK, other_join_conjunct_ctxs_, PrepareNullAwareNullProbe(), impala::BlockingJoinNode::probe_batch_, impala::BlockingJoinNode::probe_batch_pos_, impala::PartitionedHashJoinNode::Partition::probe_rows(), RETURN_IF_ERROR, and impala::BlockingJoinNode::semi_join_staging_row_.
Referenced by GetNext().
|
private |
Sweep the hash_tbl_ of the partition that it is in the front of flush_build_partitions_, using hash_tbl_iterator_ and output any unmatched build rows. If reaches the end of the hash table it closes that partition, removes it from flush_build_partitions_ and moves hash_tbl_iterator_ to the beginning of the partition in the front of flush_build_partitions_.
Definition at line 865 of file partitioned-hash-join-node.cc.
References impala::RowBatch::AddRow(), impala::RowBatch::AtCapacity(), impala::HashTable::Iterator::AtEnd(), impala::RowBatch::capacity(), impala::RowBatch::CommitRows(), impala::ExecNode::conjunct_ctxs(), impala::ExecNode::conjunct_ctxs_, impala::RowBatch::CopyRow(), COUNTER_SET, impala::BlockingJoinNode::CreateOutputRow(), impala::ExecNode::EvalConjuncts(), impala::RowBatch::GetRow(), impala::HashTable::Iterator::GetRow(), hash_tbl_iterator_, ht_ctx_, impala::HashTable::Iterator::IsMatched(), impala::BlockingJoinNode::join_op_, impala::TupleRow::next_row(), impala::HashTable::Iterator::NextUnmatched(), impala::RowBatch::num_rows(), impala::ExecNode::num_rows_returned_, output_build_partitions_, impala::BlockingJoinNode::probe_timer_, impala::ExecNode::rows_returned_counter_, SCOPED_TIMER, and impala::HashTable::Iterator::SetMatched().
Referenced by GetNext().
|
virtual |
Subclasses should call BlockingJoinNode::Prepare() and then perform any other Prepare() work, e.g. codegen.
Reimplemented from impala::BlockingJoinNode.
Definition at line 94 of file partitioned-hash-join-node.cc.
References impala::ObjectPool::Add(), ADD_COUNTER, ADD_TIMER, impala::ExecNode::AddExprCtxsToFree(), impala::RuntimeProfile::AddHighWaterMarkCounter(), impala::ExecNode::AddRuntimeExecOption(), impala::RuntimeState::block_mgr(), block_mgr_client_, build_expr_ctxs_, impala::PartitionedHashJoinNode::Partition::build_rows(), impala::ExecNode::child(), impala::RuntimeState::codegen_enabled(), CodegenProcessBuildBatch(), CodegenProcessProbeBatch(), impala::ExecNode::expr_mem_tracker(), impala::RuntimeState::fragment_hash_seed(), impala::RuntimeState::GetCodegen(), ht_ctx_, impala::BufferedTupleStream::Init(), impala::BlockingJoinNode::join_op_, largest_partition_percent_, MAX_PARTITION_DEPTH, max_partition_level_, impala::ExecNode::mem_tracker(), MinRequiredBuffers(), null_aware_eval_timer_, null_aware_partition_, null_probe_rows_, num_build_rows_partitioned_, num_hash_buckets_, num_probe_rows_partitioned_, num_repartitions_, num_spilled_partitions_, impala::RuntimeState::obj_pool(), impala::Status::OK, other_join_conjunct_ctxs_, partition_build_timer_, partitions_created_, impala::ExecNode::pool_, impala::BlockingJoinNode::Prepare(), impala::Expr::Prepare(), probe_expr_ctxs_, impala::PartitionedHashJoinNode::Partition::probe_rows(), impala::BufferedBlockMgr::RegisterClient(), RETURN_IF_ERROR, impala::ExecNode::row_desc(), impala::ExecNode::runtime_profile(), impala::ExecNode::runtime_profile_, runtime_state_, SCOPED_TIMER, and impala::RowDescriptor::tuple_descriptors().
|
private |
Moves onto the next spilled partition and initializes input_partition_. This function processes the entire build side of input_partition_ and when this function returns, we are ready to consume the probe side of input_partition_. If the build side's hash table fits in memory, we will construct input_partition_'s hash table. If it does not, meaning we need to repartition, this function will initialize hash_partitions_.
Definition at line 661 of file partitioned-hash-join-node.cc.
References impala::Status::AddDetail(), impala::PartitionedHashJoinNode::Partition::build_rows(), impala::PartitionedHashJoinNode::Partition::BuildHashTable(), COUNTER_ADD, impala::PartitionedHashJoinNode::Partition::EstimatedInMemSize(), hash_partitions_, impala::PartitionedHashJoinNode::Partition::hash_tbl(), hash_tbls_, ht_ctx_, impala::ExecNode::id_, input_partition_, impala::PartitionedHashJoinNode::Partition::is_spilled(), LargestSpilledPartition(), impala::PartitionedHashJoinNode::Partition::level_, impala::Status::MEM_LIMIT_EXCEEDED, impala::ExecNode::mem_tracker(), NodeDebugString(), num_probe_rows_partitioned_, num_repartitions_, impala::BufferedTupleStream::num_rows(), impala::Status::OK, PARTITION_FANOUT, impala::BufferedTupleStream::PrepareForRead(), impala::PartitionedHashJoinNode::Partition::probe_rows(), PROBING_SPILLED_PARTITION, ProcessBuildInput(), REPARTITIONING, RETURN_IF_ERROR, impala::RuntimeState::SetMemLimitExceeded(), impala::MemTracker::SpareCapacity(), impala::PartitionedHashJoinNode::Partition::Spill(), spilled_partitions_, and UpdateState().
Referenced by GetNext().
|
private |
Prepares to output NULLs on the probe side for NAAJ. Before calling this, matched_null_probe_ should have been fully evaluated.
Definition at line 918 of file partitioned-hash-join-node.cc.
References null_probe_output_idx_, null_probe_rows_, impala::Status::OK, impala::BufferedTupleStream::PrepareForRead(), impala::BlockingJoinNode::probe_batch_, impala::BlockingJoinNode::probe_batch_pos_, and RETURN_IF_ERROR.
Referenced by OutputNullAwareProbeRows(), and PrepareNullAwarePartition().
|
private |
Initializes null_aware_partition_ and nulls_build_batch_ to output rows.
Definition at line 969 of file partitioned-hash-join-node.cc.
References impala::PartitionedHashJoinNode::Partition::build_rows(), null_aware_partition_, NullAwareAntiJoinError(), nulls_build_batch_, impala::BufferedTupleStream::num_rows(), impala::Status::OK, impala::BufferedTupleStream::PrepareForRead(), PrepareNullAwareNullProbe(), impala::BlockingJoinNode::probe_batch_, impala::BlockingJoinNode::probe_batch_pos_, impala::PartitionedHashJoinNode::Partition::probe_rows(), and RETURN_IF_ERROR.
Referenced by GetNext().
|
private |
Returns the current state of the partition as a string.
Definition at line 1299 of file partitioned-hash-join-node.cc.
References PARTITIONING_BUILD, PROBING_SPILLED_PARTITION, PROCESSING_PROBE, REPARTITIONING, and state_.
Referenced by NodeDebugString().
Reads the rows in build_batch and partitions them in hash_partitions_.
Definition at line 254 of file partitioned-hash-join-node-ir.cc.
References impala::BufferedTupleStream::AddRow(), AppendRow(), impala::PartitionedHashJoinNode::Partition::build_rows(), impala::RowBatch::GetRow(), impala::hash, hash_partitions_, ht_ctx_, null_aware_partition_, NUM_PARTITIONING_BITS, impala::RowBatch::num_rows(), impala::Status::OK, impala::BufferedTupleStream::status(), status_, and UNLIKELY.
Referenced by ProcessBuildInput().
|
private |
Partitions the entire build input (either from child(1) or input_partition_) into hash_partitions_. When this call returns, hash_partitions_ is ready to consume the probe input. 'level' is the level new partitions (in hash_partitions_) should be created with.
Definition at line 500 of file partitioned-hash-join-node.cc.
References impala::ObjectPool::Add(), impala::RuntimeState::batch_size(), impala::BufferedTupleStream::blocks_pinned(), impala::BlockingJoinNode::build_row_counter_, impala::PartitionedHashJoinNode::Partition::build_rows(), impala::PartitionedHashJoinNode::Partition::build_rows_, BuildHashTables(), impala::ExecNode::child(), impala::BufferedTupleStream::Close(), COUNTER_ADD, COUNTER_SET, GetNext(), impala::BufferedTupleStream::GetNext(), hash_partitions_, ht_ctx_, impala::ExecNode::id(), impala::ExecNode::id_, Init(), input_partition_, impala::PartitionedHashJoinNode::Partition::is_spilled(), largest_partition_percent_, MAX_PARTITION_DEPTH, max_partition_level_, impala::Status::MEM_LIMIT_EXCEEDED, impala::ExecNode::mem_tracker(), NodeDebugString(), non_empty_build_, num_build_rows_partitioned_, impala::BufferedTupleStream::num_rows(), impala::Status::OK, partition_build_timer_, PARTITION_FANOUT, partitions_created_, impala::ExecNode::pool_, impala::BufferedTupleStream::PrepareForRead(), process_build_batch_fn_, process_build_batch_fn_level0_, ProcessBuildBatch(), impala::ExecNode::QueryMaintenance(), RETURN_IF_CANCELLED, RETURN_IF_ERROR, impala::ExecNode::row_desc(), impala::ExecNode::runtime_profile(), SCOPED_TIMER, impala::Status::SetErrorMsg(), impala::RuntimeState::SetMemLimitExceeded(), and using_small_buffers_.
Referenced by ConstructBuildSide(), and PrepareNextPartition().
|
private |
Process probe rows from probe_batch_. Returns either if out_batch is full or probe_batch_ is entirely consumed. For RIGHT_ANTI_JOIN, all this function does is to mark whether each build row had a match. Returns the number of rows added to out_batch; -1 on error (and status_ will be set).
Definition at line 40 of file partitioned-hash-join-node-ir.cc.
References impala::RowBatch::AddRow(), impala::BufferedTupleStream::AddRow(), AppendRow(), impala::RowBatch::AtCapacity(), impala::HashTable::Iterator::AtEnd(), impala::PartitionedHashJoinNode::Partition::build_rows(), impala::RowBatch::capacity(), impala::ExecNode::conjunct_ctxs(), impala::ExecNode::conjunct_ctxs_, impala::RowBatch::CopyRow(), impala::BlockingJoinNode::CreateOutputRow(), impala::BlockingJoinNode::current_probe_row_, impala::HashTableCtx::EvalAndHashProbe(), impala::ExecNode::EvalConjuncts(), EvalOtherJoinConjuncts(), impala::HashTable::Find(), impala::RowBatch::GetRow(), impala::HashTable::Iterator::GetRow(), impala::hash, hash_partitions_, hash_tbl_iterator_, hash_tbls_, impala::PartitionedHashJoinNode::Partition::is_closed(), impala::PartitionedHashJoinNode::Partition::is_spilled(), impala::HashTable::Iterator::IsMatched(), LIKELY, matched_null_probe_, impala::BlockingJoinNode::matched_probe_, impala::HashTable::Iterator::NextDuplicate(), non_empty_build_, null_aware_partition_, null_probe_rows_, NUM_PARTITIONING_BITS, impala::RowBatch::num_rows(), impala::BufferedTupleStream::num_rows(), other_join_conjunct_ctxs_, impala::BlockingJoinNode::probe_batch_, impala::BlockingJoinNode::probe_batch_pos_, impala::PartitionedHashJoinNode::Partition::probe_rows(), PROCESSING_PROBE, impala::BlockingJoinNode::semi_join_staging_row_, impala::HashTable::Iterator::SetAtEnd(), impala::HashTable::Iterator::SetMatched(), state_, impala::BufferedTupleStream::status(), status_, and UNLIKELY.
Referenced by GetNext().
|
private |
Wrapper that calls the templated version of ProcessProbeBatch() based on 'join_op'.
Definition at line 227 of file partitioned-hash-join-node-ir.cc.
|
protectedvirtualinherited |
Frees any local allocations made by expr_ctxs_to_free_ and returns the result of state->CheckQueryState(). Nodes should call this periodically, e.g. once per input row batch. This should not be called outside the main execution thread. Nodes may override this to add extra periodic cleanup, e.g. freeing other local allocations. ExecNodes overriding this function should return ExecNode::QueryMaintenance().
Reimplemented in impala::PartitionedAggregationNode, and impala::AnalyticEvalNode.
Definition at line 401 of file exec-node.cc.
References impala::RuntimeState::CheckQueryState(), impala::ExecNode::expr_ctxs_to_free_, and impala::ExprContext::FreeLocalAllocations().
Referenced by impala::CrossJoinNode::ConstructBuildSide(), impala::HashJoinNode::ConstructBuildSide(), impala::SelectNode::GetNext(), impala::SortNode::GetNext(), impala::UnionNode::GetNext(), impala::HBaseScanNode::GetNext(), impala::TopNNode::GetNext(), impala::ExchangeNode::GetNext(), impala::CrossJoinNode::GetNext(), impala::HashJoinNode::GetNext(), impala::AggregationNode::GetNext(), GetNext(), impala::HdfsScanNode::GetNextInternal(), impala::HBaseScanNode::Open(), impala::SortNode::Open(), impala::TopNNode::Open(), impala::BlockingJoinNode::Open(), impala::AggregationNode::Open(), ProcessBuildInput(), impala::AnalyticEvalNode::QueryMaintenance(), impala::PartitionedAggregationNode::QueryMaintenance(), and impala::SortNode::SortInput().
|
inlineinherited |
Definition at line 159 of file exec-node.h.
References impala::ExecNode::limit_, and impala::ExecNode::num_rows_returned_.
Referenced by impala::HdfsParquetScanner::AssembleRows(), impala::SelectNode::CopyRows(), impala::UnionNode::EvalAndMaterializeExprs(), impala::HdfsTextScanner::FinishScanRange(), impala::SelectNode::GetNext(), impala::UnionNode::GetNext(), impala::SortNode::GetNext(), impala::HBaseScanNode::GetNext(), impala::ExchangeNode::GetNext(), impala::CrossJoinNode::GetNext(), impala::HashJoinNode::GetNext(), impala::AggregationNode::GetNext(), GetNext(), impala::AnalyticEvalNode::GetNext(), impala::PartitionedAggregationNode::GetNext(), impala::HdfsScanNode::GetNextInternal(), impala::ExchangeNode::GetNextMerging(), impala::AnalyticEvalNode::GetNextOutputBatch(), impala::HashJoinNode::LeftJoinGetNext(), impala::HdfsSequenceScanner::ProcessBlockCompressedScanRange(), impala::HdfsTextScanner::ProcessRange(), impala::HdfsAvroScanner::ProcessRange(), impala::HdfsSequenceScanner::ProcessRange(), impala::HdfsRCFileScanner::ProcessRange(), and impala::PlanFragmentExecutor::ReachedLimit().
|
private |
For each partition in hash partitions, reserves an IO sized block on both the build and probe stream.
Definition at line 1200 of file partitioned-hash-join-node.cc.
References impala::Status::AddDetail(), hash_partitions_, impala::ExecNode::is_closed(), impala::Status::MEM_LIMIT_EXCEEDED, impala::Status::OK, RETURN_IF_ERROR, and using_small_buffers_.
Referenced by AppendRowStreamFull().
|
virtual |
Subclasses should reset any state modified in Open() and GetNext() and then call BlockingJoinNode::Reset().
Reimplemented from impala::BlockingJoinNode.
Definition at line 187 of file partitioned-hash-join-node.cc.
|
inlineprivate |
Prepares for probing the next batch.
Definition at line 24 of file partitioned-hash-join-node.inline.h.
References impala::BlockingJoinNode::current_probe_row_, hash_tbl_iterator_, impala::BlockingJoinNode::matched_probe_, impala::BlockingJoinNode::probe_batch_pos_, and impala::HashTable::Iterator::SetAtEnd().
Referenced by InitGetNext(), NextProbeRowBatch(), and NextSpilledProbeRowBatch().
|
inlineinherited |
Definition at line 156 of file exec-node.h.
References impala::ExecNode::row_descriptor_.
Referenced by impala::CrossJoinNode::BuildListDebugString(), impala::HashJoinNode::CodegenCreateOutputRow(), CodegenCreateOutputRow(), impala::CrossJoinNode::ConstructBuildSide(), impala::HashJoinNode::ConstructBuildSide(), impala::BlockingJoinNode::GetLeftChildRowString(), impala::HashJoinNode::GetNext(), impala::AggregationNode::GetNext(), impala::AnalyticEvalNode::GetNextOutputBatch(), impala::PartitionedAggregationNode::Partition::InitStreams(), impala::TopNNode::Open(), impala::AggregationNode::Open(), impala::AnalyticEvalNode::Open(), impala::PartitionedAggregationNode::Open(), impala::UnionNode::OpenCurrentChild(), impala::SelectNode::Prepare(), impala::SortNode::Prepare(), impala::UnionNode::Prepare(), impala::TopNNode::Prepare(), impala::BlockingJoinNode::Prepare(), impala::HashJoinNode::Prepare(), impala::AggregationNode::Prepare(), impala::AnalyticEvalNode::Prepare(), Prepare(), impala::ExecNode::Prepare(), impala::PlanFragmentExecutor::Prepare(), impala::PartitionedAggregationNode::Prepare(), impala::HdfsScanNode::Prepare(), ProcessBuildInput(), impala::PartitionedAggregationNode::ProcessStream(), impala::PlanFragmentExecutor::row_desc(), impala::SortNode::SortInput(), and impala::HdfsScanner::StartNewRowBatch().
|
inlineinherited |
Definition at line 157 of file exec-node.h.
References impala::ExecNode::num_rows_returned_.
Referenced by impala::CrossJoinNode::GetNext(), impala::AnalyticEvalNode::GetNext(), impala::HashJoinNode::LeftJoinGetNext(), impala::PartitionedAggregationNode::Open(), impala::HdfsSequenceScanner::ProcessDecompressedBlock(), impala::CrossJoinNode::ProcessLeftChildBatch(), impala::HashJoinNode::ProcessProbeBatch(), and impala::HdfsTextScanner::WriteFields().
|
inlineinherited |
Definition at line 161 of file exec-node.h.
References impala::ExecNode::runtime_profile_.
Referenced by impala::ExecNode::AddRuntimeExecOption(), impala::BlockingJoinNode::BuildSideThread(), impala::ExecNode::CreateTreeHelper(), impala::PartitionedAggregationNode::Partition::InitStreams(), impala::SortNode::Open(), impala::AnalyticEvalNode::Open(), impala::PartitionedAggregationNode::Open(), impala::HdfsScanNode::Open(), impala::HdfsTextScanner::Prepare(), impala::HBaseScanNode::Prepare(), impala::BaseSequenceScanner::Prepare(), impala::ExchangeNode::Prepare(), impala::HdfsParquetScanner::Prepare(), impala::BlockingJoinNode::Prepare(), impala::HashJoinNode::Prepare(), impala::AggregationNode::Prepare(), impala::AnalyticEvalNode::Prepare(), Prepare(), impala::ExecNode::Prepare(), impala::ScanNode::Prepare(), impala::PlanFragmentExecutor::Prepare(), impala::PartitionedAggregationNode::Prepare(), impala::HdfsScanner::Prepare(), impala::HdfsScanNode::Prepare(), and ProcessBuildInput().
|
staticinherited |
Set debug action for node with given id in 'tree'.
Definition at line 332 of file exec-node.cc.
References impala::ExecNode::children_, impala::ExecNode::debug_action_, impala::ExecNode::debug_phase_, and impala::ExecNode::id_.
Referenced by impala::PlanFragmentExecutor::Prepare().
Called when we need to free up memory by spilling a partition. This function walks hash_partitions_ and picks on to spill. *spilled_partition is the partition that was spilled. Returns non-ok status if we couldn't spill a partition.
Definition at line 450 of file partitioned-hash-join-node.cc.
References impala::RuntimeState::block_mgr(), block_mgr_client_, hash_partitions_, hash_tbls_, impala::ExecNode::is_closed(), impala::BufferedBlockMgr::MemLimitTooLowError(), NodeDebugString(), impala::Status::OK, RETURN_IF_ERROR, and runtime_state_.
Referenced by AppendRowStreamFull(), and BuildHashTables().
|
inlineinherited |
Definition at line 155 of file exec-node.h.
References impala::ExecNode::type_.
Referenced by impala::ExecNode::CodegenEvalConjuncts(), impala::PartitionedAggregationNode::CodegenUpdateTuple(), and impala::PlanFragmentExecutor::Prepare().
|
private |
Updates state_ to 's', logging the transition.
Definition at line 1294 of file partitioned-hash-join-node.cc.
References NodeDebugString(), and state_.
Referenced by ConstructBuildSide(), and PrepareNextPartition().
|
private |
Client to the buffered block mgr.
Definition at line 306 of file partitioned-hash-join-node.h.
Referenced by Close(), Prepare(), and SpillPartition().
|
private |
Definition at line 293 of file partitioned-hash-join-node.h.
Referenced by AddToDebugString(), AllocateProbeFilters(), Close(), ConstructBuildSide(), Init(), and Prepare().
|
protectedinherited |
Definition at line 70 of file blocking-join-node.h.
Referenced by impala::BlockingJoinNode::Close(), impala::HashJoinNode::ConstructBuildSide(), impala::BlockingJoinNode::Prepare(), and impala::BlockingJoinNode::Reset().
|
protectedinherited |
Definition at line 102 of file blocking-join-node.h.
Referenced by impala::CrossJoinNode::ConstructBuildSide(), impala::HashJoinNode::ConstructBuildSide(), impala::BlockingJoinNode::Prepare(), and ProcessBuildInput().
|
protectedinherited |
Definition at line 100 of file blocking-join-node.h.
Referenced by impala::CrossJoinNode::ConstructBuildSide(), impala::HashJoinNode::ConstructBuildSide(), and impala::BlockingJoinNode::Prepare().
|
protectedinherited |
Definition at line 89 of file blocking-join-node.h.
Referenced by impala::HashJoinNode::CodegenCreateOutputRow(), CodegenCreateOutputRow(), impala::BlockingJoinNode::CreateOutputRow(), and impala::BlockingJoinNode::Prepare().
|
protectedinherited |
If true, this node can add filters to the probe (left child) node after processing the entire build side.
Definition at line 98 of file blocking-join-node.h.
Referenced by AllocateProbeFilters(), AttachProbeFilters(), BuildHashTables(), impala::HashJoinNode::ConstructBuildSide(), impala::HashJoinNode::HashJoinNode(), and PartitionedHashJoinNode().
|
protectedinherited |
Definition at line 214 of file exec-node.h.
Referenced by impala::ExecNode::child(), impala::ExecNode::Close(), impala::ExecNode::CollectNodes(), impala::ExecNode::CreateTreeHelper(), impala::HBaseScanNode::DebugString(), impala::UnionNode::GetNext(), impala::UnionNode::Open(), impala::AggregationNode::Open(), impala::PartitionedAggregationNode::Open(), impala::UnionNode::OpenCurrentChild(), impala::ExecNode::Prepare(), impala::PartitionedAggregationNode::ProcessStream(), impala::ExecNode::Reset(), and impala::ExecNode::SetDebugOptions().
|
protectedinherited |
Definition at line 212 of file exec-node.h.
Referenced by impala::ExecNode::Close(), impala::HashJoinNode::CodegenProcessProbeBatch(), CodegenProcessProbeBatch(), impala::ExecNode::conjunct_ctxs(), impala::SelectNode::CopyRows(), impala::UnionNode::EvalAndMaterializeExprs(), impala::HBaseScanNode::GetNext(), impala::HashJoinNode::GetNext(), impala::AggregationNode::GetNext(), impala::PartitionedAggregationNode::GetNext(), impala::AnalyticEvalNode::GetNextOutputBatch(), impala::TopNNode::Init(), impala::ExecNode::Init(), Init(), impala::ExecNode::Open(), OutputUnmatchedBuild(), impala::ExecNode::Prepare(), impala::CrossJoinNode::ProcessLeftChildBatch(), impala::HashJoinNode::ProcessProbeBatch(), and ProcessProbeBatch().
|
protectedinherited |
Definition at line 82 of file blocking-join-node.h.
Referenced by impala::HashJoinNode::GetNext(), GetNext(), NextProbeRowBatch(), NextSpilledProbeRowBatch(), impala::BlockingJoinNode::Open(), impala::CrossJoinNode::ProcessLeftChildBatch(), impala::HashJoinNode::ProcessProbeBatch(), ProcessProbeBatch(), and ResetForProbe().
|
protectedinherited |
Definition at line 220 of file exec-node.h.
Referenced by impala::ExecNode::ExecDebugAction(), and impala::ExecNode::SetDebugOptions().
|
protectedinherited |
debug-only: if debug_action_ is not INVALID, node will perform action in debug_phase_
Definition at line 219 of file exec-node.h.
Referenced by impala::ExecNode::ExecDebugAction(), and impala::ExecNode::SetDebugOptions().
|
protectedinherited |
Definition at line 69 of file blocking-join-node.h.
Referenced by impala::BlockingJoinNode::DebugString(), impala::CrossJoinNode::GetNext(), impala::HashJoinNode::GetNext(), impala::HashJoinNode::LeftJoinGetNext(), impala::BlockingJoinNode::Open(), and impala::BlockingJoinNode::Reset().
|
protectedinherited |
Execution options that are determined at runtime. This is added to the runtime profile at Close(). Examples for options logged here would be "Codegen Enabled"
Definition at line 238 of file exec-node.h.
Referenced by impala::ExecNode::AddRuntimeExecOption().
|
protectedinherited |
MemTracker that should be used for ExprContexts.
Definition at line 233 of file exec-node.h.
Referenced by impala::ExecNode::expr_mem_tracker(), and impala::ExecNode::Prepare().
|
private |
The current set of partitions that are being built. This is only used in mode 1 and 2 when we need to partition the build and probe inputs. This is not used when processing a single partition.
Definition at line 444 of file partitioned-hash-join-node.h.
Referenced by BuildHashTables(), CleanUpHashPartitions(), Close(), GetNext(), LargestSpilledPartition(), NodeDebugString(), PrepareNextPartition(), ProcessBuildBatch(), ProcessBuildInput(), ProcessProbeBatch(), ReserveTupleStreamBlocks(), and SpillPartition().
|
private |
The iterator that corresponds to the look up of current_probe_row_.
Definition at line 314 of file partitioned-hash-join-node.h.
Referenced by CleanUpHashPartitions(), NextSpilledProbeRowBatch(), OutputUnmatchedBuild(), ProcessProbeBatch(), and ResetForProbe().
|
private |
Cache of the per partition hash table to speed up ProcessProbeBatch. In the case where we need to partition the probe: hash_tbls_[i] = hash_partitions_[i]->hash_tbl(); In the case where we don't need to partition the probe: hash_tbls_[i] = input_partition_->hash_tbl();
Definition at line 451 of file partitioned-hash-join-node.h.
Referenced by BuildHashTables(), PartitionedHashJoinNode(), PrepareNextPartition(), ProcessProbeBatch(), and SpillPartition().
|
private |
Used for hash-related functionality, such as evaluating rows and calculating hashes. TODO: If we want to multi-thread then this context should be thread-local and not associated with the node.
Definition at line 311 of file partitioned-hash-join-node.h.
Referenced by AllocateProbeFilters(), CleanUpHashPartitions(), Close(), CodegenProcessBuildBatch(), CodegenProcessProbeBatch(), GetNext(), NextSpilledProbeRowBatch(), OutputUnmatchedBuild(), Prepare(), PrepareNextPartition(), ProcessBuildBatch(), and ProcessBuildInput().
|
protectedinherited |
Definition at line 209 of file exec-node.h.
Referenced by impala::PartitionedAggregationNode::CreateHashPartitions(), impala::ExecNode::id(), impala::ExecNode::InitRuntimeProfile(), impala::PartitionedAggregationNode::NextPartition(), impala::ExchangeNode::Prepare(), PrepareNextPartition(), ProcessBuildInput(), and impala::ExecNode::SetDebugOptions().
|
private |
The current input partition to be processed (not in spilled_partitions_). This partition can either serve as the source for a repartitioning step, or if the hash table fits in memory, the source of the probe rows.
Definition at line 456 of file partitioned-hash-join-node.h.
Referenced by BuildHashTables(), CleanUpHashPartitions(), Close(), GetNext(), NextSpilledProbeRowBatch(), PrepareNextPartition(), and ProcessBuildInput().
|
protectedinherited |
Definition at line 68 of file blocking-join-node.h.
Referenced by CleanUpHashPartitions(), CodegenCreateOutputRow(), CodegenProcessProbeBatch(), GetNext(), impala::HashJoinNode::HashJoinNode(), Init(), MinRequiredBuffers(), NextSpilledProbeRowBatch(), NodeDebugString(), OutputUnmatchedBuild(), impala::CrossJoinNode::Prepare(), impala::BlockingJoinNode::Prepare(), impala::HashJoinNode::Prepare(), Prepare(), and impala::HashJoinNode::ProcessProbeBatch().
|
private |
The largest fraction (of build side) after repartitioning. This is expected to be 1 / PARTITION_FANOUT. A value much larger indicates skew.
Definition at line 340 of file partitioned-hash-join-node.h.
Referenced by Prepare(), and ProcessBuildInput().
|
protectedinherited |
Definition at line 222 of file exec-node.h.
Referenced by impala::SortNode::GetNext(), impala::HdfsScanNode::GetNextInternal(), impala::ExchangeNode::GetNextMerging(), impala::TopNNode::InsertTupleRow(), impala::HdfsScanNode::limit(), impala::ExecNode::limit(), impala::TopNNode::Open(), and impala::ExecNode::ReachedLimit().
|
staticinherited |
Definition at line 64 of file blocking-join-node.h.
Referenced by impala::HashJoinNode::CodegenCreateOutputRow(), and CodegenCreateOutputRow().
|
private |
For each row in null_probe_rows_, true if this row has matched any build row (i.e. the resulting joined row passes other_join_conjuncts). TODO: remove this. We need to be able to put these bits inside the tuple itself.
Definition at line 493 of file partitioned-hash-join-node.h.
Referenced by EvaluateNullProbe(), OutputNullAwareNullProbe(), and ProcessProbeBatch().
|
protectedinherited |
Definition at line 83 of file blocking-join-node.h.
Referenced by impala::HashJoinNode::GetNext(), impala::HashJoinNode::InitGetNext(), impala::HashJoinNode::ProcessProbeBatch(), ProcessProbeBatch(), and ResetForProbe().
|
staticprivate |
Maximum number of build tables that can be in memory at any time. This is in addition to the memory constraints and is used for testing to trigger code paths for small tables. Note: In order to test the spilling paths more easily, set it to PARTITION_FANOUT / 2. TODO: Eventually remove.
Definition at line 137 of file partitioned-hash-join-node.h.
|
staticprivate |
Maximum number of times we will repartition. The maximum build table we can process is: MEM_LIMIT * (PARTITION_FANOUT ^ MAX_PARTITION_DEPTH). With a (low) 1GB limit and 64 fanout, we can support 256TB build tables in the case where there is no skew. In the case where there is skew, repartitioning is unlikely to help (assuming a reasonable hash function). Note that we need to have at least as many SEED_PRIMES in HashTableCtx. TODO: we can revisit and try harder to explicitly detect skew.
Definition at line 130 of file partitioned-hash-join-node.h.
Referenced by Prepare(), and ProcessBuildInput().
|
private |
Level of max partition (i.e. number of repartitioning steps).
Definition at line 326 of file partitioned-hash-join-node.h.
Referenced by Prepare(), and ProcessBuildInput().
|
protectedinherited |
Account for peak memory used by this node.
Definition at line 230 of file exec-node.h.
Referenced by impala::ExecNode::mem_tracker(), and impala::ExecNode::Prepare().
|
protectedinherited |
Definition at line 67 of file blocking-join-node.h.
Referenced by impala::BlockingJoinNode::DebugString(), and impala::BlockingJoinNode::Open().
|
private |
If true, the build side has at least one row.
Definition at line 484 of file partitioned-hash-join-node.h.
Referenced by ProcessBuildInput(), and ProcessProbeBatch().
|
private |
Time spent evaluating other_join_conjuncts for NAAJ.
Definition at line 343 of file partitioned-hash-join-node.h.
Referenced by EvaluateNullProbe(), and Prepare().
|
private |
Partition used if null_aware_ is set. This partition is always processed at the end after all build and probe rows are processed. Rows are added to this partition along the way. In this partition's build_rows_, we store all the rows for which build_expr_ctxs_ evaluated over the row returns NULL (i.e. it has a NULL on the eq join slot). In this partition's probe_rows, we store all probe rows that did not have a match in the hash table. At the very end, we then iterate over all the probe rows. For each probe row, we return the rows that did not match any of the build rows. NULL if we this join is not null aware or we are done processing this partition.
Definition at line 477 of file partitioned-hash-join-node.h.
Referenced by Close(), GetNext(), OutputNullAwareNullProbe(), OutputNullAwareProbeRows(), Prepare(), PrepareNullAwarePartition(), ProcessBuildBatch(), and ProcessProbeBatch().
|
private |
The current index into null_probe_rows_/matched_null_probe_ that we are outputting.
Definition at line 497 of file partitioned-hash-join-node.h.
Referenced by GetNext(), OutputNullAwareNullProbe(), and PrepareNullAwareNullProbe().
|
private |
For NAAJ, this stream contains all probe rows that had NULL on the hash table conjuncts.
Definition at line 488 of file partitioned-hash-join-node.h.
Referenced by CleanUpHashPartitions(), Close(), EvaluateNullProbe(), OutputNullAwareNullProbe(), Prepare(), PrepareNullAwareNullProbe(), and ProcessProbeBatch().
|
private |
Used while processing null_aware_partition_. It contains all the build tuple rows with a NULL when evaluating the hash table expr.
Definition at line 481 of file partitioned-hash-join-node.h.
Referenced by Close(), GetNext(), OutputNullAwareNullProbe(), OutputNullAwareProbeRows(), and PrepareNullAwarePartition().
|
private |
Number of build/probe rows that have been partitioned.
Definition at line 329 of file partitioned-hash-join-node.h.
Referenced by Prepare(), and ProcessBuildInput().
|
private |
Total number of hash buckets across all partitions.
Definition at line 320 of file partitioned-hash-join-node.h.
Referenced by Prepare().
|
staticprivate |
Needs to be the log(PARTITION_FANOUT)
Definition at line 119 of file partitioned-hash-join-node.h.
Referenced by impala::PartitionedHashJoinNode::Partition::BuildHashTableInternal(), ProcessBuildBatch(), and ProcessProbeBatch().
|
private |
Definition at line 330 of file partitioned-hash-join-node.h.
Referenced by Prepare(), and PrepareNextPartition().
|
private |
Number of partitions that have been repartitioned.
Definition at line 333 of file partitioned-hash-join-node.h.
Referenced by Prepare(), and PrepareNextPartition().
|
protectedinherited |
Definition at line 223 of file exec-node.h.
Referenced by impala::ExecNode::Close(), impala::SelectNode::CopyRows(), impala::UnionNode::EvalAndMaterializeExprs(), impala::SortNode::GetNext(), impala::HBaseScanNode::GetNext(), impala::TopNNode::GetNext(), impala::ExchangeNode::GetNext(), impala::CrossJoinNode::GetNext(), impala::HashJoinNode::GetNext(), impala::AggregationNode::GetNext(), GetNext(), impala::AnalyticEvalNode::GetNext(), impala::PartitionedAggregationNode::GetNext(), impala::HdfsScanNode::GetNextInternal(), impala::ExchangeNode::GetNextMerging(), impala::AnalyticEvalNode::GetNextOutputBatch(), impala::HashJoinNode::LeftJoinGetNext(), OutputUnmatchedBuild(), impala::ExecNode::ReachedLimit(), and impala::ExecNode::rows_returned().
|
private |
Number of partitions that have been spilled.
Definition at line 336 of file partitioned-hash-join-node.h.
Referenced by Prepare().
|
private |
Non-equi-join conjuncts from the JOIN clause.
Definition at line 296 of file partitioned-hash-join-node.h.
Referenced by Close(), CodegenProcessProbeBatch(), ConstructBuildSide(), EvaluateNullProbe(), Init(), OutputNullAwareProbeRows(), Prepare(), and ProcessProbeBatch().
|
private |
In the case of right-outer and full-outer joins, this is the list of the partitions that we need to output their unmatched build rows. We always flush the unmatched rows of the partition that it is in the front.
Definition at line 461 of file partitioned-hash-join-node.h.
Referenced by CleanUpHashPartitions(), Close(), GetNext(), NextSpilledProbeRowBatch(), and OutputUnmatchedBuild().
|
private |
Total time spent partitioning build.
Definition at line 317 of file partitioned-hash-join-node.h.
Referenced by Prepare(), and ProcessBuildInput().
|
staticprivate |
Number of initial partitions to create. Must be a power of two. TODO: this is set to a lower than actual value for testing.
Definition at line 116 of file partitioned-hash-join-node.h.
Referenced by BuildHashTables(), MinRequiredBuffers(), PrepareNextPartition(), and ProcessBuildInput().
|
private |
Total number of partitions created.
Definition at line 323 of file partitioned-hash-join-node.h.
Referenced by Prepare(), and ProcessBuildInput().
|
protectedinherited |
Definition at line 211 of file exec-node.h.
Referenced by impala::SortNode::Init(), impala::UnionNode::Init(), impala::TopNNode::Init(), impala::ExchangeNode::Init(), impala::HashJoinNode::Init(), impala::AggregationNode::Init(), impala::ExecNode::Init(), Init(), impala::AnalyticEvalNode::Init(), impala::PartitionedAggregationNode::Init(), impala::ExecNode::InitRuntimeProfile(), impala::HdfsScanNode::Open(), Prepare(), and ProcessBuildInput().
|
protectedinherited |
probe_batch_ must be cleared before calling GetNext(). The child node does not initialize all tuple ptrs in the row, only the ones that it is responsible for.
Definition at line 75 of file blocking-join-node.h.
Referenced by impala::BlockingJoinNode::Close(), impala::CrossJoinNode::GetNext(), impala::HashJoinNode::GetNext(), impala::HashJoinNode::LeftJoinGetNext(), NextProbeRowBatch(), NextSpilledProbeRowBatch(), impala::BlockingJoinNode::Open(), OutputNullAwareNullProbe(), OutputNullAwareProbeRows(), impala::BlockingJoinNode::Prepare(), PrepareNullAwareNullProbe(), PrepareNullAwarePartition(), ProcessProbeBatch(), impala::BlockingJoinNode::Reset(), and impala::BlockingJoinNode::~BlockingJoinNode().
|
protectedinherited |
TODO: These variables should move to a join control block struct, which is local to each probing thread.
Definition at line 81 of file blocking-join-node.h.
Referenced by CleanUpHashPartitions(), impala::BlockingJoinNode::DebugString(), impala::CrossJoinNode::GetNext(), impala::HashJoinNode::GetNext(), GetNext(), impala::HashJoinNode::LeftJoinGetNext(), NextProbeRowBatch(), NextSpilledProbeRowBatch(), impala::BlockingJoinNode::Open(), OutputNullAwareNullProbe(), OutputNullAwareProbeRows(), PrepareNullAwareNullProbe(), PrepareNullAwarePartition(), impala::CrossJoinNode::ProcessLeftChildBatch(), impala::HashJoinNode::ProcessProbeBatch(), ProcessProbeBatch(), and ResetForProbe().
|
private |
Our equi-join predicates "<lhs> = <rhs>" are separated into build_expr_ctxs_ (over child(1)) and probe_expr_ctxs_ (over child(0))
Definition at line 292 of file partitioned-hash-join-node.h.
Referenced by AddToDebugString(), AllocateProbeFilters(), Close(), ConstructBuildSide(), Init(), and Prepare().
Used for concentrating the existence bits from all the partitions, used by the probe-side filter optimization.
Definition at line 465 of file partitioned-hash-join-node.h.
Referenced by AllocateProbeFilters(), and AttachProbeFilters().
|
protectedinherited |
Definition at line 103 of file blocking-join-node.h.
Referenced by impala::CrossJoinNode::GetNext(), impala::HashJoinNode::GetNext(), impala::HashJoinNode::LeftJoinGetNext(), NextProbeRowBatch(), impala::BlockingJoinNode::Open(), and impala::BlockingJoinNode::Prepare().
|
protectedinherited |
Definition at line 77 of file blocking-join-node.h.
Referenced by impala::CrossJoinNode::GetNext(), impala::HashJoinNode::GetNext(), impala::HashJoinNode::LeftJoinGetNext(), NextProbeRowBatch(), impala::BlockingJoinNode::Open(), and impala::BlockingJoinNode::Reset().
|
protectedinherited |
Definition at line 101 of file blocking-join-node.h.
Referenced by impala::CrossJoinNode::GetNext(), impala::HashJoinNode::GetNext(), GetNext(), impala::HashJoinNode::LeftJoinGetNext(), OutputUnmatchedBuild(), and impala::BlockingJoinNode::Prepare().
|
protectedinherited |
Size of the TupleRow (just the Tuple ptrs) from the build (right) and probe (left) sides. Set to zero if the build/probe tuples are not returned, e.g., for semi joins. Cached because it is used in the hot path.
Definition at line 88 of file blocking-join-node.h.
Referenced by impala::HashJoinNode::CodegenCreateOutputRow(), CodegenCreateOutputRow(), impala::BlockingJoinNode::CreateOutputRow(), and impala::BlockingJoinNode::Prepare().
|
private |
Jitted ProcessBuildBatch function pointers. NULL if codegen is disabled. process_build_batch_fn_level0_ uses CRC hashing when available and is used when the partition level is 0, otherwise process_build_batch_fn_ uses murmur hash and is used for subsequent levels.
Definition at line 423 of file partitioned-hash-join-node.h.
Referenced by CodegenProcessBuildBatch(), and ProcessBuildInput().
|
private |
Definition at line 424 of file partitioned-hash-join-node.h.
Referenced by CodegenProcessBuildBatch(), and ProcessBuildInput().
|
private |
Jitted ProcessProbeBatch function pointer. NULL if codegen is disabled. process_probe_batch_fn_level0_ uses CRC hashing when available and is used when the partition level is 0, otherwise process_probe_batch_fn_ uses murmur hash and is used for subsequent levels.
Definition at line 433 of file partitioned-hash-join-node.h.
Referenced by CodegenProcessProbeBatch(), and GetNext().
|
private |
Definition at line 434 of file partitioned-hash-join-node.h.
Referenced by CodegenProcessProbeBatch(), and GetNext().
|
protectedinherited |
Definition at line 215 of file exec-node.h.
Referenced by impala::SortNode::Open(), impala::SortNode::Prepare(), impala::TopNNode::Prepare(), impala::ExchangeNode::Prepare(), and impala::ExecNode::row_desc().
|
staticinherited |
Names of counters shared by all exec nodes.
Definition at line 169 of file exec-node.h.
Referenced by impala::ExecNode::Prepare().
|
protectedinherited |
Definition at line 226 of file exec-node.h.
Referenced by impala::ExecNode::Close(), impala::SelectNode::CopyRows(), impala::UnionNode::EvalAndMaterializeExprs(), impala::SortNode::GetNext(), impala::HBaseScanNode::GetNext(), impala::TopNNode::GetNext(), impala::ExchangeNode::GetNext(), impala::CrossJoinNode::GetNext(), impala::HashJoinNode::GetNext(), impala::AggregationNode::GetNext(), impala::AnalyticEvalNode::GetNext(), GetNext(), impala::PartitionedAggregationNode::GetNext(), impala::HdfsScanNode::GetNextInternal(), impala::ExchangeNode::GetNextMerging(), impala::HashJoinNode::LeftJoinGetNext(), OutputUnmatchedBuild(), and impala::ExecNode::Prepare().
|
protectedinherited |
Definition at line 227 of file exec-node.h.
Referenced by impala::ExecNode::Prepare().
|
protectedinherited |
Definition at line 239 of file exec-node.h.
Referenced by impala::ExecNode::AddRuntimeExecOption().
|
protectedinherited |
Definition at line 225 of file exec-node.h.
Referenced by impala::HBaseScanNode::Close(), impala::SelectNode::GetNext(), impala::SortNode::GetNext(), impala::UnionNode::GetNext(), impala::HBaseScanNode::GetNext(), impala::TopNNode::GetNext(), impala::ExchangeNode::GetNext(), impala::CrossJoinNode::GetNext(), impala::HashJoinNode::GetNext(), impala::AggregationNode::GetNext(), impala::AnalyticEvalNode::GetNext(), GetNext(), impala::PartitionedAggregationNode::GetNext(), impala::HdfsScanNode::GetNext(), impala::ExecNode::InitRuntimeProfile(), impala::SelectNode::Open(), impala::HBaseScanNode::Open(), impala::UnionNode::Open(), impala::SortNode::Open(), impala::TopNNode::Open(), impala::ExchangeNode::Open(), impala::DataSourceScanNode::Open(), impala::BlockingJoinNode::Open(), impala::AggregationNode::Open(), impala::AnalyticEvalNode::Open(), impala::PartitionedAggregationNode::Open(), impala::SelectNode::Prepare(), impala::SortNode::Prepare(), impala::UnionNode::Prepare(), impala::TopNNode::Prepare(), impala::BlockingJoinNode::Prepare(), impala::HashJoinNode::Prepare(), impala::AggregationNode::Prepare(), Prepare(), impala::AnalyticEvalNode::Prepare(), impala::ExecNode::Prepare(), impala::ScanNode::Prepare(), impala::PartitionedAggregationNode::Prepare(), impala::HdfsScanNode::Prepare(), impala::ExecNode::runtime_profile(), and impala::HdfsScanNode::StopAndFinalizeCounters().
|
private |
Definition at line 288 of file partitioned-hash-join-node.h.
Referenced by Prepare(), and SpillPartition().
|
protectedinherited |
Row assembled from all lhs and rhs tuples used for evaluating the non-equi-join conjuncts for semi joins. Semi joins only return the lhs or rhs output tuples, so this tuple is temporarily assembled for evaluating the conjuncts.
Definition at line 94 of file blocking-join-node.h.
Referenced by impala::BlockingJoinNode::Close(), EvaluateNullProbe(), OutputNullAwareProbeRows(), impala::BlockingJoinNode::Prepare(), impala::HashJoinNode::ProcessProbeBatch(), and ProcessProbeBatch().
|
private |
The list of partitions that have been spilled on both sides and still need more processing. These partitions could need repartitioning, in which cases more partitions will be added to this list after repartitioning.
Definition at line 439 of file partitioned-hash-join-node.h.
Referenced by CleanUpHashPartitions(), Close(), NodeDebugString(), and PrepareNextPartition().
|
private |
State of the algorithm. Used just for debugging.
Definition at line 302 of file partitioned-hash-join-node.h.
Referenced by GetNext(), PrintState(), ProcessProbeBatch(), and UpdateState().
|
private |
Definition at line 303 of file partitioned-hash-join-node.h.
Referenced by AppendRowStreamFull(), GetNext(), ProcessBuildBatch(), and ProcessProbeBatch().
|
protectedinherited |
Definition at line 210 of file exec-node.h.
Referenced by impala::ExecNode::CollectNodes(), and impala::ExecNode::type().
|
private |
If true, the partitions in hash_partitions_ are using small buffers.
Definition at line 299 of file partitioned-hash-join-node.h.
Referenced by AppendRowStreamFull(), ProcessBuildInput(), and ReserveTupleStreamBlocks().