Impala
Impalaistheopensource,nativeanalyticdatabaseforApacheHadoop.
|
Public Member Functions | |
Partition (PartitionedAggregationNode *parent, int level) | |
Status | InitStreams () |
bool | InitHashTable () |
Initializes the hash table. Returns false on OOM. More... | |
void | Close (bool finalize_rows) |
Status | Spill (Tuple *tuple=NULL) |
bool | is_spilled () const |
Public Attributes | |
PartitionedAggregationNode * | parent |
bool | is_closed |
If true, this partition is closed and there is nothing left to do. More... | |
const int | level |
boost::scoped_ptr< HashTable > | hash_tbl |
std::vector < impala_udf::FunctionContext * > | agg_fn_ctxs |
Clone of parent's agg_fn_ctxs_ and backing MemPool. More... | |
boost::scoped_ptr< MemPool > | agg_fn_pool |
boost::scoped_ptr < BufferedTupleStream > | aggregated_row_stream |
boost::scoped_ptr < BufferedTupleStream > | unaggregated_row_stream |
Unaggregated rows that are spilled. More... | |
Definition at line 237 of file partitioned-aggregation-node.h.
|
inline |
Definition at line 238 of file partitioned-aggregation-node.h.
void impala::PartitionedAggregationNode::Partition::Close | ( | bool | finalize_rows | ) |
Closes this partition. If finalize_rows is true, this iterates over all rows in aggregated_row_stream and finalizes them (this is only used in the cancellation path).
Definition at line 570 of file partitioned-aggregation-node.cc.
References impala::ExecNode::is_closed().
Referenced by impala::PartitionedAggregationNode::Close(), impala::PartitionedAggregationNode::GetNext(), and impala::PartitionedAggregationNode::MoveHashPartitions().
bool impala::PartitionedAggregationNode::Partition::InitHashTable | ( | ) |
Initializes the hash table. Returns false on OOM.
Definition at line 454 of file partitioned-aggregation-node.cc.
References impala::PartitionedAggregationNode::NUM_PARTITIONING_BITS.
Status impala::PartitionedAggregationNode::Partition::InitStreams | ( | ) |
Initializes aggregated_row_stream and unaggregated_row_stream, reserving one buffer for each. The buffers backing these streams are reserved, so this function will not fail with a continuable OOM. If we fail to init these buffers, the mem limit is too low to run this algorithm.
Definition at line 429 of file partitioned-aggregation-node.cc.
References impala::ObjectPool::Add(), agg_fn_ctxs, impala::PartitionedAggregationNode::agg_fn_ctxs_, agg_fn_pool, aggregated_row_stream, impala::RuntimeState::block_mgr(), impala::PartitionedAggregationNode::block_mgr_client_, impala::ExecNode::child(), impala::ExecNode::expr_mem_tracker(), impala::PartitionedAggregationNode::intermediate_row_desc_, level, impala::RuntimeState::obj_pool(), impala::Status::OK, parent, RETURN_IF_ERROR, impala::ExecNode::row_desc(), impala::ExecNode::runtime_profile(), impala::PartitionedAggregationNode::state_, and unaggregated_row_stream.
|
inline |
Definition at line 261 of file partitioned-aggregation-node.h.
References hash_tbl.
Referenced by impala::PartitionedAggregationNode::LargestSpilledPartition(), impala::PartitionedAggregationNode::MoveHashPartitions(), impala::PartitionedAggregationNode::NextPartition(), and impala::PartitionedAggregationNode::ProcessBatch().
Spills this partition, unpinning streams and cleaning up hash tables as necessary. If tuple is non-NULL, tuple should also be cleaned up (it was added to this partitions aggregated_row_stream but not in the hash table).
Definition at line 467 of file partitioned-aggregation-node.cc.
References impala::BufferedTupleStream::AddRow(), impala::HashTable::Iterator::AtEnd(), COUNTER_ADD, impala::HashTable::Iterator::GetTuple(), impala::HashTable::Iterator::Next(), impala::Status::OK, impala::Status::ok(), RETURN_IF_ERROR, impala::AggFnEvaluator::Serialize(), impala::BufferedTupleStream::status(), and UNLIKELY.
Referenced by impala::PartitionedAggregationNode::SpillPartition().
std::vector<impala_udf::FunctionContext*> impala::PartitionedAggregationNode::Partition::agg_fn_ctxs |
Clone of parent's agg_fn_ctxs_ and backing MemPool.
Definition at line 279 of file partitioned-aggregation-node.h.
Referenced by impala::PartitionedAggregationNode::Close(), impala::PartitionedAggregationNode::GetNext(), InitStreams(), and impala::PartitionedAggregationNode::ProcessBatch().
boost::scoped_ptr<MemPool> impala::PartitionedAggregationNode::Partition::agg_fn_pool |
Definition at line 280 of file partitioned-aggregation-node.h.
Referenced by InitStreams().
boost::scoped_ptr<BufferedTupleStream> impala::PartitionedAggregationNode::Partition::aggregated_row_stream |
Tuple stream used to store aggregated rows. When the partition is not spilled, (meaning the hash table is maintained), this stream is pinned and contains the memory referenced by the hash table. When it is spilled, aggregate rows are just appended to this stream.
Definition at line 286 of file partitioned-aggregation-node.h.
Referenced by InitStreams(), impala::PartitionedAggregationNode::LargestSpilledPartition(), impala::PartitionedAggregationNode::MoveHashPartitions(), impala::PartitionedAggregationNode::NextPartition(), and impala::PartitionedAggregationNode::ProcessBatch().
boost::scoped_ptr<HashTable> impala::PartitionedAggregationNode::Partition::hash_tbl |
Hash table for this partition. Can be NULL if this partition is no longer maintaining a hash table (i.e. is spilled).
Definition at line 276 of file partitioned-aggregation-node.h.
Referenced by is_spilled(), impala::PartitionedAggregationNode::MoveHashPartitions(), impala::PartitionedAggregationNode::NextPartition(), and impala::PartitionedAggregationNode::ProcessBatch().
bool impala::PartitionedAggregationNode::Partition::is_closed |
If true, this partition is closed and there is nothing left to do.
Definition at line 266 of file partitioned-aggregation-node.h.
const int impala::PartitionedAggregationNode::Partition::level |
How many times rows in this partition have been repartitioned. Partitions created from the node's children's input is level 0, 1 after the first repartitionining, etc.
Definition at line 271 of file partitioned-aggregation-node.h.
Referenced by InitStreams().
PartitionedAggregationNode* impala::PartitionedAggregationNode::Partition::parent |
Definition at line 263 of file partitioned-aggregation-node.h.
Referenced by InitStreams().
boost::scoped_ptr<BufferedTupleStream> impala::PartitionedAggregationNode::Partition::unaggregated_row_stream |
Unaggregated rows that are spilled.
Definition at line 289 of file partitioned-aggregation-node.h.
Referenced by InitStreams(), impala::PartitionedAggregationNode::LargestSpilledPartition(), impala::PartitionedAggregationNode::MoveHashPartitions(), and impala::PartitionedAggregationNode::ProcessBatch().