Impala
Impalaistheopensource,nativeanalyticdatabaseforApacheHadoop.
|
#include <row-batch.h>
Public Member Functions | |
RowBatch (const RowDescriptor &row_desc, int capacity, MemTracker *tracker) | |
RowBatch (const RowDescriptor &row_desc, const TRowBatch &input_batch, MemTracker *tracker) | |
~RowBatch () | |
int | AddRows (int n) |
int | AddRow () |
void | CommitRows (int n) |
void | CommitLastRow () |
void | set_num_rows (int num_rows) |
bool | AtCapacity () |
bool | AtCapacity (MemPool *tuple_pool) |
int | TotalByteSize () |
TupleRow * | GetRow (int row_idx) |
int | row_byte_size () |
MemPool * | tuple_data_pool () |
int | num_io_buffers () const |
int | num_tuple_streams () const |
void | Reset () |
Resets the row batch, returning all resources it has accumulated. More... | |
void | AddIoBuffer (DiskIoMgr::BufferDescriptor *buffer) |
Add io buffer to this row batch. More... | |
void | AddTupleStream (BufferedTupleStream *stream) |
void | MarkNeedToReturn () |
void | TransferResourceOwnership (RowBatch *dest) |
void | CopyRow (TupleRow *src, TupleRow *dest) |
void | CopyRows (int dest, int src, int num_rows) |
void | ClearRow (TupleRow *row) |
void | AcquireState (RowBatch *src) |
int | Serialize (TRowBatch *output_batch) |
int | num_rows () const |
int | capacity () const |
const RowDescriptor & | row_desc () const |
int | MaxTupleBufferSize () |
Computes the maximum size needed to store tuple data for this row batch. More... | |
Static Public Member Functions | |
static int | GetBatchSize (const TRowBatch &batch) |
Utility function: returns total size of batch. More... | |
Static Public Attributes | |
static const int | INVALID_ROW_INDEX = -1 |
static const int | AT_CAPACITY_MEM_USAGE = 8 * 1024 * 1024 |
Private Attributes | |
MemTracker * | mem_tracker_ |
bool | has_in_flight_row_ |
All members below need to be handled in RowBatch::AcquireState() More... | |
int | num_rows_ |
int | capacity_ |
int | num_tuples_per_row_ |
RowDescriptor | row_desc_ |
Tuple ** | tuple_ptrs_ |
int | tuple_ptrs_size_ |
int64_t | auxiliary_mem_usage_ |
bool | need_to_return_ |
boost::scoped_ptr< MemPool > | tuple_data_pool_ |
holding (some of the) data referenced by rows More... | |
std::vector < DiskIoMgr::BufferDescriptor * > | io_buffers_ |
std::vector < BufferedTupleStream * > | tuple_streams_ |
Tuple streams currently owned by this row batch. More... | |
std::string | compression_scratch_ |
A RowBatch encapsulates a batch of rows, each composed of a number of tuples. The maximum number of rows is fixed at the time of construction. The row batch reference a few different sources of memory.
Definition at line 66 of file row-batch.h.
impala::RowBatch::RowBatch | ( | const RowDescriptor & | row_desc, |
int | capacity, | ||
MemTracker * | tracker | ||
) |
Create RowBatch for a maximum of 'capacity' rows of tuples specified by 'row_desc'. tracker cannot be NULL.
Definition at line 34 of file row-batch.cc.
References capacity_, mem_tracker_, num_tuples_per_row_, tuple_data_pool_, tuple_ptrs_, and tuple_ptrs_size_.
impala::RowBatch::RowBatch | ( | const RowDescriptor & | row_desc, |
const TRowBatch & | input_batch, | ||
MemTracker * | tracker | ||
) |
Populate a row batch from input_batch by copying input_batch's tuple_data into the row batch's mempool and converting all offsets in the data back into pointers. TODO: figure out how to transfer the data from input_batch to this RowBatch (so that we don't need to make yet another copy)
Definition at line 57 of file row-batch.cc.
References impala::Codec::CreateDecompressor(), impala::Status::GetDetail(), GetRow(), impala::Tuple::GetStringSlot(), impala::TupleRow::GetTuple(), mem_tracker_, num_rows_, offset, impala::Status::ok(), row_desc_, tuple_data_pool_, impala::RowDescriptor::tuple_descriptors(), tuple_ptrs_, and tuple_ptrs_size_.
impala::RowBatch::~RowBatch | ( | ) |
Releases all resources accumulated at this row batch. This includes
Definition at line 137 of file row-batch.cc.
References io_buffers_, tuple_data_pool_, and tuple_streams_.
void impala::RowBatch::AcquireState | ( | RowBatch * | src | ) |
Acquires state from the 'src' row batch into this row batch. This includes all IO buffers and tuple data. This row batch must be empty and have the same row descriptor as the src batch. This is used for scan nodes which produce RowBatches asynchronously. Typically, an ExecNode is handed a row batch to populate (pull model) but ScanNodes have multiple threads which push row batches. TODO: this is wasteful and makes a copy that's unnecessary. Think about cleaning this up. TOOD: rename this or unify with TransferResourceOwnership()
Definition at line 271 of file row-batch.cc.
References auxiliary_mem_usage_, impala::DiskIoMgr::BufferDescriptor::buffer_len(), capacity_, impala::RowDescriptor::Equals(), has_in_flight_row_, io_buffers_, mem_tracker_, need_to_return_, num_rows_, num_tuples_per_row_, row_desc_, impala::DiskIoMgr::BufferDescriptor::SetMemTracker(), tuple_data_pool_, tuple_ptrs_, tuple_ptrs_size_, and tuple_streams_.
Referenced by impala::HdfsScanNode::GetNextInternal().
void impala::RowBatch::AddIoBuffer | ( | DiskIoMgr::BufferDescriptor * | buffer | ) |
Add io buffer to this row batch.
Definition at line 211 of file row-batch.cc.
References auxiliary_mem_usage_, impala::DiskIoMgr::BufferDescriptor::buffer_len(), io_buffers_, mem_tracker_, and impala::DiskIoMgr::BufferDescriptor::SetMemTracker().
Referenced by impala::ScannerContext::Stream::ReleaseCompletedResources().
|
inline |
Definition at line 100 of file row-batch.h.
References AddRows().
Referenced by impala::SelectNode::CopyRows(), impala::SimpleTupleStreamTest::CreateIntBatch(), impala::RowBatchListTest::CreateRowBatch(), impala::DataStreamTest::CreateRowBatch(), impala::SimpleTupleStreamTest::CreateStringBatch(), impala::UnionNode::EvalAndMaterializeExprs(), impala::HdfsScanner::GetMemory(), impala::HBaseScanNode::GetNext(), impala::TopNNode::GetNext(), impala::ExchangeNode::GetNext(), impala::DataSourceScanNode::GetNext(), impala::HashJoinNode::GetNext(), impala::AggregationNode::GetNext(), impala::SortedRunMerger::GetNext(), impala::Sorter::Run::GetNext(), impala::PartitionedAggregationNode::GetNext(), impala::AnalyticEvalNode::GetNextOutputBatch(), impala::PartitionedHashJoinNode::OutputNullAwareNullProbe(), impala::PartitionedHashJoinNode::OutputNullAwareProbeRows(), impala::PartitionedHashJoinNode::OutputUnmatchedBuild(), impala::PartitionedHashJoinNode::ProcessProbeBatch(), and impala::HdfsScanner::WriteEmptyTuples().
|
inline |
Add n rows of tuple pointers after the last committed row and return its index. The rows are uninitialized and each tuple of the row must be set. Returns INVALID_ROW_INDEX if the row batch cannot fit n rows. Two consecutive AddRow() calls without a CommitLastRow() between them have the same effect as a single call.
Definition at line 94 of file row-batch.h.
References capacity_, has_in_flight_row_, INVALID_ROW_INDEX, and num_rows_.
Referenced by AddRow(), impala::BufferedTupleStream::GetNextInternal(), impala::CrossJoinNode::ProcessLeftChildBatch(), impala::HashJoinNode::ProcessProbeBatch(), and impala::HdfsScanner::WriteEmptyTuples().
void impala::RowBatch::AddTupleStream | ( | BufferedTupleStream * | stream | ) |
Add tuple stream to this row batch. The row batch must call Close() on the stream when freeing resources.
Definition at line 218 of file row-batch.cc.
References auxiliary_mem_usage_, impala::BufferedTupleStream::byte_size(), and tuple_streams_.
Referenced by impala::PartitionedHashJoinNode::Partition::Close().
|
inline |
Returns true if the row batch has filled all the rows or has accumulated enough memory.
Definition at line 120 of file row-batch.h.
References AT_CAPACITY_MEM_USAGE, auxiliary_mem_usage_, capacity_, need_to_return_, num_rows_, and num_tuple_streams().
Referenced by AtCapacity(), impala::HdfsScanner::CommitRows(), impala::SelectNode::CopyRows(), impala::UnionNode::EvalAndMaterializeExprs(), impala::SelectNode::GetNext(), impala::UnionNode::GetNext(), impala::HBaseScanNode::GetNext(), impala::TopNNode::GetNext(), impala::ExchangeNode::GetNext(), impala::CrossJoinNode::GetNext(), impala::DataSourceScanNode::GetNext(), impala::HashJoinNode::GetNext(), impala::AggregationNode::GetNext(), impala::SortedRunMerger::GetNext(), impala::PartitionedHashJoinNode::GetNext(), impala::Sorter::Run::GetNext(), impala::PartitionedAggregationNode::GetNext(), impala::ExchangeNode::GetNextMerging(), impala::AnalyticEvalNode::GetNextOutputBatch(), impala::HashJoinNode::LeftJoinGetNext(), impala::PartitionedHashJoinNode::NextProbeRowBatch(), impala::PartitionedHashJoinNode::NextSpilledProbeRowBatch(), impala::PartitionedHashJoinNode::OutputNullAwareNullProbe(), impala::PartitionedHashJoinNode::OutputNullAwareProbeRows(), impala::PartitionedHashJoinNode::OutputUnmatchedBuild(), impala::PartitionedHashJoinNode::ProcessProbeBatch(), and impala::HdfsScanner::WriteEmptyTuples().
Returns true if the row batch has filled all the rows or has accumulated enough memory. tuple_pool is an intermediate memory pool containing tuple data that will eventually be attached to this row batch. We need to make sure the tuple pool does not accumulate excessive memory.
Definition at line 129 of file row-batch.h.
References AT_CAPACITY_MEM_USAGE, AtCapacity(), num_rows_, and impala::MemPool::total_allocated_bytes().
|
inline |
Definition at line 216 of file row-batch.h.
References capacity_.
Referenced by impala::HdfsScanner::CommitRows(), impala::HdfsScanner::GetMemory(), impala::ExchangeNode::GetNext(), impala::CrossJoinNode::GetNext(), impala::BufferedTupleStream::GetNextInternal(), impala::AnalyticEvalNode::GetNextOutputBatch(), impala::HashJoinNode::LeftJoinGetNext(), impala::PartitionedHashJoinNode::OutputUnmatchedBuild(), impala::PartitionedHashJoinNode::ProcessProbeBatch(), and impala::HdfsScanner::WriteEmptyTuples().
|
inline |
Definition at line 187 of file row-batch.h.
References num_tuples_per_row_.
Referenced by impala::ExchangeNode::GetNext().
|
inline |
Definition at line 109 of file row-batch.h.
References CommitRows().
Referenced by impala::SelectNode::CopyRows(), impala::SimpleTupleStreamTest::CreateIntBatch(), impala::RowBatchListTest::CreateRowBatch(), impala::DataStreamTest::CreateRowBatch(), impala::SimpleTupleStreamTest::CreateStringBatch(), impala::UnionNode::EvalAndMaterializeExprs(), impala::HBaseScanNode::GetNext(), impala::TopNNode::GetNext(), impala::ExchangeNode::GetNext(), impala::DataSourceScanNode::GetNext(), impala::HashJoinNode::GetNext(), impala::AggregationNode::GetNext(), impala::SortedRunMerger::GetNext(), impala::Sorter::Run::GetNext(), impala::PartitionedAggregationNode::GetNext(), impala::AnalyticEvalNode::GetNextOutputBatch(), impala::PartitionedHashJoinNode::OutputNullAwareNullProbe(), impala::PartitionedHashJoinNode::OutputNullAwareProbeRows(), and impala::HdfsScanner::WriteEmptyTuples().
|
inline |
Definition at line 102 of file row-batch.h.
References capacity_, has_in_flight_row_, and num_rows_.
Referenced by CommitLastRow(), impala::HdfsScanner::CommitRows(), impala::PartitionedHashJoinNode::GetNext(), impala::BufferedTupleStream::GetNextInternal(), impala::PartitionedHashJoinNode::OutputUnmatchedBuild(), impala::CrossJoinNode::ProcessLeftChildBatch(), impala::HashJoinNode::ProcessProbeBatch(), and impala::HdfsScanner::WriteEmptyTuples().
Definition at line 173 of file row-batch.h.
References num_tuples_per_row_.
Referenced by impala::SelectNode::CopyRows(), impala::TopNNode::GetNext(), impala::ExchangeNode::GetNext(), impala::PartitionedHashJoinNode::OutputNullAwareNullProbe(), impala::PartitionedHashJoinNode::OutputNullAwareProbeRows(), impala::PartitionedHashJoinNode::OutputUnmatchedBuild(), impala::HashJoinNode::ProcessProbeBatch(), and impala::PartitionedHashJoinNode::ProcessProbeBatch().
|
inline |
Copy 'num_rows' rows from 'src' to 'dest' within the batch. Useful for exec nodes that skip an offset and copied more than necessary.
Definition at line 179 of file row-batch.h.
References capacity_, num_tuples_per_row_, and tuple_ptrs_.
Referenced by impala::SortNode::GetNext(), and impala::ExchangeNode::GetNextMerging().
|
static |
Utility function: returns total size of batch.
Definition at line 264 of file row-batch.cc.
Referenced by impala::DataStreamRecvr::SenderQueue::AddBatch(), Serialize(), and impala::DataStreamSender::SerializeBatch().
|
inline |
Definition at line 140 of file row-batch.h.
References has_in_flight_row_, num_rows_, num_tuples_per_row_, and tuple_ptrs_.
Referenced by impala::Sorter::Run::AddBatch(), impala::HBaseTableWriter::AppendRowBatch(), impala::HdfsSequenceTableWriter::AppendRowBatch(), impala::HdfsTextTableWriter::AppendRowBatch(), impala::HdfsParquetTableWriter::AppendRowBatch(), impala::HdfsAvroTableWriter::AppendRowBatch(), impala::SelectNode::CopyRows(), impala::SimpleTupleStreamTest::CreateIntBatch(), impala::RowBatchListTest::CreateRowBatch(), impala::DataStreamTest::CreateRowBatch(), impala::SimpleTupleStreamTest::CreateStringBatch(), impala::SortedRunMerger::BatchedRowSupplier::current_row(), impala::UnionNode::EvalAndMaterializeExprs(), impala::HdfsScanner::GetMemory(), impala::HBaseScanNode::GetNext(), impala::TopNNode::GetNext(), impala::ExchangeNode::GetNext(), impala::DataSourceScanNode::GetNext(), impala::HashJoinNode::GetNext(), impala::AggregationNode::GetNext(), impala::SortedRunMerger::GetNext(), impala::Sorter::Run::GetNext(), impala::PartitionedAggregationNode::GetNext(), impala::DataStreamTest::GetNextBatch(), impala::BufferedTupleStream::GetNextInternal(), impala::AnalyticEvalNode::GetNextOutputBatch(), impala::PlanFragmentExecutor::OpenInternal(), impala::PartitionedHashJoinNode::OutputNullAwareNullProbe(), impala::PartitionedHashJoinNode::OutputNullAwareProbeRows(), impala::PartitionedHashJoinNode::OutputUnmatchedBuild(), impala::PrintBatch(), impala::PartitionedAggregationNode::ProcessBatch(), impala::PartitionedAggregationNode::ProcessBatchNoGrouping(), impala::HashJoinNode::ProcessBuildBatch(), impala::PartitionedHashJoinNode::ProcessBuildBatch(), impala::CrossJoinNode::ProcessLeftChildBatch(), impala::HashJoinNode::ProcessProbeBatch(), impala::PartitionedHashJoinNode::ProcessProbeBatch(), impala::AggregationNode::ProcessRowBatchNoGrouping(), impala::AggregationNode::ProcessRowBatchWithGrouping(), impala::DataStreamTest::ReadStream(), impala::DataStreamTest::ReadStreamMerging(), impala::SimpleTupleStreamTest::ReadValues(), RowBatch(), impala::DataStreamSender::Send(), impala::HdfsTableSink::Send(), Serialize(), impala::TEST_F(), impala::SimpleTupleStreamTest::TestIntValuesInterleaved(), impala::SimpleTupleStreamTest::TestValues(), TotalByteSize(), and impala::HdfsScanner::WriteEmptyTuples().
|
inline |
Called to indicate this row batch must be returned up the operator tree. This is used to control memory management for streaming rows. TODO: consider using this mechanism instead of AddIoBuffer/AddTupleStream. This is the property we need rather than meticulously passing resources up so the operator tree.
Definition at line 167 of file row-batch.h.
References need_to_return_.
Referenced by impala::PartitionedAggregationNode::GetNext(), and impala::BufferedTupleStream::GetNextInternal().
int impala::RowBatch::MaxTupleBufferSize | ( | ) |
Computes the maximum size needed to store tuple data for this row batch.
Definition at line 325 of file row-batch.cc.
References AT_CAPACITY_MEM_USAGE, capacity_, impala::RowDescriptor::GetRowSize(), num_rows(), and row_desc_.
Referenced by impala::UnionNode::GetNext(), impala::HBaseScanNode::GetNext(), and impala::DataSourceScanNode::GetNext().
|
inline |
Definition at line 149 of file row-batch.h.
References io_buffers_.
Referenced by impala::ExecNode::RowBatchQueue::Cleanup(), impala::CrossJoinNode::ConstructBuildSide(), and impala::HdfsScanNode::GetNextInternal().
|
inline |
Definition at line 215 of file row-batch.h.
References num_rows_.
Referenced by impala::Sorter::Run::AddBatch(), impala::RowBatchList::AddRowBatch(), impala::HBaseTableWriter::AppendRowBatch(), impala::HdfsSequenceTableWriter::AppendRowBatch(), impala::HdfsTextTableWriter::AppendRowBatch(), impala::HdfsParquetTableWriter::AppendRowBatch(), impala::HdfsAvroTableWriter::AppendRowBatch(), impala::HdfsScanner::CommitRows(), impala::HdfsScanner::GetMemory(), impala::SortNode::GetNext(), impala::CrossJoinNode::GetNext(), impala::PlanFragmentExecutor::GetNext(), impala::BufferedTupleStream::GetNextInternal(), impala::HdfsScanNode::GetNextInternal(), impala::ExchangeNode::GetNextMerging(), impala::HashJoinNode::LeftJoinGetNext(), MaxTupleBufferSize(), impala::SortedRunMerger::BatchedRowSupplier::Next(), impala::PlanFragmentExecutor::OpenInternal(), impala::PartitionedHashJoinNode::OutputUnmatchedBuild(), impala::PrintBatch(), impala::PartitionedAggregationNode::ProcessBatch(), impala::PartitionedAggregationNode::ProcessBatchNoGrouping(), impala::HashJoinNode::ProcessBuildBatch(), impala::PartitionedHashJoinNode::ProcessBuildBatch(), impala::CrossJoinNode::ProcessLeftChildBatch(), impala::HashJoinNode::ProcessProbeBatch(), impala::PartitionedHashJoinNode::ProcessProbeBatch(), impala::AggregationNode::ProcessRowBatchNoGrouping(), impala::AggregationNode::ProcessRowBatchWithGrouping(), impala::DataStreamTest::ReadStream(), impala::DataStreamTest::ReadStreamMerging(), impala::SimpleTupleStreamTest::ReadValues(), impala::HBaseTableSink::Send(), impala::DataStreamSender::Send(), impala::HdfsTableSink::Send(), impala::DataStreamSender::SerializeBatch(), set_num_rows(), impala::TEST_F(), impala::SimpleTupleStreamTest::TestIntValuesInterleaved(), impala::SimpleTupleStreamTest::TestValues(), and impala::HdfsScanner::WriteEmptyTuples().
|
inline |
void impala::RowBatch::Reset | ( | ) |
Resets the row batch, returning all resources it has accumulated.
Definition at line 224 of file row-batch.cc.
References auxiliary_mem_usage_, has_in_flight_row_, io_buffers_, mem_tracker_, need_to_return_, num_rows_, tuple_data_pool_, tuple_ptrs_, tuple_ptrs_size_, and tuple_streams_.
Referenced by impala::TopNNode::Open(), impala::DataStreamTest::ReadStreamMerging(), impala::SimpleTupleStreamTest::ReadValues(), impala::SortNode::SortInput(), impala::SimpleTupleStreamTest::TestIntValuesInterleaved(), impala::SimpleTupleStreamTest::TestValues(), and TransferResourceOwnership().
|
inline |
Definition at line 147 of file row-batch.h.
References num_tuples_per_row_.
Referenced by impala::TupleRow::next_row(), impala::HdfsScanner::next_row(), impala::HdfsSequenceScanner::ProcessDecompressedBlock(), impala::CrossJoinNode::ProcessLeftChildBatch(), impala::HashJoinNode::ProcessProbeBatch(), and impala::HdfsTextScanner::WriteFields().
|
inline |
Definition at line 218 of file row-batch.h.
References row_desc_.
Referenced by impala::Sorter::Run::AddBatch(), impala::ExchangeNode::GetNext(), impala::BufferedTupleStream::GetNextInternal(), and impala::PrintBatch().
int impala::RowBatch::Serialize | ( | TRowBatch * | output_batch | ) |
Create a serialized version of this row batch in output_batch, attaching all of the data it references to output_batch.tuple_data. output_batch.tuple_data will be snappy-compressed unless the compressed data is larger than the uncompressed data. Use output_batch.is_compressed to determine whether tuple_data is compressed. If an in-flight row is present in this row batch, it is ignored. This function does not Reset(). Returns the uncompressed serialized size (this will be the true size of output_batch if tuple_data is actually uncompressed).
Definition at line 147 of file row-batch.cc.
References compression_scratch_, impala::Codec::CreateCompressor(), impala::Tuple::DeepCopy(), GetBatchSize(), impala::Status::GetDetail(), GetRow(), impala::TupleRow::GetTuple(), LIKELY, num_rows_, num_tuples_per_row_, offset, impala::Status::ok(), row_desc_, TotalByteSize(), impala::RowDescriptor::ToThrift(), impala::RowDescriptor::tuple_descriptors(), and VLOG_ROW.
Referenced by impala::DataStreamSender::SerializeBatch().
|
inline |
Set function can be used to reduce the number of rows in the batch. This is only used in the limit case where more rows were added than necessary.
Definition at line 113 of file row-batch.h.
References num_rows(), and num_rows_.
Referenced by impala::SortNode::GetNext(), impala::HdfsScanNode::GetNextInternal(), and impala::ExchangeNode::GetNextMerging().
int impala::RowBatch::TotalByteSize | ( | ) |
The total size of all data represented in this row batch (tuples and referenced string data). This is the size of the row batch after removing all gaps in the auxiliary (i.e. the smallest footprint for the row batch).
Definition at line 303 of file row-batch.cc.
References GetRow(), impala::Tuple::GetStringSlot(), impala::TupleRow::GetTuple(), impala::Tuple::IsNull(), impala::StringValue::len, num_rows_, row_desc_, and impala::RowDescriptor::tuple_descriptors().
Referenced by Serialize().
void impala::RowBatch::TransferResourceOwnership | ( | RowBatch * | dest | ) |
Transfer ownership of resources to dest. This includes tuple data in mem pool and io buffers.
Definition at line 243 of file row-batch.cc.
References auxiliary_mem_usage_, impala::DiskIoMgr::BufferDescriptor::buffer_len(), io_buffers_, mem_tracker_, need_to_return_, Reset(), impala::DiskIoMgr::BufferDescriptor::SetMemTracker(), tuple_data_pool_, tuple_ptrs_, and tuple_streams_.
Referenced by impala::SortedRunMerger::BatchedRowSupplier::Next(), and impala::DataStreamRecvr::TransferAllResources().
|
inline |
Definition at line 148 of file row-batch.h.
References tuple_data_pool_.
Referenced by impala::HdfsScanner::AttachPool(), impala::SimpleTupleStreamTest::CreateIntBatch(), impala::RowBatchListTest::CreateRowBatch(), impala::DataStreamTest::CreateRowBatch(), impala::SimpleTupleStreamTest::CreateStringBatch(), impala::UnionNode::EvalAndMaterializeExprs(), impala::HdfsScanner::GetMemory(), impala::UnionNode::GetNext(), impala::HBaseScanNode::GetNext(), impala::DataSourceScanNode::GetNext(), impala::AggregationNode::GetNext(), impala::SortedRunMerger::GetNext(), impala::AnalyticEvalNode::GetNext(), impala::PartitionedAggregationNode::GetNext(), impala::ScannerContext::Stream::ReleaseCompletedResources(), and impala::HdfsScanner::StartNewRowBatch().
|
static |
Max memory that this row batch can accumulate in tuple_data_pool_ before it is considered at capacity.
Definition at line 222 of file row-batch.h.
Referenced by AtCapacity(), impala::Sorter::EstimateMergeMem(), and MaxTupleBufferSize().
|
private |
Sum of all auxiliary bytes. This includes IoBuffers and memory from TransferResourceOwnership().
Definition at line 246 of file row-batch.h.
Referenced by AcquireState(), AddIoBuffer(), AddTupleStream(), AtCapacity(), Reset(), and TransferResourceOwnership().
|
private |
Definition at line 234 of file row-batch.h.
Referenced by AcquireState(), AddRows(), AtCapacity(), capacity(), CommitRows(), CopyRows(), MaxTupleBufferSize(), and RowBatch().
|
private |
String to write compressed tuple data to in Serialize(). This is a string so we can swap() with the string in the TRowBatch we're serializing to (we don't compress directly into the TRowBatch in case the compressed data is longer than the uncompressed data). Swapping avoids copying data to the TRowBatch and avoids excess memory allocations: since we reuse RowBatchs and TRowBatchs, and assuming all row batches are roughly the same size, all strings will eventually be allocated to the right size.
Definition at line 270 of file row-batch.h.
Referenced by Serialize().
|
private |
All members below need to be handled in RowBatch::AcquireState()
Definition at line 232 of file row-batch.h.
Referenced by AcquireState(), AddRows(), CommitRows(), GetRow(), and Reset().
|
static |
Definition at line 87 of file row-batch.h.
Referenced by AddRows(), impala::SelectNode::CopyRows(), impala::UnionNode::EvalAndMaterializeExprs(), impala::CrossJoinNode::ProcessLeftChildBatch(), impala::HashJoinNode::ProcessProbeBatch(), and impala::HdfsScanner::WriteEmptyTuples().
|
private |
IO buffers current owned by this row batch. Ownership of IO buffers transfer between row batches. Any IO buffer will be owned by at most one row batch (i.e. they are not ref counted) so most row batches don't own any.
Definition at line 258 of file row-batch.h.
Referenced by AcquireState(), AddIoBuffer(), num_io_buffers(), Reset(), TransferResourceOwnership(), and ~RowBatch().
|
private |
Definition at line 228 of file row-batch.h.
Referenced by AcquireState(), AddIoBuffer(), Reset(), RowBatch(), and TransferResourceOwnership().
|
private |
If true, this batch is considered at capacity. This is explicitly set by streaming components that return rows via row batches.
Definition at line 250 of file row-batch.h.
Referenced by AcquireState(), AtCapacity(), MarkNeedToReturn(), Reset(), and TransferResourceOwnership().
|
private |
Definition at line 233 of file row-batch.h.
Referenced by AcquireState(), AddRows(), AtCapacity(), CommitRows(), GetRow(), num_rows(), Reset(), RowBatch(), Serialize(), set_num_rows(), and TotalByteSize().
|
private |
Definition at line 236 of file row-batch.h.
Referenced by AcquireState(), ClearRow(), CopyRow(), CopyRows(), GetRow(), row_byte_size(), RowBatch(), and Serialize().
|
private |
Definition at line 237 of file row-batch.h.
Referenced by AcquireState(), MaxTupleBufferSize(), row_desc(), RowBatch(), Serialize(), and TotalByteSize().
|
private |
holding (some of the) data referenced by rows
Definition at line 253 of file row-batch.h.
Referenced by AcquireState(), Reset(), RowBatch(), TransferResourceOwnership(), tuple_data_pool(), and ~RowBatch().
|
private |
array of pointers (w/ capacity_ * num_tuples_per_row_ elements) TODO: replace w/ tr1 array?
Definition at line 241 of file row-batch.h.
Referenced by AcquireState(), CopyRows(), GetRow(), Reset(), RowBatch(), and TransferResourceOwnership().
|
private |
Definition at line 242 of file row-batch.h.
Referenced by AcquireState(), Reset(), and RowBatch().
|
private |
Tuple streams currently owned by this row batch.
Definition at line 261 of file row-batch.h.
Referenced by AcquireState(), AddTupleStream(), num_tuple_streams(), Reset(), TransferResourceOwnership(), and ~RowBatch().