Impala
Impalaistheopensource,nativeanalyticdatabaseforApacheHadoop.
|
The underlying memory management is done by the BufferedBlockMgr. More...
#include <buffered-tuple-stream.h>
Classes | |
struct | RowIdx |
Public Member Functions | |
BufferedTupleStream (RuntimeState *state, const RowDescriptor &row_desc, BufferedBlockMgr *block_mgr, BufferedBlockMgr::Client *client, bool use_initial_small_buffers=true, bool delete_on_read=false, bool read_write=false) | |
Status | Init (RuntimeProfile *profile=NULL, bool pinned=true) |
Status | SwitchToIoBuffers (bool *got_buffer) |
bool | AddRow (TupleRow *row, uint8_t **dst=NULL) |
uint8_t * | AllocateRow (int size) |
void | GetTupleRow (const RowIdx &idx, TupleRow *row) const |
Status | PrepareForRead (bool *got_buffer=NULL) |
Status | PinStream (bool already_reserved, bool *pinned) |
Status | UnpinStream (bool all=false) |
Status | GetNext (RowBatch *batch, bool *eos, std::vector< RowIdx > *indices=NULL) |
Status | GetRows (boost::scoped_ptr< RowBatch > *batch, bool *got_rows) |
void | Close () |
Must be called once at the end to cleanup all resources. Idempotent. More... | |
Status | status () const |
int64_t | num_rows () const |
Number of rows in the stream. More... | |
int64_t | rows_returned () const |
Number of rows returned via GetNext(). More... | |
int64_t | byte_size () const |
Returns the byte size necessary to store the entire stream in memory. More... | |
int64_t | bytes_in_mem (bool ignore_current) const |
bool | is_pinned () const |
int | blocks_pinned () const |
int | blocks_unpinned () const |
bool | has_read_block () const |
bool | has_write_block () const |
bool | using_small_buffers () const |
std::string | DebugString () const |
template<bool HasNullableTuple> | |
Status | GetNextInternal (RowBatch *batch, bool *eos, vector< RowIdx > *indices) |
Private Member Functions | |
template<bool HasNullableTuple> | |
bool | DeepCopyInternal (TupleRow *row, uint8_t **dst) |
bool | DeepCopy (TupleRow *row, uint8_t **dst) |
Wrapper of the templated DeepCopyInternal() function. More... | |
Status | NewBlockForWrite (int min_size, bool *got_block) |
Status | NextBlockForRead () |
int | ComputeRowSize (TupleRow *row) const |
Returns the byte size of this row when encoded in a block. More... | |
Status | UnpinBlock (BufferedBlockMgr::Block *block) |
Unpins block if it is an io sized block and updates tracking stats. More... | |
template<bool HasNullableTuple> | |
Status | GetNextInternal (RowBatch *batch, bool *eos, std::vector< RowIdx > *indices) |
Templated GetNext implementation. More... | |
int | ComputeNumNullIndicatorBytes (int block_size) const |
Computes the number of bytes needed for null indicators for a block of 'block_size'. More... | |
Private Attributes | |
bool | use_small_buffers_ |
If true, this stream is still using small buffers. More... | |
const bool | delete_on_read_ |
If true, blocks are deleted after they are read. More... | |
const bool | read_write_ |
RuntimeState *const | state_ |
Runtime state instance used to check for cancellation. Not owned. More... | |
const RowDescriptor & | desc_ |
Description of rows stored in the stream. More... | |
const bool | nullable_tuple_ |
Whether any tuple in the rows is nullable. More... | |
int | fixed_tuple_row_size_ |
Sum of the fixed length portion of all the tuples in desc_. More... | |
uint32_t | null_indicators_read_block_ |
uint32_t | null_indicators_write_block_ |
std::vector< std::pair< int, std::vector< SlotDescriptor * > > > | string_slots_ |
Vector of all the strings slots grouped by tuple_idx. More... | |
BufferedBlockMgr * | block_mgr_ |
Block manager and client used to allocate, pin and release blocks. Not owned. More... | |
BufferedBlockMgr::Client * | block_mgr_client_ |
std::list < BufferedBlockMgr::Block * > | blocks_ |
List of blocks in the stream. More... | |
int64_t | total_byte_size_ |
Total size of blocks_, including small blocks. More... | |
std::list < BufferedBlockMgr::Block * > ::iterator | read_block_ |
std::vector< uint8_t * > | block_start_idx_ |
uint8_t * | read_ptr_ |
Current ptr offset in read_block_'s buffer. More... | |
uint32_t | read_tuple_idx_ |
Current idx of the tuple read from the read_block_ buffer. More... | |
uint32_t | write_tuple_idx_ |
Current idx of the tuple written at the write_block_ buffer. More... | |
int64_t | read_bytes_ |
Bytes read in read_block_. More... | |
int64_t | rows_returned_ |
Number of rows returned to the caller from GetNext(). More... | |
int | read_block_idx_ |
The block index of the current read block. More... | |
BufferedBlockMgr::Block * | write_block_ |
The current block for writing. NULL if there is no available block to write to. More... | |
int | num_pinned_ |
int | num_small_blocks_ |
The total number of small blocks in blocks_;. More... | |
bool | closed_ |
Status | status_ |
int64_t | num_rows_ |
Number of rows stored in the stream. More... | |
bool | pinned_ |
RuntimeProfile::Counter * | pin_timer_ |
Counters added by this object to the parent runtime profile. More... | |
RuntimeProfile::Counter * | unpin_timer_ |
RuntimeProfile::Counter * | get_new_block_timer_ |
The underlying memory management is done by the BufferedBlockMgr.
Class that provides an abstraction for a stream of tuple rows. Rows can be added to the stream and returned. Rows are returned in the order they are added.The tuple stream consists of a number of small (less than io sized blocks) before an arbitrary number of io sized blocks. The smaller blocks do not spill and are there to lower the minimum buffering requirements. For example, an operator that needs to maintain 64 streams (1 buffer per partition) would need, by default, 64 * 8MB = 512MB of buffering. A query with 5 of these operators would require 2.56 GB just to run any query, regardless of how much of that is used. This is problematic for small queries. Instead we will start with a fixed number of small buffers and only start using IO sized buffers when those fill up. The small buffers never spill. The stream will not automatically switch from using small buffers to io sized buffers. The BufferedTupleStream is not thread safe from the caller's point of view. It is expected that all the APIs are called from a single thread. Internally, the object is thread safe wrt to the underlying block mgr. Buffer management: The stream is either pinned or unpinned, set via PinStream() and UnpinStream(). Blocks are optionally deleted as they are read, set with the delete_on_read c'tor parameter. Block layout: At the header of each block, starting at position 0, there is a bitstring with null indicators for all the tuples in each row in the block. Then there are the tuple rows. We further optimize the codepaths when we know that no tuple is nullable, indicated by 'nullable_tuple_'. Tuple row layout: Tuples are stored back to back. Each tuple starts with the fixed length portion, directly followed by the var len portion. (Fixed len and var len are interleaved). If any tuple in the row is nullable, then there is a bitstring of null tuple indicators at the header of the block. The order of bits in the null indicators bitstring corresponds to the order of tuples in the block. The NULL tuples are not stored in the body of the block, only as set bits in the null indicators bitsting. The behavior of reads and writes is as follows: Read:
Definition at line 109 of file buffered-tuple-stream.h.
BufferedTupleStream::BufferedTupleStream | ( | RuntimeState * | state, |
const RowDescriptor & | row_desc, | ||
BufferedBlockMgr * | block_mgr, | ||
BufferedBlockMgr::Client * | client, | ||
bool | use_initial_small_buffers = true , |
||
bool | delete_on_read = false , |
||
bool | read_write = false |
||
) |
row_desc: description of rows stored in the stream. This is the desc for rows that are added and the rows being returned. block_mgr: Underlying block mgr that owns the data blocks. delete_on_read: Blocks are deleted after they are read. use_initial_small_buffers: If true, the initial N buffers allocated for the tuple stream use smaller than io sized buffers. read_write: Stream allows interchanging read and write operations. Requires at least two blocks may be pinned.
Definition at line 43 of file buffered-tuple-stream.cc.
References blocks_, impala::TupleDescriptor::byte_size(), desc_, fixed_tuple_row_size_, null_indicators_read_block_, null_indicators_write_block_, read_block_, impala::TupleDescriptor::string_slots(), string_slots_, and impala::RowDescriptor::tuple_descriptors().
Adds a single row to the stream. Returns false if an error occurred. BufferedTupleStream will do a deep copy of the memory in the row. *dst is the ptr to the memory (in the underlying block) that this row was copied to.
Definition at line 25 of file buffered-tuple-stream.inline.h.
References closed_, ComputeRowSize(), DeepCopy(), LIKELY, NewBlockForWrite(), impala::Status::ok(), and status_.
Referenced by impala::PartitionedHashJoinNode::AppendRow(), impala::PartitionedHashJoinNode::AppendRowStreamFull(), impala::PartitionedAggregationNode::ProcessBatch(), impala::PartitionedHashJoinNode::ProcessBuildBatch(), impala::PartitionedHashJoinNode::ProcessProbeBatch(), and impala::PartitionedAggregationNode::Partition::Spill().
|
inline |
Allocates space to store a row of size 'size'. Returns NULL if there is not enough memory. The returned memory is guaranteed to fit on one block.
Definition at line 34 of file buffered-tuple-stream.inline.h.
References impala::BufferedBlockMgr::Block::AddRow(), impala::BufferedBlockMgr::Block::Allocate(), impala::BufferedBlockMgr::Block::BytesRemaining(), closed_, impala::BufferedBlockMgr::Block::is_pinned(), NewBlockForWrite(), num_rows_, impala::Status::ok(), status_, UNLIKELY, and write_block_.
Referenced by impala::PartitionedAggregationNode::ConstructIntermediateTuple().
|
inline |
Definition at line 244 of file buffered-tuple-stream.h.
References num_pinned_.
Referenced by impala::PartitionedHashJoinNode::NodeDebugString(), and impala::PartitionedHashJoinNode::ProcessBuildInput().
|
inline |
Definition at line 245 of file buffered-tuple-stream.h.
References blocks_, num_pinned_, and num_small_blocks_.
Referenced by PinStream().
|
inline |
Returns the byte size necessary to store the entire stream in memory.
Definition at line 237 of file buffered-tuple-stream.h.
References total_byte_size_.
Referenced by impala::RowBatch::AddTupleStream().
int64_t BufferedTupleStream::bytes_in_mem | ( | bool | ignore_current | ) | const |
Returns the byte size of the stream that is currently pinned in memory. If ignore_current is true, the write_block_ memory is not included.
Definition at line 156 of file buffered-tuple-stream.cc.
References blocks_, and write_block_.
void BufferedTupleStream::Close | ( | ) |
Must be called once at the end to cleanup all resources. Idempotent.
Definition at line 145 of file buffered-tuple-stream.cc.
References blocks_, closed_, num_pinned_, and NumPinned().
Referenced by impala::PartitionedHashJoinNode::BuildHashTables(), impala::PartitionedHashJoinNode::Close(), impala::PartitionedHashJoinNode::ProcessBuildInput(), and impala::PartitionedAggregationNode::ProcessStream().
|
private |
Computes the number of bytes needed for null indicators for a block of 'block_size'.
Definition at line 415 of file buffered-tuple-stream.cc.
References desc_, fixed_tuple_row_size_, nullable_tuple_, impala::BitUtil::RoundUpNumi64(), and impala::RowDescriptor::tuple_descriptors().
Referenced by NewBlockForWrite(), NextBlockForRead(), and PrepareForRead().
|
private |
Returns the byte size of this row when encoded in a block.
Definition at line 595 of file buffered-tuple-stream.cc.
References fixed_tuple_row_size_, impala::Tuple::GetStringSlot(), impala::TupleRow::GetTuple(), impala::Tuple::IsNull(), impala::StringValue::len, impala::SlotDescriptor::null_indicator_offset(), nullable_tuple_, string_slots_, and impala::SlotDescriptor::tuple_offset().
Referenced by AddRow().
string BufferedTupleStream::DebugString | ( | ) | const |
Definition at line 93 of file buffered-tuple-stream.cc.
References blocks_, closed_, delete_on_read_, num_pinned_, num_rows_, pinned_, read_block_, rows_returned_, and write_block_.
Referenced by DeepCopyInternal(), GetNextInternal(), and NextBlockForRead().
Wrapper of the templated DeepCopyInternal() function.
Definition at line 22 of file buffered-tuple-stream-ir.cc.
References nullable_tuple_.
Referenced by AddRow().
|
private |
Copies 'row' into write_block_. Returns false if there is not enough space in 'write_block_'. *dst is the ptr to the memory (in the underlying write block) where this row was copied to.
Definition at line 32 of file buffered-tuple-stream-ir.cc.
References impala::BufferedBlockMgr::Block::AddRow(), impala::BufferedBlockMgr::Block::Allocate(), impala::BufferedBlockMgr::Block::buffer(), impala::BufferedBlockMgr::Block::BytesRemaining(), impala::BufferedBlockMgr::Block::DebugString(), DebugString(), desc_, fixed_tuple_row_size_, impala::Tuple::GetStringSlot(), impala::TupleRow::GetTuple(), impala::BufferedBlockMgr::Block::is_pinned(), impala::Tuple::IsNull(), impala::StringValue::len, LIKELY, impala::SlotDescriptor::null_indicator_offset(), null_indicators_write_block_, num_rows_, impala::StringValue::ptr, impala::BufferedBlockMgr::Block::ReturnAllocation(), string_slots_, impala::RowDescriptor::tuple_descriptors(), impala::SlotDescriptor::tuple_offset(), UNLIKELY, write_block_, and write_tuple_idx_.
Status BufferedTupleStream::GetNext | ( | RowBatch * | batch, |
bool * | eos, | ||
std::vector< RowIdx > * | indices = NULL |
||
) |
Get the next batch of output rows. Memory is still owned by the BufferedTupleStream and must be copied out by the caller. If 'indices' is non-NULL, that is also populated for each returned row with the index for that row.
Definition at line 447 of file buffered-tuple-stream.cc.
References nullable_tuple_.
Referenced by GetRows(), impala::PartitionedHashJoinNode::NextSpilledProbeRowBatch(), impala::PartitionedHashJoinNode::OutputNullAwareNullProbe(), impala::PartitionedHashJoinNode::ProcessBuildInput(), impala::PartitionedAggregationNode::ProcessStream(), and impala::SimpleTupleStreamTest::ReadValues().
|
private |
Templated GetNext implementation.
Status impala::BufferedTupleStream::GetNextInternal | ( | RowBatch * | batch, |
bool * | eos, | ||
vector< RowIdx > * | indices | ||
) |
Definition at line 457 of file buffered-tuple-stream.cc.
References impala::RowBatch::AddRows(), blocks_, impala::RowBatch::capacity(), closed_, impala::RowBatch::CommitRows(), DebugString(), delete_on_read_, desc_, impala::RowDescriptor::Equals(), fixed_tuple_row_size_, impala::RowBatch::GetRow(), impala::Tuple::GetStringSlot(), impala::TupleRow::GetTuple(), is_pinned(), impala::Tuple::IsNull(), impala::StringValue::len, impala::RowBatch::MarkNeedToReturn(), NextBlockForRead(), impala::SlotDescriptor::null_indicator_offset(), null_indicators_read_block_, impala::RowBatch::num_rows(), num_rows_, impala::Status::OK, pinned_, impala::StringValue::ptr, read_block_, read_block_idx_, read_bytes_, read_ptr_, read_tuple_idx_, RETURN_IF_ERROR, impala::RowBatch::row_desc(), rows_returned_, impala::TupleRow::SetTuple(), string_slots_, impala::RowDescriptor::tuple_descriptors(), impala::SlotDescriptor::tuple_offset(), and UNLIKELY.
Returns all the rows in the stream in batch. This pins the entire stream in the process. *got_rows is false if the stream could not be pinned.
Definition at line 431 of file buffered-tuple-stream.cc.
References block_mgr_, block_mgr_client_, desc_, impala::BufferedBlockMgr::get_tracker(), GetNext(), num_rows(), impala::Status::OK, PinStream(), PrepareForRead(), and RETURN_IF_ERROR.
Referenced by impala::PartitionedHashJoinNode::EvaluateNullProbe().
Populates 'row' with the row at 'idx'. The stream must be pinned. The row must have been allocated with the stream's row desc.
Definition at line 49 of file buffered-tuple-stream.inline.h.
References impala::BufferedTupleStream::RowIdx::block(), block_start_idx_, blocks_, closed_, delete_on_read_, desc_, impala::BufferedTupleStream::RowIdx::idx(), is_pinned(), nullable_tuple_, impala::BufferedTupleStream::RowIdx::offset(), impala::TupleRow::SetTuple(), and impala::RowDescriptor::tuple_descriptors().
Referenced by impala::HashTable::GetRow().
|
inline |
Definition at line 246 of file buffered-tuple-stream.h.
References blocks_, and read_block_.
|
inline |
Definition at line 247 of file buffered-tuple-stream.h.
References write_block_.
Referenced by impala::PartitionedHashJoinNode::BuildHashTables(), and impala::PartitionedAggregationNode::ProcessBatch().
Status BufferedTupleStream::Init | ( | RuntimeProfile * | profile = NULL , |
bool | pinned = true |
||
) |
Initializes the tuple stream object. Must be called once before any of the other APIs. If pinned, the tuple stream starts of pinned, otherwise it is unpinned. If profile is non-NULL, counters are created.
Definition at line 116 of file buffered-tuple-stream.cc.
References ADD_TIMER, block_mgr_, block_mgr_client_, fixed_tuple_row_size_, get_new_block_timer_, INITIAL_BLOCK_SIZES, impala::BufferedBlockMgr::max_block_size(), impala::BufferedBlockMgr::MemLimitTooLowError(), NewBlockForWrite(), impala::Status::OK, pin_timer_, PrepareForRead(), read_write_, RETURN_IF_ERROR, unpin_timer_, UnpinStream(), use_small_buffers_, and write_block_.
Referenced by impala::PartitionedHashJoinNode::Prepare().
|
inline |
Definition at line 243 of file buffered-tuple-stream.h.
References pinned_.
Referenced by impala::PartitionedHashJoinNode::BuildHashTables(), GetNextInternal(), GetTupleRow(), and impala::PartitionedAggregationNode::ProcessBatch().
Gets a new block from the block_mgr_, updating write_block_ and write_tuple_idx_, and setting *got_block. If there are no blocks available, *got_block is set to false and write_block_ is unchanged. min_size is the minimum number of bytes required for this block.
Definition at line 178 of file buffered-tuple-stream.cc.
References impala::BufferedBlockMgr::Block::Allocate(), block_mgr_, block_mgr_client_, block_start_idx_, blocks_, impala::BufferedBlockMgr::Block::buffer(), closed_, ComputeNumNullIndicatorBytes(), get_new_block_timer_, impala::BufferedBlockMgr::GetNewBlock(), INITIAL_BLOCK_SIZES, impala::BufferedBlockMgr::Block::is_max_size(), impala::BufferedBlockMgr::Block::is_pinned(), impala::BufferedBlockMgr::max_block_size(), null_indicators_write_block_, num_pinned_, impala::BufferedBlockMgr::Block::num_rows(), NUM_SMALL_BLOCKS, num_small_blocks_, NumPinned(), impala::Status::OK, pinned_, impala::PrettyPrinter::Print(), read_block_, RETURN_IF_ERROR, SCOPED_TIMER, total_byte_size_, use_small_buffers_, write_block_, and write_tuple_idx_.
Referenced by AddRow(), AllocateRow(), Init(), and SwitchToIoBuffers().
|
private |
Reads the next block from the block_mgr_. This blocks if necessary. Updates read_block_, read_ptr_, read_tuple_idx_ and read_bytes_.
Definition at line 248 of file buffered-tuple-stream.cc.
References block_mgr_, block_mgr_client_, blocks_, closed_, ComputeNumNullIndicatorBytes(), DebugString(), impala::BufferedBlockMgr::DebugString(), impala::BufferedBlockMgr::Block::Delete(), delete_on_read_, impala::BufferedBlockMgr::Block::is_max_size(), null_indicators_read_block_, num_pinned_, NumPinned(), impala::Status::OK, pin_timer_, pinned_, read_block_, read_block_idx_, read_bytes_, read_ptr_, read_tuple_idx_, RETURN_IF_ERROR, SCOPED_TIMER, unpin_timer_, UnpinBlock(), and write_block_.
Referenced by GetNextInternal().
|
inline |
Number of rows in the stream.
Definition at line 231 of file buffered-tuple-stream.h.
References num_rows_.
Referenced by impala::PartitionedHashJoinNode::BuildHashTables(), impala::PartitionedHashJoinNode::CleanUpHashPartitions(), impala::PartitionedHashJoinNode::EvaluateNullProbe(), GetRows(), impala::PartitionedHashJoinNode::LargestSpilledPartition(), impala::PartitionedHashJoinNode::NextSpilledProbeRowBatch(), impala::PartitionedHashJoinNode::NodeDebugString(), impala::PartitionedHashJoinNode::PrepareNextPartition(), impala::PartitionedHashJoinNode::PrepareNullAwarePartition(), impala::PartitionedHashJoinNode::ProcessBuildInput(), impala::PartitionedHashJoinNode::ProcessProbeBatch(), and impala::PartitionedAggregationNode::ProcessStream().
Pins all blocks in this stream and switches to pinned mode. If there is not enough memory, *pinned is set to false and the stream is unmodified. If already_reserved is true, the caller has already made a reservation on block_mgr_client_ to pin the stream.
Definition at line 357 of file buffered-tuple-stream.cc.
References block_mgr_, block_mgr_client_, block_start_idx_, blocks_, blocks_unpinned(), closed_, impala::BufferedBlockMgr::DebugString(), delete_on_read_, num_pinned_, NumPinned(), impala::Status::OK, pin_timer_, pinned_, RETURN_IF_ERROR, SCOPED_TIMER, impala::BufferedBlockMgr::TryAcquireTmpReservation(), and VLOG_QUERY.
Referenced by GetRows().
Prepares the stream for reading. If read_write_, this does not need to be called in order to begin reading, otherwise this must be called after the last AddRow() and before GetNext(). If got_buffer is NULL, this function will fail (with a bad status) if no buffer is available. If got_buffer is non-null, this function will not fail on OOM and *got_buffer is true if a buffer was pinned.
Definition at line 314 of file buffered-tuple-stream.cc.
References blocks_, closed_, ComputeNumNullIndicatorBytes(), impala::BufferedBlockMgr::Block::is_pinned(), null_indicators_read_block_, num_pinned_, NumPinned(), impala::Status::OK, pin_timer_, pinned_, read_block_, read_block_idx_, read_bytes_, read_ptr_, read_tuple_idx_, read_write_, RETURN_IF_ERROR, rows_returned_, SCOPED_TIMER, UnpinBlock(), and write_block_.
Referenced by GetRows(), Init(), impala::PartitionedHashJoinNode::PrepareNextPartition(), impala::PartitionedHashJoinNode::PrepareNullAwareNullProbe(), impala::PartitionedHashJoinNode::PrepareNullAwarePartition(), impala::PartitionedHashJoinNode::ProcessBuildInput(), and impala::PartitionedAggregationNode::ProcessStream().
|
inline |
Number of rows returned via GetNext().
Definition at line 234 of file buffered-tuple-stream.h.
References rows_returned_.
Referenced by impala::PartitionedHashJoinNode::NextSpilledProbeRowBatch().
|
inline |
Returns the status of the stream. We don't want to return a more costly Status object on AddRow() which is way that API returns a bool.
Definition at line 228 of file buffered-tuple-stream.h.
References status_.
Referenced by impala::PartitionedHashJoinNode::AppendRowStreamFull(), impala::PartitionedAggregationNode::ProcessBatch(), impala::PartitionedHashJoinNode::ProcessBuildBatch(), impala::PartitionedHashJoinNode::ProcessProbeBatch(), and impala::PartitionedAggregationNode::Partition::Spill().
Must be called for streams using small buffers to switch to IO-sized buffers. TODO: this does not seem like the best mechanism.
Definition at line 136 of file buffered-tuple-stream.cc.
References block_mgr_, impala::BufferedBlockMgr::max_block_size(), NewBlockForWrite(), impala::Status::OK, use_small_buffers_, and write_block_.
Referenced by impala::PartitionedHashJoinNode::BuildHashTables().
|
private |
Unpins block if it is an io sized block and updates tracking stats.
Definition at line 168 of file buffered-tuple-stream.cc.
References blocks_, impala::BufferedBlockMgr::Block::is_max_size(), impala::BufferedBlockMgr::Block::is_pinned(), num_pinned_, NumPinned(), impala::Status::OK, RETURN_IF_ERROR, SCOPED_TIMER, impala::BufferedBlockMgr::Block::Unpin(), and unpin_timer_.
Referenced by NextBlockForRead(), PrepareForRead(), and UnpinStream().
Unpins stream. If all is true, all blocks are unpinned, otherwise all blocks except the write_block_ and read_block_ are unpinned.
Definition at line 396 of file buffered-tuple-stream.cc.
References blocks_, closed_, impala::BufferedBlockMgr::Block::is_pinned(), impala::Status::OK, pinned_, read_block_, read_write_, RETURN_IF_ERROR, SCOPED_TIMER, unpin_timer_, UnpinBlock(), and write_block_.
Referenced by impala::PartitionedHashJoinNode::CleanUpHashPartitions(), and Init().
|
inline |
Definition at line 248 of file buffered-tuple-stream.h.
References use_small_buffers_.
Referenced by impala::PartitionedHashJoinNode::BuildHashTables().
|
private |
Block manager and client used to allocate, pin and release blocks. Not owned.
Definition at line 288 of file buffered-tuple-stream.h.
Referenced by GetRows(), Init(), NewBlockForWrite(), NextBlockForRead(), PinStream(), and SwitchToIoBuffers().
|
private |
Definition at line 289 of file buffered-tuple-stream.h.
Referenced by GetRows(), Init(), NewBlockForWrite(), NextBlockForRead(), and PinStream().
|
private |
For each block in the stream, the buffer of the start of the block. This is only valid when the stream is pinned, giving random access to data in the stream. This is not maintained for delete_on_read_.
Definition at line 304 of file buffered-tuple-stream.h.
Referenced by GetTupleRow(), NewBlockForWrite(), and PinStream().
|
private |
List of blocks in the stream.
Definition at line 292 of file buffered-tuple-stream.h.
Referenced by blocks_unpinned(), BufferedTupleStream(), bytes_in_mem(), Close(), DebugString(), GetNextInternal(), GetTupleRow(), has_read_block(), NewBlockForWrite(), NextBlockForRead(), PinStream(), PrepareForRead(), UnpinBlock(), and UnpinStream().
|
private |
Definition at line 335 of file buffered-tuple-stream.h.
Referenced by AddRow(), AllocateRow(), Close(), DebugString(), GetNextInternal(), GetTupleRow(), NewBlockForWrite(), NextBlockForRead(), PinStream(), PrepareForRead(), and UnpinStream().
|
private |
If true, blocks are deleted after they are read.
Definition at line 257 of file buffered-tuple-stream.h.
Referenced by DebugString(), GetNextInternal(), GetTupleRow(), NextBlockForRead(), and PinStream().
|
private |
Description of rows stored in the stream.
Definition at line 268 of file buffered-tuple-stream.h.
Referenced by BufferedTupleStream(), ComputeNumNullIndicatorBytes(), DeepCopyInternal(), GetNextInternal(), GetRows(), and GetTupleRow().
|
private |
Sum of the fixed length portion of all the tuples in desc_.
Definition at line 274 of file buffered-tuple-stream.h.
Referenced by BufferedTupleStream(), ComputeNumNullIndicatorBytes(), ComputeRowSize(), DeepCopyInternal(), GetNextInternal(), and Init().
|
private |
Definition at line 350 of file buffered-tuple-stream.h.
Referenced by Init(), and NewBlockForWrite().
|
private |
Max size (in bytes) of null indicators bitstring in the current read and write blocks. If 0, it means that there is no need to store null indicators for this RowDesc. We calculate this value based on the block's size and the fixed_tuple_row_size_. When not 0, this value is also an upper bound for the number of (rows * tuples_per_row) in this block.
Definition at line 281 of file buffered-tuple-stream.h.
Referenced by BufferedTupleStream(), GetNextInternal(), NextBlockForRead(), and PrepareForRead().
|
private |
Definition at line 282 of file buffered-tuple-stream.h.
Referenced by BufferedTupleStream(), DeepCopyInternal(), and NewBlockForWrite().
|
private |
Whether any tuple in the rows is nullable.
Definition at line 271 of file buffered-tuple-stream.h.
Referenced by ComputeNumNullIndicatorBytes(), ComputeRowSize(), DeepCopy(), GetNext(), and GetTupleRow().
|
private |
Number of pinned blocks in blocks_, stored to avoid iterating over the list to compute bytes_in_mem and bytes_unpinned. This does not include small blocks.
Definition at line 330 of file buffered-tuple-stream.h.
Referenced by blocks_pinned(), blocks_unpinned(), Close(), DebugString(), NewBlockForWrite(), NextBlockForRead(), PinStream(), PrepareForRead(), and UnpinBlock().
|
private |
Number of rows stored in the stream.
Definition at line 339 of file buffered-tuple-stream.h.
Referenced by AllocateRow(), DebugString(), DeepCopyInternal(), GetNextInternal(), and num_rows().
|
private |
The total number of small blocks in blocks_;.
Definition at line 333 of file buffered-tuple-stream.h.
Referenced by blocks_unpinned(), and NewBlockForWrite().
|
private |
Counters added by this object to the parent runtime profile.
Definition at line 348 of file buffered-tuple-stream.h.
Referenced by Init(), NextBlockForRead(), PinStream(), and PrepareForRead().
|
private |
If true, this stream has been explicitly pinned by the caller. This changes the memory management of the stream. The blocks are not unpinned until the caller calls UnpinAllBlocks(). If false, only the write_block_ and/or read_block_ are pinned (both are if read_write_ is true).
Definition at line 345 of file buffered-tuple-stream.h.
Referenced by DebugString(), GetNextInternal(), is_pinned(), NewBlockForWrite(), NextBlockForRead(), PinStream(), PrepareForRead(), and UnpinStream().
|
private |
Iterator pointing to the current block for read. If read_write_, this is always a valid block, otherwise equal to list.end() until PrepareForRead() is called.
Definition at line 299 of file buffered-tuple-stream.h.
Referenced by BufferedTupleStream(), DebugString(), GetNextInternal(), has_read_block(), NewBlockForWrite(), NextBlockForRead(), PrepareForRead(), and UnpinStream().
|
private |
The block index of the current read block.
Definition at line 322 of file buffered-tuple-stream.h.
Referenced by GetNextInternal(), NextBlockForRead(), and PrepareForRead().
|
private |
Bytes read in read_block_.
Definition at line 316 of file buffered-tuple-stream.h.
Referenced by GetNextInternal(), NextBlockForRead(), and PrepareForRead().
|
private |
Current ptr offset in read_block_'s buffer.
Definition at line 307 of file buffered-tuple-stream.h.
Referenced by GetNextInternal(), NextBlockForRead(), and PrepareForRead().
|
private |
Current idx of the tuple read from the read_block_ buffer.
Definition at line 310 of file buffered-tuple-stream.h.
Referenced by GetNextInternal(), NextBlockForRead(), and PrepareForRead().
|
private |
If true, read and write operations may be interleaved. Otherwise all calls to AddRow() must occur before calling PrepareForRead() and subsequent calls to GetNext().
Definition at line 262 of file buffered-tuple-stream.h.
Referenced by Init(), PrepareForRead(), and UnpinStream().
|
private |
Number of rows returned to the caller from GetNext().
Definition at line 319 of file buffered-tuple-stream.h.
Referenced by DebugString(), GetNextInternal(), PrepareForRead(), and rows_returned().
|
private |
Runtime state instance used to check for cancellation. Not owned.
Definition at line 265 of file buffered-tuple-stream.h.
|
private |
Definition at line 336 of file buffered-tuple-stream.h.
Referenced by AddRow(), AllocateRow(), and status().
|
private |
Vector of all the strings slots grouped by tuple_idx.
Definition at line 285 of file buffered-tuple-stream.h.
Referenced by BufferedTupleStream(), ComputeRowSize(), DeepCopyInternal(), and GetNextInternal().
|
private |
Total size of blocks_, including small blocks.
Definition at line 295 of file buffered-tuple-stream.h.
Referenced by byte_size(), and NewBlockForWrite().
|
private |
Definition at line 349 of file buffered-tuple-stream.h.
Referenced by Init(), NextBlockForRead(), UnpinBlock(), and UnpinStream().
|
private |
If true, this stream is still using small buffers.
Definition at line 254 of file buffered-tuple-stream.h.
Referenced by Init(), NewBlockForWrite(), SwitchToIoBuffers(), and using_small_buffers().
|
private |
The current block for writing. NULL if there is no available block to write to.
Definition at line 325 of file buffered-tuple-stream.h.
Referenced by AllocateRow(), bytes_in_mem(), DebugString(), DeepCopyInternal(), has_write_block(), Init(), NewBlockForWrite(), NextBlockForRead(), PrepareForRead(), SwitchToIoBuffers(), and UnpinStream().
|
private |
Current idx of the tuple written at the write_block_ buffer.
Definition at line 313 of file buffered-tuple-stream.h.
Referenced by DeepCopyInternal(), and NewBlockForWrite().