Impala
Impalaistheopensource,nativeanalyticdatabaseforApacheHadoop.
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros
impala::KeyNormalizer Class Reference

Finally, we pad any remaining bytes of the key with zeroes. More...

#include <key-normalizer.h>

Collaboration diagram for impala::KeyNormalizer:

Public Member Functions

 KeyNormalizer (const std::vector< ExprContext * > &key_exprs_ctxs, int key_len, const std::vector< bool > &is_asc, const std::vector< bool > &nulls_first)
 
bool NormalizeKey (TupleRow *tuple_row, uint8_t *dst, int *key_idx_over_budget=NULL)
 

Static Private Member Functions

static bool WriteNullBit (uint8_t null_bit, uint8_t *value, uint8_t *dst, int *bytes_left)
 Returns true if we went over the max key size while writing the null bit. More...
 
template<typename ValueType >
static void StoreFinalValue (ValueType value, void *dst, bool is_asc)
 
template<typename IntType >
static void NormalizeInt (void *src, void *dst, bool is_asc)
 
template<typename FloatType , typename ResultType >
static void NormalizeFloat (void *src, void *dst, bool is_asc)
 
static void NormalizeTimestamp (uint8_t *src, uint8_t *dst, bool is_asc)
 
static bool WriteNormalizedKey (const ColumnType &type, bool is_asc, uint8_t *value, uint8_t *dst, int *bytes_left)
 
static bool NormalizeKeyColumn (const ColumnType &type, uint8_t null_bit, bool is_asc, uint8_t *value, uint8_t *dst, int *bytes_left)
 

Private Attributes

std::vector< ExprContext * > key_expr_ctxs_
 
int key_len_
 
std::vector< boolis_asc_
 
std::vector< boolnulls_first_
 

Detailed Description

Finally, we pad any remaining bytes of the key with zeroes.

Provides support for normalizing Impala expr values into a memcmp-able, fixed-length format. To normalize a key, we first write a null byte (0 if nulls_first, 1 otw), followed by the normalized form of the key. We invert the bytes of the key (excluding the null byte) if the key should be sorted in descending order. Further, for any multi-byte data types, we ensure that the most significant byte is first by converting to big endian. In addition to inverting descending keys and converting to big endian, here is how we normalize specific types: Integers: Invert the sign bit. Floats: Write out the inverted sign bit, followed by the exponent, followed by the fraction. If the float is negative, though, we need to invert both the exponent and fraction (since smaller number means greater actual value when negative). Conveniently, IEEE floating point numbers are already in the correct order. Timestamps: 32 bits for date: 23 bits for year, 4 bits for month, and 5 bits for day. 64 bits for time of day in nanoseconds. All numbers assumed unsigned. Strings: Write one character at a time with a null byte at the end (inverted if sort descending). Unlike other data types, we may write partial strings. NOTE: This assumes strings do not contain null characters. Booleans/Nulls: Left as-is.

Definition at line 52 of file key-normalizer.h.

Constructor & Destructor Documentation

impala::KeyNormalizer::KeyNormalizer ( const std::vector< ExprContext * > &  key_exprs_ctxs,
int  key_len,
const std::vector< bool > &  is_asc,
const std::vector< bool > &  nulls_first 
)
inline

Initializes the normalizer with the key exprs and length alloted to each normalized key.

Definition at line 56 of file key-normalizer.h.

Member Function Documentation

template<typename FloatType , typename ResultType >
void impala::KeyNormalizer::NormalizeFloat ( void *  src,
void *  dst,
bool  is_asc 
)
inlinestaticprivate

ResultType should be an integer type of the same size as FloatType, used to examine the bytes of the float.

Definition at line 56 of file key-normalizer.inline.h.

template<typename IntType >
void impala::KeyNormalizer::NormalizeInt ( void *  src,
void *  dst,
bool  is_asc 
)
inlinestaticprivate

Definition at line 46 of file key-normalizer.inline.h.

bool impala::KeyNormalizer::NormalizeKey ( TupleRow tuple_row,
uint8_t *  dst,
int *  key_idx_over_budget = NULL 
)
inline

Normalizes all keys and writes the value into dst. Returns true if we went over the max key size while writing the key. If the return value is true, then key_idx_over_budget will be set to the index of the key expr which went over. TODO: Handle non-nullable columns

Definition at line 162 of file key-normalizer.inline.h.

References is_asc_, key_expr_ctxs_, key_len_, NormalizeKeyColumn(), nulls_first_, and offset.

bool impala::KeyNormalizer::NormalizeKeyColumn ( const ColumnType type,
uint8_t  null_bit,
bool  is_asc,
uint8_t *  value,
uint8_t *  dst,
int *  bytes_left 
)
inlinestaticprivate

Normalizes a column by writing a NULL byte and then the normalized value. Updates bytes_left and returns true if we went over the max key size.

Definition at line 155 of file key-normalizer.inline.h.

References WriteNormalizedKey(), and WriteNullBit().

Referenced by NormalizeKey().

void impala::KeyNormalizer::NormalizeTimestamp ( uint8_t *  src,
uint8_t *  dst,
bool  is_asc 
)
inlinestaticprivate

Definition at line 73 of file key-normalizer.inline.h.

References impala::TimestampValue::date().

Referenced by WriteNormalizedKey().

template<typename ValueType >
void impala::KeyNormalizer::StoreFinalValue ( ValueType  value,
void *  dst,
bool  is_asc 
)
inlinestaticprivate

Stores the given value in the memory address given by dst, after converting to big endian and inverting the value if the sort is descending. Copy of 'value' intentional, we don't want to modify original.

Definition at line 39 of file key-normalizer.inline.h.

References impala::BitUtil::ToBigEndian().

bool impala::KeyNormalizer::WriteNormalizedKey ( const ColumnType type,
bool  is_asc,
uint8_t *  value,
uint8_t *  dst,
int *  bytes_left 
)
inlinestaticprivate
bool impala::KeyNormalizer::WriteNullBit ( uint8_t  null_bit,
uint8_t *  value,
uint8_t *  dst,
int *  bytes_left 
)
inlinestaticprivate

Returns true if we went over the max key size while writing the null bit.

Definition at line 29 of file key-normalizer.inline.h.

Referenced by NormalizeKeyColumn().

Member Data Documentation

std::vector<bool> impala::KeyNormalizer::is_asc_
private

Definition at line 102 of file key-normalizer.h.

Referenced by NormalizeKey().

std::vector<ExprContext*> impala::KeyNormalizer::key_expr_ctxs_
private

Definition at line 100 of file key-normalizer.h.

Referenced by NormalizeKey().

int impala::KeyNormalizer::key_len_
private

Definition at line 101 of file key-normalizer.h.

Referenced by NormalizeKey().

std::vector<bool> impala::KeyNormalizer::nulls_first_
private

Definition at line 103 of file key-normalizer.h.

Referenced by NormalizeKey().


The documentation for this class was generated from the following files: