Impala
Impalaistheopensource,nativeanalyticdatabaseforApacheHadoop.
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros
impala::StreamingSampler< T, MAX_SAMPLES > Class Template Reference

#include <streaming-sampler.h>

Collaboration diagram for impala::StreamingSampler< T, MAX_SAMPLES >:

Public Member Functions

 StreamingSampler (int initial_period=500)
 
 StreamingSampler (int period, const std::vector< T > &initial_samples)
 Initialize the sampler with values. More...
 
void AddSample (T sample, int ms)
 
const T * GetSamples (int *num_samples, int *period, SpinLock **lock=NULL) const
 
void SetSamples (int period, const std::vector< T > &samples)
 Set the underlying data to period/samples. More...
 
std::string DebugString (const std::string &prefix="") const
 

Private Attributes

SpinLock lock_
 
samples_ [MAX_SAMPLES]
 
int samples_collected_
 Number of samples collected <= MAX_SAMPLES. More...
 
int period_
 Storage period in ms. More...
 
current_sample_sum_
 The sum of input samples that makes up the next stored sample. More...
 
int current_sample_count_
 The number of input samples that contribute to current_sample_sum_. More...
 
int current_sample_total_time_
 The total time that current_sample_sum_ represents. More...
 

Detailed Description

template<typename T, int MAX_SAMPLES>
class impala::StreamingSampler< T, MAX_SAMPLES >

A fixed-size sampler to collect samples over time. AddSample should be called periodically with the sampled value. Samples are added at the max resolution possible. When the sample buffer is full, the current samples are collapsed and the collection period is doubled. The input period and the streaming sampler period do not need to match, the streaming sampler will average values. T is the type of the sample and must be a native numerical type (e.g. int or float).

Definition at line 32 of file streaming-sampler.h.

Constructor & Destructor Documentation

template<typename T, int MAX_SAMPLES>
impala::StreamingSampler< T, MAX_SAMPLES >::StreamingSampler ( int  initial_period = 500)
inline

Definition at line 34 of file streaming-sampler.h.

template<typename T, int MAX_SAMPLES>
impala::StreamingSampler< T, MAX_SAMPLES >::StreamingSampler ( int  period,
const std::vector< T > &  initial_samples 
)
inline

Initialize the sampler with values.

Definition at line 43 of file streaming-sampler.h.

Member Function Documentation

template<typename T, int MAX_SAMPLES>
void impala::StreamingSampler< T, MAX_SAMPLES >::AddSample ( sample,
int  ms 
)
inline

Add a sample to the sampler. 'ms' is the time elapsed since the last time this was called. The input value is accumulated into current_*. If the total time elapsed in current_sample_total_time_ is higher than the storage period, the value is stored. 'sample' should be interpreted as a representative sample from (now - ms, now]. TODO: we can make this more complex by taking a weighted average of samples accumulated in a period.

collapse the samples in half by averaging them and doubling the storage period

Definition at line 61 of file streaming-sampler.h.

Referenced by impala::RuntimeProfile::TimeSeriesCounter::AddSample(), and impala::TEST().

template<typename T, int MAX_SAMPLES>
std::string impala::StreamingSampler< T, MAX_SAMPLES >::DebugString ( const std::string &  prefix = "") const
inline

Definition at line 111 of file streaming-sampler.h.

template<typename T, int MAX_SAMPLES>
const T* impala::StreamingSampler< T, MAX_SAMPLES >::GetSamples ( int *  num_samples,
int *  period,
SpinLock **  lock = NULL 
) const
inline

Get the samples collected. Returns the number of samples and the period they were collected at. If lock is non-null, the lock will be taken before returning. The caller must unlock it.

Definition at line 88 of file streaming-sampler.h.

Referenced by impala::RuntimeProfile::TimeSeriesCounter::ToThrift(), and impala::ValidateSampler().

template<typename T, int MAX_SAMPLES>
void impala::StreamingSampler< T, MAX_SAMPLES >::SetSamples ( int  period,
const std::vector< T > &  samples 
)
inline

Set the underlying data to period/samples.

Definition at line 99 of file streaming-sampler.h.

Member Data Documentation

template<typename T, int MAX_SAMPLES>
int impala::StreamingSampler< T, MAX_SAMPLES >::current_sample_count_
private

The number of input samples that contribute to current_sample_sum_.

Definition at line 141 of file streaming-sampler.h.

Referenced by impala::StreamingSampler< int64_t, 64 >::AddSample(), and impala::StreamingSampler< int64_t, 64 >::SetSamples().

template<typename T, int MAX_SAMPLES>
T impala::StreamingSampler< T, MAX_SAMPLES >::current_sample_sum_
private

The sum of input samples that makes up the next stored sample.

Definition at line 138 of file streaming-sampler.h.

Referenced by impala::StreamingSampler< int64_t, 64 >::AddSample(), and impala::StreamingSampler< int64_t, 64 >::SetSamples().

template<typename T, int MAX_SAMPLES>
int impala::StreamingSampler< T, MAX_SAMPLES >::current_sample_total_time_
private

The total time that current_sample_sum_ represents.

Definition at line 144 of file streaming-sampler.h.

Referenced by impala::StreamingSampler< int64_t, 64 >::AddSample(), and impala::StreamingSampler< int64_t, 64 >::SetSamples().

template<typename T, int MAX_SAMPLES>
T impala::StreamingSampler< T, MAX_SAMPLES >::samples_[MAX_SAMPLES]
private

The documentation for this class was generated from the following file: