Impala
Impalaistheopensource,nativeanalyticdatabaseforApacheHadoop.
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros
com.cloudera.impala.analysis.Analyzer.ValueTransferGraph Class Reference
Collaboration diagram for com.cloudera.impala.analysis.Analyzer.ValueTransferGraph:

Public Member Functions

void computeValueTransfers ()
 
void bulkUpdate (List< Pair< SlotId, SlotId >> mutualValueTransfers)
 
boolean hasValueTransfer (SlotId slotA, SlotId slotB)
 
boolean validate (StringBuilder expected, StringBuilder actual)
 

Private Member Functions

void mapSlots (List< Pair< SlotId, SlotId >> origValueTransfers, List< Pair< Integer, Integer >> coalescedValueTransfers, DisjointSet< Integer > graphPartitions)
 
void partitionValueTransfers (DisjointSet< SlotId > completeSubGraphs, List< Pair< SlotId, SlotId >> valueTransfers)
 

Private Attributes

final DisjointSet< SlotIdcompleteSubGraphs_ = new DisjointSet<SlotId>()
 
int[] coalescedSlots_
 
int nextCoalescedSlotId_ = 0
 
boolean[][] valueTransfer_
 

Detailed Description

Efficiently computes and stores the transitive closure of the value transfer graph of slots. After calling computeValueTransfers(), value transfers between slots can be queried via hasValueTransfer(). Value transfers can be uni-directional due to outer joins, inline views with a limit, or unions.

Definition at line 2320 of file Analyzer.java.

Member Function Documentation

void com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.bulkUpdate ( List< Pair< SlotId, SlotId >>  mutualValueTransfers)
inline

Bulk updates the value transfer graph based on the given list of new mutual value transfers. The first element of each pair must be an existing slot id already in the value transfer graph and the second element must be a new slot id not yet in the value transfer graph. In particular, this requirement means that no new complete subgraphs may be introduced by the new slots.

Definition at line 2456 of file Analyzer.java.

References com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.coalescedSlots_, and com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.valueTransfer_.

void com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.computeValueTransfers ( )
inline

Computes all direct and transitive value transfers based on the registered conjuncts of the form <slotref> = <slotref>. The high-level steps are:

  1. Identify complete subgraps based on bi-directional value transfers, and coalesce the slots of each complete subgraph into a single slot.
  2. Map the remaining uni-directional value transfers into the new slot domain.
  3. Identify the connected components of the uni-directional value transfers. This step partitions the value transfers into disjoint sets.
  4. Compute the transitive closure of each partition from (3) in the new slot domain separately. Hopefully, the partitions are small enough to afford the O(N^3) complexity of the brute-force transitive closure computation. The condensed graph is not transformed back into the original slot domain because of the potential performance penalty. Instead, hasValueTransfer() consults coalescedSlots_, valueTransfer_, and completeSubGraphs_ which can together determine any value transfer in the original slot domain in constant time.

Definition at line 2355 of file Analyzer.java.

References com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.coalescedSlots_, com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.completeSubGraphs_, com.cloudera.impala.analysis.Analyzer.globalState_, com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.mapSlots(), com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.nextCoalescedSlotId_, com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.partitionValueTransfers(), com.cloudera.impala.analysis.Analyzer.GlobalState.registeredValueTransfers, and com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.valueTransfer_.

boolean com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.hasValueTransfer ( SlotId  slotA,
SlotId  slotB 
)
inline

Returns true if slotA always has the same value as slotB or the tuple containing slotB is NULL.

Definition at line 2480 of file Analyzer.java.

References com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.coalescedSlots_, and com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.valueTransfer_.

Referenced by com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.validate().

void com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.mapSlots ( List< Pair< SlotId, SlotId >>  origValueTransfers,
List< Pair< Integer, Integer >>  coalescedValueTransfers,
DisjointSet< Integer >  graphPartitions 
)
inlineprivate

Maps the slots of the given origValueTransfers to the coalescedSlots_. For most queries the uni-directional value transfers only reference a fraction of the original slots, so we assign new slot ids as necessary to make them dense. Returns the new list of value transfers in coalescedValueTransfers. Also adds each new value transfer into the given graphPartitions (via union() on the slots).

Definition at line 2498 of file Analyzer.java.

References com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.coalescedSlots_, and com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.nextCoalescedSlotId_.

Referenced by com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.computeValueTransfers().

void com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.partitionValueTransfers ( DisjointSet< SlotId completeSubGraphs,
List< Pair< SlotId, SlotId >>  valueTransfers 
)
inlineprivate

Transforms the registered equality predicates of the form <slotref> = <slotref> into disjoint sets of slots with mutual value transfers (completeSubGraphs), and a list of remaining uni-directional value transfers (valueTransfers). Both completeSubGraphs and valueTransfers use the original slot ids.

For debugging: If completeSubGraphs is null, adds all value transfers including bi-directional ones into valueTransfers.

Definition at line 2529 of file Analyzer.java.

References com.cloudera.impala.analysis.Analyzer.ancestors_, com.cloudera.impala.analysis.Analyzer.GlobalState.conjuncts, com.cloudera.impala.analysis.JoinOperator.FULL_OUTER_JOIN, com.cloudera.impala.analysis.TableRef.getId(), com.cloudera.impala.analysis.TableRef.getJoinOp(), com.cloudera.impala.analysis.Analyzer.getTupleId(), com.cloudera.impala.analysis.Analyzer.globalState_, com.cloudera.impala.analysis.Analyzer.hasLimitOffsetClause_, com.cloudera.impala.analysis.JoinOperator.LEFT_ANTI_JOIN, com.cloudera.impala.analysis.JoinOperator.NULL_AWARE_LEFT_ANTI_JOIN, and com.cloudera.impala.analysis.JoinOperator.RIGHT_ANTI_JOIN.

Referenced by com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.computeValueTransfers(), and com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.validate().

boolean com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.validate ( StringBuilder  expected,
StringBuilder  actual 
)
inline

Debug utility to validate the correctness of the value transfer graph using a brute-force transitive closure algorithm. Returns true if this value transfer graph is identical to one computed with the brute-force method, false otherwise. Writes string representations of the expected and actual value transfer matrices into expected and actual, respectively.

Definition at line 2622 of file Analyzer.java.

References com.cloudera.impala.analysis.Analyzer.globalState_, com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.hasValueTransfer(), com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.partitionValueTransfers(), and com.cloudera.impala.analysis.Analyzer.GlobalState.registeredValueTransfers.

Member Data Documentation

final DisjointSet<SlotId> com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.completeSubGraphs_ = new DisjointSet<SlotId>()
private
int com.cloudera.impala.analysis.Analyzer.ValueTransferGraph.nextCoalescedSlotId_ = 0
private

The documentation for this class was generated from the following file: