Impala
Impalaistheopensource,nativeanalyticdatabaseforApacheHadoop.
|
#include <hdfs-table-sink.h>
Public Member Functions | |
OutputPartition () | |
Public Attributes | |
std::string | final_hdfs_file_name_prefix |
std::string | current_file_name |
std::string | tmp_hdfs_dir_name |
std::string | tmp_hdfs_file_name_prefix |
std::string | partition_name |
key1=val1/key2=val2/ etc. Used to identify partitions to the metastore. More... | |
hdfsFS | hdfs_connection |
Connection to hdfs. More... | |
hdfsFile | tmp_hdfs_file |
Hdfs file at tmp_hdfs_file_name. More... | |
int64_t | num_rows |
Records number of rows appended to the current file in this partition. More... | |
int32_t | num_files |
Number of files created in this partition. More... | |
boost::scoped_ptr < HdfsTableWriter > | writer |
Table format specific writer functions. More... | |
const HdfsPartitionDescriptor * | partition_descriptor |
The descriptor for this partition. More... | |
Records the temporary and final Hdfs file name, the opened temporary Hdfs file, and the number of appended rows of an output partition.
Definition at line 40 of file hdfs-table-sink.h.
impala::OutputPartition::OutputPartition | ( | ) |
Definition at line 67 of file hdfs-table-sink.cc.
std::string impala::OutputPartition::current_file_name |
File name for current output, with sequence number appended. This is a temporary file that will get moved to a permanent location when we commit the insert. Path: <hdfs_base_dir>/<partition_values>/<unique_id_str>.<sequence number>="">
Definition at line 55 of file hdfs-table-sink.h.
Referenced by impala::HdfsTableSink::ClosePartitionFile(), impala::HdfsTableSink::CreateNewTmpFile(), impala::HdfsTableSink::GetFileBlockSize(), and impala::HdfsTableWriter::Write().
std::string impala::OutputPartition::final_hdfs_file_name_prefix |
In the below, <unique_id_str> is the unique ID passed to HdfsTableSink in string form. It is typically the fragment ID that owns the sink. Full path to root of the group of files that will be created for this partition. Each file will have a sequence number appended. A table writer may produce multiple files per partition. The root is either partition_descriptor->location (if non-empty, i.e. the partition has a custom location) or table_dir/partition_name/ Path: <root>/<unique_id_str>
Definition at line 49 of file hdfs-table-sink.h.
Referenced by impala::HdfsTableSink::BuildHdfsFileNames(), and impala::HdfsTableSink::CreateNewTmpFile().
hdfsFS impala::OutputPartition::hdfs_connection |
Connection to hdfs.
Definition at line 71 of file hdfs-table-sink.h.
Referenced by impala::HdfsTableSink::GetFileBlockSize(), impala::HdfsTableSink::InitOutputPartition(), and impala::HdfsTableWriter::Write().
int32_t impala::OutputPartition::num_files |
Number of files created in this partition.
Definition at line 80 of file hdfs-table-sink.h.
Referenced by impala::HdfsTableSink::BuildHdfsFileNames(), and impala::HdfsTableSink::CreateNewTmpFile().
int64_t impala::OutputPartition::num_rows |
Records number of rows appended to the current file in this partition.
Definition at line 77 of file hdfs-table-sink.h.
Referenced by impala::HdfsTextTableWriter::AppendRowBatch(), impala::HdfsParquetTableWriter::AppendRowBatch(), impala::HdfsTableSink::CreateNewTmpFile(), and impala::HdfsTableSink::FinalizePartitionFile().
const HdfsPartitionDescriptor* impala::OutputPartition::partition_descriptor |
The descriptor for this partition.
Definition at line 86 of file hdfs-table-sink.h.
Referenced by impala::HdfsTableSink::CreateNewTmpFile(), and impala::HdfsTableSink::InitOutputPartition().
std::string impala::OutputPartition::partition_name |
key1=val1/key2=val2/ etc. Used to identify partitions to the metastore.
Definition at line 68 of file hdfs-table-sink.h.
Referenced by impala::HdfsTableSink::BuildHdfsFileNames(), impala::HdfsTableSink::FinalizePartitionFile(), impala::HdfsTableSink::GetOutputPartition(), and impala::HdfsTableSink::InitOutputPartition().
std::string impala::OutputPartition::tmp_hdfs_dir_name |
Name of the temporary directory that files for this partition are staged to before the coordinator moves them to their permanent location once the query completes. Path: <base_table_dir/<staging_dir>/<unique_id>_dir/
Definition at line 60 of file hdfs-table-sink.h.
Referenced by impala::HdfsTableSink::BuildHdfsFileNames(), and impala::HdfsTableSink::GetOutputPartition().
hdfsFile impala::OutputPartition::tmp_hdfs_file |
Hdfs file at tmp_hdfs_file_name.
Definition at line 74 of file hdfs-table-sink.h.
Referenced by impala::HdfsTableSink::ClosePartitionFile(), impala::HdfsTableSink::CreateNewTmpFile(), impala::HdfsTableSink::FinalizePartitionFile(), and impala::HdfsTableWriter::Write().
std::string impala::OutputPartition::tmp_hdfs_file_name_prefix |
Base prefix for temporary files, to save building it every time a temporary file is created. Path: tmp_hdfs_dir_name/partition_name/<unique_id_str>
Definition at line 65 of file hdfs-table-sink.h.
Referenced by impala::HdfsTableSink::BuildHdfsFileNames(), and impala::HdfsTableSink::CreateNewTmpFile().
boost::scoped_ptr<HdfsTableWriter> impala::OutputPartition::writer |
Table format specific writer functions.
Definition at line 83 of file hdfs-table-sink.h.
Referenced by impala::HdfsTableSink::CreateNewTmpFile(), impala::HdfsTableSink::FinalizePartitionFile(), impala::HdfsTableSink::GetOutputPartition(), impala::HdfsTableSink::InitOutputPartition(), and impala::HdfsTableSink::Send().