Impala Type Conversion Functions
Conversion functions are usually used in combination with other functions, to explicitly pass the expected
data types. Impala has strict rules regarding data types for function parameters. For example, Impala does
not automatically convert a DOUBLE
value to FLOAT
, a
BIGINT
value to INT
, or other conversion where precision could be lost or
overflow could occur. Also, for reporting or dealing with loosely defined schemas in big data contexts,
you might frequently need to convert values to or from the STRING
type.
SHOW FUNCTIONS
output for
database _IMPALA_BUILTINS
contains some function signatures
matching the pattern castto*
, these functions are not intended
for public use and are expected to be hidden in future.
Function reference:
Impala supports the following type conversion functions:
-
cast(expr AS type)
-
Purpose: Converts the value of an expression to any other type.
If the expression value is of a type that cannot be converted to the target type, the result is
NULL
.Usage notes: Use
CAST
when passing a column value or literal to a function that expects a parameter with a different type. Frequently used in SQL operations such asCREATE TABLE AS SELECT
andINSERT ... VALUES
to ensure that values from various sources are of the appropriate type for the destination columns. Where practical, do a one-timeCAST()
operation during the ingestion process to make each column into the appropriate type, rather than using manyCAST()
operations in each query; doing type conversions for each row during each query can be expensive for tables with millions or billions of rows.The way this function deals with time zones when converting to or from
TIMESTAMP
values is affected by the--use_local_tz_for_unix_timestamp_conversions
startup flag for the impalad daemon. See TIMESTAMP Data Type for details about how Impala handles time zone considerations for theTIMESTAMP
data type.Examples:
select concat('Here are the first ',10,' results.'); -- Fails select concat('Here are the first ',cast(10 as string),' results.'); -- Succeeds
The following example starts with a text table where every column has a type of
STRING
, which might be how you ingest data of unknown schema until you can verify the cleanliness of the underly values. Then it usesCAST()
to create a new Parquet table with the same data, but using specific numeric data types for the columns with numeric data. Using numeric types of appropriate sizes can result in substantial space savings on disk and in memory, and performance improvements in queries, over using strings or larger-than-necessary numeric types.create table t1 (name string, x string, y string, z string); create table t2 stored as parquet as select name, cast(x as bigint) x, cast(y as timestamp) y, cast(z as smallint) z from t1; describe t2; +------+----------+---------+ | name | type | comment | +------+----------+---------+ | name | string | | | x | bigint | | | y | smallint | | | z | tinyint | | +------+----------+---------+
Related information:
For details of casts from each kind of data type, see the description of the appropriate type: TINYINT Data Type, SMALLINT Data Type, INT Data Type, BIGINT Data Type, FLOAT Data Type, DOUBLE Data Type, DECIMAL Data Type (Impala 3.0 or higher only), STRING Data Type, CHAR Data Type (Impala 2.0 or higher only), VARCHAR Data Type (Impala 2.0 or higher only), TIMESTAMP Data Type, BOOLEAN Data Type
-
typeof(type value)
-
Purpose: Returns the name of the data type corresponding to an expression. For types with
extra attributes, such as length for
CHAR
andVARCHAR
, or precision and scale forDECIMAL
, includes the full specification of the type.Return type:
string
Usage notes: Typically used in interactive exploration of a schema, or in application code that programmatically generates schema definitions such as
CREATE TABLE
statements. For example, previously, to understand the type of an expression such ascol1 / col2
orconcat(col1, col2, col3)
, you might have created a dummy table with a single row, using syntax such asCREATE TABLE foo AS SELECT 5 / 3.0
, and then doing aDESCRIBE
to see the type of the row. Or you might have done aCREATE TABLE AS SELECT
operation to create a table and copy data into it, only learning the types of the columns by doing aDESCRIBE
afterward. This technique is especially useful for arithmetic expressions involvingDECIMAL
types, because the precision and scale of the result is typically different than that of the operands.Added in: Impala 2.3.0
Examples:
These examples show how to check the type of a simple literal or function value. Notice how adding even tiny integers together changes the data type of the result to avoid overflow, and how the results of arithmetic operations on
DECIMAL
values have specific precision and scale attributes.select typeof(2) +-----------+ | typeof(2) | +-----------+ | TINYINT | +-----------+ select typeof(2+2) +---------------+ | typeof(2 + 2) | +---------------+ | SMALLINT | +---------------+ select typeof('xyz') +---------------+ | typeof('xyz') | +---------------+ | STRING | +---------------+ select typeof(now()) +---------------+ | typeof(now()) | +---------------+ | TIMESTAMP | +---------------+ select typeof(5.3 / 2.1) +-------------------+ | typeof(5.3 / 2.1) | +-------------------+ | DECIMAL(6,4) | +-------------------+ select typeof(5.30001 / 2342.1); +--------------------------+ | typeof(5.30001 / 2342.1) | +--------------------------+ | DECIMAL(13,11) | +--------------------------+ select typeof(typeof(2+2)) +-----------------------+ | typeof(typeof(2 + 2)) | +-----------------------+ | STRING | +-----------------------+
This example shows how even if you do not have a record of the type of a column, for example because the type was changed by
ALTER TABLE
after the originalCREATE TABLE
, you can still find out the type in a more compact form than examining the fullDESCRIBE
output. Remember to useLIMIT 1
in such cases, to avoid an identical result value for every row in the table.create table typeof_example (a int, b tinyint, c smallint, d bigint); /* Empty result set if there is no data in the table. */ select typeof(a) from typeof_example; /* OK, now we have some data but the type of column A is being changed. */ insert into typeof_example values (1, 2, 3, 4); alter table typeof_example change a a bigint; /* We can always find out the current type of that column without doing a full DESCRIBE. */ select typeof(a) from typeof_example limit 1; +-----------+ | typeof(a) | +-----------+ | BIGINT | +-----------+
This example shows how you might programmatically generate a
CREATE TABLE
statement with the appropriate column definitions to hold the result values of arbitrary expressions. Thetypeof()
function lets you construct a detailedCREATE TABLE
statement without actually creating the table, as opposed toCREATE TABLE AS SELECT
operations where you create the destination table but only learn the column data types afterward throughDESCRIBE
.describe typeof_example; +------+----------+---------+ | name | type | comment | +------+----------+---------+ | a | bigint | | | b | tinyint | | | c | smallint | | | d | bigint | | +------+----------+---------+ /* An ETL or business intelligence tool might create variations on a table with different file formats, different sets of columns, and so on. TYPEOF() lets an application introspect the types of the original columns. */ select concat('create table derived_table (a ', typeof(a), ', b ', typeof(b), ', c ', typeof(c), ', d ', typeof(d), ') stored as parquet;') as 'create table statement' from typeof_example limit 1; +-------------------------------------------------------------------------------------------+ | create table statement | +-------------------------------------------------------------------------------------------+ | create table derived_table (a BIGINT, b TINYINT, c SMALLINT, d BIGINT) stored as parquet; | +-------------------------------------------------------------------------------------------+