The DISTINCT
operator in a SELECT
statement filters the
result set to remove duplicates.
-- Returns the unique values from one column.
-- NULL is included in the set of values if any rows have a NULL in this column.
SELECT DISTINCT c_birth_country FROM customer;
-- Returns the unique combinations of values from multiple columns.
SELECT DISTINCT c_salutation, c_last_name FROM customer;
You can use DISTINCT
in combination with an aggregation function,
typically COUNT()
, to find how many different values a column contains.
-- Counts the unique values from one column.
-- NULL is not included as a distinct value in the count.
SELECT COUNT(DISTINCT c_birth_country) FROM customer;
-- Counts the unique combinations of values from multiple columns.
SELECT COUNT(DISTINCT c_salutation, c_last_name) FROM customer;
Zero-length strings: For purposes of clauses such as DISTINCT
and GROUP BY
, Impala considers zero-length strings
(""
), NULL
, and space to all be different values.
In contrast with some database systems that always return DISTINCT
values in sorted order, Impala does not do any ordering of DISTINCT
values. Always include an ORDER BY
clause if you need the values in
alphabetical or numeric sorted order.