Views are lightweight logical constructs that act as aliases for queries. You can specify a view name in a
query (a SELECT
statement or the SELECT
portion of an
INSERT
statement) where you would usually specify a table name.
A view lets you:
-- Take a complicated reporting query, plug it into a CREATE VIEW statement...
create view v1 as select c1, c2, avg(c3) from t1 group by c3 order by c1 desc limit 10;
-- ... and now you can produce the report with 1 line of code.
select * from v1;
create view v2 as select t1.c1, t1.c2, t2.c3 from t1 join t2 on (t1.id = t2.id);
-- This simple query is safer to embed in reporting applications than the longer query above.
-- The view definition can remain stable even if the structure of the underlying tables changes.
select c1, c2, c3 from v2;
create view average_price_by_category as select category, avg(price) as avg_price from products group by category;
create view expensive_categories as select category, avg_price from average_price_by_category order by avg_price desc limit 10000;
create view top_10_expensive_categories as select category, avg_price from expensive_categories limit 10;
This technique lets you build up several more or less granular variations of the same query, and switch
between them when appropriate.
-- The original tables might have cryptic names inherited from a legacy system.
create view action_items as select rrptsk as assignee, treq as due_date, dmisc as notes from vxy_t1_br;
-- You can leave original names for compatibility, build new applications using more intuitive ones.
select assignee, due_date, notes from action_items;
create table slow (x int, s string) stored as textfile;
create view report as select s from slow where x between 20 and 30;
-- Query is kind of slow due to inefficient table definition, but it works.
select * from report;
create table fast (s string) partitioned by (x int) stored as parquet;
-- ...Copy data from SLOW to FAST. Queries against REPORT view continue to work...
-- After changing the view definition, queries will be faster due to partitioning,
-- binary format, and compression in the new table.
alter view report as select s from fast where x between 20 and 30;
select * from report;
The SQL statements that configure views are CREATE VIEW Statement, ALTER VIEW Statement, and DROP VIEW Statement. You can specify view names when querying data (SELECT Statement) and copying data from one table to another (INSERT Statement). The WITH clause creates an inline view, that only exists for the duration of a single query.
[localhost:21000] > create view trivial as select * from customer;
[localhost:21000] > create view some_columns as select c_first_name, c_last_name, c_login from customer;
[localhost:21000] > select * from some_columns limit 5;
Query finished, fetching results ...
+--------------+-------------+---------+
| c_first_name | c_last_name | c_login |
+--------------+-------------+---------+
| Javier | Lewis | |
| Amy | Moses | |
| Latisha | Hamilton | |
| Michael | White | |
| Robert | Moran | |
+--------------+-------------+---------+
[localhost:21000] > create view ordered_results as select * from some_columns order by c_last_name desc, c_first_name desc limit 1000;
[localhost:21000] > select * from ordered_results limit 5;
Query: select * from ordered_results limit 5
Query finished, fetching results ...
+--------------+-------------+---------+
| c_first_name | c_last_name | c_login |
+--------------+-------------+---------+
| Thomas | Zuniga | |
| Sarah | Zuniga | |
| Norma | Zuniga | |
| Lloyd | Zuniga | |
| Lisa | Zuniga | |
+--------------+-------------+---------+
Returned 5 row(s) in 0.48s
The previous example uses descending order for ORDERED_RESULTS
because in the sample TPCD-H
data, there are some rows with empty strings for both C_FIRST_NAME
and
C_LAST_NAME
, making the lowest-ordered names unuseful in a sample query.
create view visitors_by_day as select day, count(distinct visitors) as howmany from web_traffic group by day;
create view top_10_days as select day, howmany from visitors_by_day order by howmany limit 10;
select * from top_10_days;
Usage notes:
DESCRIBE FORMATTED
statement,
which shows the query from the original CREATE VIEW
statement:
[localhost:21000] > create view v1 as select * from t1;
[localhost:21000] > describe formatted v1;
Query finished, fetching results ...
+------------------------------+------------------------------+------------+
| name | type | comment |
+------------------------------+------------------------------+------------+
| # col_name | data_type | comment |
| | NULL | NULL |
| x | int | None |
| y | int | None |
| s | string | None |
| | NULL | NULL |
| # Detailed Table Information | NULL | NULL |
| Database: | views | NULL |
| Owner: | doc_demo | NULL |
| CreateTime: | Mon Jul 08 15:56:27 EDT 2013 | NULL |
| LastAccessTime: | UNKNOWN | NULL |
| Protect Mode: | None | NULL |
| Retention: | 0 | NULL |
| Table Type: | VIRTUAL_VIEW | NULL |
| Table Parameters: | NULL | NULL |
| | transient_lastDdlTime | 1373313387 |
| | NULL | NULL |
| # Storage Information | NULL | NULL |
| SerDe Library: | null | NULL |
| InputFormat: | null | NULL |
| OutputFormat: | null | NULL |
| Compressed: | No | NULL |
| Num Buckets: | 0 | NULL |
| Bucket Columns: | [] | NULL |
| Sort Columns: | [] | NULL |
| | NULL | NULL |
| # View Information | NULL | NULL |
| View Original Text: | SELECT * FROM t1 | NULL |
| View Expanded Text: | SELECT * FROM t1 | NULL |
+------------------------------+------------------------------+------------+
Prior to Impala 1.4.0, it was not possible to use the CREATE TABLE LIKE
view_name
syntax. In Impala 1.4.0 and higher, you can create
a table with the same column definitions as a view using the CREATE TABLE
LIKE
technique. Although CREATE TABLE LIKE
normally inherits
the file format of the original table, a view has no underlying file format, so
CREATE TABLE LIKE view_name
produces a text table by
default. To specify a different file format, include a STORED AS
file_format
clause at the end of the CREATE TABLE
LIKE
statement.
Complex type considerations:
For tables containing complex type columns (ARRAY
,
STRUCT
, or MAP
), you typically use join queries to
refer to the complex values. You can use views to hide the join notation, making such
tables seem like traditional denormalized tables, and making those tables queryable by
business intelligence tools that do not have built-in support for those complex types.
See Accessing Complex Type Data in Flattened Form Using Views for details.
The STRAIGHT_JOIN
hint affects the join order of table references in
the query block containing the hint. It does not affect the join order of nested
queries, such as views, inline views, or WHERE
-clause subqueries. To
use this hint for performance tuning of complex queries, apply the hint to all query
blocks that need a fixed join order.
Restrictions:
You cannot insert into an Impala view. (In some database systems, this operation is allowed and inserts
rows into the base table.) You can use a view name on the right-hand side of an INSERT
statement, in the SELECT
part.
If a view applies to a partitioned table, any partition pruning considers the clauses on
both the original query and any additional WHERE
predicates in the
query that refers to the view. Prior to Impala 1.4, only the WHERE
clauses on the original query from the CREATE VIEW
statement were used
for partition pruning.
ORDER BY
clause without an additional LIMIT
clause
is ignored in any view definition. If you need to sort the entire result set from a
view, use an ORDER BY
clause in the SELECT
statement
that queries the view. You can still make a simple "top 10" report by combining the
ORDER BY
and LIMIT
clauses in the same view
definition:
[localhost:21000] > create table unsorted (x bigint);
[localhost:21000] > insert into unsorted values (1), (9), (3), (7), (5), (8), (4), (6), (2);
[localhost:21000] > create view sorted_view as select x from unsorted order by x;
[localhost:21000] > select x from sorted_view; -- ORDER BY clause in view has no effect.
+---+
| x |
+---+
| 1 |
| 9 |
| 3 |
| 7 |
| 5 |
| 8 |
| 4 |
| 6 |
| 2 |
+---+
[localhost:21000] > select x from sorted_view order by x; -- View query requires ORDER BY at outermost level.
+---+
| x |
+---+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
+---+
[localhost:21000] > create view top_3_view as select x from unsorted order by x limit 3;
[localhost:21000] > select x from top_3_view; -- ORDER BY and LIMIT together in view definition are preserved.
+---+
| x |
+---+
| 1 |
| 2 |
| 3 |
+---+
The TABLESAMPLE
clause of the SELECT
statement does
not apply to a table reference derived from a view, a subquery, or anything other than a
real base table. This clause only works for tables backed by HDFS or HDFS-like data
files, therefore it does not apply to Kudu or HBase tables.
Related statements: CREATE VIEW Statement, ALTER VIEW Statement, DROP VIEW Statement