Overview of Impala Databases
In Impala, a database is a logical container for a group of tables. Each database defines a separate namespace. Within a database, you can refer to the tables inside it using their unqualified names. Different databases can contain tables with identical names.
Creating a database is a lightweight operation. There are minimal database-specific properties to configure,
COMMENT. There is no
ALTER DATABASE statement.
Typically, you create a separate database for each project or application, to avoid naming conflicts between
tables and to make clear which tables are related to each other. The
USE statement lets
you switch between databases. Unqualified references to tables, views, and functions refer to objects
within the current database. You can also refer to objects in other databases by using qualified names
of the form
Each database is physically represented by a directory in HDFS. When you do not specify a
attribute, the directory is located in the Impala data directory with the associated tables managed by Impala.
When you do specify a
LOCATION attribute, any read and write operations for tables in that
database are relative to the specified HDFS directory.
There is a special database, named
default, where you begin when you connect to Impala.
Tables created in
default are physically located one level higher in HDFS than all the
_impala_builtins, that serves as the location for the built-in functions. To see the built-in functions, use a statement like the following:
show functions in _impala_builtins; show functions in _impala_builtins like '*substring*';