Overview of Impala Databases
In Impala, a database is a logical container for a group of tables. Each database defines a separate namespace. Within a database, you can refer to the tables inside it using their unqualified names. Different databases can contain tables with identical names.
Creating a database is a lightweight operation. There are minimal database-specific properties to configure,
only LOCATION
and COMMENT
. There is no ALTER DATABASE
statement.
Typically, you create a separate database for each project or application, to avoid naming conflicts between
tables and to make clear which tables are related to each other. The USE
statement lets
you switch between databases. Unqualified references to tables, views, and functions refer to objects
within the current database. You can also refer to objects in other databases by using qualified names
of the form dbname.object_name
.
Each database is physically represented by a directory in HDFS. When you do not specify a LOCATION
attribute, the directory is located in the Impala data directory with the associated tables managed by Impala.
When you do specify a LOCATION
attribute, any read and write operations for tables in that
database are relative to the specified HDFS directory.
There is a special database, named default
, where you begin when you connect to Impala.
Tables created in default
are physically located one level higher in HDFS than all the
user-created databases.
_impala_builtins
, that serves as the location
for the built-in functions. To see the built-in
functions, use a statement like the following:
show functions in _impala_builtins;
show functions in _impala_builtins like '*substring*';
Related statements:
CREATE DATABASE Statement, DROP DATABASE Statement, USE Statement, SHOW DATABASES