Configuring Impala Delegation for Clients

When users submit Impala queries through a separate client application, such as Hue or a business intelligence tool, typically all requests are treated as coming from the same user. In Impala 1.2 and higher, Impala supports "delegation" where users whose names you specify can delegate the execution of a query to another user. The query runs with the privileges of the delegated user, not the original authenticated user.

Starting in Impala 3.1 and higher, you can delegate using groups. Instead of listing a large number of delegated users, you can create a group of those users and specify the delegated group name in the impalad startup option. The client sends the delegated user name, and Impala performs an authorization to see if the delegated user belongs to a delegated group.

The name of the delegated user is passed using the HiveServer2 protocol configuration property impala.doas.user when the client connects to Impala.

Currently, the delegation feature is available only for Impala queries submitted through application interfaces such as Hue and BI tools. For example, Impala cannot issue queries using the privileges of the HDFS user.

  • When the delegation is enabled in Impala, the Impala clients should take an extra caution to prevent unauthorized access for the delegate-able users.
  • Impala requires Apache Ranger on the cluster to enable delegation.

The delegation feature is enabled by the startup options for impalad: ‑‑authorized_proxy_user_config and ‑‑authorized_proxy_group_config.

The syntax for the options are:

  • The list of authorized users/groups are delimited with ;
  • The list of delegated users/groups are delimited with , by default.
  • Use the ‑‑authorized_proxy_user_config_delimiter startup option to override the default user delimiter (the comma character) to another character.
  • Use the ‑‑authorized_proxy_group_config_delimiter startup option to override the default group delimiter ( (the comma character) to another character.
  • Wildcard (*) is supported to delegated to any users or any groups, e.g. ‑‑authorized_proxy_group_config=hue=*. Make sure to use single quotes or escape characters to ensure that any * characters do not undergo wildcard expansion when specified in command-line arguments.

When you start Impala with the ‑‑authorized_proxy_user_config=authenticated_user=delegated_user or ‑‑authorized_proxy_group_config=authenticated_user=delegated_group option:

  • Authentication is based on the user on the left hand side (authenticated_user).
  • Authorization is based on the right hand side user(s) or group(s) (delegated_user, delegated_group).
  • When opening a client connection, the client must provide a delegated username via the HiveServer2 protocol property,impala.doas.user or DelegationUID.

    When the client connects over HTTP, the doAs parameter can be specified in the HTTP path, e.g. /?doAs=delegated_user.

  • It is not necessary for authenticated_user to have the permission to access/edit files.
  • It is not necessary for the delegated users to have access to the service via Kerberos.
  • delegated_user and delegated_group must exist in the OS.
  • For group delegation, use the JNI-based mapping providers for group delegation, such as JniBasedUnixGroupsMappingWithFallback and JniBasedUnixGroupsNetgroupMappingWithFallback.
  • ShellBasedUnixGroupsNetgroupMapping and ShellBasedUnixGroupsMapping Hadoop group mapping providers are not supported in Impala group delegation by default. To enable them, flag enable_shell_based_groups_mapping needs to be enabled.
  • In Impala, user() returns authenticated_user and effective_user() returns the delegated user that the client specified.
The user or group delegation process works as follows:
  1. The impalad daemon starts with one of the following options:
    • ‑‑authorized_proxy_user_config=authenticated_user=delegated_user
    • ‑‑authorized_proxy_group_config=authenticated_user=delegated_group
  2. A client connects to Impala via the HiveServer2 protocol with the impala.doas.user configuration property, e.g. connected user is authenticated_user with impala.doas.user=delegated_user.
  3. The client user authenticated_user sends a request to Impala as the delegated user delegated_user.
  4. Impala checks authorization:
    • In user delegation, Impala checks if delegated_user is in the list of authorized delegate users for the user authenticated_user.
    • In group delegation, Impala checks if delegated_user belongs to one of the delegated groups for the user authenticated_user, delegated_group in this example.
  5. If the user is an authorized delegated user for authenticated_user, the request is executed as the delegate user delegated_user.

See Modifying Impala Startup Options for details about adding or changing impalad startup options.

See this blog post for background information about the delegation capability in HiveServer2.

To set up authentication for the delegated users: