PARQUET_READ_PAGE_INDEX Query Option
Use the PARQUET_READ_PAGE_INDEX
query option to disable or enable using
the Parquet page index during scans. The page index contains min/max statistics at the
page-level granularity. It can be used to skip pages and rows that do not match the
conditions in the WHERE
clause.
This option enables the same optimization as the PARQUET_READ_STATISTICS
at the finer grained page level.
Impala supports filtering based on Parquet statistics:
- Of the types: Boolean, Integer, Decimal, String, Timestamp
-
For simple predicates of the forms:
<slot> <op> <constant>
or<constant> <op> <slot>
, where<op>
is LT, LE, GE, GT, and EQ
The supported values for the query option are:
-
true
(1
): Read the page-level statistics from the Parquet page index during query processing and filter out pages based on the statistics. -
false
(0
): Do not use the Parquet page index. -
Any other values are treated as
false
.
Type: Boolean
Default: TRUE