The OFFSET
clause in a SELECT
query causes the result set to start some
number of rows after the logical first item. The result set is numbered starting from zero, so OFFSET
0
produces the same result as leaving out the OFFSET
clause. Always use this clause
in combination with ORDER BY
(so that it is clear which item should be first, second, and so
on) and LIMIT
(so that the result set covers a bounded range, such as items 0-9, 100-199,
and so on).
In Impala 1.2.1 and higher, you can combine a LIMIT
clause with an
OFFSET
clause to produce a small result set that is different from a
top-N query, for example, to return items 11 through 20. This technique can be used to
simulate "paged" results. Because Impala queries typically involve substantial
amounts of I/O, use this technique only for compatibility in cases where you cannot
rewrite the application logic. For best performance and scalability, wherever practical,
query as many items as you expect to need, cache them on the application side, and
display small groups of results to users using application logic.
Examples:
The following example shows how you could run a "paging" query originally written for a traditional database application. Because typical Impala queries process megabytes or gigabytes of data and read large data files from disk each time, it is inefficient to run a separate query to retrieve each small group of items. Use this technique only for compatibility while porting older applications, then rewrite the application code to use a single query with a large result set, and display pages of results from the cached result set.
[localhost:21000] > create table numbers (x int);
[localhost:21000] > insert into numbers select x from very_long_sequence;
Inserted 1000000 rows in 1.34s
[localhost:21000] > select x from numbers order by x limit 5 offset 0;
+----+
| x |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
+----+
[localhost:21000] > select x from numbers order by x limit 5 offset 5;
+----+
| x |
+----+
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
+----+