Thursday, November 11, 2010

Constraining rows of output: LIMIT and OFFSET

Database users often want to take a quick look at just a few rows of tables to get a sense of the data within them. We can do that in PostgreSQL with the LIMIT clause. Let's examine this table:

create table data(name varchar, value int);
insert into data values('joe',1);
insert into data values('bob',2);
insert into data values('jane',3);
insert into data values('anne',4);
insert into data values('kate',5);

apple=# select * from data;
 name | value
------+-------
 joe  |     1
 bob  |     2
 jane |     3
 anne |     4
 kate |     5
(5 rows)

Now, let's select just the first two rows:

apple=# select * from data limit 2;
 name | value
------+-------
 joe  |     1
 bob  |     2
(2 rows)


LIMIT is especially important when dealing with large tables. If you do select * from [very large table] in psql, it will generate the result set then show the first screen of rows. You can then press space to see the next screen of results or 'q' to quit and get back to the psql command line. Importantly, it may take a while to process the select because there are so many rows even if you only want to see the first screen. LIMITing results explicitly however generates results much more quickly.

What if wanted a set of rows that occurred later in the sequence? For this we can use PostgreSQL's OFFSET in combination with LIMIT. Both of these are PostgreSQL specific. (They have other names in other databases: MySQL uses LIMIT, ORACLE use ROWNUM <= number and other use TOP.) So, if we wanted the 4th and 5th row we offset the first 3 so that the 4th row becomes the start of our set and we specify a limit to say that we only want 2 rows from that point:

apple=# select * from data limit 2 offset 3;
 name | value
------+-------
 anne |     4
 kate |     5
(2 rows)

The order of LIMIT and OFFSET does not matter. This gives the same result:

apple=# select * from data offset 3 limit 2;
 name | value
------+-------
 anne |     4
 kate |     5
(2 rows)

We can use these two in combination with ORDER BY. Let's order by name:

apple=# select * from data order by name;
 name | value
------+-------
 anne |     4
 bob  |     2
 jane |     3
 joe  |     1
 kate |     5
(5 rows)

Now if we offset 3 and limit 2 we get the following instead:

apple=# select * from data order by name limit 2 offset 3;
 name | value
------+-------
 joe  |     1
 kate |     5
(2 rows)

Thus, clearly ORDER BY occurs first and then the LIMIT...OFFSET is applied.

2 comments:

  1. thank you so much for doing this!!!!

    ReplyDelete
  2. clients would even call this a 10.5 rendition. We have

    ReplyDelete