Re: [SQL] OFFSET impact on Performance???

Поиск
Список
Период
Сортировка
От Christopher Browne
Тема Re: [SQL] OFFSET impact on Performance???
Дата
Msg-id m38y6fqwxr.fsf@knuth.knuth.cbbrowne.com
обсуждение исходный текст
Ответ на Re: [SQL] OFFSET impact on Performance???  ("Merlin Moncure" <merlin.moncure@rcsonline.com>)
Список pgsql-performance
In an attempt to throw the authorities off his trail, merlin.moncure@rcsonline.com ("Merlin Moncure") transmitted:
> Alex wrote:
>> How do you create a temporary view that has only a small subset of the
>> data from the DB init?  (Links to docs are fine - I can read ;).  My
>> query isn't all that complex, and my number of records might be from
>> 10 to 2k depending on how I implement it.
>
> Well, you can't.  My point was that the traditional query/view
> approach is often more appropriate for these cases.

Actually, you can if you assume you can "temporarily materialize" that
view.

You take the initial query and materialize it into a temporary table
which can then be used to browse "detail."

Thus, suppose you've got a case where the selection criteria draw in
8000 objects/transactions, of which you only want to fit 20/page.

It's ugly and slow to process the 15th page, and you essentially
reprocess the whole set from scratch each time:

  select [details] from [big table] where [criteria]
    order by [something]
    offset 280 limit 20;

Instead, you might start out by doing:

  select [key fields] into temp table my_query
  from [big table] where [criteria];

  create index my_query_idx on my_query(interesting fields);

With 8000 records, the number of pages in the table will correspond
roughly to the number of bytes per record which is probably pretty
small.

Then, you use a join on my_query to pull the bits you want:

  select [big table.details] from [big table],
  [select * from my_query order by [something] offset 280 limit 20]
  where [join criteria between my_query and big table]
  order by [something];

For this to be fast is predicated on my_query being compact, but that
should surely be so.

The big table may be 20 million records; for the result set to be even
vaguely browsable means that my_query ought to be relatively small so
you can pull subsets reasonably efficiently.

This actually has a merit over looking at a dynamic, possibly-changing
big table that you won't unexpectedly see the result set changing
size.

This strikes me as a pretty slick way to handle "data warehouse-style"
browsing...
--
output = ("cbbrowne" "@" "gmail.com")
http://www.ntlug.org/~cbbrowne/oses.html
The first cup of coffee recapitulates phylogeny.

В списке pgsql-performance по дате отправления:

Предыдущее
От: Alex Turner
Дата:
Сообщение: Re: [SQL] OFFSET impact on Performance???
Следующее
От: Steve Poe
Дата:
Сообщение: Re: Ideal disk setup for Postgresql 7.4?