Обсуждение: speed w/ OFFSET/LIMIT

Поиск
Список
Период
Сортировка

speed w/ OFFSET/LIMIT

От
Damien
Дата:
Hi !

I'm running a pretty simple select query on a pretty large table (70000
records). This table has some "flag" fields, each one textually explained by
another table ( flag VARCHAR(2), flag_details VARCHAR(60))

SELECT t.* , t1.flag1_details , ... , tn.flagn_details
FROM table t
NATURAL JOIN t1
NATURAL JOIN ...
NATURAL JOIN tn
ORDER BY main_field OFFSET x LIMIT 50

There is no where statement, this query is for display purposes on a web page.
My problem is, where OFFSET is low, execution takes only a few milli-seconds,
but where OFFSET is high (50 last results for example), execution can take 10
seconds...
Is there anything I can do to speed up such a query ?


Re: speed w/ OFFSET/LIMIT

От
Damien
Дата:
I found a kind of workaround by :

- first creating a view :
CREATE VIEW table_view AS
SELECT *
FROM table
ORDER BY main_field OFFSET x LIMIT 50

- then joining required tables
SELECT t.* , t1.flag1_details , ... , tn.flagn_details
FROM table_view t
JOINs....

A better solution anyway ?


On Tuesday 27 May 2003 15:45, Damien wrote:
> Hi !
>
> I'm running a pretty simple select query on a pretty large table (70000
> records). This table has some "flag" fields, each one textually explained
> by another table ( flag VARCHAR(2), flag_details VARCHAR(60))
>
> SELECT t.* , t1.flag1_details , ... , tn.flagn_details
> FROM table t
> NATURAL JOIN t1
> NATURAL JOIN ...
> NATURAL JOIN tn
> ORDER BY main_field OFFSET x LIMIT 50
>
> There is no where statement, this query is for display purposes on a web
> page. My problem is, where OFFSET is low, execution takes only a few
> milli-seconds, but where OFFSET is high (50 last results for example),
> execution can take 10 seconds...
> Is there anything I can do to speed up such a query ?
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html


Re: speed w/ OFFSET/LIMIT

От
Stephan Szabo
Дата:
On Tue, 27 May 2003, Damien wrote:

> I'm running a pretty simple select query on a pretty large table (70000
> records). This table has some "flag" fields, each one textually explained by
> another table ( flag VARCHAR(2), flag_details VARCHAR(60))
>
> SELECT t.* , t1.flag1_details , ... , tn.flagn_details
> FROM table t
> NATURAL JOIN t1
> NATURAL JOIN ...
> NATURAL JOIN tn
> ORDER BY main_field OFFSET x LIMIT 50
>
> There is no where statement, this query is for display purposes on a web page.
> My problem is, where OFFSET is low, execution takes only a few milli-seconds,
> but where OFFSET is high (50 last results for example), execution can take 10
> seconds...
> Is there anything I can do to speed up such a query ?

Can you send exact query and explain analyze output for each?  Since it
has to get the x+50 I'm not sure what can be done, but the explain output
will help.

As a side note, the workaround in your following message works as long as
the joins give only one match, but won't if they don't (the results are
different in that case).




Re: speed w/ OFFSET/LIMIT

От
Damien
Дата:
On Tuesday 27 May 2003 17:47, Stephan Szabo wrote:
> Can you send exact query and explain analyze output for each?  Since it
> has to get the x+50 I'm not sure what can be done, but the explain output
> will help.
>
> As a side note, the workaround in your following message works as long as
> the joins give only one match, but won't if they don't (the results are
> different in that case).

Here is the output. As you can see the explainations really differs depending of the given offset :

optima=# EXPLAIN SELECT a.* , p.palette , e.etat_detail , s.status_detail
optima-# FROM da4adresse a
optima-# LEFT OUTER JOIN da4paletier p ON p.adresse = a.adresse
optima-# JOIN da4status s ON s.status = a.status
optima-# JOIN da4etat e ON e.etat = a.etat
optima-# ORDER BY a.adresse LIMIT 50 OFFSET 1500 ;
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Limit  (cost=4022.58..4156.63 rows=50 width=154)
   ->  Nested Loop  (cost=1.05..194138.88 rows=72412 width=154)
         Join Filter: ("inner".etat = "outer".etat)
         ->  Nested Loop  (cost=1.05..115209.80 rows=72412 width=139)
               Join Filter: ("inner".status = "outer".status)
               ->  Merge Join  (cost=1.05..37909.99 rows=72412 width=124)
                     Merge Cond: ("outer".adresse = "inner".adresse)
                     ->  Index Scan using pk_adresse on da4adresse a  (cost=0.00..41296.38 rows=72412 width=106)
                     ->  Sort  (cost=1.05..1.06 rows=3 width=18)
                           Sort Key: p.adresse
                           ->  Seq Scan on da4paletier p  (cost=0.00..1.03 rows=3 width=18)
               ->  Seq Scan on da4status s  (cost=0.00..1.03 rows=3 width=15)
         ->  Seq Scan on da4etat e  (cost=0.00..1.04 rows=4 width=15)
(13 rows)

optima=# EXPLAIN SELECT a.* , p.palette , e.etat_detail , s.status_detail
optima-# FROM da4adresse a
optima-# LEFT OUTER JOIN da4paletier p ON p.adresse = a.adresse
optima-# JOIN da4status s ON s.status = a.status
optima-# JOIN da4etat e ON e.etat = a.etat
optima-# ORDER BY a.adresse LIMIT 50 OFFSET 70000 ;
                                                     QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
 Limit  (cost=28336.02..28336.15 rows=50 width=154)
   ->  Sort  (cost=28161.02..28342.05 rows=72412 width=154)
         Sort Key: a.adresse
         ->  Merge Join  (cost=21048.72..22315.95 rows=72412 width=154)
               Merge Cond: ("outer".etat = "inner".etat)
               ->  Sort  (cost=1.08..1.09 rows=4 width=15)
                     Sort Key: e.etat
                     ->  Seq Scan on da4etat e  (cost=0.00..1.04 rows=4 width=15)
               ->  Sort  (cost=21047.64..21228.67 rows=72412 width=139)
                     Sort Key: a.etat
                     ->  Merge Join  (cost=13935.34..15202.57 rows=72412 width=139)
                           Merge Cond: ("outer".status = "inner".status)
                           ->  Sort  (cost=13934.29..14115.32 rows=72412 width=124)
                                 Sort Key: a.status
                                 ->  Merge Join  (cost=7758.25..8089.21 rows=72412 width=124)
                                       Merge Cond: ("outer".adresse = "inner".adresse)
                                       ->  Sort  (cost=7757.20..7938.23 rows=72412 width=106)
                                             Sort Key: a.adresse
                                             ->  Seq Scan on da4adresse a  (cost=0.00..1912.12 rows=72412 width=106)
                                       ->  Sort  (cost=1.05..1.06 rows=3 width=18)
                                             Sort Key: p.adresse
                                             ->  Seq Scan on da4paletier p  (cost=0.00..1.03 rows=3 width=18)
                           ->  Sort  (cost=1.05..1.06 rows=3 width=15)
                                 Sort Key: s.status
                                 ->  Seq Scan on da4status s  (cost=0.00..1.03 rows=3 width=15)
(25 rows)


Re: speed w/ OFFSET/LIMIT

От
Stephan Szabo
Дата:
On Tue, 27 May 2003, Damien wrote:

> On Tuesday 27 May 2003 17:47, Stephan Szabo wrote:
> > Can you send exact query and explain analyze output for each?  Since it
> > has to get the x+50 I'm not sure what can be done, but the explain output
> > will help.
> >
> > As a side note, the workaround in your following message works as long as
> > the joins give only one match, but won't if they don't (the results are
> > different in that case).
>
> Here is the output. As you can see the explainations really differs
> depending of the given offset :

Well, it's got to (in theory) do the join and then the limit/offset, so it
seems to be doing a reasonable plan given only the estimates (explain
analyze would give the real time info as well).  In the small number case
it picks a plan that can be stopped when it's got the rows, in the large
number case it does the join and then a sort then limits.

The estimate in the first plan shows that it thinks that trying to get
70000 rows from it would suck alot, whereas the second plan should be
about the same no matter how many rows it gets.  It's possible that doing
something like your view (although I'd do it inline in the query with a
subselect rather than making a view object) is the best plan if you don't
mind the fact that the results can be different.

> optima=# EXPLAIN SELECT a.* , p.palette , e.etat_detail , s.status_detail
> optima-# FROM da4adresse a
> optima-# LEFT OUTER JOIN da4paletier p ON p.adresse = a.adresse
> optima-# JOIN da4status s ON s.status = a.status
> optima-# JOIN da4etat e ON e.etat = a.etat
> optima-# ORDER BY a.adresse LIMIT 50 OFFSET 1500 ;
>                                                    QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------
>  Limit  (cost=4022.58..4156.63 rows=50 width=154)
>    ->  Nested Loop  (cost=1.05..194138.88 rows=72412 width=154)
>          Join Filter: ("inner".etat = "outer".etat)
>          ->  Nested Loop  (cost=1.05..115209.80 rows=72412 width=139)
>                Join Filter: ("inner".status = "outer".status)
>                ->  Merge Join  (cost=1.05..37909.99 rows=72412 width=124)
>                      Merge Cond: ("outer".adresse = "inner".adresse)
>                      ->  Index Scan using pk_adresse on da4adresse a  (cost=0.00..41296.38 rows=72412 width=106)
>                      ->  Sort  (cost=1.05..1.06 rows=3 width=18)
>                            Sort Key: p.adresse
>                            ->  Seq Scan on da4paletier p  (cost=0.00..1.03 rows=3 width=18)
>                ->  Seq Scan on da4status s  (cost=0.00..1.03 rows=3 width=15)
>          ->  Seq Scan on da4etat e  (cost=0.00..1.04 rows=4 width=15)
> (13 rows)
>
> optima=# EXPLAIN SELECT a.* , p.palette , e.etat_detail , s.status_detail
> optima-# FROM da4adresse a
> optima-# LEFT OUTER JOIN da4paletier p ON p.adresse = a.adresse
> optima-# JOIN da4status s ON s.status = a.status
> optima-# JOIN da4etat e ON e.etat = a.etat
> optima-# ORDER BY a.adresse LIMIT 50 OFFSET 70000 ;
>                                                      QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=28336.02..28336.15 rows=50 width=154)
>    ->  Sort  (cost=28161.02..28342.05 rows=72412 width=154)
>          Sort Key: a.adresse
>          ->  Merge Join  (cost=21048.72..22315.95 rows=72412 width=154)
>                Merge Cond: ("outer".etat = "inner".etat)
>                ->  Sort  (cost=1.08..1.09 rows=4 width=15)
>                      Sort Key: e.etat
>                      ->  Seq Scan on da4etat e  (cost=0.00..1.04 rows=4 width=15)
>                ->  Sort  (cost=21047.64..21228.67 rows=72412 width=139)
>                      Sort Key: a.etat
>                      ->  Merge Join  (cost=13935.34..15202.57 rows=72412 width=139)
>                            Merge Cond: ("outer".status = "inner".status)
>                            ->  Sort  (cost=13934.29..14115.32 rows=72412 width=124)
>                                  Sort Key: a.status
>                                  ->  Merge Join  (cost=7758.25..8089.21 rows=72412 width=124)
>                                        Merge Cond: ("outer".adresse = "inner".adresse)
>                                        ->  Sort  (cost=7757.20..7938.23 rows=72412 width=106)
>                                              Sort Key: a.adresse
>                                              ->  Seq Scan on da4adresse a  (cost=0.00..1912.12 rows=72412 width=106)
>                                        ->  Sort  (cost=1.05..1.06 rows=3 width=18)
>                                              Sort Key: p.adresse
>                                              ->  Seq Scan on da4paletier p  (cost=0.00..1.03 rows=3 width=18)
>                            ->  Sort  (cost=1.05..1.06 rows=3 width=15)
>                                  Sort Key: s.status
>                                  ->  Seq Scan on da4status s  (cost=0.00..1.03 rows=3 width=15)
> (25 rows)
>