Обсуждение: Index not being used

Поиск
Список
Период
Сортировка

Index not being used

От
Shane Wegner
Дата:
Hello,

I am trying to speed up a select which is taking around a
second to execute.  It's a very common query though so the
faster I can get it the better.  The query which I have
included below used a lot of seq scans so I created a
multicolumn index on books covering the values in the
select, ran analyze and it isn't being used.  If someone
wouldn't mind taking a look and letting me know what I'm
doing wrong, I'd appreciate it.

select query:
select books.id as book_id,title,isbn,
publisher, publishers.id as publisher_id,
place, places.id as place_id,
illustrator, illustrators.id as illustrator_id,
edition, editions.id as edition_id,
type, types.id as type_id,
category, categories.id as category_id,
year,
binding, binding.id as binding_id,
weight,
books.price as price,discount,description,comments,books.status,
ctime,mtime
from books
left join publishers on publisher_id=publishers.id
left join places on place_id=places.id
left join illustrators on illustrator_id=illustrators.id
left join editions on edition_id=editions.id
left join types on type_id=types.id
left join categories on category_id=categories.id
left join binding on binding_id=binding.id
,orders_and_books where order_id = 753 and book_id = books.id

The order_id will vary.
The index I created reads:
create index books_idx1 on books(publisher_id,place_id,illustrator_id,
edition_id,type_id,category_id,binding_id,id);

The other ids in the joining tables are all serial values
and are primary keys so are indexed automatically.

explain analyze output:
                                                                           QUERY PLAN
                                       

----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=29.96..1282.10 rows=2 width=999) (actual time=329.336..609.402 rows=1 loops=1)
   Hash Cond: ("outer".id = "inner".book_id)
   ->  Hash Left Join  (cost=24.65..1248.12 rows=3819 width=999) (actual time=6.344..602.450 rows=3819 loops=1)
         Hash Cond: ("outer".binding_id = "inner".id)
         ->  Hash Left Join  (cost=23.59..1189.90 rows=3819 width=982) (actual time=6.098..511.002 rows=3819 loops=1)
               Hash Cond: ("outer".category_id = "inner".id)
               ->  Hash Left Join  (cost=22.31..1138.06 rows=3819 width=962) (actual time=5.922..428.875 rows=3819
loops=1)
                     Hash Cond: ("outer".type_id = "inner".id)
                     ->  Hash Left Join  (cost=21.30..1117.92 rows=3819 width=931) (actual time=5.857..359.763
rows=3819loops=1) 
                           Hash Cond: ("outer".edition_id = "inner".id)
                           ->  Hash Left Join  (cost=20.10..1097.21 rows=3819 width=916) (actual time=5.703..292.123
rows=3819loops=1) 
                                 Hash Cond: ("outer".illustrator_id = "inner".id)
                                 ->  Hash Left Join  (cost=18.09..1075.19 rows=3819 width=880) (actual
time=5.190..224.422rows=3819 loops=1) 
                                       Hash Cond: ("outer".place_id = "inner".id)
                                       ->  Hash Left Join  (cost=13.01..1015.36 rows=3819 width=855) (actual
time=3.721..148.384rows=3819 loops=1) 
                                             Hash Cond: ("outer".publisher_id = "inner".id)
                                             ->  Seq Scan on books  (cost=0.00..946.19 rows=3819 width=828) (actual
time=0.034..70.883rows=3819 loops=1) 
                                             ->  Hash  (cost=11.41..11.41 rows=641 width=31) (actual time=3.611..3.611
rows=0loops=1) 
                                                   ->  Seq Scan on publishers  (cost=0.00..11.41 rows=641 width=31)
(actualtime=0.010..2.268 rows=641 loops=1) 
                                       ->  Hash  (cost=4.46..4.46 rows=246 width=29) (actual time=1.417..1.417 rows=0
loops=1)
                                             ->  Seq Scan on places  (cost=0.00..4.46 rows=246 width=29) (actual
time=0.008..0.895rows=246 loops=1) 
                                 ->  Hash  (cost=1.81..1.81 rows=81 width=40) (actual time=0.468..0.468 rows=0 loops=1)
                                       ->  Seq Scan on illustrators  (cost=0.00..1.81 rows=81 width=40) (actual
time=0.008..0.291rows=81 loops=1) 
                           ->  Hash  (cost=1.16..1.16 rows=16 width=19) (actual time=0.106..0.106 rows=0 loops=1)
                                 ->  Seq Scan on editions  (cost=0.00..1.16 rows=16 width=19) (actual time=0.008..0.062
rows=16loops=1) 
                     ->  Hash  (cost=1.01..1.01 rows=1 width=35) (actual time=0.015..0.015 rows=0 loops=1)
                           ->  Seq Scan on types  (cost=0.00..1.01 rows=1 width=35) (actual time=0.008..0.010 rows=1
loops=1)
               ->  Hash  (cost=1.22..1.22 rows=22 width=24) (actual time=0.126..0.126 rows=0 loops=1)
                     ->  Seq Scan on categories  (cost=0.00..1.22 rows=22 width=24) (actual time=0.009..0.082 rows=22
loops=1)
         ->  Hash  (cost=1.05..1.05 rows=5 width=21) (actual time=0.043..0.043 rows=0 loops=1)
               ->  Seq Scan on binding  (cost=0.00..1.05 rows=5 width=21) (actual time=0.014..0.031 rows=5 loops=1)
   ->  Hash  (cost=5.30..5.30 rows=2 width=4) (actual time=0.070..0.070 rows=0 loops=1)
         ->  Index Scan using orders_and_books_pkey on orders_and_books  (cost=0.00..5.30 rows=2 width=4) (actual
time=0.057..0.062rows=1 loops=1) 
               Index Cond: (order_id = 753)
 Total runtime: 610.245 ms
(35 rows)

Thanks,
Shane

--
Shane Wegner
http://www.cm.nu/~shane/

Re: Index not being used

От
Manfred Koizar
Дата:
On Tue, 15 Jun 2004 13:06:40 -0700, Shane Wegner
<shane-keyword-pgsql.a1e0d9@cm.nu> wrote:
>The index I created reads:
>create index books_idx1 on books(publisher_id,place_id,illustrator_id,
>edition_id,type_id,category_id,binding_id,id);

This index is useless, drop it.  Is there an index on books(id)?

>The other ids in the joining tables are all serial values
>and are primary keys so are indexed automatically.
>
>explain analyze output: [lots of seq scans and hash joins]

Try

    EXPLAIN ANALYSE
    SELECT *
      FROM orders_and_books AS o
          INNER JOIN books AS b ON o.book_id = b.id
     WHERE o.order_id = 753;

This should give a nested loop using primary key index scans on both
tables.  Then add

      LEFT JOIN publishers ON publisher_id=publishers.id
      LEFT JOIN places ON place_id=places.id
      ...

one by one until the plan changes to hash joins again and show us the
results.

Servus
 Manfred

Re: Index not being used

От
Shane Wegner
Дата:
On Thu, Jun 17, 2004 at 06:38:08AM +0200, Manfred Koizar wrote:
> This index is useless, drop it.  Is there an index on books(id)?

Yes it's a primary key.

> Try
>     EXPLAIN ANALYSE
>     SELECT *
>       FROM orders_and_books AS o
>           INNER JOIN books AS b ON o.book_id = b.id
>      WHERE o.order_id = 753;
>
> This should give a nested loop using primary key index scans on both
> tables.  Then add
>
>       LEFT JOIN publishers ON publisher_id=publishers.id
>       LEFT JOIN places ON place_id=places.id
>       ...
>
> one by one until the plan changes to hash joins again and show us the
> results.

The inner join really speeds up the script.  From it's
current 11 second runtime to 0.3 seconds.  I'm curious for
future optimization efforts as to why this is the case.
That is why an inner join as is shown in your query yeilds
such a more efficient query than a regular join with a
where clause.  Looks like I have some reading to do.

Thanks for your help with this.  Very dramatic improvement.

S



--
Shane Wegner
http://www.cm.nu/~shane/