Join performance

Поиск
Список
Период
Сортировка
От David Fury
Тема Join performance
Дата
Msg-id 199909021158.LAA00462@wombat.teleordering.co.uk
обсуждение исходный текст
Список pgsql-general
I need to find a way of forcing a more efficient execution plan for
the following scenario:

CREATE TABLE live_biblio (
        isbn text NOT NULL UNIQUE PRIMARY KEY,
        author text,
        binding text,
        classif text,
        editor text,
        rrp money,
        imprint text,
        pub_date date,
        title text,
        bic1 text);

loaded using COPY, then vacuum analyzed, c. 550,000 rows

CREATE TABLE charts
(
 isbn   TEXT NOT NULL PRIMARY KEY,
 quantity INT4 NOT NULL,
 value MONEY NOT NULL,
 prev_chart_pos INT4,
 weeks_in_chart INT4,
 current_pos INT4 NOT NULL
)

loaded using COPY then vacuum analyzed, 5000 rows

PostgreSQL 6.5.1 under RedHat on a twin 450MHz Intel box with 512Mb
memory (nothing else running apart from apache)

The idea of this setup is that we keep a central Bibliographic
database keyed on ISBN and use it as a central reference for various
types of book-sales related data. In this case we have a bestsellers
list (the 'charts' table). To retrieve the top 100 chart with
bibliographic attributes attached, I would do the following.

select cht.current_pos, bib.title, bib.... etc.
from live_biblio bib, charts cht
where bib.isbn = cht.isbn
and current_pos < 101
order by current_pos

However the resulting execution plan (involving a scan on the
live_biblio table) makes use of this technique unworkable (response
timeof c. 15-20 seconds with only one user)

NOTICE:  QUERY PLAN:

Sort  (cost=1165023.88 rows=184422 width=40)
  ->  Nested Loop  (cost=1165023.88 rows=184422 width=40)
        ->  Seq Scan on live_biblio bib  (cost=31056.12 rows=553155
        width=24) ->  Index Scan using charts_pkey on charts cht
        (cost=2.05 rows=1667 width=16)

What I would really like to see is the restriction on the charts table
being actioned first, with the resulting list of key values driving an
indexed access to the live_biblio table.

Is there any way that I could rearrange things so that Postgres
carries out the join using the primary key indexes of the two tables?
I don't want to have to de-normalise the data to replicate
bibliographic attributes around the various tables of the database.

Any ideas gratefully received.

David.

David Fury
Software Developer

Whitaker BookTrack

d.fury@teleord.co.uk       Tel: 01252 742578

В списке pgsql-general по дате отправления:

Предыдущее
От: "Natalya S. Makushina"
Дата:
Сообщение: GEQO and KSQO problem.
Следующее
От: José Soares
Дата:
Сообщение: Re: [GENERAL] Max function on Timestamp