Обсуждение: Left Join Not Using Index?

Поиск
Список
Период
Сортировка

Left Join Not Using Index?

От
Hunter Hillegas
Дата:
I have a left join that doesn't seem to be using an index I created, and the
query's performance needs to improve.

I have two tables that model a message board:

              Table "public.message_board_topics"
       Column        |            Type             | Modifiers
---------------------+-----------------------------+-----------
 rec_num             | integer                     | not null
 topic_name          | character varying(255)      |
 topic_body          | text                        |
 topic_author        | character varying(20)       |
 topic_author_email  | character varying(50)       |
 topic_date          | date                        |
 topic_updated       | timestamp without time zone |
 administrator_topic | boolean                     |
 number_of_comments  | integer                     |
Indexes: message_board_topics_pkey primary key btree (rec_num)
Triggers: RI_ConstraintTrigger_819942,
          RI_ConstraintTrigger_819943


           Table "public.message_board_comments"
        Column        |          Type          | Modifiers
----------------------+------------------------+-----------
 rec_num              | integer                | not null
 topic_id             | integer                |
 comment_parent       | integer                |
 comment_name         | character varying(255) |
 comment_body         | text                   |
 comment_author       | character varying(20)  |
 comment_author_email | character varying(50)  |
 comment_date         | date                   |
Indexes: message_board_comments_pkey primary key btree (rec_num),
         message_board_comments_topic_id btree (topic_id)
Triggers: RI_ConstraintTrigger_819941

The query is:

SELECT DISTINCT message_board_topics.rec_num,
message_board_topics.topic_name, message_board_topics.topic_body,
message_board_topics.topic_author, message_board_topics.topic_author_email,
message_board_topics.topic_updated,
message_board_topics.administrator_topic,
message_board_topics.number_of_comments, to_char(topic_date, 'MM.DD.YYYY')
as formatted_date FROM message_board_topics left join message_board_comments
on (message_board_comments.topic_id=message_board_topics.rec_num) WHERE
upper(topic_name) LIKE upper('madbrowser') OR upper(topic_body) LIKE
upper('madbrowser') OR upper(topic_author) LIKE upper('madbrowser') OR
upper(message_board_comments.comment_author) LIKE upper('madbrowser') ORDER
BY message_board_topics.rec_num DESC

Explain outputs:

 Unique  (cost=34847.38..35145.38 rows=1192 width=293)
   ->  Sort  (cost=34847.38..34877.18 rows=11920 width=293)
         Sort Key: message_board_topics.rec_num,
message_board_topics.topic_name, message_board_topics.topic_body,
message_board_topics.topic_author, message_board_topics.topic_author_email,
message_board_topics.topic_updated,
message_board_topics.administrator_topic,
message_board_topics.number_of_comments,
to_char((message_board_topics.topic_date)::timestamp with time zone,
'MM.DD.YYYY'::text)
         ->  Merge Join  (cost=26858.21..33007.14 rows=11920 width=293)
               Merge Cond: ("outer".rec_num = "inner".topic_id)
               Filter: ((upper(("outer".topic_name)::text) ~~
'MADBROWSER'::text) OR (upper("outer".topic_body) ~~ 'MADBROWSER'::text) OR
(upper(("outer".topic_author)::text) ~~ 'MADBROWSER'::text) OR
(upper(("inner".comment_author)::text) ~~ 'MADBROWSER'::text))
               ->  Sort  (cost=2446.01..2475.81 rows=11920 width=265)
                     Sort Key: message_board_topics.rec_num
                     ->  Seq Scan on message_board_topics
(cost=0.00..712.20 rows=11920 width=265)
               ->  Sort  (cost=24412.20..24818.15 rows=162382 width=28)
                     Sort Key: message_board_comments.topic_id
                     ->  Seq Scan on message_board_comments
(cost=0.00..7203.82 rows=162382 width=28)

It doesn't seem to be using the index in topic_id... What can I do to help
the planner figure out about that index?

Thanks,
Hunter


Re: Left Join Not Using Index?

От
Stephan Szabo
Дата:
On Tue, 22 Apr 2003, Hunter Hillegas wrote:

> I have a left join that doesn't seem to be using an index I created, and the
> query's performance needs to improve.
>
> I have two tables that model a message board:
>
>               Table "public.message_board_topics"
>        Column        |            Type             | Modifiers
> ---------------------+-----------------------------+-----------
>  rec_num             | integer                     | not null
>  topic_name          | character varying(255)      |
>  topic_body          | text                        |
>  topic_author        | character varying(20)       |
>  topic_author_email  | character varying(50)       |
>  topic_date          | date                        |
>  topic_updated       | timestamp without time zone |
>  administrator_topic | boolean                     |
>  number_of_comments  | integer                     |
> Indexes: message_board_topics_pkey primary key btree (rec_num)
> Triggers: RI_ConstraintTrigger_819942,
>           RI_ConstraintTrigger_819943
>
>
>            Table "public.message_board_comments"
>         Column        |          Type          | Modifiers
> ----------------------+------------------------+-----------
>  rec_num              | integer                | not null
>  topic_id             | integer                |
>  comment_parent       | integer                |
>  comment_name         | character varying(255) |
>  comment_body         | text                   |
>  comment_author       | character varying(20)  |
>  comment_author_email | character varying(50)  |
>  comment_date         | date                   |
> Indexes: message_board_comments_pkey primary key btree (rec_num),
>          message_board_comments_topic_id btree (topic_id)
> Triggers: RI_ConstraintTrigger_819941
>
> The query is:
>
> SELECT DISTINCT message_board_topics.rec_num,
> message_board_topics.topic_name, message_board_topics.topic_body,
> message_board_topics.topic_author, message_board_topics.topic_author_email,
> message_board_topics.topic_updated,
> message_board_topics.administrator_topic,
> message_board_topics.number_of_comments, to_char(topic_date, 'MM.DD.YYYY')
> as formatted_date FROM message_board_topics left join message_board_comments
> on (message_board_comments.topic_id=message_board_topics.rec_num) WHERE
> upper(topic_name) LIKE upper('madbrowser') OR upper(topic_body) LIKE
> upper('madbrowser') OR upper(topic_author) LIKE upper('madbrowser') OR
> upper(message_board_comments.comment_author) LIKE upper('madbrowser') ORDER
> BY message_board_topics.rec_num DESC
>
> Explain outputs:
>
>  Unique  (cost=34847.38..35145.38 rows=1192 width=293)
>    ->  Sort  (cost=34847.38..34877.18 rows=11920 width=293)
>          Sort Key: message_board_topics.rec_num,
> message_board_topics.topic_name, message_board_topics.topic_body,
> message_board_topics.topic_author, message_board_topics.topic_author_email,
> message_board_topics.topic_updated,
> message_board_topics.administrator_topic,
> message_board_topics.number_of_comments,
> to_char((message_board_topics.topic_date)::timestamp with time zone,
> 'MM.DD.YYYY'::text)
>          ->  Merge Join  (cost=26858.21..33007.14 rows=11920 width=293)
>                Merge Cond: ("outer".rec_num = "inner".topic_id)
>                Filter: ((upper(("outer".topic_name)::text) ~~
> 'MADBROWSER'::text) OR (upper("outer".topic_body) ~~ 'MADBROWSER'::text) OR
> (upper(("outer".topic_author)::text) ~~ 'MADBROWSER'::text) OR
> (upper(("inner".comment_author)::text) ~~ 'MADBROWSER'::text))
>                ->  Sort  (cost=2446.01..2475.81 rows=11920 width=265)
>                      Sort Key: message_board_topics.rec_num
>                      ->  Seq Scan on message_board_topics
> (cost=0.00..712.20 rows=11920 width=265)
>                ->  Sort  (cost=24412.20..24818.15 rows=162382 width=28)
>                      Sort Key: message_board_comments.topic_id
>                      ->  Seq Scan on message_board_comments
> (cost=0.00..7203.82 rows=162382 width=28)
>
> It doesn't seem to be using the index in topic_id... What can I do to help
> the planner figure out about that index?

It's deciding that seq scan + sort of all the rows is faster than the
index scan over all the rows (which may very well be true).  What does
explain analyze say with enable_seqscan set to true and false?

If you only had extra conditions on one of the two tables and had
appropriate indexes (and reworked the query a little) I'd think you might
be able to get a better plan.  As it is, I'm not sure what'd work best
I'd think that a query doing the left join with only the
message_board_topics conditions unioned with an inner join and the
message_board_comments condition would give the same results with some
massaging, but I don't really know if that'd ever perform better anyway.


Re: Left Join Not Using Index?

От
Hunter Hillegas
Дата:
Thanks for responding...

With enable_seqscan = false:

 Unique  (cost=545747.57..546045.57 rows=1192 width=293) (actual
time=40851.49..40854.80 rows=51 loops=1)
   ->  Sort  (cost=545747.57..545777.37 rows=11920 width=293) (actual
time=40851.48..40852.09 rows=292 loops=1)
         Sort Key: message_board_topics.rec_num,
message_board_topics.topic_name, message_board_topics.topic_body,
message_board_topics.topic_author, message_board_topics.topic_author_email,
message_board_topics.topic_updated,
message_board_topics.administrator_topic,
message_board_topics.number_of_comments,
to_char((message_board_topics.topic_date)::timestamp with time zone,
'MM.DD.YYYY'::text)
         ->  Merge Join  (cost=0.00..543907.33 rows=11920 width=293) (actual
time=482.05..40847.19 rows=292 loops=1)
               Merge Cond: ("outer".rec_num = "inner".topic_id)
               Filter: ((upper(("outer".topic_name)::text) ~~
'MADBROWSER'::text) OR (upper("outer".topic_body) ~~ 'MADBROWSER'::text) OR
(upper(("outer".topic_author)::text) ~~ 'MADBROWSER'::text) OR
(upper(("inner".comment_author)::text) ~~ 'MADBROWSER'::text))
               ->  Index Scan using message_board_topics_pkey on
message_board_topics  (cost=0.00..2532.12 rows=11920 width=265) (actual
time=428.32..4893.13 rows=11920 loops=1)
               ->  Index Scan using message_board_comments_topic_id on
message_board_comments  (cost=0.00..535662.04 rows=162382 width=28) (actual
time=28.45..32163.18 rows=162382 loops=1)
 Total runtime: 40855.59 msec

With enable_seqscan = true:

 Unique  (cost=34847.38..35145.38 rows=1192 width=293) (actual
time=13005.13..13008.51 rows=51 loops=1)
   ->  Sort  (cost=34847.38..34877.18 rows=11920 width=293) (actual
time=13005.12..13005.73 rows=292 loops=1)
         Sort Key: message_board_topics.rec_num,
message_board_topics.topic_name, message_board_topics.topic_body,
message_board_topics.topic_author, message_board_topics.topic_author_email,
message_board_topics.topic_updated,
message_board_topics.administrator_topic,
message_board_topics.number_of_comments,
to_char((message_board_topics.topic_date)::timestamp with time zone,
'MM.DD.YYYY'::text)
         ->  Merge Join  (cost=26858.21..33007.14 rows=11920 width=293)
(actual time=4930.32..12949.93 rows=292 loops=1)
               Merge Cond: ("outer".rec_num = "inner".topic_id)
               Filter: ((upper(("outer".topic_name)::text) ~~
'MADBROWSER'::text) OR (upper("outer".topic_body) ~~ 'MADBROWSER'::text) OR
(upper(("outer".topic_author)::text) ~~ 'MADBROWSER'::text) OR
(upper(("inner".comment_author)::text) ~~ 'MADBROWSER'::text))
               ->  Sort  (cost=2446.01..2475.81 rows=11920 width=265)
(actual time=628.30..953.50 rows=11920 loops=1)
                     Sort Key: message_board_topics.rec_num
                     ->  Seq Scan on message_board_topics
(cost=0.00..712.20 rows=11920 width=265) (actual time=0.10..223.96
rows=11920 loops=1)
               ->  Sort  (cost=24412.20..24818.15 rows=162382 width=28)
(actual time=4301.14..5788.66 rows=162382 loops=1)
                     Sort Key: message_board_comments.topic_id
                     ->  Seq Scan on message_board_comments
(cost=0.00..7203.82 rows=162382 width=28) (actual time=0.10..1335.26
rows=162382 loops=1)
 Total runtime: 13108.33 msec
(13 rows)

Your suggestion didn't really make a whole lot of sense to me... Based on
this info, what do you think?

Hunter

> From: Stephan Szabo <sszabo@megazone23.bigpanda.com>
> Date: Tue, 22 Apr 2003 21:42:02 -0700 (PDT)
> To: Hunter Hillegas <lists@lastonepicked.com>
> Cc: PostgreSQL <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] Left Join Not Using Index?
>
> On Tue, 22 Apr 2003, Hunter Hillegas wrote:
>
>> I have a left join that doesn't seem to be using an index I created, and the
>> query's performance needs to improve.
>>
>> I have two tables that model a message board:
>>
>>               Table "public.message_board_topics"
>>        Column        |            Type             | Modifiers
>> ---------------------+-----------------------------+-----------
>>  rec_num             | integer                     | not null
>>  topic_name          | character varying(255)      |
>>  topic_body          | text                        |
>>  topic_author        | character varying(20)       |
>>  topic_author_email  | character varying(50)       |
>>  topic_date          | date                        |
>>  topic_updated       | timestamp without time zone |
>>  administrator_topic | boolean                     |
>>  number_of_comments  | integer                     |
>> Indexes: message_board_topics_pkey primary key btree (rec_num)
>> Triggers: RI_ConstraintTrigger_819942,
>>           RI_ConstraintTrigger_819943
>>
>>
>>            Table "public.message_board_comments"
>>         Column        |          Type          | Modifiers
>> ----------------------+------------------------+-----------
>>  rec_num              | integer                | not null
>>  topic_id             | integer                |
>>  comment_parent       | integer                |
>>  comment_name         | character varying(255) |
>>  comment_body         | text                   |
>>  comment_author       | character varying(20)  |
>>  comment_author_email | character varying(50)  |
>>  comment_date         | date                   |
>> Indexes: message_board_comments_pkey primary key btree (rec_num),
>>          message_board_comments_topic_id btree (topic_id)
>> Triggers: RI_ConstraintTrigger_819941
>>
>> The query is:
>>
>> SELECT DISTINCT message_board_topics.rec_num,
>> message_board_topics.topic_name, message_board_topics.topic_body,
>> message_board_topics.topic_author, message_board_topics.topic_author_email,
>> message_board_topics.topic_updated,
>> message_board_topics.administrator_topic,
>> message_board_topics.number_of_comments, to_char(topic_date, 'MM.DD.YYYY')
>> as formatted_date FROM message_board_topics left join message_board_comments
>> on (message_board_comments.topic_id=message_board_topics.rec_num) WHERE
>> upper(topic_name) LIKE upper('madbrowser') OR upper(topic_body) LIKE
>> upper('madbrowser') OR upper(topic_author) LIKE upper('madbrowser') OR
>> upper(message_board_comments.comment_author) LIKE upper('madbrowser') ORDER
>> BY message_board_topics.rec_num DESC
>>
>> Explain outputs:
>>
>>  Unique  (cost=34847.38..35145.38 rows=1192 width=293)
>>    ->  Sort  (cost=34847.38..34877.18 rows=11920 width=293)
>>          Sort Key: message_board_topics.rec_num,
>> message_board_topics.topic_name, message_board_topics.topic_body,
>> message_board_topics.topic_author, message_board_topics.topic_author_email,
>> message_board_topics.topic_updated,
>> message_board_topics.administrator_topic,
>> message_board_topics.number_of_comments,
>> to_char((message_board_topics.topic_date)::timestamp with time zone,
>> 'MM.DD.YYYY'::text)
>>          ->  Merge Join  (cost=26858.21..33007.14 rows=11920 width=293)
>>                Merge Cond: ("outer".rec_num = "inner".topic_id)
>>                Filter: ((upper(("outer".topic_name)::text) ~~
>> 'MADBROWSER'::text) OR (upper("outer".topic_body) ~~ 'MADBROWSER'::text) OR
>> (upper(("outer".topic_author)::text) ~~ 'MADBROWSER'::text) OR
>> (upper(("inner".comment_author)::text) ~~ 'MADBROWSER'::text))
>>                ->  Sort  (cost=2446.01..2475.81 rows=11920 width=265)
>>                      Sort Key: message_board_topics.rec_num
>>                      ->  Seq Scan on message_board_topics
>> (cost=0.00..712.20 rows=11920 width=265)
>>                ->  Sort  (cost=24412.20..24818.15 rows=162382 width=28)
>>                      Sort Key: message_board_comments.topic_id
>>                      ->  Seq Scan on message_board_comments
>> (cost=0.00..7203.82 rows=162382 width=28)
>>
>> It doesn't seem to be using the index in topic_id... What can I do to help
>> the planner figure out about that index?
>
> It's deciding that seq scan + sort of all the rows is faster than the
> index scan over all the rows (which may very well be true).  What does
> explain analyze say with enable_seqscan set to true and false?
>
> If you only had extra conditions on one of the two tables and had
> appropriate indexes (and reworked the query a little) I'd think you might
> be able to get a better plan.  As it is, I'm not sure what'd work best
> I'd think that a query doing the left join with only the
> message_board_topics conditions unioned with an inner join and the
> message_board_comments condition would give the same results with some
> massaging, but I don't really know if that'd ever perform better anyway.
>
>


Re: Left Join Not Using Index?

От
"Dann Corbit"
Дата:
> -----Original Message-----
> From: Hunter Hillegas [mailto:lists@lastonepicked.com]
> Sent: Tuesday, April 22, 2003 9:55 PM
> To: Stephan Szabo
> Cc: PostgreSQL
> Subject: Re: [GENERAL] Left Join Not Using Index?
>
>
> Thanks for responding...
>
> With enable_seqscan = false:
>
>  Unique  (cost=545747.57..546045.57 rows=1192 width=293)
> (actual time=40851.49..40854.80 rows=51 loops=1)
>    ->  Sort  (cost=545747.57..545777.37 rows=11920 width=293)
> (actual time=40851.48..40852.09 rows=292 loops=1)
>          Sort Key: message_board_topics.rec_num,
> message_board_topics.topic_name,
> message_board_topics.topic_body,
> message_board_topics.topic_author,
> message_board_topics.topic_author_email,
> message_board_topics.topic_updated,
> message_board_topics.administrator_topic,
> message_board_topics.number_of_comments,
> to_char((message_board_topics.topic_date)::timestamp with time zone,
> 'MM.DD.YYYY'::text)
>          ->  Merge Join  (cost=0.00..543907.33 rows=11920
> width=293) (actual time=482.05..40847.19 rows=292 loops=1)
>                Merge Cond: ("outer".rec_num = "inner".topic_id)
>                Filter: ((upper(("outer".topic_name)::text) ~~
> 'MADBROWSER'::text) OR (upper("outer".topic_body) ~~
> 'MADBROWSER'::text) OR
> (upper(("outer".topic_author)::text) ~~ 'MADBROWSER'::text) OR
> (upper(("inner".comment_author)::text) ~~ 'MADBROWSER'::text))
>                ->  Index Scan using message_board_topics_pkey
> on message_board_topics  (cost=0.00..2532.12 rows=11920
> width=265) (actual time=428.32..4893.13 rows=11920 loops=1)
>                ->  Index Scan using
> message_board_comments_topic_id on message_board_comments
> (cost=0.00..535662.04 rows=162382 width=28) (actual
> time=28.45..32163.18 rows=162382 loops=1)  Total runtime:
> 40855.59 msec
>
> With enable_seqscan = true:
>
>  Unique  (cost=34847.38..35145.38 rows=1192 width=293)
> (actual time=13005.13..13008.51 rows=51 loops=1)
>    ->  Sort  (cost=34847.38..34877.18 rows=11920 width=293)
> (actual time=13005.12..13005.73 rows=292 loops=1)
>          Sort Key: message_board_topics.rec_num,
> message_board_topics.topic_name,
> message_board_topics.topic_body,
> message_board_topics.topic_author,
> message_board_topics.topic_author_email,
> message_board_topics.topic_updated,
> message_board_topics.administrator_topic,
> message_board_topics.number_of_comments,
> to_char((message_board_topics.topic_date)::timestamp with time zone,
> 'MM.DD.YYYY'::text)
>          ->  Merge Join  (cost=26858.21..33007.14 rows=11920
> width=293) (actual time=4930.32..12949.93 rows=292 loops=1)
>                Merge Cond: ("outer".rec_num = "inner".topic_id)
>                Filter: ((upper(("outer".topic_name)::text) ~~
> 'MADBROWSER'::text) OR (upper("outer".topic_body) ~~
> 'MADBROWSER'::text) OR
> (upper(("outer".topic_author)::text) ~~ 'MADBROWSER'::text) OR
> (upper(("inner".comment_author)::text) ~~ 'MADBROWSER'::text))
>                ->  Sort  (cost=2446.01..2475.81 rows=11920
> width=265) (actual time=628.30..953.50 rows=11920 loops=1)
>                      Sort Key: message_board_topics.rec_num
>                      ->  Seq Scan on message_board_topics
> (cost=0.00..712.20 rows=11920 width=265) (actual
> time=0.10..223.96 rows=11920 loops=1)
>                ->  Sort  (cost=24412.20..24818.15 rows=162382
> width=28) (actual time=4301.14..5788.66 rows=162382 loops=1)
>                      Sort Key: message_board_comments.topic_id
>                      ->  Seq Scan on message_board_comments
> (cost=0.00..7203.82 rows=162382 width=28) (actual
> time=0.10..1335.26 rows=162382 loops=1)  Total runtime:
> 13108.33 msec (13 rows)
>
> Your suggestion didn't really make a whole lot of sense to
> me... Based on this info, what do you think?

Suggestion:
Perform the actual query with seqscan enabled/disabled and see which one
is literally faster.

This guess:
> 40855.59 msec

Certainly seems slower than this one:
> 13108.33 msec (13 rows)

Indicating that the strategy originally chosen should be correct.

How accurate is the estimate on your machine?


Re: Left Join Not Using Index?

От
Hunter Hillegas
Дата:
Those plans are from 'EXPLAIN ANALYZE', not 'EXPLAIN'... So as I understand
it, that's the actual run time for the query.

Having seqscan turned on is a lot faster, but isn't that the default? The
reason I was given this query to look at was that it's not running fast
enough right now in our production application...

So, anyone have any suggestion about anything else I could do? Those indices
don't help at all?

Hunter

> From: "Dann Corbit" <DCorbit@connx.com>
> Date: Tue, 22 Apr 2003 22:00:12 -0700
> To: "Hunter Hillegas" <lists@lastonepicked.com>, "Stephan Szabo"
> <sszabo@megazone23.bigpanda.com>
> Cc: "PostgreSQL" <pgsql-general@postgresql.org>
> Subject: RE: [GENERAL] Left Join Not Using Index?
>
>> -----Original Message-----
>> From: Hunter Hillegas [mailto:lists@lastonepicked.com]
>> Sent: Tuesday, April 22, 2003 9:55 PM
>> To: Stephan Szabo
>> Cc: PostgreSQL
>> Subject: Re: [GENERAL] Left Join Not Using Index?
>>
>>
>> Thanks for responding...
>>
>> With enable_seqscan = false:
>>
>>  Unique  (cost=545747.57..546045.57 rows=1192 width=293)
>> (actual time=40851.49..40854.80 rows=51 loops=1)
>>    ->  Sort  (cost=545747.57..545777.37 rows=11920 width=293)
>> (actual time=40851.48..40852.09 rows=292 loops=1)
>>          Sort Key: message_board_topics.rec_num,
>> message_board_topics.topic_name,
>> message_board_topics.topic_body,
>> message_board_topics.topic_author,
>> message_board_topics.topic_author_email,
>> message_board_topics.topic_updated,
>> message_board_topics.administrator_topic,
>> message_board_topics.number_of_comments,
>> to_char((message_board_topics.topic_date)::timestamp with time zone,
>> 'MM.DD.YYYY'::text)
>>          ->  Merge Join  (cost=0.00..543907.33 rows=11920
>> width=293) (actual time=482.05..40847.19 rows=292 loops=1)
>>                Merge Cond: ("outer".rec_num = "inner".topic_id)
>>                Filter: ((upper(("outer".topic_name)::text) ~~
>> 'MADBROWSER'::text) OR (upper("outer".topic_body) ~~
>> 'MADBROWSER'::text) OR
>> (upper(("outer".topic_author)::text) ~~ 'MADBROWSER'::text) OR
>> (upper(("inner".comment_author)::text) ~~ 'MADBROWSER'::text))
>>                ->  Index Scan using message_board_topics_pkey
>> on message_board_topics  (cost=0.00..2532.12 rows=11920
>> width=265) (actual time=428.32..4893.13 rows=11920 loops=1)
>>                ->  Index Scan using
>> message_board_comments_topic_id on message_board_comments
>> (cost=0.00..535662.04 rows=162382 width=28) (actual
>> time=28.45..32163.18 rows=162382 loops=1)  Total runtime:
>> 40855.59 msec
>>
>> With enable_seqscan = true:
>>
>>  Unique  (cost=34847.38..35145.38 rows=1192 width=293)
>> (actual time=13005.13..13008.51 rows=51 loops=1)
>>    ->  Sort  (cost=34847.38..34877.18 rows=11920 width=293)
>> (actual time=13005.12..13005.73 rows=292 loops=1)
>>          Sort Key: message_board_topics.rec_num,
>> message_board_topics.topic_name,
>> message_board_topics.topic_body,
>> message_board_topics.topic_author,
>> message_board_topics.topic_author_email,
>> message_board_topics.topic_updated,
>> message_board_topics.administrator_topic,
>> message_board_topics.number_of_comments,
>> to_char((message_board_topics.topic_date)::timestamp with time zone,
>> 'MM.DD.YYYY'::text)
>>          ->  Merge Join  (cost=26858.21..33007.14 rows=11920
>> width=293) (actual time=4930.32..12949.93 rows=292 loops=1)
>>                Merge Cond: ("outer".rec_num = "inner".topic_id)
>>                Filter: ((upper(("outer".topic_name)::text) ~~
>> 'MADBROWSER'::text) OR (upper("outer".topic_body) ~~
>> 'MADBROWSER'::text) OR
>> (upper(("outer".topic_author)::text) ~~ 'MADBROWSER'::text) OR
>> (upper(("inner".comment_author)::text) ~~ 'MADBROWSER'::text))
>>                ->  Sort  (cost=2446.01..2475.81 rows=11920
>> width=265) (actual time=628.30..953.50 rows=11920 loops=1)
>>                      Sort Key: message_board_topics.rec_num
>>                      ->  Seq Scan on message_board_topics
>> (cost=0.00..712.20 rows=11920 width=265) (actual
>> time=0.10..223.96 rows=11920 loops=1)
>>                ->  Sort  (cost=24412.20..24818.15 rows=162382
>> width=28) (actual time=4301.14..5788.66 rows=162382 loops=1)
>>                      Sort Key: message_board_comments.topic_id
>>                      ->  Seq Scan on message_board_comments
>> (cost=0.00..7203.82 rows=162382 width=28) (actual
>> time=0.10..1335.26 rows=162382 loops=1)  Total runtime:
>> 13108.33 msec (13 rows)
>>
>> Your suggestion didn't really make a whole lot of sense to
>> me... Based on this info, what do you think?
>
> Suggestion:
> Perform the actual query with seqscan enabled/disabled and see which one
> is literally faster.
>
> This guess:
>> 40855.59 msec
>
> Certainly seems slower than this one:
>> 13108.33 msec (13 rows)
>
> Indicating that the strategy originally chosen should be correct.
>
> How accurate is the estimate on your machine?


Re: Left Join Not Using Index?

От
Tom Lane
Дата:
"Dann Corbit" <DCorbit@connx.com> writes:
> This guess:
>> 40855.59 msec
> Certainly seems slower than this one:
>> 13108.33 msec

Those are not guesses, those are measurements.  Translation: the planner
made the right choice here.  (Hot diggety ;-))

            regards, tom lane


Re: Left Join Not Using Index?

От
Hunter Hillegas
Дата:
Okay... So the planner is in great shape and chose the right solution...

Can anyone point me in another direction to optimize this for some more
speed? Other query suggestions? System setting changes? My impatient users
don't like to wait. :-)

Any help is very much appreciated.

Hunter

> From: Tom Lane <tgl@sss.pgh.pa.us>
> Date: Wed, 23 Apr 2003 01:09:39 -0400
> To: "Dann Corbit" <DCorbit@connx.com>
> Cc: "Hunter Hillegas" <lists@lastonepicked.com>, "Stephan Szabo"
> <sszabo@megazone23.bigpanda.com>, "PostgreSQL" <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] Left Join Not Using Index?
>
> "Dann Corbit" <DCorbit@connx.com> writes:
>> This guess:
>>> 40855.59 msec
>> Certainly seems slower than this one:
>>> 13108.33 msec
>
> Those are not guesses, those are measurements.  Translation: the planner
> made the right choice here.  (Hot diggety ;-))
>
> regards, tom lane


Re: Left Join Not Using Index?

От
Tom Lane
Дата:
Hunter Hillegas <lists@lastonepicked.com> writes:
> Can anyone point me in another direction to optimize this

AFAICS you cannot improve that without changing the query structure
and/or the database layout.  Because the WHERE clause is a bunch of OR'd
conditions, it's useless for restricting either individual table scan
making up the join --- there is really no implementation short of
forming the entire join described by the FROM ... JOIN ... ON ... part
of the query and then testing each individual row against the WHERE
clause.  While that is the abstract semantic model implied by the SQL
spec, it's not exactly how you want a query to really be done :-(.

            regards, tom lane


Re: Left Join Not Using Index?

От
Hunter Hillegas
Дата:
So, what you're basically saying is that my best bet for improving the speed
of a query to get this result set is to tweak my settings and get faster
hardware?

> From: Tom Lane <tgl@sss.pgh.pa.us>
> Date: Wed, 23 Apr 2003 01:30:05 -0400
> To: Hunter Hillegas <lists@lastonepicked.com>
> Cc: Dann Corbit <DCorbit@connx.com>, Stephan Szabo
> <sszabo@megazone23.bigpanda.com>, PostgreSQL <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] Left Join Not Using Index?
>
> Hunter Hillegas <lists@lastonepicked.com> writes:
>> Can anyone point me in another direction to optimize this
>
> AFAICS you cannot improve that without changing the query structure
> and/or the database layout.  Because the WHERE clause is a bunch of OR'd
> conditions, it's useless for restricting either individual table scan
> making up the join --- there is really no implementation short of
> forming the entire join described by the FROM ... JOIN ... ON ... part
> of the query and then testing each individual row against the WHERE
> clause.  While that is the abstract semantic model implied by the SQL
> spec, it's not exactly how you want a query to really be done :-(.
>
> regards, tom lane


Re: Left Join Not Using Index?

От
Lincoln Yeoh
Дата:
I think he said you have to change your query or your database layout to
get better speed.

Can you think of a different query or set of queries that will do what you
want?

What are you trying to achieve with your query?

Link.

At 10:35 PM 4/22/2003 -0700, Hunter Hillegas wrote:

>So, what you're basically saying is that my best bet for improving the speed
>of a query to get this result set is to tweak my settings and get faster
>hardware?
>
> > From: Tom Lane <tgl@sss.pgh.pa.us>
> > Date: Wed, 23 Apr 2003 01:30:05 -0400
> > To: Hunter Hillegas <lists@lastonepicked.com>
> > Cc: Dann Corbit <DCorbit@connx.com>, Stephan Szabo
> > <sszabo@megazone23.bigpanda.com>, PostgreSQL <pgsql-general@postgresql.org>
> > Subject: Re: [GENERAL] Left Join Not Using Index?
> >
> > Hunter Hillegas <lists@lastonepicked.com> writes:
> >> Can anyone point me in another direction to optimize this
> >
> > AFAICS you cannot improve that without changing the query structure
> > and/or the database layout.  Because the WHERE clause is a bunch of OR'd
> > conditions, it's useless for restricting either individual table scan
> > making up the join --- there is really no implementation short of


Re: Left Join Not Using Index?

От
Stephan Szabo
Дата:
On Tue, 22 Apr 2003, Hunter Hillegas wrote:

> Your suggestion didn't really make a whole lot of sense to me... Based on
> this info, what do you think?

I was wondering if something like (columns removed because I'd go insane
otherwise, but I think this illustrates it):

select message_board_topics.rec_num from
 message_board_topics where upper(topic_name) LIKE upper('madbrowser')
union
select message_board_topics.rec_num from
 message_board_topics where upper(topic_body) LIKE upper('madbrowser')
union
select message_board_topics.rec_num from
 message_board_topics where upper(topic_author) LIKE upper('madbrowser')
union
select message_board_topics.rec_num from
 message_board_topics,message_board_comments where
 message_board_comments.topic_id=message_board_topics.rec_num
 and upper(message_board_comments.comment_author) LIKE upper('madbrowser')
order by 1 desc;

with indexes on upper(topic_name), upper(topic_body), etc... was
both the same and faster.

However, the best solution is probably some sort of full text indexing
solution.  Putting the keywords from the various columns you want to index
along with the rec_num (or topic_id) of the row and an index on the text.
Then you could join message_board_topics with that and probably get a much
better plan.


Re: Left Join Not Using Index?

От
Ken Williams
Дата:
On Tuesday, April 22, 2003, at 08:44  PM, Hunter Hillegas wrote:

> I have a left join that doesn't seem to be using an index I created,
> and the
> query's performance needs to improve.

[snip]

> The query is:
>
> SELECT DISTINCT message_board_topics.rec_num,
> message_board_topics.topic_name, message_board_topics.topic_body,
> message_board_topics.topic_author,
> message_board_topics.topic_author_email,
> message_board_topics.topic_updated,
> message_board_topics.administrator_topic,
> message_board_topics.number_of_comments, to_char(topic_date,
> 'MM.DD.YYYY')
> as formatted_date FROM message_board_topics left join
> message_board_comments
> on (message_board_comments.topic_id=message_board_topics.rec_num) WHERE
> upper(topic_name) LIKE upper('madbrowser') OR upper(topic_body) LIKE
> upper('madbrowser') OR upper(topic_author) LIKE upper('madbrowser') OR
> upper(message_board_comments.comment_author) LIKE upper('madbrowser')
> ORDER
> BY message_board_topics.rec_num DESC
>

I'm not very good at reading EXPLAIN output in Postgres yet, but it
seems like it's all those "upper(table.foo) LIKE upper('madbrowser')"
conditions that are causing the slowness.  For starters, change it to
"upper(table.foo) LIKE 'MADBROWSER'".  Then since you're not using
wildcards there, change it to "upper(table.foo) = 'MADBROWSER'".

  -Ken


Re: Left Join Not Using Index?

От
Dennis Gearon
Дата:
I think he is saying that efforts in data design, and query design are called for.

Hunter Hillegas wrote:
> So, what you're basically saying is that my best bet for improving the speed
> of a query to get this result set is to tweak my settings and get faster
> hardware?
>
>
>>From: Tom Lane <tgl@sss.pgh.pa.us>
>>Date: Wed, 23 Apr 2003 01:30:05 -0400
>>To: Hunter Hillegas <lists@lastonepicked.com>
>>Cc: Dann Corbit <DCorbit@connx.com>, Stephan Szabo
>><sszabo@megazone23.bigpanda.com>, PostgreSQL <pgsql-general@postgresql.org>
>>Subject: Re: [GENERAL] Left Join Not Using Index?
>>
>>Hunter Hillegas <lists@lastonepicked.com> writes:
>>
>>>Can anyone point me in another direction to optimize this
>>
>>AFAICS you cannot improve that without changing the query structure
>>and/or the database layout.  Because the WHERE clause is a bunch of OR'd
>>conditions, it's useless for restricting either individual table scan
>>making up the join --- there is really no implementation short of
>>forming the entire join described by the FROM ... JOIN ... ON ... part
>>of the query and then testing each individual row against the WHERE
>>clause.  While that is the abstract semantic model implied by the SQL
>>spec, it's not exactly how you want a query to really be done :-(.
>>
>>regards, tom lane
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>


Re: Left Join Not Using Index?

От
"scott.marlowe"
Дата:
On Tue, 22 Apr 2003, Hunter Hillegas wrote:

> Thanks for responding...
>
> With enable_seqscan = false:

SNIP

>  Total runtime: 40855.59 msec
>
> With enable_seqscan = true:
>
>  Unique  (cost=34847.38..35145.38 rows=1192 width=293) (actual
> time=13005.13..13008.51 rows=51 loops=1)
>    ->  Sort  (cost=34847.38..34877.18 rows=11920 width=293) (actual
> time=13005.12..13005.73 rows=292 loops=1)
>          Sort Key: message_board_topics.rec_num,
> message_board_topics.topic_name, message_board_topics.topic_body,
> message_board_topics.topic_author, message_board_topics.topic_author_email,
> message_board_topics.topic_updated,
> message_board_topics.administrator_topic,
> message_board_topics.number_of_comments,
> to_char((message_board_topics.topic_date)::timestamp with time zone,
> 'MM.DD.YYYY'::text)

Note the actual time on the merge join here (about 8000 msec):

>          ->  Merge Join  (cost=26858.21..33007.14 rows=11920 width=293)
> (actual time=4930.32..12949.93 rows=292 loops=1)
>                Merge Cond: ("outer".rec_num = "inner".topic_id)
>                Filter: ((upper(("outer".topic_name)::text) ~~
> 'MADBROWSER'::text) OR (upper("outer".topic_body) ~~ 'MADBROWSER'::text) OR
> (upper(("outer".topic_author)::text) ~~ 'MADBROWSER'::text) OR
> (upper(("inner".comment_author)::text) ~~ 'MADBROWSER'::text))
>                ->  Sort  (cost=2446.01..2475.81 rows=11920 width=265)
> (actual time=628.30..953.50 rows=11920 loops=1)
>                      Sort Key: message_board_topics.rec_num
>                      ->  Seq Scan on message_board_topics
> (cost=0.00..712.20 rows=11920 width=265) (actual time=0.10..223.96
> rows=11920 loops=1)

Note the sort is showing ~ 4000 to 5000 msec

>                ->  Sort  (cost=24412.20..24818.15 rows=162382 width=28)
> (actual time=4301.14..5788.66 rows=162382 loops=1)
>                      Sort Key: message_board_comments.topic_id
>                      ->  Seq Scan on message_board_comments
> (cost=0.00..7203.82 rows=162382 width=28) (actual time=0.10..1335.26
> rows=162382 loops=1)

Run time:

>  Total runtime: 13108.33 msec
> (13 rows)

So, this query, using the seq scan, is 3 times faster.  I.e. the planner
made the right move on the seq scan versus the index scan.

There may be a faster way than using a merge join, but more than likely,
your biggest gain will come from tuning postgresql and your OS to handle
more data at a time therefore faster.

What are your settings for sort_mem, shared_buffers, effective_cache_size?

They're all explained in the 3.4.2. Planner and Optimizer Tuning section
of the 7.3.2 docs quite well.

Look at changing any of these to off and see how it affects the planner as
well.

enable_seqscan
enable_indexscan
enable_tidscan
enable_sort
enable_nestloop
enable_mergejoin
enable_hashjoin

> Your suggestion didn't really make a whole lot of sense to me... Based on
> this info, what do you think?

Well, the problem with the suggestion was that while it did turn on index
scans, it actually resulted in slower performance, since a seq scan proved
3 times faster.  Barring other minor errors in the query planner, you can
either make the machine faster / tune postgresql, change your query to
something that runs faster by the nature of how it's written.  Maybe
having a more selective where clause or using one in a subselect that will
be run first, thus knocking down the amount of data your database has to
sling around, or go to full text indexing.  There were two or three
solutions in the contrib directory last I looked that both were way faster
than the typical "roll your own" solutions.


Re: Left Join Not Using Index?

От
Hunter Hillegas
Дата:
Basically this query searches those two tables, which model a message board,
for text strings...

I'll have to take a closer look and see.

> From: Lincoln Yeoh <lyeoh@pop.jaring.my>
> Date: Wed, 23 Apr 2003 14:00:42 +0800
> To: Hunter Hillegas <lists@lastonepicked.com>
> Cc: PostgreSQL <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] Left Join Not Using Index?
>
> I think he said you have to change your query or your database layout to
> get better speed.
>
> Can you think of a different query or set of queries that will do what you
> want?
>
> What are you trying to achieve with your query?
>
> Link.
>
> At 10:35 PM 4/22/2003 -0700, Hunter Hillegas wrote:
>
>> So, what you're basically saying is that my best bet for improving the speed
>> of a query to get this result set is to tweak my settings and get faster
>> hardware?
>>
>>> From: Tom Lane <tgl@sss.pgh.pa.us>
>>> Date: Wed, 23 Apr 2003 01:30:05 -0400
>>> To: Hunter Hillegas <lists@lastonepicked.com>
>>> Cc: Dann Corbit <DCorbit@connx.com>, Stephan Szabo
>>> <sszabo@megazone23.bigpanda.com>, PostgreSQL <pgsql-general@postgresql.org>
>>> Subject: Re: [GENERAL] Left Join Not Using Index?
>>>
>>> Hunter Hillegas <lists@lastonepicked.com> writes:
>>>> Can anyone point me in another direction to optimize this
>>>
>>> AFAICS you cannot improve that without changing the query structure
>>> and/or the database layout.  Because the WHERE clause is a bunch of OR'd
>>> conditions, it's useless for restricting either individual table scan
>>> making up the join --- there is really no implementation short of
>
>


Re: Left Join Not Using Index?

От
Hunter Hillegas
Дата:
Wow! This query is MUCH faster than my old query...

About 1/8 of the time.

> From: Stephan Szabo <sszabo@megazone23.bigpanda.com>
> Date: Wed, 23 Apr 2003 06:41:27 -0700 (PDT)
> To: Hunter Hillegas <lists@lastonepicked.com>
> Cc: PostgreSQL <pgsql-general@postgresql.org>
> Subject: Re: [GENERAL] Left Join Not Using Index?
>
>
> On Tue, 22 Apr 2003, Hunter Hillegas wrote:
>
>> Your suggestion didn't really make a whole lot of sense to me... Based on
>> this info, what do you think?
>
> I was wondering if something like (columns removed because I'd go insane
> otherwise, but I think this illustrates it):
>
> select message_board_topics.rec_num from
> message_board_topics where upper(topic_name) LIKE upper('madbrowser')
> union
> select message_board_topics.rec_num from
> message_board_topics where upper(topic_body) LIKE upper('madbrowser')
> union
> select message_board_topics.rec_num from
> message_board_topics where upper(topic_author) LIKE upper('madbrowser')
> union
> select message_board_topics.rec_num from
> message_board_topics,message_board_comments where
> message_board_comments.topic_id=message_board_topics.rec_num
> and upper(message_board_comments.comment_author) LIKE upper('madbrowser')
> order by 1 desc;
>
> with indexes on upper(topic_name), upper(topic_body), etc... was
> both the same and faster.
>
> However, the best solution is probably some sort of full text indexing
> solution.  Putting the keywords from the various columns you want to index
> along with the rec_num (or topic_id) of the row and an index on the text.
> Then you could join message_board_topics with that and probably get a much
> better plan.