Re: reducing number of ANDs speeds up query

Поиск
Список
Период
Сортировка
От Alban Hertroys
Тема Re: reducing number of ANDs speeds up query
Дата
Msg-id CAF-3MvNS94v9+gPFvW=eF8WO9=QsbwAP3pK3PmPuSg+ism5fpA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: reducing number of ANDs speeds up query  ("T. E. Lawrence" <t.e.lawrence@icloud.com>)
Ответы Re: reducing number of ANDs speeds up query RESOLVED  ("T. E. Lawrence" <t.e.lawrence@icloud.com>)
Список pgsql-general
On 12 January 2013 12:41, T. E. Lawrence <t.e.lawrence@icloud.com> wrote:
Hi and thank you for your notes!

> You really ought to include the output of EXPLAIN ANALYZE in cases such as these (if it doesn't already point you to the culprit).

I'll do so, it takes quite long...

> Most likely you'll find that the last condition added a sequential scan to the query plan,

Exactly! EXPLAIN says so.

> which can have several causes/reasons. Are the estimated #rows close to the actual #rows?

Yes, this is the problem. I read that in such cases indexes are not read. However if the previous conditions are executed first, the result is zero or just a few rows and there is no need seq scan the whole values column.

You mean they don't match, do you?

The database doesn't know what you know and its making the wrong decision based on incorrect data.

The database won't use an index if it thinks that there aren't many rows to check against a condition. Most likely (the results from explain analyze would tell) the database thinks there are much fewer rows in table b than there actually are.

You'll probably want to read about database maintenance for Postgres and how to keep its statistics up to date. Autovacuum may need some tuning or you need to run manual VACUUM more frequently.
In fact, run VACUUM now and see if the problem goes away.

You'll usually also want to run VACUUM after a large batch job.
 
> Meanwhile, it looks like most of your AND's are involved in joining tables a and b. Perhaps it helps to use an explicit join instead of an implicit one?

I am not quite sure what this means, but will read about it.

You're currently using implicit joins by combining your join conditions in the WHERE clause of your query, like this:
SELECT *
FROM a, b
WHERE a.col1 = b.col1 AND a.col2 = b.col2 AND b.value = 'yadayada';

You can also explicitly put your join conditions with the joins, like so:
SELECT *
FROM a INNER JOIN b ON (a.col1 = b.col1 AND a.col2 = b.col2)
WHERE b.value = 'yadayada';

You explicitly tell the database that those are the conditions to be joined on and that the remaining conditions are filters on the result set. With just two tables the need for such isn't that obvious, but with more tables it quickly becomes difficult to see what condition in an implicit join is part of the joins and which is the result set filter. With explicit joins that's much clearer.
It wouldn't be the first time that I rewrite a query to use explicit joins, only to find that the original query was incorrect.

--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

В списке pgsql-general по дате отправления:

Предыдущее
От: Tony Theodore
Дата:
Сообщение: Re: reducing number of ANDs speeds up query
Следующее
От: Jeff Janes
Дата:
Сообщение: Re: changes "during checkpointing"