Re: [NOVICE] WHERE clause not used when index is used

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [NOVICE] WHERE clause not used when index is used
Дата
Msg-id 23030.1456862619@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: [NOVICE] WHERE clause not used when index is used  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [NOVICE] WHERE clause not used when index is used  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: [NOVICE] WHERE clause not used when index is used  (Simon Riggs <simon@2ndQuadrant.com>)
Список pgsql-hackers
I wrote:
> Petr Jelinek <petr@2ndquadrant.com> writes:
>> I can only get the issue when the sort order of the individual keys does 
>> not correlate and the operator sorts according to the first column and 
>> there are duplicate values for the first column.

> Yeah, I think the combination of ASC and DESC columns may be necessary to
> break things.  It needs closer analysis.

Okay, I've now studied the patch more carefully and understood that it
wasn't quite so naive as to not consider multi-column cases at all.
It does work for simple scalar multi-column cases, because the test on
SK_BT_MATCHED is placed so that it only skips checking the first column's
condition, not the conditions on lower-order columns.  Rather, the problem
is failure to consider ROW() comparisons, which do not necessarily have
the same simple behavior as scalar comparisons.  They sort of accidentally
work, *if* the lower-order columns in the ROW() comparison match the
index's lower-order columns as to position and index direction.  Tobias'
example fails because the second column in the ROW() isn't sorted the
same, and you could also make it fail with a 3-or-more-column index in
which the ROW() comparison tested, say, just the first and third columns.

What I think we should do is move the SK_BT_MATCHED flag down to the
first sub-key of the ROW() comparison, which is the one that does behave
the same as in scalar cases.

The fact that it doesn't fail for the case where the lower-order columns
of the ROW() do match the index shows that this still leaves something on
the table: within a ROW() comparison, you could set SK_BT_MATCHED on more
than just the first sub-key, *if* the subsequent columns match the index.
This makes me think that the analysis should be getting done in or near
_bt_mark_scankey_required, not in the rather random place it is now,
because _bt_mark_scankey_required already understands about sub-keys
matching the index or not.


However ... there is an even larger problem, which is that the patch
thinks it can use skey->sk_flags for what's effectively transient state
storage --- not merely transient, but scan-direction-dependent.  This
means that a cursor that reads forwards for awhile and then turns around
and reads backwards will get the wrong answer, even without anything as
complicated as multiple key columns.  It's a bit harder to exhibit such a
bug than you might guess, because of btree's habit of prefetching an index
page at a time; you have to make the scan cross an index page boundary
before you turn around and back it up.  Bearing that in mind, I was soon
able to devise a failure case in the regression database:

regression=# set enable_seqscan TO 0;
SET
regression=# set enable_bitmapscan TO 0;
SET
regression=# begin;
BEGIN
regression=# declare c cursor for select unique1,unique2 from tenk1 where unique1 > 9500;
DECLARE CURSOR
regression=# fetch 20 from c;unique1 | unique2 
---------+---------   9501 |    5916   9502 |    1812   9503 |    1144   9504 |    9202   9505 |    4300   9506 |
5551  9507 |     847   9508 |    4093   9509 |    9418   9510 |    1862   9511 |     848   9512 |    4823   9513 |
1125  9514 |    9276   9515 |    1160   9516 |    5177   9517 |    3600   9518 |    9677   9519 |    5518   9520 |
1429
(20 rows)

regression=# fetch backward 30 from c;unique1 | unique2 
---------+---------   9519 |    5518   9518 |    9677   9517 |    3600   9516 |    5177   9515 |    1160   9514 |
9276  9513 |    1125   9512 |    4823   9511 |     848   9510 |    1862   9509 |    9418   9508 |    4093   9507 |
847  9506 |    5551   9505 |    4300   9504 |    9202   9503 |    1144   9502 |    1812   9501 |    5916   9500 |
3676  9499 |     927   9498 |    2807   9497 |    6558   9496 |    1857   9495 |    1974   9494 |     560   9493 |
3531  9492 |    9875   9491 |    7237   9490 |    8928
 
(30 rows)

Ooops.

I believe the way to fix this would be to stop regarding SK_BT_MATCHED
as state, and instead treat it as a scankey property identified during
_bt_preprocess_keys, analogously to SK_BT_REQFWD/SK_BT_REQBKWD --- and,
like those, you'd need two flags not one since the properties will be
determined independently of knowing which direction you'll be going in.

In any event, I am now of the opinion that this patch needs to be reverted
outright and returned to the authors for redesign.  There are too many
things wrong with it and too little reason to think that we have to have
it in 9.5.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Steele
Дата:
Сообщение: 2016-03 Commitfest Manager
Следующее
От: Tom Lane
Дата:
Сообщение: Re: 2016-03 Commitfest Manager