Обсуждение: Bug with index-usage?

Поиск
Список
Период
Сортировка

Bug with index-usage?

От
Sebastian Böck
Дата:
Hello,

I get unpredictibale results selecting from a view depending on
index-usage.

Please see the attached script for details.

Is it a bug or some "weird feature"?

Any help appreciated to get predictibale results

Sebastian
CREATE TABLE test1 (
  id SERIAL PRIMARY KEY,
  name TEXT NOT NULL
);

INSERT INTO test1 (name) VALUES ('test1_1');
INSERT INTO test1 (name) VALUES ('test1_2');
INSERT INTO test1 (name) VALUES ('test1_3');

CREATE TABLE test2 (
  id SERIAL PRIMARY KEY,
  type TEXT NOT NULL CHECK (type IN ('a','b','c')),
  test1_id INTEGER REFERENCES test1
);

INSERT INTO test2 (type,test1_id) VALUES ('a',1);
INSERT INTO test2 (type,test1_id) VALUES ('a',2);
INSERT INTO test2 (type,test1_id) VALUES ('a',3);
INSERT INTO test2 (type,test1_id) VALUES ('b',1);
INSERT INTO test2 (type,test1_id) VALUES ('b',2);
INSERT INTO test2 (type,test1_id) VALUES ('b',3);
INSERT INTO test2 (type,test1_id) VALUES ('c',1);
INSERT INTO test2 (type,test1_id) VALUES ('c',2);
INSERT INTO test2 (type,test1_id) VALUES ('c',3);

CREATE OR REPLACE VIEW test AS
  SELECT test2.*
  FROM test2
  LEFT JOIN test2 AS t2 ON
    test2.type IN ('c','b') AND
    t2.type = 'a';

SELECT * from test WHERE type = 'a';

CREATE INDEX index_a ON test2 (id) WHERE type = 'a';
CREATE INDEX index_b ON test2 (id) WHERE type = 'b';
CREATE INDEX index_c ON test2 (id) WHERE type = 'c';

SET enable_seqscan TO OFF;

SELECT * from test WHERE type = 'a';


Re: Bug with index-usage?

От
Scott Marlowe
Дата:
On Mon, 2005-11-14 at 11:07, Sebastian Böck wrote:
> Hello,
>
> I get unpredictibale results selecting from a view depending on
> index-usage.

PostgreSQL uses a cost based planner.  So, it tends to not use the plan
you might expect, especially in "toy" test cases with small data sets.
I.e. why use an index to look up 10 values, when they all fit on the
same page.  Just seq scan the data from the table.

Fill up your table with REAL data (or a close substitute) and test
again.  Also, read up on the admin section, specifically the part on the
postgresql.conf file and what the settings in there mean, then read
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

Re: Bug with index-usage?

От
Sebastian Böck
Дата:
Scott Marlowe wrote:
> On Mon, 2005-11-14 at 11:07, Sebastian Böck wrote:
>
>>Hello,
>>
>>I get unpredictibale results selecting from a view depending on
>>index-usage.
>
>
> PostgreSQL uses a cost based planner.  So, it tends to not use the plan
> you might expect, especially in "toy" test cases with small data sets.
> I.e. why use an index to look up 10 values, when they all fit on the
> same page.  Just seq scan the data from the table.
>
> Fill up your table with REAL data (or a close substitute) and test
> again.  Also, read up on the admin section, specifically the part on the
> postgresql.conf file and what the settings in there mean, then read
> http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
>

I think you didn't test my small script or don't see the same results.

I don't speak about index-usage per se, I'm talkung about the results.

Without indices I get:

SELECT * from test WHERE typ = 'a';
  id | typ | test1_id
----+-----+----------
   1 | a   |        1
   2 | a   |        2
   3 | a   |        3
(3 rows)

But with defined indices I get:

SELECT * from test WHERE typ = 'a';
  id | typ | test1_id
----+-----+----------
(0 rows)

By the way, this is 8.1 (forgot to mention in my first mail).

Sebastian


Re: Bug with index-usage?

От
Scott Marlowe
Дата:
On Mon, 2005-11-14 at 11:07, Sebastian Böck wrote:
> Hello,
>
> I get unpredictibale results selecting from a view depending on
> index-usage.

Also read up on vacuum, analyze, and explain analyze.

Re: Bug with index-usage?

От
Scott Marlowe
Дата:
On Mon, 2005-11-14 at 11:25, Sebastian Böck wrote:
> Scott Marlowe wrote:
> > On Mon, 2005-11-14 at 11:07, Sebastian Böck wrote:
> >
> >>Hello,
> >>
> >>I get unpredictibale results selecting from a view depending on
> >>index-usage.
> >
> >
> > PostgreSQL uses a cost based planner.  So, it tends to not use the plan
> > you might expect, especially in "toy" test cases with small data sets.
> > I.e. why use an index to look up 10 values, when they all fit on the
> > same page.  Just seq scan the data from the table.
> >
> > Fill up your table with REAL data (or a close substitute) and test
> > again.  Also, read up on the admin section, specifically the part on the
> > postgresql.conf file and what the settings in there mean, then read
> > http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
> >
>
> I think you didn't test my small script or don't see the same results.
>
> I don't speak about index-usage per se, I'm talkung about the results.
>
> Without indices I get:
>
> SELECT * from test WHERE typ = 'a';
>   id | typ | test1_id
> ----+-----+----------
>    1 | a   |        1
>    2 | a   |        2
>    3 | a   |        3
> (3 rows)
>
> But with defined indices I get:
>
> SELECT * from test WHERE typ = 'a';
>   id | typ | test1_id
> ----+-----+----------
> (0 rows)
>
> By the way, this is 8.1 (forgot to mention in my first mail).

I don't get this problem in 7.4.  I'll try 8.1 and get back to you.

Re: Bug with index-usage?

От
Csaba Nagy
Дата:
The OP was complaining about the results of the above script, which I
could readily reproduce on a 8.1.0 installation on debian (see below).
The same select which returned 3 rows will return nothing after creating
the partial indexes, which looks as a bug to me...
I can't tell anything about why it happens, just confirm that I can
reproduce too...

Cheers,
Csaba.

cnagy=> CREATE TABLE test1 (
cnagy(>   id SERIAL PRIMARY KEY,
cnagy(>   name TEXT NOT NULL
cnagy(> );
NOTICE:  CREATE TABLE will create implicit sequence "test1_id_seq" for
serial column "test1.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"test1_pkey" for table "test1"
CREATE TABLE
cnagy=>
cnagy=> INSERT INTO test1 (name) VALUES ('test1_1');
INSERT 0 1
cnagy=> INSERT INTO test1 (name) VALUES ('test1_2');
INSERT 0 1
cnagy=> INSERT INTO test1 (name) VALUES ('test1_3');
INSERT 0 1
cnagy=>
cnagy=> CREATE TABLE test2 (
cnagy(>   id SERIAL PRIMARY KEY,
cnagy(>   type TEXT NOT NULL CHECK (type IN ('a','b','c')),
cnagy(>   test1_id INTEGER REFERENCES test1
cnagy(> );
NOTICE:  CREATE TABLE will create implicit sequence "test2_id_seq" for
serial column "test2.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"test2_pkey" for table "test2"
CREATE TABLE
cnagy=>
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('a',1);
INSERT 0 1
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('a',2);
INSERT 0 1
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('a',3);
INSERT 0 1
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('b',1);
INSERT 0 1
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('b',2);
INSERT 0 1
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('b',3);
INSERT 0 1
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('c',1);
INSERT 0 1
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('c',2);
INSERT 0 1
cnagy=> INSERT INTO test2 (type,test1_id) VALUES ('c',3);
INSERT 0 1
cnagy=>
cnagy=> CREATE OR REPLACE VIEW test AS
cnagy->   SELECT test2.*
cnagy->   FROM test2
cnagy->   LEFT JOIN test2 AS t2 ON
cnagy->     test2.type IN ('c','b') AND
cnagy->     t2.type = 'a';
CREATE VIEW
cnagy=>
cnagy=> SELECT * from test WHERE type = 'a';
 id | type | test1_id
----+------+----------
  1 | a    |        1
  2 | a    |        2
  3 | a    |        3
(3 rows)

cnagy=>
cnagy=> CREATE INDEX index_a ON test2 (id) WHERE type = 'a';
CREATE INDEX
cnagy=> CREATE INDEX index_b ON test2 (id) WHERE type = 'b';
CREATE INDEX
cnagy=> CREATE INDEX index_c ON test2 (id) WHERE type = 'c';
CREATE INDEX
cnagy=>
cnagy=> SET enable_seqscan TO OFF;
SET
cnagy=>
cnagy=> SELECT * from test WHERE type = 'a';
 id | type | test1_id
----+------+----------
(0 rows)



On Mon, 2005-11-14 at 18:17, Scott Marlowe wrote:
> On Mon, 2005-11-14 at 11:07, Sebastian Böck wrote:
> > Hello,
> >
> > I get unpredictibale results selecting from a view depending on
> > index-usage.
>
> PostgreSQL uses a cost based planner.  So, it tends to not use the plan
> you might expect, especially in "toy" test cases with small data sets.
> I.e. why use an index to look up 10 values, when they all fit on the
> same page.  Just seq scan the data from the table.
>
> Fill up your table with REAL data (or a close substitute) and test
> again.  Also, read up on the admin section, specifically the part on the
> postgresql.conf file and what the settings in there mean, then read
> http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend


Re: Bug with index-usage?

От
Andreas Kretschmer
Дата:
Sebastian Böck <sebastianboeck@freenet.de> schrieb:

> Hello,
>
> I get unpredictibale results selecting from a view depending on
> index-usage.
> [ snipp ]
>
> SELECT * from test WHERE type = 'a';

unfortunately, no result. What Du you expect?


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: Bug with index-usage?

От
Jaime Casanova
Дата:
On 11/14/05, Sebastian Böck <sebastianboeck@freenet.de> wrote:
> Hello,
>
> I get unpredictibale results selecting from a view depending on
> index-usage.
>
> Please see the attached script for details.
>
> Is it a bug or some "weird feature"?
>
> Any help appreciated to get predictibale results
>
> Sebastian
>
>
> CREATE TABLE test1 (
>  id SERIAL PRIMARY KEY,
>  name TEXT NOT NULL
> );
>
> INSERT INTO test1 (name) VALUES ('test1_1');
> INSERT INTO test1 (name) VALUES ('test1_2');
> INSERT INTO test1 (name) VALUES ('test1_3');
>
> CREATE TABLE test2 (
>  id SERIAL PRIMARY KEY,
>  type TEXT NOT NULL CHECK (type IN ('a','b','c')),
>  test1_id INTEGER REFERENCES test1
> );
>
> INSERT INTO test2 (type,test1_id) VALUES ('a',1);
> INSERT INTO test2 (type,test1_id) VALUES ('a',2);
> INSERT INTO test2 (type,test1_id) VALUES ('a',3);
> INSERT INTO test2 (type,test1_id) VALUES ('b',1);
> INSERT INTO test2 (type,test1_id) VALUES ('b',2);
> INSERT INTO test2 (type,test1_id) VALUES ('b',3);
> INSERT INTO test2 (type,test1_id) VALUES ('c',1);
> INSERT INTO test2 (type,test1_id) VALUES ('c',2);
> INSERT INTO test2 (type,test1_id) VALUES ('c',3);
>
> CREATE OR REPLACE VIEW test AS
>  SELECT test2.*
>  FROM test2
>  LEFT JOIN test2 AS t2 ON
>    test2.type IN ('c','b') AND
>    t2.type = 'a';
>
> SELECT * from test WHERE type = 'a';
>
> CREATE INDEX index_a ON test2 (id) WHERE type = 'a';
> CREATE INDEX index_b ON test2 (id) WHERE type = 'b';
> CREATE INDEX index_c ON test2 (id) WHERE type = 'c';
>
> SET enable_seqscan TO OFF;
>
> SELECT * from test WHERE type = 'a';
>

i don't have my machine at hand but i don't think that even the select
is right, you have a join but without joining clauses you will get a
cartesian product...

what do you believe is the right answer... just for my probe later...

--
regards,
Jaime Casanova
(DBA: DataBase Aniquilator ;)

Re: Bug with index-usage?

От
Scott Marlowe
Дата:
On Mon, 2005-11-14 at 11:30, Csaba Nagy wrote:
> The OP was complaining about the results of the above script, which I
> could readily reproduce on a 8.1.0 installation on debian (see below).
> The same select which returned 3 rows will return nothing after creating
> the partial indexes, which looks as a bug to me...
> I can't tell anything about why it happens, just confirm that I can
> reproduce too...

Yep, I just reproduced it too.

In the future, I'd recommend they include the bad output, as I simply
thought "unpredictable output" was referring to performance, not the
actual data.



Re: Bug with index-usage?

От
Sebastian Böck
Дата:
Jaime Casanova wrote:
> On 11/14/05, Sebastian Böck <sebastianboeck@freenet.de> wrote:
>
>>Hello,
>>
>>I get unpredictibale results selecting from a view depending on
>>index-usage.
>>
>>Please see the attached script for details.
>>
>>Is it a bug or some "weird feature"?
>>
>>Any help appreciated to get predictibale results
>>
>>Sebastian
>>
>>
>>CREATE TABLE test1 (
>> id SERIAL PRIMARY KEY,
>> name TEXT NOT NULL
>>);
>>
>>INSERT INTO test1 (name) VALUES ('test1_1');
>>INSERT INTO test1 (name) VALUES ('test1_2');
>>INSERT INTO test1 (name) VALUES ('test1_3');
>>
>>CREATE TABLE test2 (
>> id SERIAL PRIMARY KEY,
>> type TEXT NOT NULL CHECK (type IN ('a','b','c')),
>> test1_id INTEGER REFERENCES test1
>>);
>>
>>INSERT INTO test2 (type,test1_id) VALUES ('a',1);
>>INSERT INTO test2 (type,test1_id) VALUES ('a',2);
>>INSERT INTO test2 (type,test1_id) VALUES ('a',3);
>>INSERT INTO test2 (type,test1_id) VALUES ('b',1);
>>INSERT INTO test2 (type,test1_id) VALUES ('b',2);
>>INSERT INTO test2 (type,test1_id) VALUES ('b',3);
>>INSERT INTO test2 (type,test1_id) VALUES ('c',1);
>>INSERT INTO test2 (type,test1_id) VALUES ('c',2);
>>INSERT INTO test2 (type,test1_id) VALUES ('c',3);
>>
>>CREATE OR REPLACE VIEW test AS
>> SELECT test2.*
>> FROM test2
>> LEFT JOIN test2 AS t2 ON
>>   test2.type IN ('c','b') AND
>>   t2.type = 'a';
>>
>>SELECT * from test WHERE type = 'a';
>>
>>CREATE INDEX index_a ON test2 (id) WHERE type = 'a';
>>CREATE INDEX index_b ON test2 (id) WHERE type = 'b';
>>CREATE INDEX index_c ON test2 (id) WHERE type = 'c';
>>
>>SET enable_seqscan TO OFF;
>>
>>SELECT * from test WHERE type = 'a';
>>
>
>
> i don't have my machine at hand but i don't think that even the select
> is right, you have a join but without joining clauses you will get a
> cartesian product...
>
> what do you believe is the right answer... just for my probe later...

I think it should be:

  id | type | test1_id
----+------+----------
   1 | a    |        1
   2 | a    |        2
   3 | a    |        3


because a

EXPLAIN SELECT * from test WHERE type = 'a';

shows some weird assumptions

Index Scan using index_a on test2  (cost=0.00..4.69 rows=1 width=40)
   Filter: (("type" = 'c'::text) OR ("type" = 'b'::text))

note that index_a is defined as:

CREATE INDEX index_a ON test2 (id) WHERE type = 'a';

Sebastian


Re: Bug with index-usage?

От
Tom Lane
Дата:
=?ISO-8859-1?Q?Sebastian_B=F6ck?= <sebastianboeck@freenet.de> writes:
> I get unpredictibale results selecting from a view depending on
> index-usage.

It's not actually *using* the indexes, although presence of the indexes
does seem to be needed to trigger the bug:

regression=# explain SELECT * from test WHERE type = 'a';
                                        QUERY PLAN
-------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=0.00..2.29 rows=1 width=40)
   Join Filter: (("outer"."type" = 'c'::text) OR ("outer"."type" = 'b'::text))
   ->  Seq Scan on test2  (cost=0.00..1.16 rows=1 width=40)
         Filter: (("type" = 'a'::text) AND (("type" = 'c'::text) OR ("type" = 'b'::text)))
   ->  Seq Scan on test2 t2  (cost=0.00..1.11 rows=1 width=0)
         Filter: ("type" = 'a'::text)
(6 rows)

regression=# drop index index_b;
DROP INDEX
regression=# explain SELECT * from test WHERE type = 'a';
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=0.00..2.24 rows=1 width=40)
   Join Filter: (("outer"."type" = 'c'::text) OR ("outer"."type" = 'b'::text))
   ->  Seq Scan on test2  (cost=0.00..1.11 rows=1 width=40)
         Filter: ("type" = 'a'::text)
   ->  Seq Scan on test2 t2  (cost=0.00..1.11 rows=1 width=0)
         Filter: ("type" = 'a'::text)
(6 rows)

It looks like the problem is that the new 8.1 OR-index-qual code is
confused about when it can apply outer-join conditions.  It shouldn't be
propagating the outer-join condition into the scan condition on test2,
but it is.  Will fix.

            regards, tom lane

Re: Bug with index-usage?

От
Sebastian Böck
Дата:
Tom Lane wrote:
> =?ISO-8859-1?Q?Sebastian_B=F6ck?= <sebastianboeck@freenet.de> writes:
>
>>I get unpredictibale results selecting from a view depending on
>>index-usage.
>
>
> It's not actually *using* the indexes, although presence of the indexes
> does seem to be needed to trigger the bug:
>
> regression=# explain SELECT * from test WHERE type = 'a';
>                                         QUERY PLAN
> -------------------------------------------------------------------------------------------
>  Nested Loop Left Join  (cost=0.00..2.29 rows=1 width=40)
>    Join Filter: (("outer"."type" = 'c'::text) OR ("outer"."type" = 'b'::text))
>    ->  Seq Scan on test2  (cost=0.00..1.16 rows=1 width=40)
>          Filter: (("type" = 'a'::text) AND (("type" = 'c'::text) OR ("type" = 'b'::text)))
>    ->  Seq Scan on test2 t2  (cost=0.00..1.11 rows=1 width=0)
>          Filter: ("type" = 'a'::text)
> (6 rows)
>
> regression=# drop index index_b;
> DROP INDEX
> regression=# explain SELECT * from test WHERE type = 'a';
>                                   QUERY PLAN
> -------------------------------------------------------------------------------
>  Nested Loop Left Join  (cost=0.00..2.24 rows=1 width=40)
>    Join Filter: (("outer"."type" = 'c'::text) OR ("outer"."type" = 'b'::text))
>    ->  Seq Scan on test2  (cost=0.00..1.11 rows=1 width=40)
>          Filter: ("type" = 'a'::text)
>    ->  Seq Scan on test2 t2  (cost=0.00..1.11 rows=1 width=0)
>          Filter: ("type" = 'a'::text)
> (6 rows)
>
> It looks like the problem is that the new 8.1 OR-index-qual code is
> confused about when it can apply outer-join conditions.  It shouldn't be
> propagating the outer-join condition into the scan condition on test2,
> but it is.  Will fix.
>
>             regards, tom lane

Hi,

thanks for lookin into it.

I patched my 8.1 installation with the following changes:

http://www-new.commandprompt.com/projects/public/pgsql/changeset/23461

The error described above doesn't exist any more, but it's still
buggy. Just create a view which is left-joining to an other table. The
joined columns don't show up in the view.

CREATE OR REPLACE VIEW test_ AS
   SELECT test2.*, test1.name
   FROM test2
   LEFT JOIN test1 ON test1.id = test2.test1_id
   LEFT JOIN test2 AS t2 ON
     test2.type IN ('c','b') AND
     t2.type = 'a';

In 8.0 I get:

SELECT * from test WHERE type = 'a';
  id | type | test1_id |  name
----+------+----------+---------
   1 | a    |        1 | test1_1
   2 | a    |        2 | test1_2
   3 | a    |        3 | test1_3
(3 rows)

In 8.1 (with or without your patches) I get:

SELECT * from test_ WHERE type = 'a';
  id | type | test1_id | name
----+------+----------+------
   1 | a    |        1 |
   2 | a    |        2 |
   3 | a    |        3 |
(3 rows)

Hope you could repeat the problem. Otherwise, please contact me.

Sebastian

Re: Bug with index-usage?

От
Tom Lane
Дата:
=?ISO-8859-1?Q?Sebastian_B=F6ck?= <sebastianboeck@freenet.de> writes:
> I patched my 8.1 installation with the following changes:
> http://www-new.commandprompt.com/projects/public/pgsql/changeset/23461

> The error described above doesn't exist any more, but it's still
> buggy.

Yup, you're right :-(.  Looks like we haven't been doing adequate
testing with complex OUTER JOIN clauses ...

Fix committed.  Thanks for the report!

            regards, tom lane

Re: Bug with index-usage?

От
Sebastian Böck
Дата:
Tom Lane wrote:
> =?ISO-8859-1?Q?Sebastian_B=F6ck?= <sebastianboeck@freenet.de> writes:
>
>>I patched my 8.1 installation with the following changes:
>>http://www-new.commandprompt.com/projects/public/pgsql/changeset/23461
>
>
>>The error described above doesn't exist any more, but it's still
>>buggy.
>
>
> Yup, you're right :-(.  Looks like we haven't been doing adequate
> testing with complex OUTER JOIN clauses ...
>
> Fix committed.  Thanks for the report!

Thanks for the quick fix, everything looks good now!

Sebastian