Обсуждение: Strange problem with create table as select * from table;

Поиск
Список
Период
Сортировка

Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
Hi
We have pretty weird situation, which seems to be impossible, but perhaps you'll notice something that will let me fix
theproblem. 

System: SunOS 5.11      snv_130
Pg: PostgreSQL 8.4.7 on i386-pc-solaris2.11, compiled by cc: Sun C 5.10 SunOS_i386 2009/06/03, 64-bit

In there I have a table:

$ \d sssssss.xobjects
                                                   Table "sssssss.xobjects"
          Column           |           Type           |                               Modifiers

---------------------------+--------------------------+-----------------------------------------------------------------------
 xobject_id                | integer                  | not null default
nextval('sssssss.xobjects_xobject_id_seq'::regclass)
 magic_id                  | integer                  |
 xxxxxxxxxxxxxx            | integer                  |
 xxxxxxxxxxxxx             | integer                  |
 creation_tsz              | timestamp with time zone |
 xxxx                      | character varying(255)   |
 xxxxxxxxxx                | character varying(255)   |
 xxxxxxxxxxx               | character varying(255)   |
 xxxx                      | character varying(255)   |
 xxxxx                     | character varying(255)   |
 xxx                       | character varying(255)   |
 xxxxxxxxxx                | integer                  |
 xxxxxxxxxxxxxx            | character varying(128)   |
 xxxxxxxxxxxxx             | character varying(255)   |
 xxxxxxxxxxxxxx            | character varying(24)    |
 xxxxxxxxxxxxxxxxxxx       | text                     |
 xxxxxxxxxxxxxxxxxx        | text                     |
 xxxxxxxx                  | boolean                  | default false
 xxxxxxxxxxx               | numeric(24,2)            |
 xxxxxxxxxxxxxxxxxxx       | numeric(24,2)            |
 xxxxxxxxxxxxx             | character varying(6)     | default 'USD'::character varying
 xxxxxxxxxxxxxxxxxxxx      | text                     |
 xxxxxxxxxxx               | boolean                  | default false
 xxxxxxxxxxxxxxxxx         | numeric(24,2)            |
 xxxxxxxxxxxxxxxxxxx       | character varying(6)     |
 xxxxxxxxxxxxxxxxxxxxxxxxx | numeric(24,2)            |
 xxxxxxxxxxxxxx            | numeric(24,2)            |
 xxxxxxxxx                 | boolean                  |
 xxxxxxxxxxxxxxxx          | integer                  |
 xxxxxxxxxxxxxxx           | numeric(24,2)            |
 xxxxxxxxxxxxxxxxxxxxxxx   | numeric(24,2)            |
 xxxxxxxxxxxxxxxxxx        | numeric(24,2)            |
 xxxxxxxxxxxxxxxxxxxx      | numeric(24,2)            |
 xxxxxxxxx                 | bigint                   |
 xxxxxxxxxxxxxxx           | numeric(24,2)            |
 xxxxxxxxxxxxxxxxxxx       | numeric(24,2)            |
 xxxxxxxxxxxxxxxxxxxxx     | numeric(24,2)            |
 xxxxxxxxxxxxxxxxx         | numeric(24,2)            |
 xxxxxxxxxxxxxxxxxxxxx     | numeric(24,2)            |
 xxxxxxxxxxxxxxxxxxxxxxx   | numeric(24,2)            |
 xxxxxxxxxxxxxx            | numeric(24,2)            |
 xxxxxxxxxxxxxxxxxx        | numeric(24,2)            |
 xxxxxxxxxxxxxxxxxxxx      | numeric(24,2)            |
 xxxxxxxxxxxxxxxxx         | integer                  |
 xxxxxxxxxxxxxxxxxxxxxx    | text                     |
 xxxxxxxxx                 | character varying(255)   |
 xxxxxxx                   | integer                  |
 xxxxxxxxxxxxxxxx          | boolean                  |
 xxxxxxxxxxxxxxxx          | numeric(24,2)            |
Indexes:
    "xobjects_pkey" PRIMARY KEY, btree (xobject_id)
    "xobjects_creation_tsz" btree (creation_tsz)

$ select * from pg_class  where oid = 'sssssss.xobjects'::regclass;
-[ RECORD 1 ]--+-------------------------------------------------------------------------------------------
relname        | xobjects
relnamespace   | 9868449
reltype        | 7441360
relowner       | 10
relam          | 0
relfilenode    | 334123501
reltablespace  | 0
relpages       | 5534109
reltuples      | 3.49685e+07
reltoastrelid  | 334123504
reltoastidxid  | 0
relhasindex    | t
relisshared    | f
relistemp      | f
relkind        | r
relnatts       | 49
relchecks      | 0
relhasoids     | f
relhaspkey     | t
relhasrules    | f
relhastriggers | t
relhassubclass | f
relfrozenxid   | 75260515
relacl         | {postgres=arwdDxt/postgres,developer=r/postgres,sitemaps=r/postgres,uuuuuuu_ro=r/postgres}
reloptions     | [null]

$ select * from pg_stat_user_tables where relid = 'sssssss.xobjects'::regclass;
-[ RECORD 1 ]----+------------------------------
relid            | 7441358
schemaname       | sssssss
relname          | xobjects
seq_scan         | 302
seq_tup_read     | 8367856283
idx_scan         | 34898129
idx_tup_fetch    | 2836717789
n_tup_ins        | 7772954
n_tup_upd        | 1
n_tup_del        | 5539090
n_tup_hot_upd    | 1
n_live_tup       | 35068206
n_dead_tup       | 137275
last_vacuum      | [null]
last_autovacuum  | 2011-10-30 12:29:38.853241+00
last_analyze     | [null]
last_autoanalyze | 2011-10-30 06:30:28.334954+00

This table looks perfectly OK. What's important - it doesn't have any duplicates in xobject_id column:

$ select xobject_id, count(*) from sssssss.xobjects group by 1 having count(*) > 1;
 xobject_id | count
------------+-------
(0 rows)

All looks good. pg_dump of the table also doesn't show any strange problems, and is duplicate free. But:

$ create table zzz as select * from sssssss.xobjects;
SELECT

$ select xobject_id, count(*) from zzz group by 1 having count(*) > 1 order by 2 desc;
 xobject_id | count
------------+-------
         -1 |    40
(1 row)

$ select magic_id from zzz where xobject_id = -1 order by 1;
 magic_id
----------
 30343295
 30343295
 30408831
 30408831
 30408831
 30539903
 30605439
 30605439
 30670975
 30670975
 30670975
 30802047
 30867583
 30933119
 31195263
 31195263
 31260799
 31326335
 31588479
 31588479
 31588479
 31654015
 31719551
 31785087
 31785087
 31785087
 31850623
 31850623
 31850623
 31850623
 31981695
 31981695
 32047231
 32047231
 32112767
 32309375
 32374911
 32440447
 32505983
 32505983
(40 rows)

What's interesting is that when I did it previously, couple of days ago, and sampled some randon magic_ids that I foudn
withxobject_id = -1, I got: 
$ select xobject_id, magic_id from zzz where magic_id in ( 32440447, 32047231, 32505983);
 xobject_id | magic_id
------------+----------
   35858705 | 32505983
   35793169 | 32440447
         -1 | 32440447
         -1 | 32047231
         -1 | 32505983
         -1 | 32505983
   35399951 | 32047231
         -1 | 32047231
(8 rows)

please note that the same ids are duplicated now too.

and if magic_id was twice with xobject_id = -1, it is the same case now!.

In base sssssss.xobjects, all those rows are just once:

$ select xobject_id, magic_id from sssssss.xobjects where magic_id in (32440447, 32047231, 32505983);
 xobject_id | magic_id
------------+----------
   35858705 | 32505983
   35793169 | 32440447
   35399951 | 32047231
(3 rows)

I also verified that there are no concurrent updates that would set xobject_id to -1, so it's not a problem of
isolation.

During the night I repeated the procedure and the rows that got duplicated seem to be the same - at the very least -
thesame magic_id. 

Does above seem sensible for anyone? Any suggestions on what could be broken?

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
Adrian Klaver
Дата:
On Thursday, November 03, 2011 1:25:58 am hubert depesz lubaczewski wrote:
> Hi
> We have pretty weird situation, which seems to be impossible, but perhaps
> you'll notice something that will let me fix the problem.
>
> System: SunOS 5.11      snv_130
> Pg: PostgreSQL 8.4.7 on i386-pc-solaris2.11, compiled by cc: Sun C 5.10
> SunOS_i386 2009/06/03, 64-bit
>

>
> I also verified that there are no concurrent updates that would set
> xobject_id to -1, so it's not a problem of isolation.
>
> During the night I repeated the procedure and the rows that got duplicated
> seem to be the same - at the very least - the same magic_id.
>
> Does above seem sensible for anyone? Any suggestions on what could be
> broken?

Do the xobject_id values have other negative numbers or is -1 just a special
case? The only thing I can think of is a corrupted index on xobject_id.

>
> Best regards,
>
> depesz

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
On Thu, Nov 03, 2011 at 07:00:30AM -0700, Adrian Klaver wrote:
> > I also verified that there are no concurrent updates that would set
> > xobject_id to -1, so it's not a problem of isolation.
> >
> > During the night I repeated the procedure and the rows that got duplicated
> > seem to be the same - at the very least - the same magic_id.
> >
> > Does above seem sensible for anyone? Any suggestions on what could be
> > broken?
>
> Do the xobject_id values have other negative numbers or is -1 just a special
> case? The only thing I can think of is a corrupted index on xobject_id.

minimal xobject_id in source table is 1000.

index on xobject_id might be corrupted, but it doesn't explain that I
don't see duplicates with group_by/having query on xobjects, which uses
seqscan:

$ explain select xobject_id, count(*) from sssssss.xobjects group by 1 having count(*) > 1;
                                   QUERY PLAN
---------------------------------------------------------------------------------
 GroupAggregate  (cost=11718280.34..12682724.26 rows=35070688 width=4)
   Filter: (count(*) > 1)
   ->  Sort  (cost=11718280.34..11805957.06 rows=35070688 width=4)
         Sort Key: xobject_id
         ->  Seq Scan on xobjects  (cost=0.00..5884815.88 rows=35070688 width=4)
(5 rows)

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
Alban Hertroys
Дата:
On 3 November 2011 15:15, hubert depesz lubaczewski <depesz@depesz.com> wrote:
>> Do the xobject_id values have other negative numbers or is -1 just a special
>> case? The only thing I can think of is a corrupted index on xobject_id.
>
> minimal xobject_id in source table is 1000.
>
> index on xobject_id might be corrupted, but it doesn't explain that I
> don't see duplicates with group_by/having query on xobjects, which uses
> seqscan:

Actually, it does. A sequential scan doesn't use the index.

--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
On Thu, Nov 03, 2011 at 03:19:36PM +0100, Alban Hertroys wrote:
> On 3 November 2011 15:15, hubert depesz lubaczewski <depesz@depesz.com> wrote:
> >> Do the xobject_id values have other negative numbers or is -1 just a special
> >> case? The only thing I can think of is a corrupted index on xobject_id.
> >
> > minimal xobject_id in source table is 1000.
> >
> > index on xobject_id might be corrupted, but it doesn't explain that I
> > don't see duplicates with group_by/having query on xobjects, which uses
> > seqscan:
>
> Actually, it does. A sequential scan doesn't use the index.

sure. but so isn't create table as.

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
Tom Lane
Дата:
hubert depesz lubaczewski <depesz@depesz.com> writes:
> index on xobject_id might be corrupted, but it doesn't explain that I
> don't see duplicates with group_by/having query on xobjects, which uses
> seqscan:

I was just going to ask you to check that.  Weird as can be.

Does plain old "SELECT COUNT(*)" show a difference between the two
tables?

Do you get similar misbehavior if you break the CREATE TABLE AS into a
CREATE and an INSERT/SELECT?  Also, please note the rowcount returned
by INSERT/SELECT and see how it matches up with the tables afterwards.

Does turning synchronize_seqscans off affect the behavior?

            regards, tom lane

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
On Thu, Nov 03, 2011 at 10:55:20AM -0400, Tom Lane wrote:
> > index on xobject_id might be corrupted, but it doesn't explain that I
> > don't see duplicates with group_by/having query on xobjects, which uses
> > seqscan:
> I was just going to ask you to check that.  Weird as can be.
> Does plain old "SELECT COUNT(*)" show a difference between the two
> tables?

yes. the xobjects table is written to quite a lot and turning it off is
not an option.

> Do you get similar misbehavior if you break the CREATE TABLE AS into a
> CREATE and an INSERT/SELECT?  Also, please note the rowcount returned
> by INSERT/SELECT and see how it matches up with the tables afterwards.
>
> Does turning synchronize_seqscans off affect the behavior?

interesting. will test and let you know. the test will take couple of
hours, though.

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
Adrian Klaver
Дата:
On Thursday, November 03, 2011 7:15:22 am hubert depesz lubaczewski wrote:
> On Thu, Nov 03, 2011 at 07:00:30AM -0700, Adrian Klaver wrote:
> > > I also verified that there are no concurrent updates that would set
> > > xobject_id to -1, so it's not a problem of isolation.
> > >
> > > During the night I repeated the procedure and the rows that got
> > > duplicated seem to be the same - at the very least - the same
> > > magic_id.
> > >
> > > Does above seem sensible for anyone? Any suggestions on what could be
> > > broken?
> >
> > Do the xobject_id values have other negative numbers or is -1 just a
> > special case? The only thing I can think of is a corrupted index on
> > xobject_id.
>
> minimal xobject_id in source table is 1000.

So just to be clear there is and never has been a -1 value for xobject_id in the
source table?
So a select count(*) from sssssss.xobjects where xobject_id = -1 on the source
table yields 0?

> depesz

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
On Thu, Nov 03, 2011 at 08:04:19AM -0700, Adrian Klaver wrote:
> So just to be clear there is and never has been a -1 value for xobject_id in the
> source table?

yes. min value of xobject_id is 1000, and we had trigger in place on the
table which logged all inserts/updates/deletes and the value -1 never
showed up (At least in the last couple of days, during which i was
making the copies).

> So a select count(*) from sssssss.xobjects where xobject_id = -1 on the source
> table yields 0?

yes, that's correct. both using index, and usingf seq scan.

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
Alban Hertroys
Дата:
On 3 November 2011 09:25, hubert depesz lubaczewski <depesz@depesz.com> wrote:
> All looks good. pg_dump of the table also doesn't show any strange problems, and is duplicate free. But:
>
> $ create table zzz as select * from sssssss.xobjects;
> SELECT
>
> $ select xobject_id, count(*) from zzz group by 1 having count(*) > 1 order by 2 desc;
>  xobject_id | count
> ------------+-------
>         -1 |    40
> (1 row)

Can you verify that these queries both do actually use a sequential
scan, and not, for some reason, an index scan? Just to rule out the
index corruption scenario.
You (or someone near you) might have disabled seqscans, for example.

Another thought: Is it possible that xobject_id just happens to be
used internally as a hidden field by Postgres or by an extension? That
would be another explanation for seeing -1  or duplicates in that
column.

If that's the case, I would have expected an error on creation of that table.

And lastly, is this behaviour after copying a table into a new one
reproducible or did it happen just once?

--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

Re: Strange problem with create table as select * from table;

От
Adrian Klaver
Дата:
On Thursday, November 03, 2011 8:05:38 am hubert depesz lubaczewski wrote:
> On Thu, Nov 03, 2011 at 08:04:19AM -0700, Adrian Klaver wrote:
> > So just to be clear there is and never has been a -1 value for xobject_id
> > in the source table?
>
> yes. min value of xobject_id is 1000, and we had trigger in place on the
> table which logged all inserts/updates/deletes and the value -1 never
> showed up (At least in the last couple of days, during which i was
> making the copies).
>
> > So a select count(*) from sssssss.xobjects where xobject_id = -1 on the
> > source table yields 0?
>
> yes, that's correct. both using index, and usingf seq scan.

Hmmm.  Now we await the results of the tests Tom suggested.  Just a thought, any
other strange behavior, hiccups in the database over the past couple of days?

>
> Best regards,
>
> depesz

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
On Thu, Nov 03, 2011 at 04:21:37PM +0100, Alban Hertroys wrote:
> On 3 November 2011 09:25, hubert depesz lubaczewski <depesz@depesz.com> wrote:
> > All looks good. pg_dump of the table also doesn't show any strange problems, and is duplicate free. But:
> >
> > $ create table zzz as select * from sssssss.xobjects;
> > SELECT
> >
> > $ select xobject_id, count(*) from zzz group by 1 having count(*) > 1 order by 2 desc;
> >  xobject_id | count
> > ------------+-------
> >         -1 |    40
> > (1 row)
>
> Can you verify that these queries both do actually use a sequential
> scan, and not, for some reason, an index scan? Just to rule out the
> index corruption scenario.
> You (or someone near you) might have disabled seqscans, for example.

yes, i tested it with explain.

> Another thought: Is it possible that xobject_id just happens to be
> used internally as a hidden field by Postgres or by an extension? That
> would be another explanation for seeing -1  or duplicates in that
> column.

no. it's not a special field. just plain old "something_id", with
underscore, so it is not special in any way.

> If that's the case, I would have expected an error on creation of that table.
> And lastly, is this behaviour after copying a table into a new one
> reproducible or did it happen just once?

fully reproductible, as i mentioned at the end of my original mail.

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
On Thu, Nov 03, 2011 at 08:23:01AM -0700, Adrian Klaver wrote:
> On Thursday, November 03, 2011 8:05:38 am hubert depesz lubaczewski wrote:
> > On Thu, Nov 03, 2011 at 08:04:19AM -0700, Adrian Klaver wrote:
> > > So just to be clear there is and never has been a -1 value for xobject_id
> > > in the source table?
> >
> > yes. min value of xobject_id is 1000, and we had trigger in place on the
> > table which logged all inserts/updates/deletes and the value -1 never
> > showed up (At least in the last couple of days, during which i was
> > making the copies).
> >
> > > So a select count(*) from sssssss.xobjects where xobject_id = -1 on the
> > > source table yields 0?
> >
> > yes, that's correct. both using index, and usingf seq scan.
>
> Hmmm.  Now we await the results of the tests Tom suggested.  Just a thought, any
> other strange behavior, hiccups in the database over the past couple of days?

no. it's doing it's job without problems.

other tests are running, but simple question - how to get number of rows
affected from psql?

create table xxx as select * from xobjects;
returns just:
SELECT

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
Tom Lane
Дата:
hubert depesz lubaczewski <depesz@depesz.com> writes:
> other tests are running, but simple question - how to get number of rows
> affected from psql?

> create table xxx as select * from xobjects;
> returns just:
> SELECT

We fixed that in 9.0, but 8.4 won't provide the count (unless you care to
patch it).  That's why I suggested doing a separate insert/select.

            regards, tom lane

Re: Strange problem with create table as select * from table;

От
Adrian Klaver
Дата:
On Thursday, November 03, 2011 8:30:34 am hubert depesz lubaczewski wrote:
> On Thu, Nov 03, 2011 at 08:23:01AM -0700, Adrian Klaver wrote:
> > On Thursday, November 03, 2011 8:05:38 am hubert depesz lubaczewski wrote:
> > > On Thu, Nov 03, 2011 at 08:04:19AM -0700, Adrian Klaver wrote:
> > > > So just to be clear there is and never has been a -1 value for
> > > > xobject_id in the source table?
> > >
> > > yes. min value of xobject_id is 1000, and we had trigger in place on
> > > the table which logged all inserts/updates/deletes and the value -1
> > > never showed up (At least in the last couple of days, during which i
> > > was making the copies).
> > >
> > > > So a select count(*) from sssssss.xobjects where xobject_id = -1 on
> > > > the source table yields 0?
> > >
> > > yes, that's correct. both using index, and usingf seq scan.
> >
> > Hmmm.  Now we await the results of the tests Tom suggested.  Just a
> > thought, any other strange behavior, hiccups in the database over the
> > past couple of days?
>
> no. it's doing it's job without problems.
>
> other tests are running, but simple question - how to get number of rows
> affected from psql?

See Toms answer

>
> create table xxx as select * from xobjects;
> returns just:
> SELECT

The thing that has me puzzled is shown below from you original post:

$ select xobject_id, magic_id from zzz where magic_id in ( 32440447, 32047231,
32505983);
 xobject_id | magic_id
------------+----------
   35858705 | 32505983
   35793169 | 32440447
         -1 | 32440447
         -1 | 32047231
         -1 | 32505983
         -1 | 32505983
   35399951 | 32047231
         -1 | 32047231
(8 rows)

Looks like multiple tuples of the same row where magic_id(s) of 32505983 and
32047231 where touched three times and 32440447 twice.  For some reason the
original table is seeing only the most recent version while the CREATE AS is
pulling it and past versions. Some sort of visibility problem, exactly what is
beyond me at this point.


>
> Best regards,
>
> depesz

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
On Thu, Nov 03, 2011 at 10:55:20AM -0400, Tom Lane wrote:
> hubert depesz lubaczewski <depesz@depesz.com> writes:
> > index on xobject_id might be corrupted, but it doesn't explain that I
> > don't see duplicates with group_by/having query on xobjects, which uses
> > seqscan:
>
> I was just going to ask you to check that.  Weird as can be.
>
> Does plain old "SELECT COUNT(*)" show a difference between the two
> tables?
>
> Do you get similar misbehavior if you break the CREATE TABLE AS into a
> CREATE and an INSERT/SELECT?  Also, please note the rowcount returned
> by INSERT/SELECT and see how it matches up with the tables afterwards.
>
> Does turning synchronize_seqscans off affect the behavior?

So, did some tests:

$ select count(*) from sssssss.xobjects;
  count
----------
 35179058
(1 row)

$ create table qqq as select * from sssssss.xobjects;
SELECT

$ select count(*) from qqq;
  count
----------
 35179631
(1 row)

$ select count(*) from qqq where xobject_id = -1;
 count
-------
    40
(1 row)

$ drop table qqq;
DROP TABLE

$ select count(*) from sssssss.xobjects;
  count
----------
 35182687
(1 row)

$ create table qqq ( like sssssss.xobjects );
CREATE TABLE

$ insert into qqq select * from sssssss.xobjects;
INSERT 0 35182962

$ select count(*) from qqq;
  count
----------
 35182962
(1 row)

$ select count(*) from qqq where xobject_id = -1;
 count
-------
    40
(1 row)

$ drop table qqq;
DROP TABLE

$ set synchronize_seqscans = off;
SET

$ create table qqq as select * from sssssss.xobjects;

SELECT
$ select count(*) from qqq;
  count
----------
 35185653
(1 row)

$ select count(*) from qqq where xobject_id = -1;
 count
-------
    40
(1 row)

$ drop table qqq;
DROP TABLE

$ create table qqq ( like sssssss.xobjects );
CREATE TABLE

$ insert into qqq select * from sssssss.xobjects;
INSERT 0 35188896

$ select count(*) from qqq;
  count
----------
 35188896
(1 row)

$ select count(*) from qqq where xobject_id = -1;
 count
-------
    40
(1 row)

$ drop table qqq;
DROP TABLE

as you can see counts of rows in created table are more or less
sensible, but whatever method I used - create table as, insert into,
using sychronized_scans (initially) or not (later) - copy of the table,
as long as it's in database, has 40 those "-1" rows.

one note - maybe it wasn't clear from my original mail - when I did
pg_dump of the xobjects table, it didn't have -1 rows.



--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
Adrian Klaver
Дата:
On Thursday, November 03, 2011 1:03:12 pm hubert depesz lubaczewski wrote:
> On Thu, Nov 03, 2011 at 10:55:20AM -0400, Tom Lane wrote:

>
> So, did some tests:
>

> as you can see counts of rows in created table are more or less
> sensible, but whatever method I used - create table as, insert into,
> using sychronized_scans (initially) or not (later) - copy of the table,
> as long as it's in database, has 40 those "-1" rows.

At least it is consistent:)
Characteristics of those forty rows.
1) creation_tsz  clustered?
2) magic_id in a range?
3) Non -1  xobject_id in a range?
4) Any other field show a similarity?


>
> one note - maybe it wasn't clear from my original mail - when I did
> pg_dump of the xobjects table, it didn't have -1 rows.

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Strange problem with create table as select * from table;

От
Tom Lane
Дата:
Adrian Klaver <adrian.klaver@gmail.com> writes:
> On Thursday, November 03, 2011 1:03:12 pm hubert depesz lubaczewski wrote:
>> as you can see counts of rows in created table are more or less
>> sensible, but whatever method I used - create table as, insert into,
>> using sychronized_scans (initially) or not (later) - copy of the table,
>> as long as it's in database, has 40 those "-1" rows.

> At least it is consistent:)

Yeah, it wasn't clear before, but it looks like there are specific
rows that there's something wrong with.

> Characteristics of those forty rows.
> 1) creation_tsz  clustered?
> 2) magic_id in a range?
> 3) Non -1  xobject_id in a range?
> 4) Any other field show a similarity?

I would like to know the ctid's of the -1 rows in the copied table,
along with the ctid's of the rows they share magic_ids with, and
the ctid's of the rows with those same magic_ids in the original.
I'm wondering whether the affected rows are physically clustered ...

            regards, tom lane

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
> I would like to know the ctid's of the -1 rows in the copied table,
> along with the ctid's of the rows they share magic_ids with, and
> the ctid's of the rows with those same magic_ids in the original.
> I'm wondering whether the affected rows are physically clustered ...

i tried:
create table qqq as select cmax as o_cmax, xmax as o_xmax, cmin as
o_cmin, xmin as o_xmin, ctid as o_ctid, * from sssssss.xobjects;

but the resulting table didn't have -1 values:
$ select xobject_id, count(*) from qqq group by 1 having count(*) > 1;

                                                                              
 xobject_id | count

                                                                              
------------+-------
(0 rows)

$ select o_cmax,o_xmax,o_cmin,o_xmin,o_ctid, xobject_id, order_id from qqq where xobject_id = -1;

                                                                       
 o_cmax | o_xmax | o_cmin | o_xmin | o_ctid | xobject_id | order_id

                                                                              
--------+--------+--------+--------+--------+------------+----------
(0 rows)

i'm checking now something else, but later will try to get better grasp on those bad rows.

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
Tom Lane
Дата:
hubert depesz lubaczewski <depesz@depesz.com> writes:
> i tried:
> create table qqq as select cmax as o_cmax, xmax as o_xmax, cmin as
> o_cmin, xmin as o_xmin, ctid as o_ctid, * from sssssss.xobjects;

> but the resulting table didn't have -1 values:

Oh, that's pretty interesting ... suggests that the targetlist has to be
exactly "select *".  We have some minor optimizations for that case,
though nothing that could result in extra rows AFAICS.

Are there any dropped columns in the original table?  Try

select * from pg_attribute where attrelid = 'sssssss.xobjects'::regclass
and attisdropped;

            regards, tom lane

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
On Thu, Nov 03, 2011 at 06:02:04PM -0400, Tom Lane wrote:
> select * from pg_attribute where attrelid = 'sssssss.xobjects'::regclass
> and attisdropped;

no dropped columns.

looking for some other info. will post as soon as i'll gather it, but
that will be in utc morning :(

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
On Thu, Nov 03, 2011 at 11:03:45PM +0100, hubert depesz lubaczewski wrote:
> looking for some other info. will post as soon as i'll gather it, but
> that will be in utc morning :(


I looked closer at the rows that got -1 xobject_id.
$ select magic_id, count(*) from qqq where xobject_id = -1 group by 1 order by 1;

                                                                              
 magic_id | count
----------+-------
 30343295 |     2
 30408831 |     3
 30539903 |     1
 30605439 |     2
 30670975 |     3
 30802047 |     1
 30867583 |     1
 30933119 |     1
 31195263 |     2
 31260799 |     1
 31326335 |     1
 31588479 |     3
 31654015 |     1
 31719551 |     1
 31785087 |     3
 31850623 |     4
 31981695 |     2
 32047231 |     2
 32112767 |     1
 32309375 |     1
 32374911 |     1
 32440447 |     1
 32505983 |     2
(23 rows)

So, I checked original counts for these magic_id:

$ select magic_id, count(*) from sssssss.xobjects where magic_id in
(30343295,30408831,30539903,30605439,30670975,30802047,30867583,30933119,31195263,31260799,31326335,31588479,31654015,31719551,31785087,31850623,31981695,32047231,32112767,32309375,32374911,32440447,32505983)
groupby 1 order by 1; 
 magic_id | count
----------+-------
 30343295 |     1
 30408831 |     1
 30539903 |     1
 30605439 |     1
 30670975 |     1
 30802047 |     1
 30867583 |     1
 30933119 |     1
 31195263 |     1
 31260799 |     1
 31326335 |     1
 31588479 |     1
 31654015 |     1
 31719551 |     1
 31785087 |     1
 31850623 |     1
 31981695 |     1
 32047231 |     1
 32112767 |     1
 32309375 |     1
 32374911 |     1
 32440447 |     1
 32505983 |     1
(23 rows)

$ select min(magic_id), max(magic_id), count(distinct magic_id), sum( case when magic_id between 30343295 and 32505983
then1 else 0 end ) as count_in_range from sssssss.xobjects; 
 min  |   max    |  count   | count_in_range

                                                                              
------+----------+----------+----------------
 1000 | 37830834 | 32030523 |        2079327
(1 row)

So, the ids are not clustered.

at least the magic_id. but since these are more or less chronological, it means that these rows were added to db quite
sometime apart: 

 magic_id |     creation_tsz
----------+------------------------
 30343295 | 2011-05-28 00:57:36+00
 30408831 | 2011-05-30 01:51:09+00
 30539903 | 2011-06-02 04:06:20+00
 30605439 | 2011-06-03 18:23:06+00
 30670975 | 2011-06-05 16:49:49+00
 30802047 | 2011-06-08 16:46:22+00
 30867583 | 2011-06-10 01:39:41+00
 30933119 | 2011-06-11 19:48:07+00
 31195263 | 2011-06-18 00:33:24+00
 31260799 | 2011-06-20 01:49:46+00
 31326335 | 2011-06-21 17:53:41+00
 31588479 | 2011-06-28 07:07:19+00
 31654015 | 2011-06-29 20:30:52+00
 31719551 | 2011-07-01 09:50:54+00
 31785087 | 2011-07-03 03:42:02+00
 31850623 | 2011-07-05 05:02:27+00
 31981695 | 2011-07-08 04:49:21+00
 32047231 | 2011-07-09 21:59:25+00
 32112767 | 2011-07-11 16:53:10+00
 32309375 | 2011-07-15 21:52:31+00
 32374911 | 2011-07-17 19:20:34+00
 32440447 | 2011-07-19 03:13:21+00
 32505983 | 2011-07-20 16:15:38+00

So, Let's see how the bad rows (-1) look, in comparison to good ones, in copy of the table:

$ select xobject_id, magic_id, creation_tsz from qqq where magic_id in
(30343295,30408831,30539903,30605439,30670975,30802047,30867583,30933119,31195263,31260799,31326335,31588479,31654015,31719551,31785087,31850623,31981695,32047231,32112767,32309375,32374911,32440447,32505983)
orderby 2, 1; 
 xobject_id | magic_id |      creation_tsz

                                                                              
------------+----------+------------------------
         -1 | 30343295 | 2011-05-28 17:24:58+00
         -1 | 30343295 | 2011-05-28 04:57:23+00
   33695980 | 30343295 | 2011-05-28 00:57:36+00
         -1 | 30408831 | 2011-05-31 04:25:44+00
         -1 | 30408831 | 2011-05-31 01:03:03+00
         -1 | 30408831 | 2011-05-30 14:52:29+00
   33761515 | 30408831 | 2011-05-30 01:51:09+00
         -1 | 30539903 | 2011-06-02 05:05:08+00
   33892588 | 30539903 | 2011-06-02 04:06:20+00
         -1 | 30605439 | 2011-06-05 05:13:01+00
         -1 | 30605439 | 2011-06-04 03:22:08+00
   33958124 | 30605439 | 2011-06-03 18:23:06+00
         -1 | 30670975 | 2011-06-05 17:54:06+00
         -1 | 30670975 | 2011-06-06 13:59:01+00
         -1 | 30670975 | 2011-06-06 14:44:22+00
   34023662 | 30670975 | 2011-06-05 16:49:49+00
         -1 | 30802047 | 2011-06-08 19:06:23+00
   34154737 | 30802047 | 2011-06-08 16:46:22+00
         -1 | 30867583 | 2011-06-11 03:35:12+00
   34220289 | 30867583 | 2011-06-10 01:39:41+00
         -1 | 30933119 | 2011-06-13 12:31:46+00
   34285828 | 30933119 | 2011-06-11 19:48:07+00
         -1 | 31195263 | 2011-06-20 01:08:40+00
         -1 | 31195263 | 2011-06-19 06:27:08+00
   34547974 | 31195263 | 2011-06-18 00:33:24+00
         -1 | 31260799 | 2011-06-21 09:10:19+00
   34613511 | 31260799 | 2011-06-20 01:49:46+00
         -1 | 31326335 | 2011-06-22 15:34:12+00
   34679047 | 31326335 | 2011-06-21 17:53:41+00
         -1 | 31588479 | 2011-06-29 00:10:51+00
         -1 | 31588479 | 2011-06-28 11:32:31+00
         -1 | 31588479 | 2011-06-29 10:37:13+00
   34941191 | 31588479 | 2011-06-28 07:07:19+00
         -1 | 31654015 | 2011-06-30 22:32:22+00
   35006735 | 31654015 | 2011-06-29 20:30:52+00
         -1 | 31719551 | 2011-07-03 02:04:12+00
   35072271 | 31719551 | 2011-07-01 09:50:54+00
         -1 | 31785087 | 2011-07-03 14:44:51+00
         -1 | 31785087 | 2011-07-03 17:05:13+00
         -1 | 31785087 | 2011-07-03 15:20:21+00
   35137807 | 31785087 | 2011-07-03 03:42:02+00
         -1 | 31850623 | 2011-07-06 17:51:30+00
         -1 | 31850623 | 2011-07-05 11:45:57+00
         -1 | 31850623 | 2011-07-06 09:58:26+00
         -1 | 31850623 | 2011-07-06 08:53:44+00
   35203343 | 31850623 | 2011-07-05 05:02:27+00
         -1 | 31981695 | 2011-07-08 09:19:59+00
         -1 | 31981695 | 2011-07-08 08:14:24+00
   35334415 | 31981695 | 2011-07-08 04:49:21+00
         -1 | 32047231 | 2011-07-11 16:45:51+00
         -1 | 32047231 | 2011-07-11 14:49:26+00
   35399951 | 32047231 | 2011-07-09 21:59:25+00
         -1 | 32112767 | 2011-07-11 17:51:43+00
   35465487 | 32112767 | 2011-07-11 16:53:10+00
         -1 | 32309375 | 2011-07-16 21:38:56+00
   35662095 | 32309375 | 2011-07-15 21:52:31+00
         -1 | 32374911 | 2011-07-18 07:04:27+00
   35727631 | 32374911 | 2011-07-17 19:20:34+00
         -1 | 32440447 | 2011-07-19 13:18:55+00
   35793169 | 32440447 | 2011-07-19 03:13:21+00
         -1 | 32505983 | 2011-07-21 12:04:21+00
         -1 | 32505983 | 2011-07-21 05:04:36+00
   35858705 | 32505983 | 2011-07-20 16:15:38+00
(63 rows)

What is important, that even if creation_tsz matches for given magic_id - data in other rows are very different - in a
waynot justifiable by simple update of rows. It looks like the content of the row would be made from some other rows? 

Checked the last magic_id that we have problems with: 32505983.

Original data:

$ select xobject_id, magic_id, some_column_01, some_column_2, creation_tsz, md5(xxxx) from qqq where magic_id =
32505983;
 xobject_id | magic_id | some_column_01 | some_column_2 |      creation_tsz      |               md5
------------+----------+----------------+---------------+------------------------+----------------------------------
   35858705 | 32505983 |        5209830 |       8536762 | 2011-07-20 16:15:38+00 | 59ecde2fa2758fa68db3199f5388771f
         -1 | 32505983 |        7492565 |      15395178 | 2011-07-21 05:04:36+00 | 6cadaeab659101ddec2b8b80ab10bea6
         -1 | 32505983 |        5325985 |      10853000 | 2011-07-21 12:04:21+00 | 887f30da9d60237e150e8b788aa3b68c
(3 rows)

$ select xobject_id, magic_id, some_column_01, some_column_2, creation_tsz, md5(xxxx) from sssssss.xobjects where
magic_id= 32505983; 
 xobject_id | magic_id | some_column_01 | some_column_2 |      creation_tsz      |               md5
------------+----------+----------------+---------------+------------------------+----------------------------------
   35858705 | 32505983 |        5209830 |       8536762 | 2011-07-20 16:15:38+00 | 59ecde2fa2758fa68db3199f5388771f
(1 row)

(the selected columns ate the first 6 columns of the table).

As you can see the row with correct xobject_id seems to be perfectly ok.

But what about the rows with xobject_id = -1 ? I used creation_tsz to find them:

$ select xobject_id, magic_id, some_column_01, some_column_2, creation_tsz, md5(xxxx) from sssssss.xobjects where
creation_tszin ('2011-07-21 05:04:36+00', '2011-07-21 12:04:21+00'); 
 xobject_id | magic_id | some_column_01 | some_column_2 |      creation_tsz      |               md5
------------+----------+----------------+---------------+------------------------+----------------------------------
   35891226 | 32538466 |        7492565 |      15395178 | 2011-07-21 05:04:36+00 | 6cadaeab659101ddec2b8b80ab10bea6
   35896800 | 32544040 |        5325985 |      10853000 | 2011-07-21 12:04:21+00 | 887f30da9d60237e150e8b788aa3b68c
(2 rows)

Please note that the rows match *perfectly* aside from the 2 first columns ?!

Let's look into some more, to get more rows to work on:

$ select xobject_id, magic_id, some_column_01, some_column_2, creation_tsz, md5(xxxx) from qqq where magic_id in
(31785087,31850623, 31981695, 31588479, 32505983) order by creation_tsz;
                                                                                     
 xobject_id | magic_id | some_column_01 | some_column_2 |      creation_tsz      |               md5

                                                                              
------------+----------+----------------+---------------+------------------------+----------------------------------
   34941191 | 31588479 |       14187503 |       5243045 | 2011-06-28 07:07:19+00 | fdf95dd37e1f37b8d38eee63bea8fd5c
         -1 | 31588479 |        6665439 |      10951593 | 2011-06-28 11:32:31+00 | 75da687e0823bca3594a52510c658d92
         -1 | 31588479 |        6665439 |      12887953 | 2011-06-29 00:10:51+00 | 6007e35f346df7da585bbfc137df6a20
         -1 | 31588479 |        5325985 |      15444995 | 2011-06-29 10:37:13+00 | 5eac30a00c8d81ca8979ab7f44d7f660
   35137807 | 31785087 |       11559453 |       5052671 | 2011-07-03 03:42:02+00 | 1beb042a8106465f52d8378937791631
         -1 | 31785087 |        5895587 |      10594827 | 2011-07-03 14:44:51+00 | 907fd04a01095ff719092cce391f12e3
         -1 | 31785087 |          41796 |       5785895 | 2011-07-03 15:20:21+00 | dd8c168feeaa78219de7f096824d11a1
         -1 | 31785087 |          64040 |        102195 | 2011-07-03 17:05:13+00 | d03f944d09e6554608d7756b19281526
   35203343 | 31850623 |       12948972 |       7508016 | 2011-07-05 05:02:27+00 | b1e0551397ee61658328a5ece7204a13
         -1 | 31850623 |       10222885 |      15216048 | 2011-07-05 11:45:57+00 | 329b26b67e0333b8c6e524a31150431e
         -1 | 31850623 |        7405758 |       7084671 | 2011-07-06 08:53:44+00 | 27fada1f5777d5b3005230fdc696d69c
         -1 | 31850623 |          21314 |       9122048 | 2011-07-06 09:58:26+00 | 0ad6292210386a3767111df35c2bec26
         -1 | 31850623 |       14364837 |      10260130 | 2011-07-06 17:51:30+00 | d16c07121570e370609493a68c4c4606
   35334415 | 31981695 |       14382062 |      15553123 | 2011-07-08 04:49:21+00 | ca66936edcd8f0e195bdaa3f0887332d
         -1 | 31981695 |          64040 |      10955898 | 2011-07-08 08:14:24+00 | 3d11154de2df1b0758f345df603f14b9
         -1 | 31981695 |          64040 |       7849141 | 2011-07-08 09:19:59+00 | 0593a4250cabf4e0fc90721a0586d5c9
   35858705 | 32505983 |        5209830 |       8536762 | 2011-07-20 16:15:38+00 | 59ecde2fa2758fa68db3199f5388771f
         -1 | 32505983 |        7492565 |      15395178 | 2011-07-21 05:04:36+00 | 6cadaeab659101ddec2b8b80ab10bea6
         -1 | 32505983 |        5325985 |      10853000 | 2011-07-21 12:04:21+00 | 887f30da9d60237e150e8b788aa3b68c
(19 rows)

$ select xobject_id, magic_id, some_column_01, some_column_2, creation_tsz, md5(xxxx) from sssssss.xobjects where
creation_tszin ( select creation_tsz from qqq where magic_id in (31785087, 31850623, 31981695, 31588479, 32505983) and
xobject_id= -1 ) order by creation_tsz;                                              
 xobject_id | magic_id | some_column_01 | some_column_2 |      creation_tsz      |               md5

                                                                              
------------+----------+----------------+---------------+------------------------+----------------------------------
   34943757 | 31591045 |        5801572 |      14146975 | 2011-06-28 11:32:31+00 | 1fb5d088e9be92609f5d2d237033ff71
(irrelevantrow with the same creation_tsz as relevant row) 
   34943758 | 31591046 |        6665439 |      10951593 | 2011-06-28 11:32:31+00 | 75da687e0823bca3594a52510c658d92
   34971890 | 31619170 |        6665439 |      12887953 | 2011-06-29 00:10:51+00 | 6007e35f346df7da585bbfc137df6a20
   34986625 | 31633905 |        5325985 |      15444995 | 2011-06-29 10:37:13+00 | 5eac30a00c8d81ca8979ab7f44d7f660
   35146513 | 31793793 |        5895587 |      10594827 | 2011-07-03 14:44:51+00 | 907fd04a01095ff719092cce391f12e3
   35146514 | 31793794 |        6451300 |      11916744 | 2011-07-03 14:44:51+00 | ca3ae62f59cf76e75ee39dad6762cf16
(irrelevantrow with the same creation_tsz as relevant row) 
   35147344 | 31794624 |        5665781 |      11816238 | 2011-07-03 15:20:21+00 | 42017de7cb7227b25c4092ba0619b18e
(irrelevantrow with the same creation_tsz as relevant row) 
   35147343 | 31794623 |          41796 |       5785895 | 2011-07-03 15:20:21+00 | dd8c168feeaa78219de7f096824d11a1
   35150207 | 31797487 |          64040 |        102195 | 2011-07-03 17:05:13+00 | d03f944d09e6554608d7756b19281526
   35207670 | 31854950 |       10222885 |      15216048 | 2011-07-05 11:45:57+00 | 329b26b67e0333b8c6e524a31150431e
   35248094 | 31895374 |        7405758 |       7084671 | 2011-07-06 08:53:44+00 | 27fada1f5777d5b3005230fdc696d69c
   35248630 | 31895910 |          21314 |       9122048 | 2011-07-06 09:58:26+00 | 0ad6292210386a3767111df35c2bec26
   35262548 | 31909828 |       14364837 |      10260130 | 2011-07-06 17:51:30+00 | d16c07121570e370609493a68c4c4606
   35262549 | 31909829 |       15304799 |       9494587 | 2011-07-06 17:51:30+00 | d70ab6ca0bd99726d3dd66f7287974c5
(irrelevantrow with the same creation_tsz as relevant row) 
   35337757 | 31985037 |          64040 |      10955898 | 2011-07-08 08:14:24+00 | 3d11154de2df1b0758f345df603f14b9
   35338337 | 31985617 |          64040 |       7849141 | 2011-07-08 09:19:59+00 | 0593a4250cabf4e0fc90721a0586d5c9
   35891226 | 32538466 |        7492565 |      15395178 | 2011-07-21 05:04:36+00 | 6cadaeab659101ddec2b8b80ab10bea6
   35896800 | 32544040 |        5325985 |      10853000 | 2011-07-21 12:04:21+00 | 887f30da9d60237e150e8b788aa3b68c
(18 rows)

So we have 14 missing xobject_ids:

$ select xobject_id from qqq where xobject_id in
(34943758,34971890,34986625,35146513,35147343,35150207,35207670,35248094,35248630,35262548,35337757,35338337,35891226,35896800);
 xobject_id

                                                                              
------------
(0 rows)

OK. So based on it all, it looks like for some rows, first two columns got mangled.

But why? System columns for the 14 rows in source table:

$ select xobject_id, ctid, xmin, xmax, cmin, cmax from sssssss.xobjects where xobject_id in
(34943758,34971890,34986625,35146513,35147343,35150207,35207670,35248094,35248630,35262548,35337757,35338337,35891226,35896800);
 xobject_id |     ctid     |   xmin    | xmax | cmin | cmax

                                                                              
------------+--------------+-----------+------+------+------
   35891226 | (1000302,18) | 103473550 |    0 |    8 |    8
   35896800 | (1037570,16) | 103541296 |    0 |    8 |    8
   34986625 | (3113025,5)  |  98897359 |    0 |    8 |    8
   35150207 | (3148101,3)  |  99819079 |    0 |    8 |    8
   35147343 | (3164551,5)  | 100274767 |    0 |    8 |    8
   35207670 | (3179416,9)  | 100564691 |    0 |    8 |    8
   35262548 | (3185219,2)  | 100685664 |    0 |    8 |    8
   35248630 | (3188532,7)  | 100742039 |    0 |    8 |    8
   35337757 | (3189663,9)  | 100782477 |    0 |    8 |    8
   35338337 | (3189785,8)  | 100792384 |    0 |    8 |    8
   35146513 | (3214806,6)  | 101489289 |    0 |    8 |    8
   35248094 | (3219436,2)  | 101542297 |    0 |    8 |    8
   34943758 | (3247211,6)  | 102143923 |    0 |    8 |    8
   34971890 | (3247212,8)  | 102143923 |    0 |    8 |    8
(14 rows)

Does it tell you anything?

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
Adrian Klaver
Дата:
On 11/04/2011 01:17 PM, hubert depesz lubaczewski wrote:
> On Thu, Nov 03, 2011 at 11:03:45PM +0100, hubert depesz lubaczewski wrote:
>> looking for some other info. will post as soon as i'll gather it, but
>> that will be in utc morning :(
>
>
> I looked closer at the rows that got -1 xobject_id.

>
> Does it tell you anything?

You are very thorough.
I don't know enough about Postgres internals to be much help there. All
I can point out is the problem seemed to appear over roughly a two month
period 5/28/11-7/20/11.
Any new code or procedures rolled out then?
Any bug reports?

Given the turnover on this table it seems important everything is
'normal' since 7/20/11.

>
> Best regards,
>
> depesz
>


--
Adrian Klaver
adrian.klaver@gmail.com

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
On Fri, Nov 04, 2011 at 01:43:55PM -0700, Adrian Klaver wrote:
> >Does it tell you anything?
> You are very thorough.

I hate mysteries. Especially the ones that break stuff.

> I don't know enough about Postgres internals to be much help there.
> All I can point out is the problem seemed to appear over roughly a
> two month period 5/28/11-7/20/11.
> Any new code or procedures rolled out then?
> Any bug reports?

not as far as I know, but I'll check with guys. Still - app changes
shouldn't cause something like this in db?

> Given the turnover on this table it seems important everything is
> 'normal' since 7/20/11.

yes, but we can't do pg_reorg of the table (it has quite a lot of bloat)

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
Adrian Klaver
Дата:
On 11/04/2011 01:47 PM, hubert depesz lubaczewski wrote:
> On Fri, Nov 04, 2011 at 01:43:55PM -0700, Adrian Klaver wrote:
>>> Does it tell you anything?
>> You are very thorough.
>
> I hate mysteries. Especially the ones that break stuff.

Know the feeling.

>
>> I don't know enough about Postgres internals to be much help there.
>> All I can point out is the problem seemed to appear over roughly a
>> two month period 5/28/11-7/20/11.
>> Any new code or procedures rolled out then?
>> Any bug reports?
>
> not as far as I know, but I'll check with guys. Still - app changes
> shouldn't cause something like this in db?

Then it wouldn't be a bug:) Seriously, what should happen and what does
happen is very often different.

>
>> Given the turnover on this table it seems important everything is
>> 'normal' since 7/20/11.
>
> yes, but we can't do pg_reorg of the table (it has quite a lot of bloat)

Probably should have been clearer here. I would look at what information
is available for changes on 7/20-21/11.

>
> Best regards,
>
> depesz
>


--
Adrian Klaver
adrian.klaver@gmail.com

Re: Strange problem with create table as select * from table;

От
Tom Lane
Дата:
hubert depesz lubaczewski <depesz@depesz.com> writes:
> OK. So based on it all, it looks like for some rows, first two columns got mangled.

Good detective work.  So now we at least have a believable theory about
*what* is happening (something is stomping the first 8 data bytes of
these particular rows), if not *why*.

You said that pg_dump does not show the corruption.  That could be
because the data is coming out through the COPY code path instead of
the SELECT code path.  Could you try a pg_dump with --inserts (which
will fetch the data with SELECTs) and see if it shows corrupt data?

Another thing we probably should ask at this point is whether you have
any nonstandard software loaded into your server, like auto_explain
or pg_stat_statements or some such.

            regards, tom lane

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
On Fri, Nov 04, 2011 at 05:49:44PM -0400, Tom Lane wrote:
> You said that pg_dump does not show the corruption.  That could be
> because the data is coming out through the COPY code path instead of
> the SELECT code path.  Could you try a pg_dump with --inserts (which
> will fetch the data with SELECTs) and see if it shows corrupt data?

Sure. Testing.

> Another thing we probably should ask at this point is whether you have
> any nonstandard software loaded into your server, like auto_explain
> or pg_stat_statements or some such.

No. Nothing like this. Just base Pg.

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
Tom Lane
Дата:
I wrote:
> Good detective work.  So now we at least have a believable theory about
> *what* is happening (something is stomping the first 8 data bytes of
> these particular rows), if not *why*.

Scratch that: something is stomping the first *six* bytes of data.
On a hunch I converted the original and bad xobject_id and magic_id
values to hex, and look at the pattern that pops out:

  badxi   |  badmi  | origxi  | origmi
----------+---------+---------+---------
 ffffffff | 1e2007f | 215a0f2 | 1e27862
 ffffffff | 1e2007f | 215da81 | 1e2b1f1
 ffffffff | 1e2007f | 215330e | 1e20a86
 ffffffff | 1e5007f | 2184b11 | 1e52281
 ffffffff | 1e5007f | 218597f | 1e530ef
 ffffffff | 1e5007f | 2184e4f | 1e525bf
 ffffffff | 1e6007f | 21939f6 | 1e61166
 ffffffff | 1e6007f | 21a1054 | 1e6e7c4
 ffffffff | 1e6007f | 219d7de | 1e6af4e
 ffffffff | 1e6007f | 219d9f6 | 1e6b166
 ffffffff | 1e8007f | 21b3861 | 1e80fd1
 ffffffff | 1e8007f | 21b361d | 1e80d8d
 ffffffff | 1f0007f | 223bde0 | 1f09528
 ffffffff | 1f0007f | 223a81a | 1f07f62

I'm assuming this is little-endian hardware, so the low-order half of
magic_id is adjacent to xobject_id.  We can see that in each case the
first six bytes are being overwritten with ff ff ff ff 7f 00, while the
high-order half of magic_id remains unchanged.

Not sure what it means yet, but this seems like confirmation of the
idea that something's stomping on the data while it passes through
CREATE TABLE AS.

BTW, did you try the separate INSERT/SELECT yet?  Does that show
corruption?

            regards, tom lane

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
On Fri, Nov 04, 2011 at 06:18:55PM -0400, Tom Lane wrote:
> BTW, did you try the separate INSERT/SELECT yet?  Does that show
> corruption?

pg_dump --inserts is still working.

i did create table (like), insert into ... select and it also shows the
problem, as I showed (with other data) in email:
20111103200312.GA4510@depesz.com

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
On Fri, Nov 04, 2011 at 05:49:44PM -0400, Tom Lane wrote:
> You said that pg_dump does not show the corruption.  That could be
> because the data is coming out through the COPY code path instead of
> the SELECT code path.  Could you try a pg_dump with --inserts (which
> will fetch the data with SELECTs) and see if it shows corrupt data?

i'm running the pg_dump (it will take some time, so don't hold your
breath), but at the same time - I can select these rows, correctly, with
normal SELECT from table (xobjects table). Doesn't it disprove this
theory?

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
Tom Lane
Дата:
hubert depesz lubaczewski <depesz@depesz.com> writes:
> On Fri, Nov 04, 2011 at 05:49:44PM -0400, Tom Lane wrote:
>> You said that pg_dump does not show the corruption.  That could be
>> because the data is coming out through the COPY code path instead of
>> the SELECT code path.  Could you try a pg_dump with --inserts (which
>> will fetch the data with SELECTs) and see if it shows corrupt data?

> i'm running the pg_dump (it will take some time, so don't hold your
> breath), but at the same time - I can select these rows, correctly, with
> normal SELECT from table (xobjects table). Doesn't it disprove this
> theory?

Well, we don't know what the triggering condition is for the corruption,
so it's premature to draw conclusions on what will or won't cause it.
I was wondering if selecting the entire table was necessary.

A different line of thought is that there's something about these
specific source rows, and only these rows, that makes them vulnerable to
corruption during INSERT/SELECT.  Do they by any chance contain any
values that are unusual elsewhere in your table?  One thing I'm
wondering about right now is the nulls bitmap --- so do these rows have
nulls (or not-nulls) in any place that's unusual elsewhere?

            regards, tom lane

Re: Strange problem with create table as select * from table;

От
Adrian Klaver
Дата:
On Friday, November 04, 2011 3:43:48 pm hubert depesz lubaczewski wrote:
> On Fri, Nov 04, 2011 at 05:49:44PM -0400, Tom Lane wrote:
> > You said that pg_dump does not show the corruption.  That could be
> > because the data is coming out through the COPY code path instead of
> > the SELECT code path.  Could you try a pg_dump with --inserts (which
> > will fetch the data with SELECTs) and see if it shows corrupt data?
>
> i'm running the pg_dump (it will take some time, so don't hold your
> breath), but at the same time - I can select these rows, correctly, with
> normal SELECT from table (xobjects table). Doesn't it disprove this
> theory?

Another question.
Between 07/20/11 and this recent attempt did you do a CREATE TABLE AS on this
table and not have corrupted rows?

>
> Best regards,
>
> depesz

--
Adrian Klaver
adrian.klaver@gmail.com

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
On Fri, Nov 04, 2011 at 05:49:44PM -0400, Tom Lane wrote:
> You said that pg_dump does not show the corruption.  That could be
> because the data is coming out through the COPY code path instead of
> the SELECT code path.  Could you try a pg_dump with --inserts (which
> will fetch the data with SELECTs) and see if it shows corrupt data?

pg_dump finished. all ids are correct - i.e. there are no rows with
xobject_id = -1.

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
On Fri, Nov 04, 2011 at 05:06:35PM -0700, Adrian Klaver wrote:
> Another question.
> Between 07/20/11 and this recent attempt did you do a CREATE TABLE AS on this
> table and not have corrupted rows?

don't remember.

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
Adrian Klaver
Дата:
On Friday, November 04, 2011 6:04:02 pm Tom Lane wrote:
> I wrote:
> > A different line of thought is that there's something about these
> > specific source rows, and only these rows, that makes them vulnerable to
> > corruption during INSERT/SELECT.  Do they by any chance contain any
> > values that are unusual elsewhere in your table?  One thing I'm
> > wondering about right now is the nulls bitmap --- so do these rows have
> > nulls (or not-nulls) in any place that's unusual elsewhere?
>
> Hah ... I have a theory.
>

>
> This is trivial to fix, now that we know there's a problem --- the
> function is only using that assumption to save itself a couple lines
> of code.  Penny wise, pound foolish :-(

I killed a few brain cells just reading the explanation:)

>
>             regards, tom lane
>
>


--
Adrian Klaver
adrian.klaver@gmail.com

Re: Strange problem with create table as select * from table;

От
Tom Lane
Дата:
I wrote:
> A different line of thought is that there's something about these
> specific source rows, and only these rows, that makes them vulnerable to
> corruption during INSERT/SELECT.  Do they by any chance contain any
> values that are unusual elsewhere in your table?  One thing I'm
> wondering about right now is the nulls bitmap --- so do these rows have
> nulls (or not-nulls) in any place that's unusual elsewhere?

Hah ... I have a theory.

I will bet that you recently added some column(s) to the source table
using ALTER TABLE ADD COLUMN and no default value, so that the added
columns were nulls and no table rewrite happened.  And that these
troublesome rows predate that addition, but contained no nulls before
that.  And that they are the only rows that, in addition to the above
conditions, contain data fields wide enough to require out-of-line
toasting.

These conditions together are enough to break the assumption in
toast_insert_or_update that the old and new tuples must have the same
value of t_hoff.  But it can only happen when the source tuple is an
original on-disk tuple, which explains why only INSERT ... SELECT *
causes the problem, not any variants that require projection of a new
column set.  When it does happen, toast_insert_or_update correctly
computes the required size of the new tuple ... but then it tells
heap_fill_tuple to fill the data part at offset olddata->t_hoff, which
is wrong (too small) and so the nulls bitmap that heap_fill_tuple
concurrently constructs will overwrite the first few data bytes.  In
your example, the table contains 49 columns so the nulls bitmap requires
7 bytes, just enough to overwrite the first 6 data bytes as observed.
(In fact, given the values we see being filled in, I can confidently say
that you have two added-since-creation null columns, no more, no less.)

I can reproduce the problem with the attached test case (using the
regression database).  With asserts enabled, the
        Assert(new_len == olddata->t_hoff);
fails.  With asserts off, corrupt data.

This is trivial to fix, now that we know there's a problem --- the
function is only using that assumption to save itself a couple lines
of code.  Penny wise, pound foolish :-(

            regards, tom lane


drop table wide;

create table wide as
select
ten as firstc,
unique1 as unique1_1,
unique2 as unique2_1,
two as two_1,
four as four_1,
ten as ten_1,
twenty as twenty_1,
hundred as hundred_1,
thousand as thousand_1,
twothousand as twothousand_1,
fivethous as fivethous_1,
tenthous as tenthous_1,
odd as odd_1,
even as even_1,
stringu1 as stringu1_1,
stringu2 as stringu2_1,
string4 as string4_1,
unique1 as unique1_2,
unique2 as unique2_2,
two as two_2,
four as four_2,
ten as ten_2,
twenty as twenty_2,
hundred as hundred_2,
thousand as thousand_2,
twothousand as twothousand_2,
fivethous as fivethous_2,
tenthous as tenthous_2,
odd as odd_2,
even as even_2,
stringu1 as stringu1_2,
stringu2 as stringu2_2,
string4 as string4_2,
unique1 as unique1_3,
unique2 as unique2_3,
two as two_3,
four as four_3,
ten as ten_3,
twenty as twenty_3,
hundred as hundred_3,
thousand as thousand_3,
twothousand as twothousand_3,
fivethous as fivethous_3,
tenthous as tenthous_3,
odd as odd_3,
even as even_3,
repeat('xyzzyxydlkadlkndvlelfzzy', 20000) as widec
from onek limit 10;

alter table wide add column nullc1 int;
alter table wide add column nullc2 int;

drop table widec;

create table widec as select * from wide;

select firstc, to_hex(unique1_1), unique2_1, to_hex(unique1_2) from widec;

Re: Strange problem with create table as select * from table;

От
Martijn van Oosterhout
Дата:
On Fri, Nov 04, 2011 at 09:04:02PM -0400, Tom Lane wrote:
> Hah ... I have a theory.
>
> I will bet that you recently added some column(s) to the source table
> using ALTER TABLE ADD COLUMN and no default value, so that the added
> columns were nulls and no table rewrite happened.  And that these
> troublesome rows predate that addition, but contained no nulls before
> that.  And that they are the only rows that, in addition to the above
> conditions, contain data fields wide enough to require out-of-line
> toasting.
>
> These conditions together are enough to break the assumption in
> toast_insert_or_update that the old and new tuples must have the same
> value of t_hoff.

Wow! Good catch.

> This is trivial to fix, now that we know there's a problem --- the
> function is only using that assumption to save itself a couple lines
> of code.  Penny wise, pound foolish :-(

No doubt the assumption was true when the code was written, but still.

Hve a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
   -- Arthur Schopenhauer

Вложения

Re: Strange problem with create table as select * from table;

От
Tom Lane
Дата:
hubert depesz lubaczewski <depesz@depesz.com> writes:
> Any chance of getting the fix in patch format so we could test it on
> this system?

http://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=23998fe99c1220ba3a9eefee194e37ec1f14ae07

            regards, tom lane

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
On Fri, Nov 04, 2011 at 09:04:02PM -0400, Tom Lane wrote:
> that.  And that they are the only rows that, in addition to the above
> conditions, contain data fields wide enough to require out-of-line
> toasting.

checked lengths of the text/varchar columns in database.

there are 16 such columns in the table.
full report of lengths is in
http://www.depesz.com/various/lengths.report.gz

it was obtained using:
select length( "first_text_column" ) as length_1, count(*) from etsy_v2.receipts group by 1 order by 1;
and so on for every text column, and at the end I also made summary of
sum-of-lengths.

there is also:
http://www.depesz.com/various/lengths2.report.gz
which has the same summary, but only of the damaged rows.

As you can see the length of columns is not really special - somewhere
in the middle of all other rows. summarized length is also not special
in any way.

> These conditions together are enough to break the assumption in
> toast_insert_or_update that the old and new tuples must have the same
> value of t_hoff.  But it can only happen when the source tuple is an
> original on-disk tuple, which explains why only INSERT ... SELECT *
> causes the problem, not any variants that require projection of a new
> column set.  When it does happen, toast_insert_or_update correctly
> computes the required size of the new tuple ... but then it tells
> heap_fill_tuple to fill the data part at offset olddata->t_hoff, which
> is wrong (too small) and so the nulls bitmap that heap_fill_tuple
> concurrently constructs will overwrite the first few data bytes.  In
> your example, the table contains 49 columns so the nulls bitmap requires
> 7 bytes, just enough to overwrite the first 6 data bytes as observed.
> (In fact, given the values we see being filled in, I can confidently say
> that you have two added-since-creation null columns, no more, no less.)
>
> I can reproduce the problem with the attached test case (using the
> regression database).  With asserts enabled, the
>         Assert(new_len == olddata->t_hoff);
> fails.  With asserts off, corrupt data.

How can I make the onek table for the test? is it standard table from
something?

> This is trivial to fix, now that we know there's a problem --- the
> function is only using that assumption to save itself a couple lines
> of code.  Penny wise, pound foolish :-(

Any chance of getting the fix in patch format so we could test it on
this system?

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: Strange problem with create table as select * from table;

От
Tom Lane
Дата:
hubert depesz lubaczewski <depesz@depesz.com> writes:
> checked lengths of the text/varchar columns in database.

> there are 16 such columns in the table.
> full report of lengths is in
> http://www.depesz.com/various/lengths.report.gz

> it was obtained using:
> select length( "first_text_column" ) as length_1, count(*) from etsy_v2.receipts group by 1 order by 1;
> and so on for every text column, and at the end I also made summary of
> sum-of-lengths.

BTW, that probably doesn't prove a lot since it takes no account of
compression, and different values could be compressible to varying
degrees.  It'd be more useful to look at pg_column_size() numbers
to get an idea of whether toasting happened or not.

            regards, tom lane

Re: Strange problem with create table as select * from table;

От
hubert depesz lubaczewski
Дата:
On Sun, Nov 06, 2011 at 09:34:24AM -0500, Tom Lane wrote:
> hubert depesz lubaczewski <depesz@depesz.com> writes:
> > Any chance of getting the fix in patch format so we could test it on
> > this system?
>
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=patch;h=23998fe99c1220ba3a9eefee194e37ec1f14ae07

hi
just to close the loop - finally today I could restart the database with
patched 8.4. bug fixed, all works fine. thanks a lot.

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/