Обсуждение: BUG #15570: Vacuum analyze ERROR: MultiXactId XXXX has not beencreated yet -- apparent wraparound

Поиск
Список
Период
Сортировка

BUG #15570: Vacuum analyze ERROR: MultiXactId XXXX has not beencreated yet -- apparent wraparound

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      15570
Logged by:          Phil Hildebrand
Email address:      phil.hildebrand@moz.com
PostgreSQL version: 10.6
Operating system:   Ubuntu 16.04.5 LTS (linux)
Description:

While running vacuum analyze before a materialized view refresh we started
seeing the following error in the logs:

 ERROR:  MultiXactId 1483293027 has not been created yet -- apparent
wraparound in the error logs 

Both selects against and pg_dumps of the table fail with the same error.

We were able narrow down the selects to at least one row / page that seems
to have problems:

select ctid,id from reviews_2018 where ctid = '(382604,16)'::tid;
    ctid     |                id                
-------------+----------------------------------
 (382604,16) | 00ec91e42d21ce0b818fe8f63ab94c66

select ctid,id from reviews_2018 where id =
'00ec91e42d21ce0b818fe8f63ab94c66';
ERROR:  MultiXactId 1483293027 has not been created yet -- apparent
wraparound

We would not expect to see this without some kind of system issue, but there
is not indication there were any system or disk issues/errors at the time.

As a workaround, we restored the table from a previous backup.

Linux info:  4.4.0-140-generic #166-Ubuntu SMP Wed Nov 14 20:09:47 UTC 2018
x86_64 x86_64 x86_64 GNU/Linux
Postgres version: PostgreSQL 10.6 (Ubuntu 10.6-1.pgdg16.04+1) on
x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0
20160609, 64-bit


Re: BUG #15570: Vacuum analyze ERROR: MultiXactId XXXX has not beencreated yet -- apparent wraparound

От
Alvaro Herrera
Дата:
On 2018-Dec-31, PG Bug reporting form wrote:

> Both selects against and pg_dumps of the table fail with the same error.
> 
> We were able narrow down the selects to at least one row / page that seems
> to have problems:
> 
> select ctid,id from reviews_2018 where ctid = '(382604,16)'::tid;
>     ctid     |                id                
> -------------+----------------------------------
>  (382604,16) | 00ec91e42d21ce0b818fe8f63ab94c66
> 
> select ctid,id from reviews_2018 where id = 
> '00ec91e42d21ce0b818fe8f63ab94c66';
> ERROR:  MultiXactId 1483293027 has not been created yet -- apparent
> wraparound

Can you please send

 select lp, lp_flags, t_xmin, t_xmax, t_field3, t_ctid,
        to_hex(t_infomask) as infomask, to_hex(t_infomask2) as infomask2
   from heap_page_items(get_raw_page('reviews_2018', 382604)) h;

Also, please send the output of pg_controldata.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: BUG #15570: Vacuum analyze ERROR: MultiXactId XXXX has not beencreated yet -- apparent wraparound

От
Phil Hildebrand
Дата:
Sure -

beacon=# select lp, lp_flags, t_xmin, t_xmax, t_field3, t_ctid,
beacon-#         to_hex(t_infomask) as infomask, to_hex(t_infomask2)
as infomask2
beacon-#    from heap_page_items(get_raw_page('reviews_2018', 382604)) h;
 lp | lp_flags |   t_xmin   |   t_xmax   |  t_field3  |       t_ctid
    | infomask | infomask2
----+----------+------------+------------+------------+--------------------+----------+-----------
  1 |        1 |          0 | 2457403396 |    1048582 |
(151256864,239)    | 200      | 0
  2 |        2 |            |            |            |
    |          |
  3 |        2 |            |            |            |
    |          |
  4 |        1 |   25532455 |   26137203 |          0 | (382604,15)
    | 2102     | c010
  5 |        1 |   25532455 |   26137203 |          0 | (382604,16)
    | 2103     | c010
  6 |        1 | 1682137913 |  862335590 |  959460400 |
(892612660,26210)  | 6439     | 6562
  7 |        1 | 1179009631 | 1483293027 | 1382768435 |
(1717064302,26444) | 3133     | 4c76
  8 |        1 | 1412509997 |  825899313 |  943012409 | (959330611,90)
    | 0        | 0
  9 |        1 |  862073140 | 1701209657 | 1265002032 |
(1697722679,25911) | 652d     | 3031
 10 |        1 |   25518614 |          0 |          0 | (382604,10)
    | 2903     | 8010
 11 |        2 |            |            |            |
    |          |
 12 |        2 |            |            |            |
    |          |
 13 |        2 |            |            |            |
    |          |
 14 |        2 |            |            |            |
    |          |
 15 |        1 |   26137203 |          0 |          0 | (382604,15)
    | 2802     | 8010
 16 |        1 |   26137203 |          0 |          0 | (382604,16)
    | 2903     | 8010
 17 |        0 |            |            |            |
    |          |
(17 rows)

and

postgres@dallocalbeacondb01b:~$ pg_controldata /data/main/
pg_control version number:            1002
Catalog version number:               201707211
Database system identifier:           6619783480642413267
Database cluster state:               in production
pg_control last modified:             Tue 01 Jan 2019 12:30:04 AM UTC
Latest checkpoint location:           637/49000098
Prior checkpoint location:            637/48000098
Latest checkpoint's REDO location:    637/49000060
Latest checkpoint's REDO WAL file:    000000020000063700000049
Latest checkpoint's TimeLineID:       2
Latest checkpoint's PrevTimeLineID:   2
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:26398877
Latest checkpoint's NextOID:          7299101
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        548
Latest checkpoint's oldestXID's DB:   1
Latest checkpoint's oldestActiveXID:  26398877
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Tue 01 Jan 2019 12:30:04 AM UTC
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                on
max_connections setting:              1000
max_worker_processes setting:         8
max_prepared_xacts setting:           1000
max_locks_per_xact setting:           64
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Size of a large-object chunk:         2048
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           0
Mock authentication nonce:
65802e08ec705db32d404dc5484bdd1f1cbc4075825d6bc5b1b2a4806ad28bc0


On Mon, Dec 31, 2018 at 4:43 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2018-Dec-31, PG Bug reporting form wrote:
>
> > Both selects against and pg_dumps of the table fail with the same error.
> >
> > We were able narrow down the selects to at least one row / page that seems
> > to have problems:
> >
> > select ctid,id from reviews_2018 where ctid = '(382604,16)'::tid;
> >     ctid     |                id
> > -------------+----------------------------------
> >  (382604,16) | 00ec91e42d21ce0b818fe8f63ab94c66
> >
> > select ctid,id from reviews_2018 where id =
> > '00ec91e42d21ce0b818fe8f63ab94c66';
> > ERROR:  MultiXactId 1483293027 has not been created yet -- apparent
> > wraparound
>
> Can you please send
>
>  select lp, lp_flags, t_xmin, t_xmax, t_field3, t_ctid,
>         to_hex(t_infomask) as infomask, to_hex(t_infomask2) as infomask2
>    from heap_page_items(get_raw_page('reviews_2018', 382604)) h;
>
> Also, please send the output of pg_controldata.
>
> --
> Álvaro Herrera                https://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



--
Phil Hildebrand
Sr. DBE @ Moz
206.696.3413


Re: BUG #15570: Vacuum analyze ERROR: MultiXactId XXXX has not been created yet -- apparent wraparound

От
Andrew Gierth
Дата:
>>>>> "Phil" == Phil Hildebrand <phil.hildebrand@moz.com> writes:

 Phil> Sure - beacon=# select lp, lp_flags, t_xmin, t_xmax, t_field3,
 Phil> t_ctid, beacon-# to_hex(t_infomask) as infomask,
 Phil> to_hex(t_infomask2) as infomask2 beacon-# from
 Phil> heap_page_items(get_raw_page('reviews_2018', 382604)) h;

This data page has obviously been partially overwritten by unrelated
data. (Notice that many of those numbers in the output correspond to
printable ASCII strings.)

A hexdump of the raw page will probably make it easy to see which part
is corrupt and what the offending data is. Something like this is
probably the easiest way to get it (adding any other required psql
options of course):

psql -d beacon -AXqt -c "select get_raw_page('reviews_2018', 382604)" |
  perl -ne '/\\x([[:xdigit:]]+)/ and print pack("H*",$1)' |
  hexdump -C

Assuming that doesn't reveal any sensitive data, can you send the output
of that?

-- 
Andrew (irc:RhodiumToad)


Re: BUG #15570: Vacuum analyze ERROR: MultiXactId XXXX has not beencreated yet -- apparent wraparound

От
Phil Hildebrand
Дата:
Yeah - There's no sensitive data (it's public domain reviews).

I've attached the hexdump

On Mon, Dec 31, 2018 at 10:57 PM Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
>
> >>>>> "Phil" == Phil Hildebrand <phil.hildebrand@moz.com> writes:
>
>  Phil> Sure - beacon=# select lp, lp_flags, t_xmin, t_xmax, t_field3,
>  Phil> t_ctid, beacon-# to_hex(t_infomask) as infomask,
>  Phil> to_hex(t_infomask2) as infomask2 beacon-# from
>  Phil> heap_page_items(get_raw_page('reviews_2018', 382604)) h;
>
> This data page has obviously been partially overwritten by unrelated
> data. (Notice that many of those numbers in the output correspond to
> printable ASCII strings.)
>
> A hexdump of the raw page will probably make it easy to see which part
> is corrupt and what the offending data is. Something like this is
> probably the easiest way to get it (adding any other required psql
> options of course):
>
> psql -d beacon -AXqt -c "select get_raw_page('reviews_2018', 382604)" |
>   perl -ne '/\\x([[:xdigit:]]+)/ and print pack("H*",$1)' |
>   hexdump -C
>
> Assuming that doesn't reveal any sensitive data, can you send the output
> of that?
>
> --
> Andrew (irc:RhodiumToad)



-- 
Phil Hildebrand
Sr. DBE @ Moz
206.696.3413

Вложения

Re: BUG #15570: Vacuum analyze ERROR: MultiXactId XXXX has not been created yet -- apparent wraparound

От
Andrew Gierth
Дата:
>>>>> "Phil" == Phil Hildebrand <phil.hildebrand@moz.com> writes:

 Phil> Yeah - There's no sensitive data (it's public domain reviews).
 Phil> I've attached the hexdump

OK. The page is definitely corrupt starting at offset 0x1000 (i.e.
offset 4kB in the 8kB page), but it's harder than usual to spot because
(a) the tear is in the middle of a text field and (b) the data in the
second half of the page is actually from a page of what is presumably a
different partition of the same table (it has the same row structure,
but the data is from 2017 not 2018).

The split being on a 4k boundary points the finger at the hardware or
OS, since pg doesn't care about 4k hardware pages or 4k disk sectors but
rather does everything in 8k blocks.

-- 
Andrew.


Re: BUG #15570: Vacuum analyze ERROR: MultiXactId XXXX has not beencreated yet -- apparent wraparound

От
Phil Hildebrand
Дата:
Ok.  We didn't find any errors in syslog, in the kern log there were
only some ntpd errors related to apparmor:

...
Dec 19 01:55:45 dallocalbeacondb01c kernel: [960716.006995] audit:
type=1400 audit(1545184545.259:1257): apparmor="DENIED"
operation="open" profile="/usr/sbin/ntpd" name="/usr/local/sbin/"
pid=18444 comm="ntpd" requested_mask="r" denied_mask="r" fsuid=0
ouid=0
...

That said, these are VMs running on ESX hosts with SSD, so it's
certainly possible.  We'll check the hosts as well.

For future reference, is there a straight forward way to get all rows
that dont' have any data on a given corrupt page?  (rather than
restore to a point in time from a backup)

On Mon, Dec 31, 2018 at 11:57 PM Andrew Gierth
<andrew@tao11.riddles.org.uk> wrote:
>
> >>>>> "Phil" == Phil Hildebrand <phil.hildebrand@moz.com> writes:
>
>  Phil> Yeah - There's no sensitive data (it's public domain reviews).
>  Phil> I've attached the hexdump
>
> OK. The page is definitely corrupt starting at offset 0x1000 (i.e.
> offset 4kB in the 8kB page), but it's harder than usual to spot because
> (a) the tear is in the middle of a text field and (b) the data in the
> second half of the page is actually from a page of what is presumably a
> different partition of the same table (it has the same row structure,
> but the data is from 2017 not 2018).
>
> The split being on a 4k boundary points the finger at the hardware or
> OS, since pg doesn't care about 4k hardware pages or 4k disk sectors but
> rather does everything in 8k blocks.
>
> --
> Andrew.



-- 
Phil Hildebrand
Sr. DBE @ Moz
206.696.3413


Re: BUG #15570: Vacuum analyze ERROR: MultiXactId XXXX has not beencreated yet -- apparent wraparound

От
Michael Paquier
Дата:
On Tue, Jan 01, 2019 at 12:13:05AM -0800, Phil Hildebrand wrote:
> That said, these are VMs running on ESX hosts with SSD, so it's
> certainly possible.  We'll check the hosts as well.

You may have been bitten by a recent bug on ESX side.  Looking around
there has been a release 2 months ago which fixes an issue with
similar symptoms:
https://docs.vmware.com/en/VMware-vSphere/6.5/rn/esxi650-201810002.html

I am referring to this one in particular:
"PR 2209189: Heavy I/Os issued to a snapshot virtual machine using the
SEsparse format might cause guest OS file system inconsistencies."

And in this case this could be referring to 4kB blocks written where
they should not.
--
Michael

Вложения

Re: BUG #15570: Vacuum analyze ERROR: MultiXactId XXXX has not beencreated yet -- apparent wraparound

От
Phil Hildebrand
Дата:
Thanks!

On Wed, Jan 2, 2019 at 5:53 AM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Jan 01, 2019 at 12:13:05AM -0800, Phil Hildebrand wrote:
> That said, these are VMs running on ESX hosts with SSD, so it's
> certainly possible.  We'll check the hosts as well.

You may have been bitten by a recent bug on ESX side.  Looking around
there has been a release 2 months ago which fixes an issue with
similar symptoms:
https://docs.vmware.com/en/VMware-vSphere/6.5/rn/esxi650-201810002.html

I am referring to this one in particular:
"PR 2209189: Heavy I/Os issued to a snapshot virtual machine using the
SEsparse format might cause guest OS file system inconsistencies."

And in this case this could be referring to 4kB blocks written where
they should not.
--
Michael
--
Phil Hildebrand
Sr. DBE @ Moz
206.696.3413

Re: BUG #15570: Vacuum analyze ERROR: MultiXactId XXXX has not beencreated yet -- apparent wraparound

От
Stephen Frost
Дата:
Greetings,

* Phil Hildebrand (phil.hildebrand@moz.com) wrote:
> Data page checksum version:           0

You really might want to consider fixing that and using some tool to
regularly check the checksums of all the pages in the cluster, to
hopefully catch something like this happening again sooner than you
might otherwise, and while you still have backups you can go to and
restore the data from.

Thanks!

Stephen

Вложения

Re: BUG #15570: Vacuum analyze ERROR: MultiXactId XXXX has not beencreated yet -- apparent wraparound

От
Phil Hildebrand
Дата:
Thanks for the tip - we'll look into enabling the page checksums as well.

On Thu, Jan 3, 2019 at 1:08 PM Stephen Frost <sfrost@snowman.net> wrote:
>
> Greetings,
>
> * Phil Hildebrand (phil.hildebrand@moz.com) wrote:
> > Data page checksum version:           0
>
> You really might want to consider fixing that and using some tool to
> regularly check the checksums of all the pages in the cluster, to
> hopefully catch something like this happening again sooner than you
> might otherwise, and while you still have backups you can go to and
> restore the data from.
>
> Thanks!
>
> Stephen



-- 
Phil Hildebrand
Sr. DBE @ Moz
206.696.3413