Обсуждение: BUG #9741: Mininal case for the BUG #9735: Error: "ERROR: tuple offset out of range: 0" during bitmap scan
BUG #9741: Mininal case for the BUG #9735: Error: "ERROR: tuple offset out of range: 0" during bitmap scan
От
maxim.boguk@gmail.com
Дата:
The following bug has been logged on the website: Bug reference: 9741 Logged by: Maxim Boguk Email address: maxim.boguk@gmail.com PostgreSQL version: 9.3.3 Operating system: Linux (Ubuntu) Description: Hi, How situation with the BUG #9735 become even more curious. One query successfully executing on master db but doesn't work on both streaming replicas. After much digging I located problem and produced it in pretty simple way: gate_platbox=# set enable_indexscan to 0; SET (force bitmap scan) gate_platbox=# explain analyze select * from transactions where id=53265020; ERROR: tuple offset out of range: 0 This tuple had been frozen not that long time ago: gate_platbox=# select id,xmin,xmax,cmin,cmax,ctid from transactions where id=53265020; id | xmin | xmax | cmin | cmax | ctid ----------+------+------+------+------+------------- 53265020 | 2 | 0 | 1 | 1 | (2413168,1) pageinspect over used index also show nothing suspicious: select gs.i,t.* from (select generate_series(1,94961) as i) as gs, bt_page_items('transactions_pkey', i) as t where ctid::text='(2413168,1)'; i | itemoffset | ctid | itemlen | nulls | vars | data -------+------------+-------------+---------+-------+------+------------------------- 88472 | 93 | (2413168,1) | 16 | f | f | 7c c2 2c 03 00 00 00 00 Any ideas what's can be wrong with the database?
> gate_platbox=# set enable_indexscan to 0; > SET > (force bitmap scan) > gate_platbox=# explain analyze select * from transactions where > id=53265020; > ERROR: tuple offset out of range: 0 > > This tuple had been frozen not that long time ago: > > gate_platbox=# select id,xmin,xmax,cmin,cmax,ctid from transactions where > id=53265020; > id | xmin | xmax | cmin | cmax | ctid > ----------+------+------+------+------+------------- > 53265020 | 2 | 0 | 1 | 1 | (2413168,1) > > pageinspect over used index also show nothing suspicious: > select gs.i,t.* from (select generate_series(1,94961) as i) as gs, > bt_page_items('transactions_pkey', i) as t where ctid::text='(2413168,1)'; > i | itemoffset | ctid | itemlen | nulls | vars | data > > -------+------------+-------------+---------+-------+------+------------------------- > 88472 | 93 | (2413168,1) | 16 | f | f | 7c c2 2c 03 00 > 00 00 00 > > Any ideas what's can be wrong with the database? > Some playing with GDB end me with the btgetbitmap function in nbtree.c, And with the next content of the BTScanOpaque so = (BTScanOpaque) scan->opaque structure: (gdb) p *(BTScanOpaque) scan->opaque $21 = { qual_ok = 1 '\001', numberOfKeys = 1, keyData = 0x7f099afaad80, arrayKeyData = 0x0, numArrayKeys = 0, arrayKeys = 0x0, arrayContext = 0x0, killedItems = 0x0, numKilled = 0, currTuples = 0x0, markTuples = 0x0, markItemIndex = -1, currPos = { buf = 137250, nextPage = 88473, moreLeft = 0 '\000', moreRight = 0 '\000', nextTupleOffset = 0, firstItem = 0, lastItem = 1, itemIndex = 1, items = {[0] = { heapTid = { ip_blkid = { bi_hi = 36, bi_lo = 53872 }, ip_posid = 1 }, indexOffset = 93, tupleOffset = 0 }, [1] = { heapTid = { ip_blkid = { bi_hi = 35, bi_lo = 44171 }, ip_posid = 0 }, indexOffset = 94, tupleOffset = 0 }, ... It could be seen that there some weird second element in items array with ip_posid = 0. Now that lead me to check the next entry in transactions_pkey index (with offset=94): select * from bt_page_items('transactions_pkey', 88472) where itemoffset in (93,94); itemoffset | ctid | itemlen | nulls | vars | data ------------+-------------+---------+-------+------+------------------------- 93 | (2413168,1) | 16 | f | f | 7c c2 2c 03 00 00 00 00 94 | (2337931,0) | 16 | f | f | 7c c2 2c 03 00 00 00 00 Ok so now we have a two entry for the same value in the index. Index on the master database have the same content. But the second entry very likely not visible on master db. So it confirm my idea that there some permanent corruption of visibility map on replicas (and probably on master database as well). What's strange that the problem could not be fixed via vacuum freeze with vacuum_freeze_table_age=0. Any suggestions what I can do next? Unfortunately dump/restore the database not an issue at this moment (it huge project with strict online read-write requirements). Is there any way force rebuild a visibility map for the table without access exclusive lock (e.g. without cluster/vacuum full)? I going to try upgrade database to 9.3.4 today and see is it fix an issue. But I see nothing about visibility map on change notes. Kind Regards, Maksym
Maxim Boguk <maxim.boguk@gmail.com> writes: > select * from bt_page_items('transactions_pkey', 88472) where itemoffset in > (93,94); > itemoffset | ctid | itemlen | nulls | vars | data > ------------+-------------+---------+-------+------+------------------------- > 93 | (2413168,1) | 16 | f | f | 7c c2 2c 03 00 00 00 00 > 94 | (2337931,0) | 16 | f | f | 7c c2 2c 03 00 00 00 00 This is sort of what I expected. Surely reindexing the index will get rid of that bogus entry? regards, tom lane
On Thu, Mar 27, 2014 at 10:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Maxim Boguk <maxim.boguk@gmail.com> writes: > > select * from bt_page_items('transactions_pkey', 88472) where itemoffset > in > > (93,94); > > itemoffset | ctid | itemlen | nulls | vars | data > > > ------------+-------------+---------+-------+------+------------------------- > > 93 | (2413168,1) | 16 | f | f | 7c c2 2c 03 00 00 > 00 00 > > 94 | (2337931,0) | 16 | f | f | 7c c2 2c 03 00 00 > 00 00 > > This is sort of what I expected. Surely reindexing the index will get > rid of that bogus entry? > > regards, tom lane > Yep reindexing that particular index fixed problem. But this index had been reindexed just 2 days ago in the same situation, and now problem reappear (with different table row/index entries). If problem reappear after the last reindex and with 9.3.4 version I will investigate future. It would be interesting know how simple integer btree index could have entry with definitely wrong ctid? It very strange for problem reappear with the only one table (there are couple other tables in the database with similar load but with no issues at all), it looks too selective and repeatable for a hardware glitch. Kind Regards, Maksym
On Thu, Mar 27, 2014 at 11:27 PM, Maxim Boguk <maxim.boguk@gmail.com> wrote: > > > > On Thu, Mar 27, 2014 at 10:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Maxim Boguk <maxim.boguk@gmail.com> writes: >> > select * from bt_page_items('transactions_pkey', 88472) where >> itemoffset in >> > (93,94); >> > itemoffset | ctid | itemlen | nulls | vars | data >> > >> ------------+-------------+---------+-------+------+------------------------- >> > 93 | (2413168,1) | 16 | f | f | 7c c2 2c 03 00 00 >> 00 00 >> > 94 | (2337931,0) | 16 | f | f | 7c c2 2c 03 00 00 >> 00 00 >> >> This is sort of what I expected. Surely reindexing the index will get >> rid of that bogus entry? >> >> regards, tom lane >> > > Yep reindexing that particular index fixed problem. > But this index had been reindexed just 2 days ago in the same situation, > and now problem reappear (with different table row/index entries). > > If problem reappear after the last reindex and with 9.3.4 version I will > investigate future. > > > It would be interesting know how simple integer btree index could have > entry with definitely wrong ctid? > It very strange for problem reappear with the only one table (there are > couple other tables in the database with similar load but with no issues at > all), it looks too selective and repeatable for a hardware glitch. > Hi, Situation reappeared just in 3 hours after reindex (and no database crash/restart happens during that period). Problem found via manual search of the potentially troublesome id's: Step 1: search potentially suspicious index entry/id's (with index ctid have offset zero): select *,(get_byte(val, 0))+(get_byte(val, 1)<<8)+(get_byte(val, 2)<<16)+(get_byte(val, 3)<<24) as id from (select gs.i,t.*,(E'\\x'||replace(trim(trailing ' 0' from data), ' ', ''))::bytea as val from (select generate_series(1,90000) as i) as gs, bt_page_items('transactions_pkey', i) as t where ctid::text like '%,0)') as _t order by id,i,itemoffset; i | itemoffset | ctid | itemlen | nulls | vars | data | val | id -------+------------+-------------+---------+-------+------+-------------------------+------------+---------- 83604 | 60 | (2233938,0) | 16 | f | f | df aa 11 03 00 00 00 00 | \xdfaa1103 | 51489503 83648 | 92 | (2245568,0) | 16 | f | f | e7 e9 11 03 00 00 00 00 | \xe7e91103 | 51505639 83819 | 37 | (2253801,0) | 16 | f | f | bc dc 12 03 00 00 00 00 | \xbcdc1203 | 51567804 83907 | 184 | (2256376,0) | 16 | f | f | 1f 5b 13 03 00 00 00 00 | \x1f5b1303 | 51600159 ... Found 25 rows. Now situation become much worse: the test query ( set enable_indexscan to 0; select * from transactions where id in (25 rows from previous query result); ) produce the same error on the master database as well as on the replicas. And reindex index transactions_pkey; doesn't help. However, filenode of index changed for sure (so reindex happened): select *,pg_relation_filenode('transactions_pkey'),pg_relation_filepath('transactions_pkey') from pg_stat_file(pg_relation_filepath('transactions_pkey')); size | access | modification | change | creation | isdir | pg_relation_filenode | pg_relation_filepath -----------+------------------------+------------------------+------------------------+----------+-------+----------------------+---------------------- 760471552 | 2014-03-28 04:02:22+04 | 2014-03-28 04:16:34+04 | 2014-03-28 04:16:34+04 | | f | 408909 | base/16424/408909 reindex index transactions_pkey; REINDEX select *,pg_relation_filenode('transactions_pkey'),pg_relation_filepath('transactions_pkey') from pg_stat_file(pg_relation_filepath('transactions_pkey')); size | access | modification | change | creation | isdir | pg_relation_filenode | pg_relation_filepath -----------+------------------------+------------------------+------------------------+----------+-------+----------------------+---------------------- 760438784 | 2014-03-28 04:18:20+04 | 2014-03-28 04:19:39+04 | 2014-03-28 04:19:39+04 | | f | 408931 | base/16424/408931 set enable_indexscan to 0; SET select id from transactions where id in (51489503,51505639,51567804,... 25 total); ERROR: tuple offset out of range: 0 Any suggestions what I can do next and where problem could be? Now it doesn't seems replication issue... may be some repeatable low level database corruption? But I have no idea what type corruption could produce so weird results. Kind Regards, Maskym