Обсуждение: ALTER TABLE name RENAME TO new_name; does not work immediately
Hello, I don't know if this is expected behaviour or not, but I've found that this sequence of commands does not work all the time: ALTER TABLE name RENAME TO new_name; ALTER TABLE new_name RENAME COLUMN x TO y; It usually works with small work_mem and maintenance_work_mem values. If I increase work_mem and maintenance_work_mem to 256MB it stops working and prints following error message: ERROR: relation "new_name" does not exist It doesn't matter whether the commands are run within the transaction or not. I'm running postgresql 8.3.3 on Core 2 Duo with 2GB ram. Actual table has 700 000 rows. I think the issue might be timing related as I've managed to "solve" the issue by putting SAVEPOINT between the commands on one machine. But on other machine this did not help. I think the issue might be reproducible by creating table with a lot of rows, setting high work_mem and issuing the commands, but I did not try to reproduce it this way.
Oh, I've forgot to mention that the commands are issued from the script. When issued by hand there is no problem at all (probably because I'm a slow typist :-D). Thank you and regards, Ondrej Jirman
<megous@gmail.com> writes: > ALTER TABLE name RENAME TO new_name; > ALTER TABLE new_name > RENAME COLUMN x TO y; > ... > I think the issue might be reproducible by creating table with a lot > of rows, setting high work_mem and issuing the commands, but I did not > try to reproduce it this way. The commands you described should take the same length of time regardless of the size of table and the memory settings are not relevant. I suspect you're actually running some different commands? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning
Here is a script that is able to reproduce the problem (on my machine anyway), you may need to play with the number of rows or workmem settings to be able to reproduce it: regards, Ondrej Jirman #!/bin/sh dropdb bug createdb bug psql bug << EOF CREATE TABLE xxx (id SERIAL, col1 TEXT, col2 TEXT); INSERT INTO xxx SELECT s AS id, s || 'XXXX' AS col1, s || 'YYYYY' AS col2 FROM generate_series(0, 2000000) AS s; EOF psql bug << EOF SET maintenance_work_mem = 256000; SET work_mem = 256000; SHOW maintenance_work_mem; SHOW work_mem; VACUUM FULL ANALYZE; REINDEX DATABASE bug; BEGIN; ALTER TABLE xxx RENAME TO yyy; ALTER TABLE yyy RENAME COLUMN col1 TO colA; ALTER TABLE yyy RENAME COLUMN col2 TO colB; COMMIT; EOF
<megous@gmail.com> writes: > Here is a script that is able to reproduce the problem (on my machine > anyway), you may need to play with the number of rows or workmem > settings to be able to reproduce it: Egads, i can confirm that this script reproduces the problem on my machine as well: ALTER TABLE xxx RENAME TO yyy; ALTER TABLE ALTER TABLE yyy RENAME COLUMN col1 TO colA; ERROR: relation "yyy" does not exist -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!
Gregory Stark escribió: > > <megous@gmail.com> writes: > > > Here is a script that is able to reproduce the problem (on my machine > > anyway), you may need to play with the number of rows or workmem > > settings to be able to reproduce it: > > Egads, i can confirm that this script reproduces the problem on my machine as > well: > > ALTER TABLE xxx RENAME TO yyy; > ALTER TABLE > ALTER TABLE yyy RENAME COLUMN col1 TO colA; > ERROR: relation "yyy" does not exist On what version? Sounds like a problem in sinval message processing. I could not reproduce it here, but didn't try very hard. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
"Alvaro Herrera" <alvherre@commandprompt.com> writes: > Gregory Stark escribi=F3: >>=20 >> <megous@gmail.com> writes: >>=20 >> > Here is a script that is able to reproduce the problem (on my machine >> > anyway), you may need to play with the number of rows or workmem >> > settings to be able to reproduce it: >>=20 >> Egads, i can confirm that this script reproduces the problem on my machi= ne as >> well: >>=20 >> ALTER TABLE xxx RENAME TO yyy; >> ALTER TABLE >> ALTER TABLE yyy RENAME COLUMN col1 TO colA; >> ERROR: relation "yyy" does not exist > > On what version? Sounds like a problem in sinval message processing. My first suspicion was in the same area. It's two commands within the same session so it wouldn't actually be receiving any sinval messages just processing the local list. The two commands are within the same transaction= so only the end-of-command list processed during CommandCounterIncrement() is happening (or should be happening). Either a) we're not actually issuing a CommandCounterIncrement() or b) the command is not being properly marked as non-read-only when the update happe= ns or, uh, c) something else is broken with generating or executing those invalidation messages. > I could not reproduce it here, but didn't try very hard. Did you try running the script he provided? It happened for me first try. --=20 Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL tra= ining!
Gregory Stark escribió: > Either a) we're not actually issuing a CommandCounterIncrement() or b) the > command is not being properly marked as non-read-only when the update happens > or, uh, c) something else is broken with generating or executing those > invalidation messages. Well, the CommandCounterIncrement call is either in postgres.c line 1032, or in the finish_xact_command call in line 1024 (may I suggest a breakpoint in CCI while reproducing the bug?). I'm not sure I understand your (b) argument (if we failed to produce a sinval message for this then the bug would have been detected eons ago), so I think the most likely explanation is (c) :-) > > I could not reproduce it here, but didn't try very hard. > > Did you try running the script he provided? It happened for me first try. Yep. No luck. I didn't try changing the args in generate_series though. How many CPUs are you running this on? My system has 2. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
"Alvaro Herrera" <alvherre@commandprompt.com> writes: >> > I could not reproduce it here, but didn't try very hard. >> >> Did you try running the script he provided? It happened for me first try. > > Yep. No luck. I didn't try changing the args in generate_series though. > > How many CPUs are you running this on? My system has 2. Huh, I've been able to simplify the reproducing script somewhat and still trigger the bug. It is *not* necessary to run the table rename and the column rename in the same transaction. That means it's *not* a problem with a missing CommandCounterIncrement() or anything like that. Even at transaction end the cache entry isn't being invalidated up if that's the cause. Also, as expected the size of the table is irrelevant. On the other hand the problem does not occur if the CREATE TABLE is in the same session. Nor if the REINDEX DATABASE is skipped. #!/bin/sh /usr/local/pgsql/bin/dropdb bug /usr/local/pgsql/bin/createdb bug /usr/local/pgsql/bin/psql -X bug << EOF CREATE TABLE xxx (id SERIAL, col1 TEXT, col2 TEXT); EOF /usr/local/pgsql/bin/psql -e -X bug << EOF REINDEX DATABASE bug; ALTER TABLE xxx RENAME TO yyy; ALTER TABLE yyy RENAME COLUMN col1 TO colA; ALTER TABLE yyy RENAME COLUMN col2 TO colB; EOF -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!
Gregory Stark <stark@enterprisedb.com> writes: > On the other hand the problem does not occur if the CREATE TABLE is in the > same session. Nor if the REINDEX DATABASE is skipped. It appears that you have to do the RENAME TABLE in the same session as the REINDEX DATABASE. Furthermore, the failure then persists into subsequent sessions. There is a "yyy" entry in pg_class, but index searches using pg_class_relname_nsp_index don't find it. Also, I can still reproduce it with just REINDEX TABLE pg_class instead of REINDEX DATABASE. What I suspect is we've broken the logic that manages REINDEX applied to pg_class itself. There's a mighty delicate dance that goes on there, and we haven't tested it too much lately. regards, tom lane
I wrote: > What I suspect is we've broken the logic that manages REINDEX applied to > pg_class itself. Hm, this is a HOT bug. (Unless the OP found it on something older than 8.3, in which case we might have more than one issue involved.) What I'm seeing is that when heap_update checks to see if the new pg_class tuple is HOT-updatable, RelationGetIndexAttrBitmap tells it that only the OID is an indexed attribute, and so it does a HOT update, leaving the wrong index entry in place. Apparently, rd_indexattr needs to get cleared someplace where it isn't being cleared right now. I'm sure this is related to the fancy dance we do for pg_class reindexing, but not sure yet where to fix it. regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > Also, I can still reproduce it with just REINDEX TABLE pg_class instead > of REINDEX DATABASE. Ah, I had tried just a reindex xxx but not a reindex pg_class. * reindex_index will attempt to update the pg_class rows for the relation * and index. If we are processing pg_class itself, we want to make sure * that the updates do not try to insert index entries into indexes we * have not processed yet. (When we are trying to recover from corrupted * indexes, that could easily cause a crash.) We can accomplish this * because CatalogUpdateIndexes will use the relcache's index list to know * which indexes to update. We just force the index list to be only the * stuff we've processed. Uhm. Is it possible we're mistakenly doing a HOT update because we're lying about what indexes exist? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!
Gregory Stark <stark@enterprisedb.com> writes: > Uhm. Is it possible we're mistakenly doing a HOT update because we're lying > about what indexes exist? Yup, exactly. Here's my proposed fix... regards, tom lane Index: src/backend/catalog/index.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/catalog/index.c,v retrieving revision 1.300 diff -c -r1.300 index.c *** src/backend/catalog/index.c 19 Jun 2008 00:46:04 -0000 1.300 --- src/backend/catalog/index.c 10 Aug 2008 18:49:08 -0000 *************** *** 2380,2388 **** * problem. */ is_pg_class = (RelationGetRelid(rel) == RelationRelationId); ! doneIndexes = NIL; /* Reindex all the indexes. */ foreach(indexId, indexIds) { Oid indexOid = lfirst_oid(indexId); --- 2380,2392 ---- * problem. */ is_pg_class = (RelationGetRelid(rel) == RelationRelationId); ! ! /* Ensure rd_indexattr is valid; see comments for RelationSetIndexList */ ! if (is_pg_class) ! (void) RelationGetIndexAttrBitmap(rel); /* Reindex all the indexes. */ + doneIndexes = NIL; foreach(indexId, indexIds) { Oid indexOid = lfirst_oid(indexId); Index: src/backend/utils/cache/relcache.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/utils/cache/relcache.c,v retrieving revision 1.272 diff -c -r1.272 relcache.c *** src/backend/utils/cache/relcache.c 12 May 2008 00:00:52 -0000 1.272 --- src/backend/utils/cache/relcache.c 10 Aug 2008 18:49:08 -0000 *************** *** 2986,2991 **** --- 2986,2998 ---- * messages. In practice it is only used on pg_class (see REINDEX). * * It is up to the caller to make sure the given list is correctly ordered. + * + * We deliberately do not change rd_indexattr here: even when operating + * with a temporary partial index list, HOT-update decisions must be made + * correctly with respect to the full index set. It is up to the caller + * to ensure that a correct rd_indexattr set has been cached before first + * calling RelationSetIndexList; else a subsequent inquiry might cause a + * wrong rd_indexattr set to get computed and cached. */ void RelationSetIndexList(Relation relation, List *indexIds, Oid oidIndex) *************** *** 3004,3010 **** relation->rd_indexvalid = 2; /* mark list as forced */ /* must flag that we have a forced index list */ need_eoxact_work = true; - /* we deliberately do not change rd_indexattr */ } /* --- 3011,3016 ----
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > Apparently, rd_indexattr needs to get cleared someplace where it > isn't being cleared right now. I'm sure this is related to the > fancy dance we do for pg_class reindexing, but not sure yet where > to fix it. Yeah, doing the equivalent of this in gdb seems to fix it (haven't rebuilt yet to test that I haven't done something silly in the C code though) --- index.c 30 Jul 2008 21:56:59 +0100 1.300 +++ index.c 10 Aug 2008 19:55:21 +0100 @@ -2382,6 +2382,11 @@ is_pg_class = (RelationGetRelid(rel) == RelationRelationId); doneIndexes = NIL; + /* have to build the HOT indexed column bitmap before we start lying about + * what indexes exist... */ + if (is_pg_class && !rel->rd_indexattr) + (void)RelationGetIndexAttrBitmap(rel); + /* Reindex all the indexes. */ foreach(indexId, indexIds) { -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!
On 9 Srp, 19:38, st...@enterprisedb.com (Gregory Stark) wrote: > <meg...@gmail.com> writes: > > ALTER TABLE name RENAME TO new_name; > > ALTER TABLE new_name > > =A0 RENAME COLUMN x TO y; > > ... > > I think the issue might be reproducible by creating table with a lot > > of rows, setting high work_mem and issuing the commands, but I did not > > try to reproduce it this way. > > The commands you described should take the same length of time regardless= of > the size of table and the memory settings are not relevant. I suspect you= 're > actually running some different commands? > > -- > =A0 Gregory Stark > =A0 EnterpriseDB =A0 =A0 =A0 =A0 =A0http://www.enterprisedb.com > =A0 Ask me about EnterpriseDB's On-Demand Production Tuning > > -- > Sent via pgsql-bugs mailing list (pgsql-b...@postgresql.org) > To make changes to your subscription:http://www.postgresql.org/mailpref/p= gsql-bugs -- This script will migrate shema from the version 1.1.18 to the version 1.2.0 -- (i.e. Gardeners integration) This is what I actually run: SET search_path =3D public, isn, pg_catalog; SET maintenance_work_mem =3D 256000; SET work_mem =3D 256000; VACUUM FULL ANALYZE; REINDEX DATABASE shakes; BEGIN; ALTER TABLE btbooks RENAME TO exbooks; -- following command returns the error ALTER TABLE exbooks RENAME COLUMN bt_price TO list_price; -- many more other commands .... COMMIT;