Обсуждение: Allow replacement of bloated primary key indexes without foreign key rebuilds
Allow replacement of bloated primary key indexes without foreign key rebuilds
От
Gurjeet Singh
Дата:
<div dir="ltr">Today I learnt [1,2,3] that the feature ALTER TABLE .. ADD CONSTRAINT ... USING INDEX we added back in theday is not so useful in the field. Having to drop foreign key constraints before this command, and recreate them afterwardsmakes this command useless to most database setups. I feel sorry that no one brought this up when we were implementingthe feature; maybe we could've done something about it right then.<br /><br />I wish to correct it now, and didsome research. Long story short, I realized that the foreign key constraint depends on the index relation of the primarykey, and does not depend on the constraint object as I had expected (Please see rows 5 and 11 of the result set shownbelow). This behaviour is also seen when the FKey references a unique constraint. As much as that perplexes me, I thinkit makes our job a bit easier.<br /><br />All we need to do is allow swapping of pg_class.relfilenode of two indexes.This will let the dependency entries stand as they are and allow us to drop the bloated primary key index structurewithout having to rebuild the foreign key constraints.<br /><br />As for the syntactical sugar, this can be addedto either ALTER TABLE or to ALTER INDEX. Although under no normal circumstances one would need to use ALTER INDEX toswap two indexes' relfilenode (because one can easily create a duplicate index and drop/rename-in-place the old one), Ithink it would make more sense here since it is just an operation on two indexes and has nothing to do with the constraints,apart from the fact that we want to use this feature to meddle with the constraints.<br /><br />Syntax options:<br/><br />ALTER TABLE tbl REPLACE [CONSTRAINT constr] {PRIMARY KEY | UNIQUE} USING INDEX new_index;<br /><br />ALTERINDEX ind REPLACE WITH new_index;<br /><br />Note that in both the syntaxes, it is assumed that all remnants of new_indexwill be gone after the command completes successfully; that is, the commands will behave as if they deleted theindex structure of the index being replaced and placed the new structure in its place, while dropping the index that wasused for the replacement.<br /><br />I don't think we need to ensure that the new_index is completely flushed to diskbefore the operation, but we do need to issue relevant cache invalidation messages after the operation is done.<br /><br/>For replacement to be successful, new_index should not be associated with any constraints, and, new_index should beidentical to the index being replaced, except for the index names.<br /><br />The ALTER TABLE syntax closely emulates theexisting syntax of replacing a constraint using an existing index, but looking at the grammar construction I feel thatit may be more complex to implement than the ALTER INDEX syntax.<br /><br /> ALTER INDEX feels easier to do, since wewon't have to jump through hoops like in ALTER TABLE's multi-command support (ATExec*() functions), and dropping the new_indexmight be easier to do.<br /><br />Thoughts?<br /><br /><span style="font-family:courier new,monospace">postgres=#\d+ test</span><br style="font-family:courier new,monospace" /><span style="font-family:couriernew,monospace"> Table "public.test"</span><br style="font-family:couriernew,monospace" /><span style="font-family:courier new,monospace"> Column | Type | Modifiers| Storage | Stats target | Description </span><br style="font-family:courier new,monospace" /><span style="font-family:courier new,monospace">--------+---------+-----------+---------+--------------+-------------</span><br style="font-family:couriernew,monospace" /><span style="font-family:courier new,monospace"> a | integer | not null | plain | | </span><br style="font-family:courier new,monospace" /><span style="font-family:courier new,monospace"> b | integer | | plain | | </span><br style="font-family:courier new,monospace"/><span style="font-family:courier new,monospace">Indexes:</span><br style="font-family:courier new,monospace"/><span style="font-family:courier new,monospace"> "test_pkey" PRIMARY KEY, btree (a)</span><br style="font-family:couriernew,monospace" /><span style="font-family:courier new,monospace"> "temp_idx" UNIQUE, btree (a)</span><brstyle="font-family:courier new,monospace" /><span style="font-family:courier new,monospace"></span><span style="font-family:couriernew,monospace">Referenced by:</span><br style="font-family:courier new,monospace" /><span style="font-family:couriernew,monospace"> TABLE "test2" CONSTRAINT "test2_b_fkey" FOREIGN KEY (b) REFERENCES test(a)</span><brstyle="font-family:courier new,monospace" /><span style="font-family:courier new,monospace">Has OIDs: no</span><brstyle="font-family:courier new,monospace" /><br style="font-family:courier new,monospace" /><span style="font-family:couriernew,monospace">postgres=# \d+ test2</span><br style="font-family:courier new,monospace" /><spanstyle="font-family:courier new,monospace"> Table "public.test2"</span><br style="font-family:couriernew,monospace" /><span style="font-family:courier new,monospace"> Column | Type | Modifiers| Storage | Stats target | Description </span><br style="font-family:courier new,monospace" /><span style="font-family:courier new,monospace">--------+---------+-----------+---------+--------------+-------------</span><br style="font-family:couriernew,monospace" /><span style="font-family:courier new,monospace"> a | integer | | plain | | </span><br style="font-family:courier new,monospace" /><span style="font-family:couriernew,monospace"> b | integer | | plain | | </span><br style="font-family:couriernew,monospace" /><span style="font-family:courier new,monospace">Foreign-key constraints:</span><brstyle="font-family:courier new,monospace" /><span style="font-family:courier new,monospace"> "test2_b_fkey"FOREIGN KEY (b) REFERENCES test(a)</span><br style="font-family:courier new,monospace" /><span style="font-family:couriernew,monospace">Has OIDs: no</span><br style="font-family:courier new,monospace" /><br />Relevantoutput of query [4] on pg_depend:<br /><br style="font-family:courier new,monospace" /><span style="font-family:couriernew,monospace"> classid | objid | objid | refclassid | refobjid| refobjid | deptype </span><br style="font-family:courier new,monospace" /><span style="font-family:courier new,monospace">---------------+-------+------------------------------+---------------+----------+--------------+---------<br /> pg_class | 16413 | test | pg_namespace | 2200 | public | n</span><br style="font-family:couriernew,monospace" /><span style="font-family:courier new,monospace"> pg_type | 16415 | test | pg_class | 16413 | test | i</span><br style="font-family:courier new,monospace"/><span style="font-family:courier new,monospace"> pg_type | 16414 | test[] | pg_type | 16415 | test | i</span><br style="font-family:courier new,monospace" /><span style="font-family:couriernew,monospace"> pg_constraint | 16417 | test_pkey | pg_class | 16413| test | a</span><br style="font-family:courier new,monospace" /><span style="font-family:courier new,monospace"> pg_class | 16416 | test_pkey | pg_constraint | 16417 | test_pkey | i</span><brstyle="font-family:courier new,monospace" /><span style="font-family:courier new,monospace"> pg_class | 16418| test2 | pg_namespace | 2200 | public | n</span><br style="font-family:courier new,monospace"/><span style="font-family:courier new,monospace"> pg_type | 16420 | test2 | pg_class | 16418 | test2 | i</span><br style="font-family:courier new,monospace" /><span style="font-family:couriernew,monospace"> pg_type | 16419 | test2[] | pg_type | 16420| test2 | i</span><br style="font-family:courier new,monospace" /><span style="font-family:courier new,monospace"> pg_constraint| 16421 | test2_b_fkey | pg_class | 16413 | test | n</span><brstyle="font-family:courier new,monospace" /><span style="font-family:courier new,monospace"> pg_constraint | 16421| test2_b_fkey | pg_class | 16418 | test2 | a</span><br style="font-family:courier new,monospace"/><span style="font-family:courier new,monospace"> pg_constraint | 16421 | test2_b_fkey | pg_class | 16416 | test_pkey | n</span><br style="font-family:courier new,monospace" /><span style="font-family:couriernew,monospace"> pg_trigger | 16422 | RI_ConstraintTrigger_a_16422 | pg_constraint | 16421| test2_b_fkey | i</span><br style="font-family:courier new,monospace" /><span style="font-family:courier new,monospace"> pg_trigger | 16423 | RI_ConstraintTrigger_a_16423 | pg_constraint | 16421 | test2_b_fkey | i</span><brstyle="font-family:courier new,monospace" /><span style="font-family:courier new,monospace"> pg_trigger | 16424| RI_ConstraintTrigger_c_16424 | pg_constraint | 16421 | test2_b_fkey | i</span><br style="font-family:courier new,monospace"/><span style="font-family:courier new,monospace"> pg_trigger | 16425 | RI_ConstraintTrigger_c_16425 | pg_constraint| 16421 | test2_b_fkey | i</span><br style="font-family:courier new,monospace" /><span style="font-family:couriernew,monospace"></span><br />[1] <a href="http://archives.postgresql.org/pgsql-general/2012-07/msg00104.php">http://archives.postgresql.org/pgsql-general/2012-07/msg00104.php</a><br />[2]<a href="http://archives.postgresql.org/pgsql-general/2012-07/msg00105.php">http://archives.postgresql.org/pgsql-general/2012-07/msg00105.php</a><br />[3] <a href="http://archives.postgresql.org/pgsql-general/2012-07/msg00110.php">http://archives.postgresql.org/pgsql-general/2012-07/msg00110.php</a><br clear="all"/>[4] select classid::regclass, objid, case classid::regclass::text when 'pg_class' then objid::regclass::textwhen 'pg_type' then objid::regtype::text when 'pg_constraint' then (select conname from pg_constraintwhere oid = objid) when 'pg_namespace' then (select nspname from pg_namespace where oid = objid) when 'pg_trigger'then (select tgname from pg_trigger where oid = objid) else objid::text end, refclassid::regclass, refobjid,case refclassid::regclass::text when 'pg_class' then refobjid::regclass::text when 'pg_type' then refobjid::regtype::textwhen 'pg_constraint' then (select conname from pg_constraint where oid = refobjid) when 'pg_namespace'then (select nspname from pg_namespace where oid = refobjid) when 'pg_trigger' then (select tgname from pg_triggerwhere oid = refobjid) else refobjid::text end, deptype from pg_depend;<br /><br />-- <br /><div dir="ltr">GurjeetSingh<br />EnterpriseDB Corporation<br />The Enterprise PostgreSQL Company<br /></div><br /></div>
On Sat, Jul 7, 2012 at 4:53 AM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote: > All we need to do is allow swapping of pg_class.relfilenode of two indexes. > This will let the dependency entries stand as they are and allow us to drop > the bloated primary key index structure without having to rebuild the > foreign key constraints. Fwiw I don't like swapping relfilenodes on indexes the user created. REINDEX currently does this but it's a bit of a hack and only works because reindex carefully builds the new index with exactly the same definition as the old one. The problem you describe is one of constraints and dependencies and not one of indexes. It seems what you really want is a way to alter foreign key dependencies to depend on a new index. Either an explicit command that lets you set the new dependency or what seems even better would be to have DROP INDEX check any dependent objects to see if there's another index that can satisfy them and change their dependency. These might suffer from deadlock problems but hopefully they could be manageable since it's not a frequent operation and there aren't any other operations that rejigger dependencies. -- greg
Greg Stark <stark@mit.edu> writes: > On Sat, Jul 7, 2012 at 4:53 AM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote: >> All we need to do is allow swapping of pg_class.relfilenode of two indexes. > Fwiw I don't like swapping relfilenodes on indexes the user created. > REINDEX currently does this but it's a bit of a hack and only works > because reindex carefully builds the new index with exactly the same > definition as the old one. Yes. The swap-relfilenodes operation would have to carefully check that the index definitions were exactly equivalent, and there would be a constant risk for bugs of omission if that code weren't taught about any new index properties we invent. > The problem you describe is one of constraints and dependencies and > not one of indexes. It seems what you really want is a way to alter > foreign key dependencies to depend on a new index. Either an explicit > command that lets you set the new dependency or what seems even better > would be to have DROP INDEX check any dependent objects to see if > there's another index that can satisfy them and change their > dependency. Either of these have exactly the same issue, namely their correctness depends on determining if two indexes have identical properties. All of these things seem like ugly, hard-to-use kluges anyway (the make-sure-the-indexes-match business is just as much of a PITA for the DBA as it is for the system). What we really want is REINDEX CONCURRENTLY. regards, tom lane
On Tue, Jul 10, 2012 at 3:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The problem you describe is one of constraints and dependencies and >> not one of indexes. It seems what you really want is a way to alter >> foreign key dependencies to depend on a new index. Either an explicit >> command that lets you set the new dependency or what seems even better >> would be to have DROP INDEX check any dependent objects to see if >> there's another index that can satisfy them and change their >> dependency. > > Either of these have exactly the same issue, namely their correctness > depends on determining if two indexes have identical properties. This doesn't sound right to me. In these cases all it would have to know about is the same set of properties that CREATE CONSTRAINT looks for to find a satisfactory index to depend on. -- greg
Re: Re: Allow replacement of bloated primary key indexes without foreign key rebuilds
От
Alvaro Herrera
Дата:
Excerpts from Tom Lane's message of mar jul 10 10:44:03 -0400 2012: > All of these things seem like ugly, hard-to-use kluges anyway (the > make-sure-the-indexes-match business is just as much of a PITA for the > DBA as it is for the system). What we really want is REINDEX > CONCURRENTLY. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.9961 ? -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Excerpts from Tom Lane's message of mar jul 10 10:44:03 -0400 2012: >> What we really want is REINDEX CONCURRENTLY. > http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.9961 ? Hm ... that paper looks like something we might want to incorporate into btree's VACUUM processing, but it's not very on-point if someone really wants to rebuild the index totally. regards, tom lane
Re: Re: Allow replacement of bloated primary key indexes without foreign key rebuilds
От
Gurjeet Singh
Дата:
On Tue, Jul 10, 2012 at 10:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
IMHO there must be many other places in this code-base where we run that risk.
The way I am planning to do it was to compare all relevant fields of the FormData_pg_index. And I am assuming anybody changing the struct members will take care of relevant changes needed for this code too.
We can add a runtime/compile-time assert to make sure that Natts_pg_index==17. That way, if a new column gets added, we will get alerted promptly.
Greg Stark <stark@mit.edu> writes:
> On Sat, Jul 7, 2012 at 4:53 AM, Gurjeet Singh <singh.gurjeet@gmail.com> wrote:
>> All we need to do is allow swapping of pg_class.relfilenode of two indexes.> Fwiw I don't like swapping relfilenodes on indexes the user created.Yes. The swap-relfilenodes operation would have to carefully check that
> REINDEX currently does this but it's a bit of a hack and only works
> because reindex carefully builds the new index with exactly the same
> definition as the old one.
the index definitions were exactly equivalent, and there would be a
constant risk for bugs of omission if that code weren't taught about
any new index properties we invent.
IMHO there must be many other places in this code-base where we run that risk.
The way I am planning to do it was to compare all relevant fields of the FormData_pg_index. And I am assuming anybody changing the struct members will take care of relevant changes needed for this code too.
We can add a runtime/compile-time assert to make sure that Natts_pg_index==17. That way, if a new column gets added, we will get alerted promptly.
All of these things seem like ugly, hard-to-use kluges anyway (themake-sure-the-indexes-match business is just as much of a PITA for the
DBA as it is for the system). What we really want is REINDEX
CONCURRENTLY.
+1, but I can't take on that task.
--
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Re: Re: Allow replacement of bloated primary key indexes without foreign key rebuilds
От
Gurjeet Singh
Дата:
<div dir="ltr"><div class="gmail_quote">On Tue, Jul 10, 2012 at 11:11 AM, Greg Stark <span dir="ltr"><<a href="mailto:stark@mit.edu"target="_blank">stark@mit.edu</a>></span> wrote:<br /><blockquote class="gmail_quote" style="margin:00 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On Tue, Jul 10, 2012 at 3:44 PM, TomLane <<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>> wrote:<br /> >> The problem you describe isone of constraints and dependencies and<br /> >> not one of indexes. It seems what you really want is a way to alter<br/> >> foreign key dependencies to depend on a new index. Either an explicit<br /> >> command that letsyou set the new dependency or what seems even better<br /> >> would be to have DROP INDEX check any dependent objectsto see if<br /> >> there's another index that can satisfy them and change their<br /> >> dependency.<br/> ><br /> > Either of these have exactly the same issue, namely their correctness<br /> > dependson determining if two indexes have identical properties.<br /><br /></div>This doesn't sound right to me. In thesecases all it would have to<br /> know about is the same set of properties that CREATE CONSTRAINT looks<br /> for tofind a satisfactory index to depend on.<br /><span class="HOEnZb"><font color="#888888"></font></span></blockquote></div><brclear="all" />I like the DROP index idea, but the silent side-effectmay not make people happy. Can you give me a pointer to relevant code.<br /><br />-- <br /><div dir="ltr">GurjeetSingh<br />EnterpriseDB Corporation<br />The Enterprise PostgreSQL Company<br /></div><br /></div>