Обсуждение: BUG #17720: pg_dump creates a dump with primary key that cannot be restored, when specifying 'using index ...'

Поиск
Список
Период
Сортировка

BUG #17720: pg_dump creates a dump with primary key that cannot be restored, when specifying 'using index ...'

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17720
Logged by:          reiner peterke
Email address:      zedaardv@drizzle.com
PostgreSQL version: 15.1
Operating system:   openSUSE Leap 15.4
Description:

I have a table
create table hamster(under integer, over text);
I create a unique index on the under column with nulls not distinct
create unique index uq_not_distinct on hamster (under) nulls not distinct;
i now create a primary key using the unique index

alter table hamster add constraint pk_hamster primary key using index
uq_not_distinct;
NOTICE:  ALTER TABLE / ADD CONSTRAINT USING INDEX will rename index
"uq_not_distinct" to "pk_hamster"
table looks good
\d hamster 
               Table "moon.hamster"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 under  | integer |           | not null | 
 over   | text    |           |          | 
Indexes:
    "pk_hamster" PRIMARY KEY, btree (under) NULLS NOT DISTINCT

Do a pg_dump of the database.
the dump creates the code for a primary key that cannot be restored
pg_dump -p 5632 -Of tranquility.sql  -d tranquility
on restore, I get the following error
psql:tranquility.sql:90: ERROR:  syntax error at or near "NULLS"
LINE 2:     ADD CONSTRAINT pk_hamster PRIMARY KEY NULLS NOT DISTINCT...
in the dump itself the create constraint command is
ALTER TABLE ONLY moon.hamster
     ADD CONSTRAINT pk_hamster PRIMARY KEY NULLS NOT DISTINCT (under);
which does not work with the NULLS NOT DISTINCT  string


Re: BUG #17720: pg_dump creates a dump with primary key that cannot be restored, when specifying 'using index ...'

От
"David G. Johnston"
Дата:
On Wednesday, December 14, 2022, PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      17720
Logged by:          reiner peterke
Email address:      zedaardv@drizzle.com
PostgreSQL version: 15.1
Operating system:   openSUSE Leap 15.4
Description:       

Do a pg_dump of the database.
the dump creates the code for a primary key that cannot be restored
pg_dump -p 5632 -Of tranquility.sql  -d tranquility
on restore, I get the following error
psql:tranquility.sql:90: ERROR:  syntax error at or near "NULLS"
LINE 2:     ADD CONSTRAINT pk_hamster PRIMARY KEY NULLS NOT DISTINCT...
in the dump itself the create constraint command is
ALTER TABLE ONLY moon.hamster
     ADD CONSTRAINT pk_hamster PRIMARY KEY NULLS NOT DISTINCT (under);
which does not work with the NULLS NOT DISTINCT  string

There is a decent chance that the fix here is to prohibit doing what you did here - a PK cannot contain nulls in any of its columns so indeed choosing an index that specifies how nulls behave is non-sensical.  That said, it also doesn’t hurt so long as the column itself is indeed not null.  But extending the syntax doesn’t seem that appealing.


David J.
 
> On 14 Dec 2022, at 13:54, David G. Johnston <david.g.johnston@gmail.com> wrote:
>
> On Wednesday, December 14, 2022, PG Bug reporting form <noreply@postgresql.org <mailto:noreply@postgresql.org>>
wrote:
> The following bug has been logged on the website:
>
> Bug reference:      17720
> Logged by:          reiner peterke
> Email address:      zedaardv@drizzle.com <mailto:zedaardv@drizzle.com>
> PostgreSQL version: 15.1
> Operating system:   openSUSE Leap 15.4
> Description:
>
> Do a pg_dump of the database.
> the dump creates the code for a primary key that cannot be restored
> pg_dump -p 5632 -Of tranquility.sql  -d tranquility
> on restore, I get the following error
> psql:tranquility.sql:90: ERROR:  syntax error at or near "NULLS"
> LINE 2:     ADD CONSTRAINT pk_hamster PRIMARY KEY NULLS NOT DISTINCT...
> in the dump itself the create constraint command is
> ALTER TABLE ONLY moon.hamster
>      ADD CONSTRAINT pk_hamster PRIMARY KEY NULLS NOT DISTINCT (under);
> which does not work with the NULLS NOT DISTINCT  string
>
> There is a decent chance that the fix here is to prohibit doing what you did here - a PK cannot contain nulls in any
ofits columns so indeed choosing an index that specifies how nulls behave is non-sensical.  That said, it also doesn’t
hurtso long as the column itself is indeed not null.  But extending the syntax doesn’t seem that appealing. 

Even if we prohibit this, there is still the case of all existing systems which
can't be dumped.  I wonder if the solution is to teach pg_dump to not create
NULLS NOT DISTINCT primary key constraints?  The simple attached fix creates a
valid PK constraint on the above schema.

--
Daniel Gustafsson        https://vmware.com/


Вложения
Daniel Gustafsson <daniel@yesql.se> writes:
> On 14 Dec 2022, at 13:54, David G. Johnston <david.g.johnston@gmail.com> wrote:
>> There is a decent chance that the fix here is to prohibit doing what you did here - a PK cannot contain nulls in any
ofits columns so indeed choosing an index that specifies how nulls behave is non-sensical.  That said, it also doesn’t
hurtso long as the column itself is indeed not null.  But extending the syntax doesn’t seem that appealing. 

> Even if we prohibit this, there is still the case of all existing systems which
> can't be dumped.  I wonder if the solution is to teach pg_dump to not create
> NULLS NOT DISTINCT primary key constraints?  The simple attached fix creates a
> valid PK constraint on the above schema.

It doesn't make sense for pg_dump to editorialize on a schema that
we otherwise consider valid; people would rightfully complain that
dump/restore changed things.  So we need to do both things: prohibit
adopting such an index as a PK constraint (but I guess it's okay
for plain unique constraints?), and adjust pg_dump to compensate
for the legacy case where it was already done.

            regards, tom lane



> On 14 Dec 2022, at 16:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Daniel Gustafsson <daniel@yesql.se> writes:
>> On 14 Dec 2022, at 13:54, David G. Johnston <david.g.johnston@gmail.com> wrote:
>>> There is a decent chance that the fix here is to prohibit doing what you did here - a PK cannot contain nulls in
anyof its columns so indeed choosing an index that specifies how nulls behave is non-sensical.  That said, it also
doesn’thurt so long as the column itself is indeed not null.  But extending the syntax doesn’t seem that appealing. 
>
>> Even if we prohibit this, there is still the case of all existing systems which
>> can't be dumped.  I wonder if the solution is to teach pg_dump to not create
>> NULLS NOT DISTINCT primary key constraints?  The simple attached fix creates a
>> valid PK constraint on the above schema.
>
> It doesn't make sense for pg_dump to editorialize on a schema that
> we otherwise consider valid; people would rightfully complain that
> dump/restore changed things.  So we need to do both things: prohibit
> adopting such an index as a PK constraint (but I guess it's okay
> for plain unique constraints?), and adjust pg_dump to compensate
> for the legacy case where it was already done.

Agreed, I'll expand the patch.

--
Daniel Gustafsson        https://vmware.com/




Re: BUG #17720: pg_dump creates a dump with primary key that cannot be restored, when specifying 'using index ...'

От
"David G. Johnston"
Дата:
On Wed, Dec 14, 2022 at 8:37 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
> On 14 Dec 2022, at 13:54, David G. Johnston <david.g.johnston@gmail.com> wrote:
>> There is a decent chance that the fix here is to prohibit doing what you did here - a PK cannot contain nulls in any of its columns so indeed choosing an index that specifies how nulls behave is non-sensical.  That said, it also doesn’t hurt so long as the column itself is indeed not null.  But extending the syntax doesn’t seem that appealing.

> Even if we prohibit this, there is still the case of all existing systems which
> can't be dumped.  I wonder if the solution is to teach pg_dump to not create
> NULLS NOT DISTINCT primary key constraints?  The simple attached fix creates a
> valid PK constraint on the above schema.

It doesn't make sense for pg_dump to editorialize on a schema that
we otherwise consider valid; people would rightfully complain that
dump/restore changed things.  So we need to do both things: prohibit
adopting such an index as a PK constraint (but I guess it's okay
for plain unique constraints?), and adjust pg_dump to compensate
for the legacy case where it was already done.


The WHERE clause of CREATE INDEX doesn't pose an issue as we report "Cannot create a primary key or unique constraint using such an index".
It is also not possible to specify an opclass in PRIMARY KEY, but since we document that unique indexes are only currently implemented by B-tree, and that is what you get from PRIMARY KEY, that is also not a problem (presently - unless extension authors were to bypass this).

The remaining elements: INCLUDE, WITH, and TABLESPACE, all exist in both formulations.

I am thinking now that the failure to include NULLS [NOT[ DISTINCT in the CREATE TABLE syntax is an oversight that needs to be fixed.  It just doesn't make sense to have the two commands expose different features.

David J.

"David G. Johnston" <david.g.johnston@gmail.com> writes:
> I am thinking now that the failure to include NULLS [NOT] DISTINCT in the
> CREATE TABLE syntax is an oversight that needs to be fixed.  It just
> doesn't make sense to have the two commands expose different features.

It looks to me like it was pretty intentional, because both CREATE
and ALTER TABLE let you write UNIQUE NULLS NOT DISTINCT but not
PRIMARY KEY NULLS NOT DISTINCT.  That doesn't seem like an oversight.

            regards, tom lane



Re: BUG #17720: pg_dump creates a dump with primary key that cannot be restored, when specifying 'using index ...'

От
"David G. Johnston"
Дата:
On Wed, Dec 14, 2022 at 9:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> I am thinking now that the failure to include NULLS [NOT] DISTINCT in the
> CREATE TABLE syntax is an oversight that needs to be fixed.  It just
> doesn't make sense to have the two commands expose different features.

It looks to me like it was pretty intentional, because both CREATE
and ALTER TABLE let you write UNIQUE NULLS NOT DISTINCT but not
PRIMARY KEY NULLS NOT DISTINCT.  That doesn't seem like an oversight.

OK, that was me tunnel-visioned on the "index_parameters" syntax section where INCLUDE, etc... are listed and not seeing it there.

So, don't document that PRIMARY KEY accepts the NULLS [NOT] DISTINCT but make it do so anyway?

David J.

> On 14 Dec 2022, at 16:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Daniel Gustafsson <daniel@yesql.se> writes:
>> On 14 Dec 2022, at 13:54, David G. Johnston <david.g.johnston@gmail.com> wrote:
>>> There is a decent chance that the fix here is to prohibit doing what you did here - a PK cannot contain nulls in
anyof its columns so indeed choosing an index that specifies how nulls behave is non-sensical.  That said, it also
doesn’thurt so long as the column itself is indeed not null.  But extending the syntax doesn’t seem that appealing. 
>
>> Even if we prohibit this, there is still the case of all existing systems which
>> can't be dumped.  I wonder if the solution is to teach pg_dump to not create
>> NULLS NOT DISTINCT primary key constraints?  The simple attached fix creates a
>> valid PK constraint on the above schema.
>
> It doesn't make sense for pg_dump to editorialize on a schema that
> we otherwise consider valid; people would rightfully complain that
> dump/restore changed things.  So we need to do both things: prohibit
> adopting such an index as a PK constraint (but I guess it's okay
> for plain unique constraints?), and adjust pg_dump to compensate
> for the legacy case where it was already done.

The attached prohibits the use of NULLS NOT DISTINCT for backing primary key
constraints but allow them for unique constraints.  Is this along the lines of
what you had in mind?

--
Daniel Gustafsson        https://vmware.com/


Вложения
Daniel Gustafsson <daniel@yesql.se> writes:
> The attached prohibits the use of NULLS NOT DISTINCT for backing primary key
> constraints but allow them for unique constraints.  Is this along the lines of
> what you had in mind?

Needs more than zero comments in the code, and why bother testing
is_alter_table in index_check_primary_key?  We're disallowing
this case across-the-board, no matter how you get to it.

I'll defer to Peter on whether this is in fact the right way to go,
or we should relax the syntax restriction as David suggests.

            regards, tom lane



> On 15 Dec 2022, at 00:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Daniel Gustafsson <daniel@yesql.se> writes:
>> The attached prohibits the use of NULLS NOT DISTINCT for backing primary key
>> constraints but allow them for unique constraints.  Is this along the lines of
>> what you had in mind?
>
> Needs more than zero comments in the code, and why bother testing
> is_alter_table in index_check_primary_key?  We're disallowing
> this case across-the-board, no matter how you get to it.

That was an, admittedly poor, attempt at self-documenting code.  Removed and
comments added in the attached.

--
Daniel Gustafsson        https://vmware.com/


Вложения
On 12/15/22 14:18, Daniel Gustafsson wrote:
>> On 15 Dec 2022, at 00:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> Daniel Gustafsson <daniel@yesql.se> writes:
>>> The attached prohibits the use of NULLS NOT DISTINCT for backing primary key
>>> constraints but allow them for unique constraints.  Is this along the lines of
>>> what you had in mind?
>>
>> Needs more than zero comments in the code, and why bother testing
>> is_alter_table in index_check_primary_key?  We're disallowing
>> this case across-the-board, no matter how you get to it.
> 
> That was an, admittedly poor, attempt at self-documenting code.  Removed and
> comments added in the attached.

[reviewing patch...]

If the point in adding a primary key USING INDEX is to avoid building an 
index, then this restriction defeats that purpose.  We have no ALTER 
INDEX command to switch or drop the <unique null treatment>.

Since the primary key can't have any nulls anyway, I think we should 
just flip this, perhaps with a NOTICE like the rename does.
-- 
Vik Fearing




Vik Fearing <vik@postgresfriends.org> writes:
> If the point in adding a primary key USING INDEX is to avoid building an 
> index, then this restriction defeats that purpose.  We have no ALTER 
> INDEX command to switch or drop the <unique null treatment>.

Well, if you want to build an index to then use as a primary key,
it's incumbent on you to make the index with the correct properties.
Do you expect the system to silently fix it for you if the index is
on the wrong columns?

I might have more sympathy for this argument if the correct <unique null
treatment> were non-default, but it is not.  You'd have had to go out
of your way to make the index incompatible.  That in turn suggests that
there's a mistake somewhere, so having the system automatically fix
it for you might just be masking something that would be better dealt
with manually.

            regards, tom lane



On 12/15/22 16:54, Tom Lane wrote:
> Vik Fearing <vik@postgresfriends.org> writes:
>> If the point in adding a primary key USING INDEX is to avoid building an
>> index, then this restriction defeats that purpose.  We have no ALTER
>> INDEX command to switch or drop the <unique null treatment>.
> 
> Well, if you want to build an index to then use as a primary key,
> it's incumbent on you to make the index with the correct properties.

This is assuming you initially created the index for the purpose of 
making it the primary key, perhaps for concurrency.

> Do you expect the system to silently fix it for you if the index is
> on the wrong columns?

Of course not.

> I might have more sympathy for this argument if the correct <unique null
> treatment> were non-default, but it is not.  You'd have had to go out
> of your way to make the index incompatible.  That in turn suggests that
> there's a mistake somewhere, so having the system automatically fix
> it for you might just be masking something that would be better dealt
> with manually.

The index may have been originally created for something else.  The only 
way to deal with this manually is to drop and recreate the index, just 
to remove something that has no effect anyway because the index contains 
no nulls.

I am not going to die on this hill, but I think it is a shame to make 
users go through that.  A NOTICE saying "btw, we did that" should be 
sufficient.
-- 
Vik Fearing




Re: BUG #17720: pg_dump creates a dump with primary key that cannot be restored, when specifying 'using index ...'

От
"David G. Johnston"
Дата:
On Thu, Dec 15, 2022 at 8:54 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Vik Fearing <vik@postgresfriends.org> writes:
> If the point in adding a primary key USING INDEX is to avoid building an
> index, then this restriction defeats that purpose.  We have no ALTER
> INDEX command to switch or drop the <unique null treatment>.

Well, if you want to build an index to then use as a primary key,
it's incumbent on you to make the index with the correct properties.
Do you expect the system to silently fix it for you if the index is
on the wrong columns?

I might have more sympathy for this argument if the correct <unique null
treatment> were non-default, but it is not.  You'd have had to go out
of your way to make the index incompatible.  That in turn suggests that
there's a mistake somewhere, so having the system automatically fix
it for you might just be masking something that would be better dealt
with manually.


No matter what we do we are either masking the situation that a user has a primary key backing index defined NULLS NOT DISTINCT which is a pointless, but not invalid, definition, or causing an error to happen because we've decided that we shouldn't be masking that inconsistent configuration.

Green-field, maybe we should have made it a "NULLS { NOT ALLOWED | [NOT] DISTINCT }" option and been better at informing/prohibiting the user from choosing an index that is not compatible with the constraint they want it to support.  But that isn't really something we make an effort to do generally and it seems too late to get on that horse now.

Sleeping on this, I feel even more strongly that touching pg_dump now is wrong.  The configuration, while nonsensical, is technically valid.  Which means that the argument for causing existing pg_dumps to become broken is basically zero.  We need to simply synchronize PRIMARY KEY to accept this configuration in the interest of dump/restore preservation.

David J.



On Thu, Dec 15, 2022 at 4:54 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Vik Fearing <vik@postgresfriends.org> writes:
> If the point in adding a primary key USING INDEX is to avoid building an
> index, then this restriction defeats that purpose.  We have no ALTER
> INDEX command to switch or drop the <unique null treatment>.

Well, if you want to build an index to then use as a primary key,
it's incumbent on you to make the index with the correct properties.
Do you expect the system to silently fix it for you if the index is
on the wrong columns?

I might have more sympathy for this argument if the correct <unique null
treatment> were non-default, but it is not.  You'd have had to go out
of your way to make the index incompatible.  That in turn suggests that
there's a mistake somewhere, so having the system automatically fix
it for you might just be masking something that would be better dealt
with manually.

                        regards, tom lane


Can you explain the distinction?
If i build an index nulls distinct/nulls not distinct on a column that will later be used in an
alter index add constraint ... using index
both indexes can have at least one null.
from that point using an index in the creation of a constraint should not matter if the index could potentially have one or many nulls.
If the pre version 15 syntax is correct, then the post version 15 syntax should also be correct.
If one is not valid, then both should not be valid.
Or am i missing something here?

Reiner
On 15.12.22 00:27, Tom Lane wrote:
> Daniel Gustafsson <daniel@yesql.se> writes:
>> The attached prohibits the use of NULLS NOT DISTINCT for backing primary key
>> constraints but allow them for unique constraints.  Is this along the lines of
>> what you had in mind?
> 
> Needs more than zero comments in the code, and why bother testing
> is_alter_table in index_check_primary_key?  We're disallowing
> this case across-the-board, no matter how you get to it.
> 
> I'll defer to Peter on whether this is in fact the right way to go,
> or we should relax the syntax restriction as David suggests.

My first instinct was to just fix pg_dump to not dump syntax that can't 
be loaded in.

It shouldn't matter what null treatment the underlying unique index has, 
since the primary key can't have nulls anyway, so either type of index 
should be acceptable.  But then we'd need to think through a bunch of 
possible ALTER behaviors.  For example, if we just change pg_dump and 
leave the index as is, a subsequent dump and restore would lose the 
original null treatment flag.  What if someone then wants to re-detach 
the constraint from the index?  (Does that exist now?  Maybe not, but it 
could.)  What should happen then?  This could all be worked out, I 
think, but it would need more thought.

In short, I think preventing the ALTER command, as proposed in this 
patch, seems like a good solution for the moment.  Additional work to 
enable some of this could follow later independently.




> On 16 Dec 2022, at 09:07, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
>
> On 15.12.22 00:27, Tom Lane wrote:
>> Daniel Gustafsson <daniel@yesql.se> writes:
>>> The attached prohibits the use of NULLS NOT DISTINCT for backing primary key
>>> constraints but allow them for unique constraints.  Is this along the lines of
>>> what you had in mind?
>> Needs more than zero comments in the code, and why bother testing
>> is_alter_table in index_check_primary_key?  We're disallowing
>> this case across-the-board, no matter how you get to it.
>> I'll defer to Peter on whether this is in fact the right way to go,
>> or we should relax the syntax restriction as David suggests.
>
> My first instinct was to just fix pg_dump to not dump syntax that can't be loaded in.
>
> It shouldn't matter what null treatment the underlying unique index has, since the primary key can't have nulls
anyway,so either type of index should be acceptable.  But then we'd need to think through a bunch of possible ALTER
behaviors. For example, if we just change pg_dump and leave the index as is, a subsequent dump and restore would lose
theoriginal null treatment flag.  What if someone then wants to re-detach the constraint from the index?  (Does that
existnow?  Maybe not, but it could.)  What should happen then?  This could all be worked out, I think, but it would
needmore thought. 
>
> In short, I think preventing the ALTER command, as proposed in this patch, seems like a good solution for the moment.
Additional work to enable some of this could follow later independently. 

The attached removes the change from pg_dump and only prohibits the ALTER TABLE
command for attaching the index.  Since it will render dumps unable to be
restored I also added a check to pg_upgrade to cover the case.

--
Daniel Gustafsson        https://vmware.com/


Вложения
Daniel Gustafsson <daniel@yesql.se> writes:
> The attached removes the change from pg_dump and only prohibits the ALTER TABLE
> command for attaching the index.  Since it will render dumps unable to be
> restored I also added a check to pg_upgrade to cover the case.

That doesn't seem like a great answer.  I understood Peter to be
favoring both aspects of the previous patch.

            regards, tom lane



> On 16 Dec 2022, at 17:12, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Daniel Gustafsson <daniel@yesql.se> writes:
>> The attached removes the change from pg_dump and only prohibits the ALTER TABLE
>> command for attaching the index.  Since it will render dumps unable to be
>> restored I also added a check to pg_upgrade to cover the case.
>
> That doesn't seem like a great answer.  I understood Peter to be
> favoring both aspects of the previous patch.

Oh, ok, if so then I misunderstood. No worries, v2 is then the one to consider.

--
Daniel Gustafsson        https://vmware.com/




Daniel Gustafsson <daniel@yesql.se> writes:
> Oh, ok, if so then I misunderstood. No worries, v2 is then the one to consider.

I'm good with v2.

            regards, tom lane



> On 16 Dec 2022, at 19:19, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Daniel Gustafsson <daniel@yesql.se> writes:
>> Oh, ok, if so then I misunderstood. No worries, v2 is then the one to consider.
>
> I'm good with v2.

Returning to an old thread that got sidetracked by christmas et.al.  Attached
is a rebased v4 based on the v2 patch above.  I'll attach this to the CF to get
the full CFbot treatment and if all is green I'll push this.

--
Daniel Gustafsson


Вложения