Обсуждение: Allow composite foreign keys to reference a superset of unique constraint columns?

Поиск
Список
Период
Сортировка

Allow composite foreign keys to reference a superset of unique constraint columns?

От
Paul Martinez
Дата:
Hey, hackers!

While working with some foreign keys, I noticed some mildly unexpected
behavior. The columns referenced by a unique constraint must naturally
have a unique constraint on them:

CREATE TABLE foo (a integer);
CREATE TABLE bar (x integer REFERENCES foo(a));
> ERROR:  there is no unique constraint matching given keys for referenced
  table "foo"

But Postgres doesn't allow a foreign key to reference a set of columns
without a unique constraint, even if there a unique constraint on a
subset of those columns (i.e., it doesn't allow referencing a superset
of a unique constraint).

CREATE TABLE foo (a integer PRIMARY KEY, b integer);
CREATE TABLE bar (x integer, y integer, FOREIGN KEY (x, y) REFERENCES
foo(a, b));
> ERROR:  there is no unique constraint matching given keys for referenced
  table "foo"

It seems to me like there would be nothing wrong in this case to allow this
foreign key constraint to exist. Because there is a unique constraint on foo(a),
foo(a, b) will also be unique. And it doesn't seem like it would be too complex
to implement.

Neither MATCH SIMPLE nor MATCH FULL constraints would have any issues
with this. MATCH PARTIAL may, but, alas, it's not implemented. (I've had
a few ideas about foreign keys, and MATCH PARTIAL seems to always come
up, and I still don't understand what its use case is.)




A real-world use case that uses denormalization could run into this. Imagine a
naive music database that has a list of artists, albums, and songs, where each
album is by one artist and each song is on one album, but we still store a
reference to the artist on each song:

CREATE TABLE artists (id serial PRIMARY KEY, name text);
CREATE TABLE albums (id serial PRIMARY KEY, artist_id REFERENCES
artists(id) name text);
CREATE TABLE songs (
  id serial PRIMARY KEY,
  artist_id REFERENCES artists(id) ON DELETE CASCADE,
  album_id REFERENCES albums(id) ON DELETE CASCADE,
  name text,
);

To ensure that artist deletions are fast, we need to create an index on
songs(artist_id) and songs(album_id). But, suppose we wanted to save on index
space, and we never needed to query JUST by album_id. We could then do:

CREATE TABLE songs (
  id serial PRIMARY KEY,
  artist_id REFERENCES artists(id) ON DELETE CASCADE,
  album_id integer,
  name text,
  FOREIGN KEY (artist_id, album_id) REFERENCES albums(artist_id, id)
ON DELETE CASCADE
);

And then we could have a single index on songs(artist_id, album_id) that would
serve both ON CASCADE DELETE triggers:

-- Delete artist
DELETE FROM songs WHERE artist_id = <artist.id>;
-- Delete artist
DELETE FROM songs
  WHERE artist_id = <album.artist_id> AND album_id = <album.id>;

But Postgres wouldn't let us create the composite foreign key described.



It seems like a somewhat useful feature. If people think it would be useful to
implement, I might take a stab at it when I have time.

- Paul



Re: Allow composite foreign keys to reference a superset of unique constraint columns?

От
"David G. Johnston"
Дата:
On Mon, Aug 16, 2021 at 4:37 PM Paul Martinez <hellopfm@gmail.com> wrote:

It seems like a somewhat useful feature. If people think it would be useful to
implement, I might take a stab at it when I have time.


This doesn't seem useful enough for us to be the only implementation to go above and beyond the SQL Standard's specification for the references feature (I assume that is what this proposal suggests).

This example does a good job of explaining but its assumptions aren't that impactful and thus isn't that good at inducing desirability.

David J.

Re: Allow composite foreign keys to reference a superset of unique constraint columns?

От
Jack Christensen
Дата:
On Mon, Aug 16, 2021 at 7:01 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Mon, Aug 16, 2021 at 4:37 PM Paul Martinez <hellopfm@gmail.com> wrote:

It seems like a somewhat useful feature. If people think it would be useful to
implement, I might take a stab at it when I have time.


This doesn't seem useful enough for us to be the only implementation to go above and beyond the SQL Standard's specification for the references feature (I assume that is what this proposal suggests).

This example does a good job of explaining but its assumptions aren't that impactful and thus isn't that good at inducing desirability.
 

I have no opinion on the broader concerns about this proposed feature, but speaking simply as a user I have wanted this on multiple occasions. In my case, it is usually because of the need to maintain consistency in a diamond table relationship. For example:

create table tenants (
  id serial primary key
);

create table users (
  id serial primary key,
  tenant_id int references tenants
);

create table user_groups (
  id serial primary key,
  tenant_id int references tenants
);

create table user_group_memberships (
  tenant_id int,
  user_id int,
  user_group_id,
  primary key (user_id, user_group_id),
  foreign key (user_id, tenant_id) references users (id, tenant_id),
  foreign key (user_group_id, tenant_id) references user_groups (id, tenant_id)
);

The only way to ensure a user can only be a member of a group in the same tenant is to user_group_memberships.tenant_id be part of the foreign key. And that will only work with a unique key on id and tenant_id in both users and user_groups. It's a bit inelegant to create multiple extra indexes to ensure consistency when existing indexes are enough to ensure uniqueness.

Jack

Re: Allow composite foreign keys to reference a superset of unique constraint columns?

От
Paul Martinez
Дата:
On Tue, Aug 17, 2021 at 8:41 AM Jack Christensen <jack@jncsoftware.com> wrote:
>
> The only way to ensure a user can only be a member of a group in the same
> tenant is to user_group_memberships.tenant_id be part of the foreign key. And
> that will only work with a unique key on id and tenant_id in both users and
> user_groups. It's a bit inelegant to create multiple extra indexes to ensure
> consistency when existing indexes are enough to ensure uniqueness.
>
> Jack

You could accomplish this by using composite primary keys on the users and
user_groups tables:

CREATE TABLE users (
  id serial,
  tenant_id int REFERENCES tenants(id),
  PRIMARY KEY (tenant_id, id)
);

This approach works pretty well for multi-tenant databases, because then your
indexes all start with tenant_id, which should help with performance, and, in
theory, would make your database easier to shard. But then it requires
including a tenant_id in *every* query (and subquery!), which may be difficult
to enforce in a codebase.

One downside of the composite primary/foreign key approach is that ON DELETE
SET NULL foreign keys no longer work properly because they try to set both
columns to NULL, the true foreign key id, AND the shared tenant_id that is part
of the referencing table's primary key. I have a patch [1] out to add new
functionality to solve this problem though.

- Paul

[1]:
https://www.postgresql.org/message-id/flat/CACqFVBZQyMYJV%3DnjbSMxf%2BrbDHpx%3DW%3DB7AEaMKn8dWn9OZJY7w%40mail.gmail.com



Re: Allow composite foreign keys to reference a superset of unique constraint columns?

От
Laurenz Albe
Дата:
On Tue, 2021-08-17 at 10:45 -0700, Paul Martinez wrote:
> On Tue, Aug 17, 2021 at 8:41 AM Jack Christensen <jack@jncsoftware.com> wrote:
> > The only way to ensure a user can only be a member of a group in the same
> > tenant is to user_group_memberships.tenant_id be part of the foreign key. And
> > that will only work with a unique key on id and tenant_id in both users and
> > user_groups. It's a bit inelegant to create multiple extra indexes to ensure
> > consistency when existing indexes are enough to ensure uniqueness.
> 
> You could accomplish this by using composite primary keys on the users and
> user_groups tables:
> 
> CREATE TABLE users (
>   id serial,
>   tenant_id int REFERENCES tenants(id),
>   PRIMARY KEY (tenant_id, id)
> );

That is not a proper solution, because it does not guarantee uniqueness of
the "id" column, which is typically what you want.

So I think Jack's example illustrates the benefit of this proposal well.

On the other hand, the SQL standard requires that a foreign key references
a unique constraint, see chapter 11.8 <referential constraint definition>,
Syntax Rules 3) a):

 "If the <referenced table and columns> specifies a <reference column list>,
  then there shall be a one-to-one correspondence between the set of
  <column name>s contained in that <reference column list> and the set of
  <column name>s contained in the <unique column list> of a unique constraint
  of the referenced table such that corresponding <column name>s are equivalent."

So while I personally agree that the proposed feature is useful, I am not
sure if it is useful enough to break the standard in a way that may be
incompatible with future extensions of the standard.

Yours,
Laurenz Albe