Обсуждение: unique constraint with several null values
I have a relation such as
create table order_item ( id uuid not null primary key, order_id number not null, item_code text, make text, model text, reason text, size text, expiration_date timestamp );
where the combination of the columns order_id, item_code, make, model, reason, size must be unique (unless there's an expiration date).
I'm inclined to use a unique index:
create unique index unique_index_order_item_1 on order_item (order_id, item_code, make, model, reason, size)
where expiration_date is null;
this works as expected and the duplicate row is rejected
:
insert into order_item ( 1, 88, 'CODE', 'MAKE','MODEL','REASON', 'SIZE', null); <- first adds
insert into order_item ( 1, 88, 'CODE', 'MAKE','MODEL','REASON', 'SIZE', null); <- rejects
however, nulls are allowed for all the columns except the order_id. so, when I add a null value, it fails to meet my expectations,
insert into order_item ( 1, 88, 'CODE', 'MAKE',null,'REASON', 'SIZE', null); <- first adds
insert into order_item ( 1, 88, 'CODE', 'MAKE',null,'REASON', 'SIZE', null); <- adds, but should reject.
This leads me to think I need to create 2^5 or 32 unique constraints to handle the various combinations of data that I can store. Until now, this integrity is handled in the application code. That breaks when the application is multi-threaded and the rules are not applied at the database level.
Another solution I can think of is to just use a trigger to prevent the duplicate rows.
Any thoughts are certainly appreciated. I can't do much about the data model itself right now, I need to protect the integrity of the data.
Thanks!
-mark-
I have a relation such ascreate table order_item ( id uuid not null primary key, order_id number not null, item_code text, make text, model text, reason text, size text, expiration_date timestamp );where the combination of the columns order_id, item_code, make, model, reason, size must be unique (unless there's an expiration date).I'm inclined to use a unique index:create unique index unique_index_order_item_1 on order_item (order_id, item_code, make, model, reason, size)where expiration_date is null;this works as expected and the duplicate row is rejected:insert into order_item ( 1, 88, 'CODE', 'MAKE','MODEL','REASON', 'SIZE', null); <- first addsinsert into order_item ( 1, 88, 'CODE', 'MAKE','MODEL','REASON', 'SIZE', null); <- rejectshowever, nulls are allowed for all the columns except the order_id. so, when I add a null value, it fails to meet my expectations,insert into order_item ( 1, 88, 'CODE', 'MAKE',null,'REASON', 'SIZE', null); <- first addsinsert into order_item ( 1, 88, 'CODE', 'MAKE',null,'REASON', 'SIZE', null); <- adds, but should reject.This leads me to think I need to create 2^5 or 32 unique constraints to handle the various combinations of data that I can store. Until now, this integrity is handled in the application code. That breaks when the application is multi-threaded and the rules are not applied at the database level.Another solution I can think of is to just use a trigger to prevent the duplicate rows.Any thoughts are certainly appreciated. I can't do much about the data model itself right now, I need to protect the integrity of the data.
Experimenting using 9.6
You may or may not find this helpful...
DROP TYPE base_order_item;
DROP TABLE order_item
;
CREATE TYPE base_order_item AS (id int, order_id numeric, item_code text, make text, model text, reason text);
create table order_item (base_item base_order_item, expiration_date timestamp,
exclude (base_item with =) where (expiration_date is null)
);
insert into order_item VALUES ( (1, 88, 'CODE', 'MAKE',null,'REASON')::base_order_item, null);
insert into order_item VALUES ( (1, 88, 'CODE', 'MAKE',null,'REASON')::base_order_item, null);
I could not figure out a way to specify an arbitrary record type within the exclusion constraint - which is kinda what you want though it seems to have its own issues...
SELECT ROW(1, 88, 'CODE', 'MAKE',null,'REASON')
= ROW(1, 88, 'CODE', 'MAKE',null,'REASON')
-- Yields NULL
SELECT ROW(1, 88, 'CODE', 'MAKE',null,'REASON')::base_order_item
= ROW(1, 88, 'CODE', 'MAKE',null,'REASON')::base_order_item
-- Yields TRUE
Though
SELECT ROW(1, 88, 'CODE', 'MAKE',null,'REASON')
IS DISTINCT FROM ROW(1, 88, 'CODE', 'MAKE',null,'REASON')
-- Yields FALSE
So I suppose using IS DISTINCT FROM within a Trigger is going to be your less cumbersome option.
Because exclusion constraints and indexes are operator based you are stuck in those contexts on having two null values considered unequal.
David J.
On Wed, Jul 20, 2016 at 1:48 PM, David G. Johnston <david.g.johnston@gmail.com> wrote: > On Wed, Jul 20, 2016 at 2:14 PM, Mark Lybarger <mlybarger@gmail.com> wrote: >> Another solution I can think of is to just use a trigger to >> prevent the duplicate rows. If you go that route you will need to use serializable transactions, explicit locking, or trigger-based update of some otherwise-unneeded column to avoid race conditions. See: https://www.postgresql.org/docs/current/static/mvcc.html That can be a perfectly valid option as long as you handle race conditions somehow. >> Any thoughts are certainly appreciated. I can't do much about >> the data model itself right now, I need to protect the integrity >> of the data. Rather than unique constraints, you could add a unique index on the COALESCE of each column with some impossible value. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Mark Lybarger <mlybarger@gmail.com> writes: > I have a relation such as > create table order_item ( id uuid not null primary key, order_id number not null, item_code text, make text, model text,reason text, size text, expiration_date > timestamp ); > > where the combination of the columns order_id, item_code, make, model, reason, size must be unique (unless there's an expirationdate).  > > I'm inclined to use a unique index: > > create unique index unique_index_order_item_1 on order_item (order_id, item_code, make, model, reason, size) > where expiration_date is null; > > this works as expected and the duplicate row is rejected > : > insert into order_item ( 1, 88, 'CODE', 'MAKE','MODEL','REASON', 'SIZE', null); <- first adds > insert into order_item ( 1, 88, 'CODE', 'MAKE','MODEL','REASON', 'SIZE', null); <- rejects > > however, nulls are allowed for all the columns except the order_id. so, when I add a null value, it fails to meet my expectations, sj$ cat q begin; create table foo (a int, b int); create unique index foox on foo (coalesce(a, 0), coalesce(b, 0)); insert into foo select 1, null; insert into foo select 1, null; sj$ psql -f q SET BEGIN CREATE TABLE CREATE INDEX INSERT 0 1 psql:q:7: ERROR: duplicate key value violates unique constraint "foox" DETAIL: Key ((COALESCE(a, 0)), (COALESCE(b, 0)))=(1, 0) already exists. sj$ > > insert into order_item ( 1, 88, 'CODE', 'MAKE',null,'REASON', 'SIZE', null); <- first adds > insert into order_item ( 1, 88, 'CODE', 'MAKE',null,'REASON', 'SIZE', null); <- adds, but should reject. > > This leads me to think I need to create 2^5 or 32 unique constraints to handle the various combinations of data that Ican store. Until now, this integrity is handled > in the application code. That breaks when the application is multi-threaded and the rules are not applied at the databaselevel. > > Another solution I can think of is to just use a trigger to prevent the duplicate rows.  > > Any thoughts are certainly appreciated. I can't do much about the data model itself right now, I need to protect theintegrity of the data.  > > Thanks! > -mark- > -- Jerry Sievers Postgres DBA/Development Consulting e: postgres.consulting@comcast.net p: 312.241.7800
On Wed, Jul 20, 2016 at 8:14 PM, Mark Lybarger <mlybarger@gmail.com> wrote:
I have a relation such ascreate table order_item ( id uuid not null primary key, order_id number not null, item_code text, make text, model text, reason text, size text, expiration_date timestamp );where the combination of the columns order_id, item_code, make, model, reason, size must be unique (unless there's an expiration date).I'm inclined to use a unique index:create unique index unique_index_order_item_1 on order_item (order_id, item_code, make, model, reason, size)where expiration_date is null;
You can also:
create unique index unique_index_order_item_1 on order_item (coalesce(order_id, -1), coalesce(item_code, -1), coalesce(make, '--NULL--), coalesce(model, '--NULL--), coalesce(reason, '--NULL--), coalesce(size, '--NULL--'), coalesce(expiration_date, '1700-01-01 00:00:00'));
this works as expected and the duplicate row is rejected:insert into order_item ( 1, 88, 'CODE', 'MAKE','MODEL','REASON', 'SIZE', null); <- first addsinsert into order_item ( 1, 88, 'CODE', 'MAKE','MODEL','REASON', 'SIZE', null); <- rejectshowever, nulls are allowed for all the columns except the order_id. so, when I add a null value, it fails to meet my expectations,insert into order_item ( 1, 88, 'CODE', 'MAKE',null,'REASON', 'SIZE', null); <- first addsinsert into order_item ( 1, 88, 'CODE', 'MAKE',null,'REASON', 'SIZE', null); <- adds, but should reject.This leads me to think I need to create 2^5 or 32 unique constraints to handle the various combinations of data that I can store. Until now, this integrity is handled in the application code. That breaks when the application is multi-threaded and the rules are not applied at the database level.Another solution I can think of is to just use a trigger to prevent the duplicate rows.Any thoughts are certainly appreciated. I can't do much about the data model itself right now, I need to protect the integrity of the data.Thanks!-mark-
Best Wishes,
Chris Travers
Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor lock-in.
On 7/20/16 1:14 PM, Mark Lybarger wrote: > This leads me to think I need to create 2^5 or 32 unique constraints to > handle the various combinations of data that I can store. Another option would be to create a unique index of a bit varying field that set a bit to true for each field that was NULL WHERE <bit varying field> != 0. Let me know if you want to go that route, I could probably add that to http://pgxn.org/dist/count_nulls/ without much difficulty. Though, probably a better way to accomplish that would be to add a function to count_nulls that spits out an array of fields that are NULL; you could then do a unique index on that WHERE array != array[]. Maybe a less obtuse option would be to use a boolean array. Storage would be ~8x larger, but since there should be very few rows I doubt that matters. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461