Обсуждение: SERIAL datatype column skipping values.
Hi all,
Please check the below behavior for the "SERIAL" datatype.
postgres=# CREATE TABLE t1(c1 int, c2 serial);
CREATE TABLE
postgres=# insert into t1 values (generate_series(1,3));
CREATE TABLE
postgres=# insert into t1 values (generate_series(1,3));
INSERT 0 3
postgres=# insert into t1 values (generate_series(4,6));
INSERT 0 3
postgres=# select * from t1;
c1 | c2
----+----
1 | 1
2 | 2
3 | 3
4 | 5
5 | 6
6 | 7
(6 rows)
postgres=# insert into t1 values (generate_series(4,6));
INSERT 0 3
postgres=# select * from t1;
c1 | c2
----+----
1 | 1
2 | 2
3 | 3
4 | 5
5 | 6
6 | 7
(6 rows)
In this above case, the serial column "c2" is skipping the value "4" in select output.
Is this an expected behavior?
With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com
On 3/11/20 11:15 AM, Prabhat Sahu wrote: > Hi all, > Please check the below behavior for the "SERIAL" datatype. > > [...] > > In this above case, the serial column "c2" is skipping the value "4" in > select output. > Is this an expected behavior? Curious, it seems like DEFAULT expressions of a table are executed an extra time if a set returning function is used like in your example. And the SERIAL type is implemented using DEFAULT. On the other hand if you use "INSERT ... SELECT" the DEFAULT expression is only executed once per row inserted. # CREATE FUNCTION test_default() RETURNS int LANGUAGE plpgsql AS $$ BEGIN RAISE NOTICE 'Ran test_default()'; RETURN 42; END $$; CREATE FUNCTION # CREATE TABLE t2 (c1 int, c2 int DEFAULT test_default()); CREATE TABLE # INSERT INTO t2 VALUES (generate_series(1,2)); NOTICE: Ran test_default() NOTICE: Ran test_default() NOTICE: Ran test_default() INSERT 0 2 # INSERT INTO t2 SELECT generate_series(1,2); NOTICE: Ran test_default() NOTICE: Ran test_default() INSERT 0 2 Andreas
Andreas Karlsson <andreas@proxel.se> writes: > On 3/11/20 11:15 AM, Prabhat Sahu wrote: >> Is this an expected behavior? > Curious, it seems like DEFAULT expressions of a table are executed an > extra time if a set returning function is used like in your example. And > the SERIAL type is implemented using DEFAULT. Yeah, it's the same as if you do regression=# select generate_series(1,2), test_default(); NOTICE: Ran test_default() NOTICE: Ran test_default() NOTICE: Ran test_default() generate_series | test_default -----------------+-------------- 1 | 42 2 | 42 (2 rows) The generated plan is regression=# explain verbose select generate_series(1,2), test_default(); QUERY PLAN ------------------------------------------------- ProjectSet (cost=0.00..0.28 rows=2 width=8) Output: generate_series(1, 2), test_default() -> Result (cost=0.00..0.01 rows=1 width=0) (3 rows) and if you read nodeProjectSet.c you'll see that it needs to evaluate the target list three times. On the third iteration, generate_series() returns isdone == ExprEndResult indicating that it has no more results, so we don't emit an output tuple --- but we still run test_default() while scanning the tlist. Possibly the planner should try to avoid putting volatile expressions into ProjectSet's tlist. On the other hand, it's worked this way for an awfully long time, so I wonder if anyone is relying on the behavior. Even in versions before we used ProjectSet nodes, you still see three calls to the volatile function. Anyway, to get back to the OP's implied question, no you should never assume that a SERIAL column's values won't have holes in the sequence. Rolled-back transactions will have that effect in any case. regards, tom lane