Обсуждение: partitioning / rules - strange behavior

Поиск
Список
Период
Сортировка

partitioning / rules - strange behavior

От
Tomas Vondra
Дата:
Hi,

we're running PostgreSQL 8.1.4 and I've encountered some strange
problems with partitioning / rules, and I really don't know how to
fix it.

We have a table 'sessions' partitioned along the 'id' column which is a
primary key. Each partition holds at most 500.000 rows, i.e. partition
'sessions_0' holds rows with ids 0 - 499.999, 'sessions_500000' holds
rows with ids 500.000 - 999.999 and so on. We're currently using
partition 'sessions_8500000'.

I think the structure of the table is not important here, but there's
nothing special about it - several columns, some foreign keys etc. There
are no triggers on it or on the partitions.

Inserts are redirected into the correct partition by RULES on the
'sessions' table, i.e.

CREATE OR REPLACE RULE insert_8500000 AS ON INSERT TO sessions WHERE
(id BETWEEN 8500000 AND 8999999) DO INSTEAD INSERT INTO sessions_8500000
(... columns here ...) VALUES (... NEW.columns ...);

The problem is that once I create a rule for the next partition (it
already exists), it simply stops working. It seems AS if the INSERT is
succesfully redirected into the correct partition, but no data are inserted.

So for example when I create a rule

CREATE OR REPLACE RULE insert_9000000 AS ON INSERT TO sessions WHERE
(id BETWEEN 9000000 AND 9499999) DO INSTEAD INSERT INTO sessions_9000000
(... columns here ...) VALUES (... NEW.columns ...);

the rule insert_8500000 does not work anymore. When I insert a row with
id between 8500000 and 8999999 it prints 'INSERT 0 0' as usual but when
I try to fetch it no rows are found. That is

db=> INSERT INTO sessions(... non-pk columns ...) VALUES (... data ...);
INSERT 0 0
db=> SELECT * FROM sessions WHERE id = currval('sessions_id_seq');
(0 rows)

which seems really strange to me. We're using sequence to generate the
ids, but that shouldn't be a problem (at least it was not till today).

All these problems started when one of our stored procedures used for a
maintenance crashed. This procedure checks the currently used partition,
checks if the 'next one' exists and creates the partition & rules / drop
the old rules if needed. So there are some dynamic SQL, basically

  EXECUTE 'CREATE TABLE ...';
  EXECUTE 'CREATE RULE ...';
  EXECUTE 'CREATE INDEX ...';
  EXECUTE 'DROP RULE ...';

I did some changes to this procedure recently and a stupid mistype in
one of the SQL commands caused a runtime EXCEPTION yesterday. It created
the next partition (sessions_9000000), and then failed because of the
mistype, and this is the moment the problems described above began.

This is really strange, as I thought all the procedures are running as a
transaction, so a failure shouldn't cause such problems. I've fixed the
mistype and rerun the procedure (succesfully) but it didn't help.

I've tried to recreate the rules (drop / create) manually, but still no
change. But when I drop the new rule (insert_9000000) it works fine, and
once I create it again it stops working.

What I suspect is the failure in the stored procedure did something
wrong to the catalogs but maybe I'm completely wrong. Anyway I really
don't know how to fix it.

thanks for all advices
Tomas




Re: partitioning / rules - strange behavior

От
"Greg Sabino Mullane"
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> db=> INSERT INTO sessions(... non-pk columns ...) VALUES (... data ...);
> INSERT 0 0
> db=> SELECT * FROM sessions WHERE id = currval('sessions_id_seq');
> (0 rows)
>
> which seems really strange to me. We're using sequence to generate the
> ids, but that shouldn't be a problem (at least it was not till today).

Looks to me as though you have forgottent the table inheritance part of the
table partitioning trick. Are you sure that sessions_900000 inherit from
the sessions table? Does a direct count from the table referenced by the
rule return a non-zero count? In other words:

SELECT count(*) FROM sessions_900000 where id = currval('sessions_id_seq');

(or some other similar table) may show your "missing" rows. The other
possibility is that the you are not using sessions_id_seq in the way
you think you are.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200702041952
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iD8DBQFFxoDWvJuQZxSWSsgRAxqLAKDNedJ2pPg9otabsrinS1002SL0XQCeMwyF
8dqMxeB/x4XoZoj2WATZyPc=
=mvMI
-----END PGP SIGNATURE-----



Re: partitioning / rules - strange behavior

От
Tomas Vondra
Дата:
> Looks to me as though you have forgottent the table inheritance part of the
> table partitioning trick. Are you sure that sessions_900000 inherit from
> the sessions table? Does a direct count from the table referenced by the
> rule return a non-zero count? In other words:
>
> SELECT count(*) FROM sessions_900000 where id = currval('sessions_id_seq');

No, the tables are inherited, i.e. there's something like

CREATE TABLE sessions_8000000 (
  CHECK (id BETWEEN 8000000 AND 8499999
) INHERITS (sessions);

I've checked this for all the partitions (we're using three right now)
as well as the direct count (returns nothing).

As I already said - it works until I create a rule for the next
partition (not used yet) - then is suddenly stops working.

> (or some other similar table) may show your "missing" rows. The other
> possibility is that the you are not using sessions_id_seq in the way
> you think you are.

What do you mean? I do a 'SELECT currval(...)' after the insert and it
gives me the right value. It works until I create the next RULE, so this
does not seem as a sequence problem.

Tomas

Re: partitioning / rules - strange behavior

От
Tom Lane
Дата:
Tomas Vondra <tv@fuzzy.cz> writes:
> As I already said - it works until I create a rule for the next
> partition (not used yet) - then is suddenly stops working.

Suddenly stops working, or continues doing exactly what it did before?
I'm wondering if you are relying on a cached plan that doesn't include
the new rule.

            regards, tom lane

Re: partitioning / rules - strange behavior

От
tv@fuzzy.cz
Дата:
> Suddenly stops working, or continues doing exactly what it did before?
> I'm wondering if you are relying on a cached plan that doesn't include
> the new rule.
>
>             regards, tom lane
>

If there´s only the insert_8500000 RULE then everything works as expected - the
insert prints "INSERT 0 0", the row is inserted into the correct partition
which is sessions_8500000 - I can fetch it using either

SELECT * FROM sessions WHERE id = currval('sessions_id_seq');

or direcly by

SELECT * FROM sessions_8500000 WHERE id = currval('sessions_id_seq');

When I create the next next rule (insert_9000000 for ids between 9000000 and
9499999) it stops working - it prints "INSERT 0 0" just as before, everything
seems fine, but the row disappears - it's not available .

I'm not sure about the query plans, but I think I've checked that and everything
seemed ok - all the partitions were used as far as I remember. But this
shouldn't be a problem as we have not reached the 9000000 limit yet (so the new
partition is not used at all). And we've tried to restart the PostgreSQL as the
last hope, yesterday, so there really should be no old plans.

I don't have an access to the production database (I have not been able to
simulate this on the development/testing system) - I'll play with that at night
(european time). I'll try to drop / recreate the partition (I've tried to
recreate only the RULEs, not the partitions).

Tomas

Re: partitioning / rules - strange behavior

От
Tomas Vondra
Дата:
> If there´s only the insert_8500000 RULE then everything works as expected - the
> insert prints "INSERT 0 0", the row is inserted into the correct partition
> which is sessions_8500000 - I can fetch it using either
>
> SELECT * FROM sessions WHERE id = currval('sessions_id_seq');
>
> or direcly by
>
> SELECT * FROM sessions_8500000 WHERE id = currval('sessions_id_seq');
>
> When I create the next next rule (insert_9000000 for ids between 9000000 and
> 9499999) it stops working - it prints "INSERT 0 0" just as before, everything
> seems fine, but the row disappears - it's not available .
>
> I'm not sure about the query plans, but I think I've checked that and everything
> seemed ok - all the partitions were used as far as I remember. But this
> shouldn't be a problem as we have not reached the 9000000 limit yet (so the new
> partition is not used at all). And we've tried to restart the PostgreSQL as the
> last hope, yesterday, so there really should be no old plans.
>
> I don't have an access to the production database (I have not been able to
> simulate this on the development/testing system) - I'll play with that at night
> (european time). I'll try to drop / recreate the partition (I've tried to
> recreate only the RULEs, not the partitions).
>
> Tomas

OK, I did some tests a while ago and the bad news is I still was not
able to fix it. The table structure is this

=======================================================================

db=> \d sessions
                                       Table "public.sessions"
     Column      |            Type             |
   Modifiers
------------------+-----------------------------+---------------------
id               | integer                     | not null default
nextval('sessions_id_seq'::regclass)
browser_id       | integer                     |
os_id            | integer                     |
arch_id          | integer                     |
language_id      | character(2)                |
country_id       | character(2)                |
visitor_id       | integer                     | not null
ip               | inet                        | not null
ip_forward       | inet                        |
session_date     | timestamp without time zone | not null default now()
user_agent       | character varying(255)      |
screen_width     | smallint                    |
screen_height    | smallint                    |
screen_bit_depth | smallint                    |
javascript       | boolean                     | default false
browser_minor    | character varying(16)       |
browser_major    | character varying(16)       |
referer          | text                        |
last_action      | integer                     | not null default 0
Indexes:
   "sessions_pkey" PRIMARY KEY, btree (id)
Check constraints:
   ... some foreign keys, not important here ...
Rules:
   insert_8500000 AS
   ON INSERT TO sessions
  WHERE new.id >= 8500000 AND new.id <= 8999999 DO INSTEAD  INSERT INTO
sessions_8500000 (id, browser_id, os_id, arch_id, language_id,
country_id, visitor_id, ip, ip_forward, session_date, user_agent,
screen_width, screen_height, screen_bit_depth, javascript,
browser_minor, browser_major, referer, last_action)
 VALUES (new.id, new.browser_id, new.os_id, new.arch_id,
new.language_id, new.country_id, new.visitor_id, new.ip, new.ip_forward,
new.session_date, new.user_agent, new.screen_width, new.screen_height,
new.screen_bit_depth, new.javascript, new.browser_minor,
new.browser_major, new.referer, new.last_action)

=======================================================================

We're using sequence to generate the sessions(id) value, but that should
not be a problem - with the structure / rules everything works fine (the
current value in sessions_id_seq is about 8700000 so the values are
inserted into the sessions_8500000 partition).

The I create the 'next partition' for values between 9000000 and 9499999
 using

=======================================================================

   CREATE TABLE sessions_9000000 (
     CHECK (id BETWEEN 9000000 AND 9499999),
     PRIMARY KEY (id)
   ) INHERITS (sessions);

=======================================================================

and everything still seems fine, even the execution plans reflect this
new child table:

=======================================================================

db=> explain select * from sessions;
                     QUERY PLAN
----------------------------------------------------------------------
Result  (cost=0.00..52262.48 rows=1052924 width=775)
  ->  Append  (cost=0.00..52262.48 rows=1052924 width=775)
        ->  Seq Scan on sessions  (cost=0.00..12.00 rows=100 width=775)
        ->  Seq Scan on sessions_8000000 sessions  (cost=0.00..23128.78
rows=500539 width=280)
        ->  Seq Scan on sessions_8500000 sessions  (cost=0.00..6147.60
rows=51230 width=775)
        ->  Seq Scan on sessions_9000000 sessions  (cost=0.00..12.00
rows=100 width=775)

=======================================================================

but one I create a RULE for the new partition, thing go wrong. That is I
execute this (I ommited the list of columns)

=======================================================================

CREATE RULE insert_9000000 AS ON INSERT TO sessions WHERE (id BETWEEN
9000000 AND 9499999) DO INSTEAD INSERT INTO sessions_9000000 ( ... all
the columns in sessions) VALUES ( ... all the columns in sessions
prefixed with 'NEW' ...);

=======================================================================

Now when I do for example

=======================================================================

INSERT INTO sessions(id,visitor_id,ip) VALUES (8900000,0,'127.0.0.1');

=======================================================================

this new row should be inserted into the session_8500000 partition as
the 8900000 is clearly between 8500000 AND 8999999. It even seems
succesfully inserted (no exception, returns INSERT 0 0 as usual), but
once I do

   SELECT * FROM sessions WHERE id = 8900000

it returns no rows. Even

   SELECT * FROM sessions_8500000 WHERE id = 8900000

returns no rows. Here is the execution plan for the INSERT (the
execution plan for the SELECT can be found above).

=======================================================================

db=> EXPLAIN ANALYZE INSERT INTO sessions(id,visitor_id,ip) VALUES
(8900000,0,'127.0.0.1');
                                     QUERY PLAN
----------------------------------------------------------------------
Result  (cost=0.00..0.03 rows=1 width=0) (actual time=0.002..0.002
rows=0 loops=1)
  One-Time Filter: ((true IS NOT TRUE) AND (false IS NOT TRUE))
Total runtime: 0.063 ms

Result  (cost=0.00..0.03 rows=1 width=0) (actual time=0.012..0.013
rows=1 loops=1)
... some triggers for foreign keys on 'sessions' ...
Total runtime: 0.209 ms

Result  (cost=0.00..0.03 rows=1 width=0) (actual time=0.001..0.001
rows=0 loops=1)
  One-Time Filter: false
Total runtime: 0.052 ms
(15 rows)

====================================================================

The 'funny' thing is once I drop that new rule (insert_9000000) it
starts working again.

I really don't know how to solve this - today I've tried to drop /
recreate the new _9000000 partitions (which are still empty) but no
luck. Tomorrow I'll restore a fresh backup on a development system, and
try if it 'works' in the same way.

Maybe I'm missing something, but I see nothing wrong in the partitions
and rules. I've taken the current backup (taken at night) and loaded
that at the development system - everything works exactly as expected
with exactly the same set-up. BTW we're using PostgreSQL 8.1.4 (on
Linux) on both machines.

Thanks in advance for all your advices how to fix this, optimally with
as little downtime as possible.

Tomas

Re: partitioning / rules - strange behavior

От
Bernd Helmle
Дата:

On Tue, 06 Feb 2007 21:28:49 +0100, Tomas Vondra <tv@fuzzy.cz> wrote:

>
> We're using sequence to generate the sessions(id) value, but that should
> not be a problem - with the structure / rules everything works fine (the
> current value in sessions_id_seq is about 8700000 so the values are
> inserted into the sessions_8500000 partition).

It's likely to be a problem because of multiple evaluations of volatile expressions
in the rule rewrite system....short example:

CREATE TABLE sessions(id SERIAL PRIMARY KEY, value TEXT);    ^
CREATE TABLE sessions_100(CHECK(id BETWEEN 1 AND 100), PRIMARY KEY(id))
INHERITS(sessions);
CREATE TABLE sessions_200(CHECK(id BETWEEN 101 AND 200), PRIMARY KEY(id))
INHERITS(sessions);

CREATE OR REPLACE RULE insert_100
AS ON INSERT TO sessions
WHERE NEW.id BETWEEN 1 AND 100
DO INSTEAD
INSERT INTO sessions_100(id, value) VALUES(NEW.id, NEW.value);

CREATE OR REPLACE RULE insert_200
AS ON INSERT TO sessions
WHERE NEW.id BETWEEN 101 AND 200
DO INSTEAD
INSERT INTO sessions_200(id, value) VALUES(NEW.id, NEW.value);

INSERT INTO sessions(value) VALUES('bernd');

SELECT * FROM sessions;
 id | value
----+-------
  5 | bernd
(1 row)

but...

SELECT * FROM sessions WHERE id = currval('sessions_id_seq');
 id | value
----+-------
(0 rows)

SELECT currval('sessions_id_seq');
 currval
---------
       6
(1 row)

[...]

>
> Now when I do for example
>
> =======================================================================
>
> INSERT INTO sessions(id,visitor_id,ip) VALUES (8900000,0,'127.0.0.1');
>
> =======================================================================
>
> this new row should be inserted into the session_8500000 partition as
> the 8900000 is clearly between 8500000 AND 8999999. It even seems
> succesfully inserted (no exception, returns INSERT 0 0 as usual), but
> once I do
>
>    SELECT * FROM sessions WHERE id = 8900000
>
> it returns no rows. Even
>
>    SELECT * FROM sessions_8500000 WHERE id = 8900000
>
> returns no rows. Here is the execution plan for the INSERT (the
> execution plan for the SELECT can be found above).
>

Maybe i'm missing something, but with constant values i'm not able
to reproduce this in my example above:

INSERT INTO sessions VALUES(200, 'xyz');
SELECT * FROM sessions_200 WHERE id = 200;
 id  | value
-----+-------
 200 | xyz
(1 row)

INSERT INTO sessions VALUES(87, 'xyz');
SELECT * FROM sessions_100 WHERE id = 87;
 id | value
----+-------
 87 | xyz
(1 row)


Bernd

Re: partitioning / rules - strange behavior

От
Tomas Vondra
Дата:
> It's likely to be a problem because of multiple evaluations of volatile expressions
> in the rule rewrite system....short example:
>
> CREATE TABLE sessions(id SERIAL PRIMARY KEY, value TEXT);    ^
> CREATE TABLE sessions_100(CHECK(id BETWEEN 1 AND 100), PRIMARY KEY(id))
> INHERITS(sessions);
> CREATE TABLE sessions_200(CHECK(id BETWEEN 101 AND 200), PRIMARY KEY(id))
> INHERITS(sessions);
>
> CREATE OR REPLACE RULE insert_100
> AS ON INSERT TO sessions
> WHERE NEW.id BETWEEN 1 AND 100
> DO INSTEAD
> INSERT INTO sessions_100(id, value) VALUES(NEW.id, NEW.value);
>
> CREATE OR REPLACE RULE insert_200
> AS ON INSERT TO sessions
> WHERE NEW.id BETWEEN 101 AND 200
> DO INSTEAD
> INSERT INTO sessions_200(id, value) VALUES(NEW.id, NEW.value);
>
> INSERT INTO sessions(value) VALUES('bernd');
>
> SELECT * FROM sessions;
>  id | value
> ----+-------
>   5 | bernd
> (1 row)
>
> but...
>
> SELECT * FROM sessions WHERE id = currval('sessions_id_seq');
>  id | value
> ----+-------
> (0 rows)
>
> SELECT currval('sessions_id_seq');
>  currval
> ---------
>        6
> (1 row)

Ouch! I've never noticed this behavior! When I do

db=> SELECT currval('sessions_id_seq');
 currval
---------
       6

db=> INSERT INTO sessions(value) VALUES('bernd');
INSERT 0 0

db=> SELECT currval('sessions_id_seq');
 currval
---------
       12

I'll check if this is the reason why it works on the development system
and not on the production.

But I don't understand why the nextval('sessions_id_seq') is evaluated
multiple times? Even when I do

INSERT INTO sessions(id,value) VALUES(nextval('sessions_id_seq','x');

it calls sessions_id_seq several times. I'll fix it by first fetching
the ID and then using it as a constant value in the INSERT, but I'd like
to know the reason why it works this way.

Tomas

Re: partitioning / rules - strange behavior

От
Martijn van Oosterhout
Дата:
On Wed, Feb 07, 2007 at 08:09:57PM +0100, Tomas Vondra wrote:
> But I don't understand why the nextval('sessions_id_seq') is evaluated
> multiple times? Even when I do

RULE expand like macros, so every expression passed in is simply copied
to each place, leading to multiple evaluations.

If you've played with the C preprocessor: that has the same basic
problem.

It's rather unexpected for novices, which is why triggers are usually
recommended.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Вложения

Re: partitioning / rules - strange behavior

От
Tomas Vondra
Дата:
>> But I don't understand why the nextval('sessions_id_seq') is evaluated
>> multiple times? Even when I do
>
> RULE expand like macros, so every expression passed in is simply copied
> to each place, leading to multiple evaluations.
>
> If you've played with the C preprocessor: that has the same basic
> problem.
>
> It's rather unexpected for novices, which is why triggers are usually
> recommended.

I wouldn't call myself a novice, but yes - this surprised me a little.
Now it seems so natural ... I think there should be a clear explanation
/ example in the docs about this - I've read the documentation on RULE
sytem several times and as far as I remember.

Anyway thanks to those helped me to solve this problem, now it works as
expected.

Tomas


Re: partitioning / rules - strange behavior

От
Tomas Vondra
Дата:
> I wouldn't call myself a novice, but yes - this surprised me a little.
> Now it seems so natural ... I think there should be a clear explanation
> / example in the docs about this - I've read the documentation on RULE
> sytem several times and as far as I remember.

... and as far as I remember there's no clear statement about this.

Tomas