Обсуждение: ENABLE ROW LEVEL SECURITY cause huge produce of checkpoints
Hi,
we tried new feature RLS - tested on postgres 9.5.3 / CentOS6. When we turn on ENABLE RLS + FORCE RLS on normal workload cause huge produce checkpoints (about 30x or more), our disk partition for xlog was full and log shipping to replica maybe delayed removing old checkpoints. Have anybody same experiences after turn on RLS? Looks like more buffers set as dirty. Yes, we can provide more space for xlog, but it will take much more space for xlog backups. We do not know if it's worth it. We had log_checkpoints ON and I send log as attachment (RLS Turn ON at 13:26).
Thanks for any idea or explanation reasons if its normal behavior.
David
(See attached file: checkpoints)
--
-------------------------------------
Ing. David TUROŇ
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava
tel.: +420 591 166 224
fax: +420 596 621 273
mobil: +420 732 589 152
www.linuxbox.cz
mobil servis: +420 737 238 656
email servis: servis@linuxbox.cz
-------------------------------------
Вложения
On Wed, Nov 2, 2016 at 12:09 AM, <david.turon@linuxbox.cz> wrote: > we tried new feature RLS - tested on postgres 9.5.3 / CentOS6. When we turn > on ENABLE RLS + FORCE RLS on normal workload cause huge produce checkpoints > (about 30x or more), our disk partition for xlog was full and log shipping > to replica maybe delayed removing old checkpoints. Have anybody same > experiences after turn on RLS? Looks like more buffers set as dirty. Yes, > we can provide more space for xlog, but it will take much more space for > xlog backups. We do not know if it's worth it. We had log_checkpoints ON and > I send log as attachment (RLS Turn ON at 13:26). Interesting, I don't recall RLS generating a burst in activity. The first heavier checkpoints happen 20 minutes after enabling RLS and those are triggered by time. Then things cool down and 1 hour later comes the real deal with a set of checkpoints triggered by volume. It is difficult though to draw a conclusion without more idea about your load, the WAL record generated, etc. -- Michael
Michael Paquier <michael.paquier@gmail.com> writes: > On Wed, Nov 2, 2016 at 12:09 AM, <david.turon@linuxbox.cz> wrote: >> we tried new feature RLS - tested on postgres 9.5.3 / CentOS6. When we turn >> on ENABLE RLS + FORCE RLS on normal workload cause huge produce checkpoints >> (about 30x or more), our disk partition for xlog was full and log shipping >> to replica maybe delayed removing old checkpoints. Have anybody same >> experiences after turn on RLS? Looks like more buffers set as dirty. Yes, >> we can provide more space for xlog, but it will take much more space for >> xlog backups. We do not know if it's worth it. We had log_checkpoints ON and >> I send log as attachment (RLS Turn ON at 13:26). > Interesting, I don't recall RLS generating a burst in activity. I have an idea about a partial explanation for that. RLS restrictions on UPDATE, DELETE, or SELECT FOR UPDATE target tables cause extra LockRows plan nodes to be generated, and in turn that would cause extra WAL activity to track the row-lock-taking. But it's hard to see how you get to a 30X bloat factor that way. The XLOG_HEAP_LOCK records would be no larger than the subsequent updates, certainly, so if RLS isn't really doing anything then this should be strictly less than a 2X penalty. The row lock occurs before the RLS filter, so if you were using RLS in such a way that it rejected a very large fraction of rows that updates or deletes were attempted on, maybe you could get to 30X from this. But that would be a weird way to use RLS, IMO. (Hm ... actually, if this were happening in otherwise-seldom-changed pages, maybe you get to that from the full page images in the XLOG_HEAP_LOCK records? Normally if you needed an FPI you were going to pay it anyway, either here or at the update proper; but for a locked and then RLS-rejected row it would be pure overhead. It's hard to credit needing so many FPIs this way, though.) I'm of the opinion that the extra row locks are actually semantically wrong, or at least pretty darn dubious: locking rows that you don't have privilege to see is not a nice behavior, and neither is the fact that the RLS filter functions see different data than they would in read-only cases (due to LockRows fetching the most up-to-date version). The RLS planner rewrite that I posted earlier gets rid of the extra LockRows nodes. It's not something we'd think of back-patching, though. regards, tom lane
Hello,
thanks for reply, I investigated this and thanks to pg_xlog_dump i found:
/usr/pgsql-9.5/bin/pg_xlogdump 00000001000008700000007C 00000001000008700000007D | head -1
rmgr: Heap len (rec/tot): 7/ 53, tx: 284003096, lsn: 870/7C000030, prev 870/7BFFFFD0, desc: LOCK off 2: xid 284003096 LOCK_ONLY EXCL_LOCK KEYS_UPDATED , blkref #0: rel 1663/16404/191292060 blk 15561
whole xlog file contains only this rows - its on table with enabled RLS
/usr/pgsql-9.5/bin/pg_xlogdump --stats=record 00000001000008700000007C 00000001000008700000007C
Type N (%) Record size (%) FPI size (%) Combined size (%)
---- - --- ----------- --- -------- --- ------------- ---
Transaction/COMMIT 5 ( 0.00) 160 ( 0.00) 0 ( 0.00) 160 ( 0.00)
Heap/INSERT 5 ( 0.00) 135 ( 0.00) 0 ( 0.00) 135 ( 0.00)
Heap/LOCK 298674 ( 99.99) 9258894 ( 99.99) 0 ( 0.00) 9258894 ( 99.99)
Btree/INSERT_LEAF 20 ( 0.01) 520 ( 0.01) 0 ( 0.00) 520 ( 0.01)
Sequence/LOG 1 ( 0.00) 182 ( 0.00) 0 ( 0.00) 182 ( 0.00)
-------- -------- -------- --------
Total 298705
And thanks to xid i found transaction with
SELECT
/*lot joins*/
FOR UPDATE
there missed OF table name clause, but this not help much..., so i found i made wrong POLICY on table ... something like:
CREATE POLICY some_policy ON projects FOR ALL TO role_name USING (pg_has_role("current_user"(), 'some_role'::name, 'member'::text));
and made some subrole that have access without grant permisions to other tables but better solution was create new role and grant access right and in policy use true instead expression for role that have access to all rows.
CREATE POLICY some_policy ON projects FOR ALL TO role_name USING (True);
So it was only bad idea, bad design.
Thanks not need solve this..., now not produce extra WAL records.
David
--
-------------------------------------
Ing. David TUROŇ
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava
tel.: +420 591 166 224
fax: +420 596 621 273
mobil: +420 732 589 152
www.linuxbox.cz
mobil servis: +420 737 238 656
email servis: servis@linuxbox.cz
-------------------------------------
Michael Paquier ---09.11.2016 07:10:44---On Wed, Nov 2, 2016 at 12:09 AM, <david.turon@linuxbox.cz> wrote: > we tried new feature RLS - test
Od: Michael Paquier <michael.paquier@gmail.com>
Komu: david.turon@linuxbox.cz
Kopie: PostgreSQL mailing lists <pgsql-general@postgresql.org>
Datum: 09.11.2016 07:10
Předmět: Re: [GENERAL] ENABLE ROW LEVEL SECURITY cause huge produce of checkpoints
On Wed, Nov 2, 2016 at 12:09 AM, <david.turon@linuxbox.cz> wrote:
> we tried new feature RLS - tested on postgres 9.5.3 / CentOS6. When we turn
> on ENABLE RLS + FORCE RLS on normal workload cause huge produce checkpoints
> (about 30x or more), our disk partition for xlog was full and log shipping
> to replica maybe delayed removing old checkpoints. Have anybody same
> experiences after turn on RLS? Looks like more buffers set as dirty. Yes,
> we can provide more space for xlog, but it will take much more space for
> xlog backups. We do not know if it's worth it. We had log_checkpoints ON and
> I send log as attachment (RLS Turn ON at 13:26).
Interesting, I don't recall RLS generating a burst in activity. The
first heavier checkpoints happen 20 minutes after enabling RLS and
those are triggered by time. Then things cool down and 1 hour later
comes the real deal with a set of checkpoints triggered by volume. It
is difficult though to draw a conclusion without more idea about your
load, the WAL record generated, etc.
--
Michael