Обсуждение: BUG #10155: BUG? Cann't remove new generated tuples after repeatable read transaction start.
BUG #10155: BUG? Cann't remove new generated tuples after repeatable read transaction start.
От
digoal@126.com
Дата:
The following bug has been logged on the website: Bug reference: 10155 Logged by: digoal.zhou Email address: digoal@126.com PostgreSQL version: 9.3.3 Operating system: CentOS 6.4 x64 Description: SESSION A : digoal=# begin isolation level repeatable read; BEGIN digoal=# \dt List of relations Schema | Name | Type | Owner --------+-----------------------+-------+---------- public | stat_pg_stat_database | table | postgres public | tbl_cost_align | table | postgres public | test | table | postgres public | ttt | table | postgres public | tttt | table | postgres public | ttttt | table | postgres (6 rows) SESSION B : digoal=# create table t as select * from pg_class; SELECT 306 SESSION A : no table t in it , A cann't see the t metadata in pg_class and pg_attr and so on. digoal=# \dt List of relations Schema | Name | Type | Owner --------+-----------------------+-------+---------- public | stat_pg_stat_database | table | postgres public | tbl_cost_align | table | postgres public | test | table | postgres public | ttt | table | postgres public | tttt | table | postgres public | ttttt | table | postgres (6 rows) SESSION B : but B cann't reclaim rows from table t. why? i think postgresql cann't reclaim tuples already exists before repeatable read transaction start, why this case t's tuples after session a and cann't reclaim. digoal=# delete from t; DELETE 306 digoal=# vacuum freeze verbose t; INFO: vacuuming "public.t" INFO: "t": found 0 removable, 306 nonremovable row versions in 2 out of 2 pages DETAIL: 306 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: vacuuming "pg_toast.pg_toast_33578" INFO: index "pg_toast_33578_index" now contains 0 row versions in 1 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: "pg_toast_33578": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.00 sec. VACUUM when SESSION end; session b can reclaim these dead tuple.
Re: BUG #10155: BUG? Cann't remove new generated tuples after repeatable read transaction start.
От
Heikki Linnakangas
Дата:
On 04/28/2014 11:37 AM, digoal@126.com wrote: > SESSION B : > but B cann't reclaim rows from table t. > why? > i think postgresql cann't reclaim tuples already exists before repeatable > read transaction start, why this case t's tuples after session a and cann't > reclaim. I think what you're arguing is that the system should be smarter and be able to reclaim the dead tuples. Because session A began before the table was even created, and there are no other backends that would need to see them either, they could indeed be safely vacuumed. The system just isn't smart enough to distinguish the case. The short answer is that such an optimization just doesn't exist in PostgreSQL. It's certainly not a bug. The long answer is that actually, even though the table was created after the transaction in session A began, session A *can* access the table. Schema changes don't follow the normal MVCC rules. If you do "SELECT * FROM t" in session A, it will work. However, the rows still won't be visible, to sessin A, because they were inserted after the snapshot was taken, so they could still be vacuumed if the system tracked the snapshots more carefully and was able to deduce that. But the fact that a new table was created is not relevant. - Heikki
HI,
This is so bad in pg_dump use, when a database so big.
because pg_dump is also use repeatable read isolation. and when pg_dump backup database , the database will bloat .
Can we optimize it?
--
公益是一辈子的事,I'm Digoal,Just Do It.
公益是一辈子的事,I'm Digoal,Just Do It.
At 2014-04-29 02:53:33,"Heikki Linnakangas" <hlinnakangas@vmware.com> wrote: >On 04/28/2014 11:37 AM, digoal@126.com wrote: >> SESSION B : >> but B cann't reclaim rows from table t. >> why? >> i think postgresql cann't reclaim tuples already exists before repeatable >> read transaction start, why this case t's tuples after session a and cann't >> reclaim. > >I think what you're arguing is that the system should be smarter and be >able to reclaim the dead tuples. Because session A began before the >table was even created, and there are no other backends that would need >to see them either, they could indeed be safely vacuumed. The system >just isn't smart enough to distinguish the case. > >The short answer is that such an optimization just doesn't exist in >PostgreSQL. It's certainly not a bug. > >The long answer is that actually, even though the table was created >after the transaction in session A began, session A *can* access the >table. Schema changes don't follow the normal MVCC rules. If you do >"SELECT * FROM t" in session A, it will work. However, the rows still >won't be visible, to sessin A, because they were inserted after the >snapshot was taken, so they could still be vacuumed if the system >tracked the snapshots more carefully and was able to deduce that. But >the fact that a new table was created is not relevant. > >- Heikki
Re: BUG #10155: BUG? Cann't remove new generated tuples after repeatable read transaction start.
От
Heikki Linnakangas
Дата:
On 04/29/2014 02:18 AM, å¾·å¥ wrote: > HI, > This is so bad in pg_dump use, when a database so big. > because pg_dump is also use repeatable read isolation. and when pg_dump backup database , the database will bloat . Yep. One approach is to take a filesystem-level backup (ie. with pg_start/stop_backup() or pg_basebackup), start a second server from that backup, and run pg_dump against that. > Can we optimize it? Not easily. I'd love to do something about it, but it's going to be a long project. - Heikki