Обсуждение: pgsql-server/ oc/src/sgml/ref/cluster.sgml rc/ ...
pgsql-server/ oc/src/sgml/ref/cluster.sgml rc/ ...
От
momjian@postgresql.org (Bruce Momjian - CVS)
Дата:
CVSROOT: /cvsroot Module name: pgsql-server Changes by: momjian@postgresql.org 02/08/10 16:43:46 Modified files: doc/src/sgml/ref: cluster.sgml src/backend/commands: cluster.c Added files: src/test/regress/output: cluster.out src/test/regress/sql: cluster.sql Log message: Major improvement in CLUSTER which preserves table characteristics using relfilenode. I sent the CLUSTER patch a few days ago and I think it was missed. I append it again, this time including the regression test files. For the committer, please note that you have to cvs add the files as they don't exist. Maybe add to the parallel and serial schedules also, but I don't know such stuff. Alvaro Herrera (<alvherre[a]atentus.com>)
momjian@postgresql.org (Bruce Momjian - CVS) writes: > Major improvement in CLUSTER which preserves table characteristics using > relfilenode. This patch is still a few bricks shy of a load. In particular, it completely destroys TOASTed data. regression=# create table f11 (f1 int unique, f2 text); NOTICE: CREATE TABLE / UNIQUE will create implicit index 'f11_f1_key' for table 'f11' CREATE TABLE regression=# insert into f11 values(0, repeat('xyzzy', 100000)); INSERT 691177 1 regression=# select f1,length(f2) from f11; f1 | length ----+-------- 0 | 500000 (1 row) regression=# cluster f11_f1_key on f11; CLUSTER regression=# select f1,length(f2) from f11; ERROR: Relation 691181 does not exist As-is the patch is entirely unacceptable. Ideally we should find a way to move tuples into the new table without invoking the TOAST code at all, but I'm not sure what that will entail. regards, tom lane
Tom Lane wrote: > momjian@postgresql.org (Bruce Momjian - CVS) writes: > > Major improvement in CLUSTER which preserves table characteristics using > > relfilenode. > > This patch is still a few bricks shy of a load. In particular, > it completely destroys TOASTed data. > > regression=# create table f11 (f1 int unique, f2 text); > NOTICE: CREATE TABLE / UNIQUE will create implicit index 'f11_f1_key' for table 'f11' > CREATE TABLE > regression=# insert into f11 values(0, repeat('xyzzy', 100000)); > INSERT 691177 1 > regression=# select f1,length(f2) from f11; > f1 | length > ----+-------- > 0 | 500000 > (1 row) > > regression=# cluster f11_f1_key on f11; > CLUSTER > regression=# select f1,length(f2) from f11; > ERROR: Relation 691181 does not exist > > > As-is the patch is entirely unacceptable. Ideally we should find a way ^^^^^^^^^^^^^^^^^^^^^ > to move tuples into the new table without invoking the TOAST code at > all, but I'm not sure what that will entail. OK, Tom, you found a problem. No need to make the submitter feel any worse by adding critical language to your comments. The basic question is should be back it out or put it on the open items list for 7.3 and keep working on it? Either way, it will be working for 7.3, I am sure. Myself, I thought the toast table would come along with the heap table during the relfilenode change, and I am not sure why it didn't. Also, do we still need the FlushRelationBuffers() calls now that we handle the local buffer differently? Would clustering temp table still require the call? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
I said: > Ideally we should find a way > to move tuples into the new table without invoking the TOAST code at > all, but I'm not sure what that will entail. I was too tired last night to really think about this. What I had in mind when I wrote the above was to somehow disable the TOAST machinery while copying tuples, so that any toasted values in the new heap would still reference the old toast table. That would avoid "unnecessary" copying of the toasted datums ... but it looks ugly to do. This morning, however, I prefer the idea of letting the system build a new TOAST table along with the new heap, and then simply swapping the toast-table links along with the relfilenode links. This would be a trivial addition to Alvaro's patch. The net effect would be that the contents of the TOAST table would also be re-sorted into the clustered order, which seems at least potentially a win for access time. The main downside of this approach is that you'd need enough free disk space for a new copy of the TOAST table, not only the main table and its indexes. In some scenarios this will mean a very large increase in the disk space needed for CLUSTER, since the TOAST table may well out-bulk its parent by large amounts. I'm engaged in code review for Alvaro's patch today, and will add the toast-table link swap as a quick-and-dirty solution. If anyone is sufficiently swayed by the disk-space issue to investigate doing it the other way, go right ahead --- we could remove the link swap again if another solution is offered. regards, tom lane
Bruce Momjian dijo: > Tom Lane wrote: > > momjian@postgresql.org (Bruce Momjian - CVS) writes: > > > Major improvement in CLUSTER which preserves table characteristics using > > > relfilenode. > > > > This patch is still a few bricks shy of a load. In particular, > > it completely destroys TOASTed data. > The basic question is should be back it out or put it on the open items > list for 7.3 and keep working on it? Either way, it will be working for > 7.3, I am sure. Don't back it out just yet. I think I'm finding why this happens and how to correct it. I'm working on it. One question (that is unrelated to the fix I'm currently working on): do each TOASTed tuple have a pointer to the TOAST table? Not to the tuple in the table, but to the table itself? -- Alvaro Herrera (<alvherre[a]atentus.com>) Voy a acabar con todos los humanos / con los humanos yo acabaré voy a acabar con todos / con todos los humanos acabaré (Bender)
Tom Lane dijo: > I'm engaged in code review for Alvaro's patch today, and will add the > toast-table link swap as a quick-and-dirty solution. I already did that. I was bitten by the dependency tracking thing that drops the new TOAST table; I just got up to fix it. -- Alvaro Herrera (<alvherre[a]atentus.com>) Officer Krupke, what are we to do? Gee, officer Krupke, Krup you! (West Side Story, "Gee, Officer Krupke")
Alvaro Herrera <alvherre@atentus.com> writes: > I already did that. I was bitten by the dependency tracking thing that > drops the new TOAST table; I just got up to fix it. Yeah, I'm on that now. I got distracted by the problems the code induced in the relcache --- didn't you get any WARNING: trying to delete a rd_node reldesc that does not exist. messages in your testing? They popped up for me anytime I modified cluster.c at all. Anyway I think I'm close to a version that actually works ;-) regards, tom lane
Tom Lane dijo: > Alvaro Herrera <alvherre@atentus.com> writes: > > I already did that. I was bitten by the dependency tracking thing that > > drops the new TOAST table; I just got up to fix it. > > Yeah, I'm on that now. I got distracted by the problems the code > induced in the relcache --- didn't you get any > WARNING: trying to delete a rd_node reldesc that does not exist. > messages in your testing? They popped up for me anytime I modified > cluster.c at all. No, I didn't. But I didn't came down to writing even one line about the dependency fixing. > Anyway I think I'm close to a version that actually works ;-) Ok, I leave it in your hands then. Thank you. -- Alvaro Herrera (<alvherre[a]atentus.com>) "La Primavera ha venido. Nadie sabe como ha sido" (A. Machado)
TOAST & DROP COLUMN (Was: RE: pgsql-server/ oc/src/sgml/ref/cluster.sgml rc/ ... )
От
"Christopher Kings-Lynne"
Дата:
> I was too tired last night to really think about this. What I had in > mind when I wrote the above was to somehow disable the TOAST machinery > while copying tuples, so that any toasted values in the new heap would > still reference the old toast table. That would avoid "unnecessary" > copying of the toasted datums ... but it looks ugly to do. Incidentally - will a dropped column definitely have its TOAST data reclaimed when you do the "update set col=col;" ? Chris
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: > Incidentally - will a dropped column definitely have its TOAST data > reclaimed when you do the "update set col=col;" ? The code is there to do it, but I must admit I didn't test that it followed through. Please check it ... regards, tom lane
Tom, I am glad you added this repeat()/TOAST test to the cluster regression test. It is a nice way to test TOAST. Folks, are there other areas where we should testing TOAST in the regression tests? --------------------------------------------------------------------------- Tom Lane wrote: > momjian@postgresql.org (Bruce Momjian - CVS) writes: > > Major improvement in CLUSTER which preserves table characteristics using > > relfilenode. > > This patch is still a few bricks shy of a load. In particular, > it completely destroys TOASTed data. > > regression=# create table f11 (f1 int unique, f2 text); > NOTICE: CREATE TABLE / UNIQUE will create implicit index 'f11_f1_key' for table 'f11' > CREATE TABLE > regression=# insert into f11 values(0, repeat('xyzzy', 100000)); > INSERT 691177 1 > regression=# select f1,length(f2) from f11; > f1 | length > ----+-------- > 0 | 500000 > (1 row) > > regression=# cluster f11_f1_key on f11; > CLUSTER > regression=# select f1,length(f2) from f11; > ERROR: Relation 691181 does not exist > > > As-is the patch is entirely unacceptable. Ideally we should find a way > to move tuples into the new table without invoking the TOAST code at > all, but I'm not sure what that will entail. > > regards, tom lane > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian wrote: > Tom, I am glad you added this repeat()/TOAST test to the cluster > regression test. It is a nice way to test TOAST. Folks, are there > other areas where we should testing TOAST in the regression tests? > Possibly strings.sql for the text substring tests (think toast slicing). I'm working in this area right now, so I'll add some tests to cover toasted data (and bytea substrings as well) before I'm done. Joe