Обсуждение: Avoid repeated PQfnumber() in pg_dump
Hi, When reviewing some pg_dump related code, I found some existsing code (getTableAttrs() and dumpEnumType()) invoke PQfnumber() repeatedly which seems unnecessary. Example ----- for (int j = 0; j < ntups; j++) { if (j + 1 != atoi(PQgetvalue(res, j, PQfnumber(res, "attnum")))) ----- Since PQfnumber() is not a cheap function, I think we'd better invoke PQfnumber() out of the loop like the attatched patch. After applying this change, I can see about 8% performance gain in my test environment when dump table definitions which have many columns. Best regards, Hou zhijie
Вложения
On Wed, Jul 14, 2021 at 08:54:32AM +0000, houzj.fnst@fujitsu.com wrote: > Since PQfnumber() is not a cheap function, I think we'd better invoke > PQfnumber() out of the loop like the attatched patch. +1 Please add to the next CF -- Justin
> On 14 Jul 2021, at 10:54, houzj.fnst@fujitsu.com wrote: > Since PQfnumber() is not a cheap function, I think we'd better invoke > PQfnumber() out of the loop like the attatched patch. Looks good on a quick readthrough, and I didn't see any other similar codepaths in pg_dump on top of what you've fixed. > After applying this change, I can see about 8% performance gain in my test environment > when dump table definitions which have many columns. Out of curiosity, how many columns are "many columns"? -- Daniel Gustafsson https://vmware.com/
On July 15, 2021 5:35 AM Daniel Gustafsson <daniel@yesql.se> wrote: > > On 14 Jul 2021, at 10:54, houzj.fnst@fujitsu.com wrote: > > > Since PQfnumber() is not a cheap function, I think we'd better invoke > > PQfnumber() out of the loop like the attatched patch. > > Looks good on a quick readthrough, and I didn't see any other similar > codepaths in pg_dump on top of what you've fixed. Thanks for reviewing the patch. Added to the CF: https://commitfest.postgresql.org/34/3254/ > > After applying this change, I can see about 8% performance gain in my > > test environment when dump table definitions which have many columns. > > Out of curiosity, how many columns are "many columns"? I tried dump 10 table definitions while each table has 1000 columns (maybe not real world case). Best regards, houzj
> On 15 Jul 2021, at 04:51, houzj.fnst@fujitsu.com wrote: > On July 15, 2021 5:35 AM Daniel Gustafsson <daniel@yesql.se> wrote: >> Out of curiosity, how many columns are "many columns"? > > I tried dump 10 table definitions while each table has 1000 columns > (maybe not real world case). While unlikely to be common, very wide tables aren’t unheard of. Either way, I think it has merit to pull out the PQfnumber before the loop even if it’s a wash performance wise for many users, as it’s a pattern used elsewhere in pg_dump. -- Daniel Gustafsson https://vmware.com/
On 7/15/21, 12:48 PM, "Daniel Gustafsson" <daniel@yesql.se> wrote: > While unlikely to be common, very wide tables aren’t unheard of. Either way, I > think it has merit to pull out the PQfnumber before the loop even if it’s a > wash performance wise for many users, as it’s a pattern used elsewhere in > pg_dump. +1 The patch looks good to me. I am marking it as ready-for-committer. Nathan
> On 23 Jul 2021, at 01:39, Bossart, Nathan <bossartn@amazon.com> wrote: > The patch looks good to me. I am marking it as ready-for-committer. I took another look at this today and pushed it after verifying with a pgindent run. Thanks! -- Daniel Gustafsson https://vmware.com/
On 8/27/21, 7:27 AM, "Daniel Gustafsson" <daniel@yesql.se> wrote: > I took another look at this today and pushed it after verifying with a pgindent > run. Thanks! Thank you! Nathan