Обсуждение: bloated heapam.h
Hi, I noticed heapam.h is included in way too many places. This is bad IMHO because heapam.h itself includes a lot of other headers. A lot of these are easy to fix; the source files just need to include some other headers. Standard cleanup, I don't think anybody would object to that. For example, Index: src/backend/access/gin/ginvacuum.c =================================================================== RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/access/gin/ginvacuum.c,v retrieving revision 1.19 diff -c -p -r1.19 ginvacuum.c *** src/backend/access/gin/ginvacuum.c 1 Jan 2008 19:45:46 -0000 1.19 --- src/backend/access/gin/ginvacuum.c 9 May 2008 18:44:31 -0000 *************** *** 15,24 **** #include "postgres.h" #include "access/genam.h" #include "access/gin.h" - #include "access/heapam.h" #include "miscadmin.h" #include "storage/freespace.h" ! #include "storage/freespace.h" #include "commands/vacuum.h" typedef struct --- 15,23 ---- #include "postgres.h" #include "access/genam.h" #include "access/gin.h" #include "miscadmin.h" #include "storage/freespace.h" ! #include "storage/lmgr.h" #include "commands/vacuum.h" typedef struct Others are more conflictive. For example syncscan.c is keeping the prototypes for its own functions on heapam.h. Also pruneheap.c and rewriteheap.c. As a result, not only themselves need to include heapam.h (without any other need for it), but they force some other files into including heapam.h to get their prototypes. I think this is a mistake; I propose splitting those prototypes to their own files, and #including those as appropriate. Objections? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera napsal(a): > Others are more conflictive. For example syncscan.c is keeping the > prototypes for its own functions on heapam.h. Also pruneheap.c and > rewriteheap.c. As a result, not only themselves need to include > heapam.h (without any other need for it), but they force some other > files into including heapam.h to get their prototypes. I think this is > a mistake; I propose splitting those prototypes to their own files, and > #including those as appropriate. > > Objections? I have similar thing in my TODO list. See my patch from March commit fest and discussion. I need solve two main issues - remove postgres.h from binaries and keep history of structures (for pg_upgrade project). My idea is split structures and functions in separate header files. Zdenek http://archives.postgresql.org/pgsql-patches/2007-10/msg00197.php http://archives.postgresql.org/pgsql-patches/2008-04/msg00149.php
Zdenek Kotala wrote: > Alvaro Herrera napsal(a): > >> Others are more conflictive. For example syncscan.c is keeping the >> prototypes for its own functions on heapam.h. Also pruneheap.c and >> rewriteheap.c. As a result, not only themselves need to include >> heapam.h (without any other need for it), but they force some other >> files into including heapam.h to get their prototypes. I think this is >> a mistake; I propose splitting those prototypes to their own files, and >> #including those as appropriate. >> >> Objections? > > I have similar thing in my TODO list. See my patch from March commit fest > and discussion. I need solve two main issues - remove postgres.h from > binaries and keep history of structures (for pg_upgrade project). Yeah, I remember that. Is there any progress on that front? BTW I noticed that I was a bit careless in the description. rewriteheap already has its own rewriteheap.h file; and there's no point at all in separating pruneheap.c declarations into another file. The one that makes a bit more sense is a new syncscan.h. And there are a lot of things in heapam.h that actually correspond to tuple manipulation (heap_form_tuple and so on), so perhaps a new header file would be appropriate, but there's already htup.h which contains tuple-related stuff. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > The one that makes a bit more sense is a new syncscan.h. And there are > a lot of things in heapam.h that actually correspond to tuple > manipulation (heap_form_tuple and so on), so perhaps a new header file > would be appropriate, but there's already htup.h which contains > tuple-related stuff. After actually looking at the header a bit ... +1 for moving fastgetattr, heap_getattr, and the heaptuple.c functions to htup.h. I don't see any big gain from relocating the other stuff; it seems to largely all use about the same set of typedefs. It looks to me actually that a large part of your complaint is that heapam.h #includes more than it has to. Have you tried just cutting its #include list to the minimum needed to compile its function declarations? Likely this would force more #includes in .c files but I don't object to that. (I might be wrong, but I believe that Bruce's script for removing "unnecessary" #includes is not bright enough to make such tradeoffs, so it'd let bloated #include lists in headers survive.) regards, tom lane
Tom Lane wrote: > +1 for moving fastgetattr, heap_getattr, and the heaptuple.c functions > to htup.h. I don't see any big gain from relocating the other stuff; > it seems to largely all use about the same set of typedefs. Ultimately that was my conclusion too. I tried moving the heaptuple.c functions to htup.h, but hit the problem that pg_dump.c wants to include that file. It then fails to compile because it doesn't know Datum (defined in postgres.h, so unavailable to frontend programs). I worked around that by enclosing the prototypes in #ifndef FRONTEND, but that seems ugly. Apparently the reason for pg_dump.c to need htup.h is getAttrName which needs system columns' attribute numbers. Of course, the first thing that comes to mind is that we should fix pg_dump to not require that header in the first place. Perhaps we can get the names by querying the server, at the same time we get the user column names. (Apparently this is only used to dump unique indexes on system columns.) The easiest actual solution, however, seems to be to move those attribute numbers to a separate header, say access/sysattrs.h. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Apparently the reason for pg_dump.c to need htup.h is getAttrName which > needs system columns' attribute numbers. Of course, the first thing > that comes to mind is that we should fix pg_dump to not require that > header in the first place. Perhaps we can get the names by querying the > server, at the same time we get the user column names. (Apparently this > is only used to dump unique indexes on system columns.) > The easiest actual solution, however, seems to be to move those > attribute numbers to a separate header, say access/sysattrs.h. Yeah, probably. There is certainly no reason for a frontend program to be dealing in Datum; but it isn't unreasonable for it to know the system column numbers, since it can see those in the catalogs. I'm sure we have work in front of us to get these things separated out a bit better. regards, tom lane
Alvaro Herrera <alvherre@commandprompt.com> writes: > So here's a patch (includes the #ifndef FRONTEND hack in htup.h.) I like this except for the #ifndef FRONTEND hack --- you're going to need to fix that before applying. I'm good with doing that by pushing the system attribute numbers into a separate header. BTW, you didn't compress the patch, which likely explains why it didn't show up on -hackers. If you want to post a shorter form, I think that the diffs in the header files are the only interesting part; the ensuing additions in .c files are just mechanical. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > So here's a patch (includes the #ifndef FRONTEND hack in htup.h.) > > I like this except for the #ifndef FRONTEND hack --- you're going to > need to fix that before applying. I'm good with doing that by pushing > the system attribute numbers into a separate header. Committed with the attribute numbers moved to access/sysattr.h, which is what pg_dump now uses. Thanks. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Tom Lane wrote: > > Alvaro Herrera <alvherre@commandprompt.com> writes: > > > So here's a patch (includes the #ifndef FRONTEND hack in htup.h.) > > > > I like this except for the #ifndef FRONTEND hack --- you're going to > > need to fix that before applying. I'm good with doing that by pushing > > the system attribute numbers into a separate header. > > Committed with the attribute numbers moved to access/sysattr.h, which is > what pg_dump now uses. Oops :-( I just noticed that I removed bufmgr.h from bufpage.h, which is a change you had objected to previously :-( http://archives.postgresql.org/pgsql-patches/2008-04/msg00149.php -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote: > Oops :-( I just noticed that I removed bufmgr.h from bufpage.h, which > is a change you had objected to previously :-( However, it seems the right fix is to move BufferGetPageSize and BufferGetPage from bufpage.h to bufmgr.h. (Digging further, it seems like bufpage.h should also include transam.h in order to get TransactionIdIsNormal ... I start to wonder how many problems of this nature we have on our headers. Without having a way to detect whether the defined macros are valid, it seems hard to check programatically, however.) -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Oops :-( I just noticed that I removed bufmgr.h from bufpage.h, which > is a change you had objected to previously :-( > http://archives.postgresql.org/pgsql-patches/2008-04/msg00149.php Hmm. I did notice that the patch seemed to need to add bufmgr.h inclusions to an awful lot of files, but I'd forgotten the argument about the bufpage.h macros needing it. Do you want to undo that aspect of it? regards, tom lane
Alvaro Herrera <alvherre@commandprompt.com> writes: > However, it seems the right fix is to move BufferGetPageSize and > BufferGetPage from bufpage.h to bufmgr.h. That sounds sane if it fixes the problem. > (Digging further, it seems like bufpage.h should also include transam.h > in order to get TransactionIdIsNormal ... I start to wonder how many > problems of this nature we have on our headers. Without having a way to > detect whether the defined macros are valid, it seems hard to check > programatically, however.) Yeah, I'm certain there's some other problems of the same kind, but I don't see any easy way to find 'em. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > However, it seems the right fix is to move BufferGetPageSize and > > BufferGetPage from bufpage.h to bufmgr.h. > > That sounds sane if it fixes the problem. Actually it's not, because bufmgr.h does not include bufpage.h, and it knows nothing about "pages". However, I think most additions of bufmgr.h to source files are not all that bogus -- a lot of the warnings I got were about BufferAccessStrategyType. I'm still checking what's the best way to deal with this, and if I can't find anything else I'll re-add bufmgr.h to bufpage.h. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Alvaro Herrera wrote: > >> Oops :-( I just noticed that I removed bufmgr.h from bufpage.h, which >> is a change you had objected to previously :-( > > However, it seems the right fix is to move BufferGetPageSize and > BufferGetPage from bufpage.h to bufmgr.h. +1 It makes more sense. > (Digging further, it seems like bufpage.h should also include transam.h > in order to get TransactionIdIsNormal ... I start to wonder how many > problems of this nature we have on our headers. Without having a way to > detect whether the defined macros are valid, it seems hard to check > programatically, however.) > I attached script which should check it. In first step it runs C preprocessor on each header (postgres.h is included as well). The output from first step is processed again with preprocessor and define.h is included. Define.h contains "all" used macros in following format: #define SIGABRT "NOT_EXPANDED_SIGABRT" Main problem is how to generate define.h. I used following magic: grep "^#define" `find . -name "*.h"` | cut -d" " -f 2 | cut -f 1 | cut -f 1 -d"(" but it generates some noise as well. Maybe some Perl or AWK magic should be better. Zdenek
Вложения
Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: > > Oops :-( I just noticed that I removed bufmgr.h from bufpage.h, which > > is a change you had objected to previously :-( > > http://archives.postgresql.org/pgsql-patches/2008-04/msg00149.php > > Hmm. I did notice that the patch seemed to need to add bufmgr.h > inclusions to an awful lot of files, but I'd forgotten the argument > about the bufpage.h macros needing it. Do you want to undo that > aspect of it? Done -- I put back bufmgr.h into bufpage.h. Surely there is a better structure for this, perhaps involving more header files, but I don't have time for that ATM. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Zdenek Kotala wrote: > Alvaro Herrera wrote: >> (Digging further, it seems like bufpage.h should also include transam.h >> in order to get TransactionIdIsNormal ... I start to wonder how many >> problems of this nature we have on our headers. Without having a way to >> detect whether the defined macros are valid, it seems hard to check >> programatically, however.) > > I attached script which should check it. In first step it runs C > preprocessor on each header (postgres.h is included as well). The output > from first step is processed again with preprocessor and define.h is > included. Define.h contains "all" used macros in following format: > > #define SIGABRT "NOT_EXPANDED_SIGABRT" > > Main problem is how to generate define.h. I used following magic: > > grep "^#define" `find . -name "*.h"` | cut -d" " -f 2 | cut -f 1 | cut -f 1 -d"(" > > but it generates some noise as well. Maybe some Perl or AWK magic should be better. So were you able to detect anything bogus with this technique? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera napsal(a): > Zdenek Kotala wrote: >> Alvaro Herrera wrote: > >>> (Digging further, it seems like bufpage.h should also include transam.h >>> in order to get TransactionIdIsNormal ... I start to wonder how many >>> problems of this nature we have on our headers. Without having a way to >>> detect whether the defined macros are valid, it seems hard to check >>> programatically, however.) >> I attached script which should check it. In first step it runs C >> preprocessor on each header (postgres.h is included as well). The output >> from first step is processed again with preprocessor and define.h is >> included. Define.h contains "all" used macros in following format: >> >> #define SIGABRT "NOT_EXPANDED_SIGABRT" >> >> Main problem is how to generate define.h. I used following magic: >> >> grep "^#define" `find . -name "*.h"` | cut -d" " -f 2 | cut -f 1 | cut -f 1 -d"(" >> >> but it generates some noise as well. Maybe some Perl or AWK magic should be better. > > So were you able to detect anything bogus with this technique? > No, everything looks OK. Zdenek
Zdenek Kotala wrote: > Alvaro Herrera napsal(a): >> Zdenek Kotala wrote: >>> I attached script which should check it. In first step it runs C >>> preprocessor on each header (postgres.h is included as well). The >>> output from first step is processed again with preprocessor and >>> define.h is included. >> So were you able to detect anything bogus with this technique? > > No, everything looks OK. Well, then it is not working, because transam.h is missing from bufpage.h ... -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera napsal(a): > Zdenek Kotala wrote: >> Alvaro Herrera napsal(a): >>> Zdenek Kotala wrote: > >>>> I attached script which should check it. In first step it runs C >>>> preprocessor on each header (postgres.h is included as well). The >>>> output from first step is processed again with preprocessor and >>>> define.h is included. > >>> So were you able to detect anything bogus with this technique? >> No, everything looks OK. > > Well, then it is not working, because transam.h is missing from > bufpage.h ... > It tried catch problems with defines not with datatypes. If you mean that TransactionID is not defined. I try to play with lint now if it gets better results. Zdenek
Zdenek Kotala wrote: > Alvaro Herrera napsal(a): >> Well, then it is not working, because transam.h is missing from >> bufpage.h ... > > It tried catch problems with defines not with datatypes. If you mean that > TransactionID is not defined. No, I mean that TransactionIdIsNormal() is used in PageSetPrunable(). > I try to play with lint now if it gets better results. Ok, good. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera napsal(a): >> I try to play with lint now if it gets better results. > > Ok, good. Hmm, It generates a lot of unnecessary includes in *.c files. I have checked only couple of them and it seems that they are really unnecessary. A attach output. Unfortunately, it does not detect missing heapam.h from bufpage.h. However, I have not tested all magic switches yet :-). There are also several reports about system headers file, but it could be platform specific warning. Zdenek
Вложения
Zdenek Kotala wrote: > Alvaro Herrera napsal(a): > >>> I try to play with lint now if it gets better results. >> >> Ok, good. > > Hmm, It generates a lot of unnecessary includes in *.c files. I have > checked only couple of them and it seems that they are really > unnecessary. A attach output. Unfortunately, it does not detect missing > heapam.h from bufpage.h. However, I have not tested all magic switches > yet :-). There are also several reports about system headers file, but > it could be platform specific warning. Interesting. It seems that Bruce's script could be replaced by lint. Please share the switches you used. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Zdenek Kotala wrote: >> Alvaro Herrera napsal(a): >> >>>> I try to play with lint now if it gets better results. >>> Ok, good. >> Hmm, It generates a lot of unnecessary includes in *.c files. I have >> checked only couple of them and it seems that they are really >> unnecessary. A attach output. Unfortunately, it does not detect missing >> heapam.h from bufpage.h. However, I have not tested all magic switches >> yet :-). There are also several reports about system headers file, but >> it could be platform specific warning. > > Interesting. It seems that Bruce's script could be replaced by lint. > Please share the switches you used. > I used following switches which only shows unused included file. I run command in backend directory. lint -I ../include -errtags -errhdr=%all -Ncheck=%none -Nlevel=1 -erroff=%all -erroff=no%E_INCL_NUSD -c `find . -name "*.c"` > include.out Good to mention that I use lint from Sun Studio 12. Zdenek