Обсуждение: Path-length follies
Whilst cleaning up query-length dependencies, I noticed that our handling of maximum file pathname lengths is awfully messy. Different parts of the system rely on no fewer than four different symbols that they import from several different system header files (any one of which might not exist on a particular platform):MAXPATHLEN, _POSIX_PATH_MAX, MAX_PATH, PATH_MAX And on top of that, postgres.h defines MAXPGPATH which is used by yet other places. On my system, _POSIX_PATH_MAX = 255, PATH_MAX = 1023, MAXPATHLEN = 1024 (a nearby Linux box is almost but not quite the same) whereas MAXPGPATH is 128. So there is absolutely no consistency to the pathname length limits being imposed in different parts of Postgres. AFAIK, most or all flavors of Unix have kernel limits on the maximum length of a pathname that will be accepted by the kernel's file-access calls (it's 1024 on my box). So I don't feel any need to remove hardwired limits on pathname lengths in favor of indefinitely-expansible buffers. But it does seem that a little more consistency in the hardwired limits is called for. >From the information I have, it seems that the various allegedly- standard #defines for max pathname length are not too standard, and I don't think that Postgres internal buffers ought to constrain path lengths to much less than the kernel limit (so using the seemingly "standard" _POSIX_PATH_MAX symbol would be a loser). So my inclination is to define MAXPGPATH as 1024 in config.h, and remove all uses of the other four symbols in favor of MAXPGPATH. That would at least provide a single point of tweaking for anyone who didn't like the value of 1024. Does anyone have a better idea? Is it worth trying to extract a system limit on pathlength during configure, rather than leaving MAXPGPATH as a manual configuration item --- and if so, exactly how should configure go about it? regards, tom lane
> Whilst cleaning up query-length dependencies, I noticed that our > handling of maximum file pathname lengths is awfully messy. > > Different parts of the system rely on no fewer than four different > symbols that they import from several different system header > files (any one of which might not exist on a particular platform): > MAXPATHLEN, _POSIX_PATH_MAX, MAX_PATH, PATH_MAX > And on top of that, postgres.h defines MAXPGPATH which is used > by yet other places. > > On my system, _POSIX_PATH_MAX = 255, PATH_MAX = 1023, MAXPATHLEN = 1024 > (a nearby Linux box is almost but not quite the same) whereas MAXPGPATH > is 128. So there is absolutely no consistency to the pathname length > limits being imposed in different parts of Postgres. > > AFAIK, most or all flavors of Unix have kernel limits on the maximum > length of a pathname that will be accepted by the kernel's file-access > calls (it's 1024 on my box). So I don't feel any need to remove > hardwired limits on pathname lengths in favor of indefinitely-expansible > buffers. But it does seem that a little more consistency in the > hardwired limits is called for. > > >From the information I have, it seems that the various allegedly- > standard #defines for max pathname length are not too standard, > and I don't think that Postgres internal buffers ought to constrain > path lengths to much less than the kernel limit (so using the > seemingly "standard" _POSIX_PATH_MAX symbol would be a loser). > So my inclination is to define MAXPGPATH as 1024 in config.h, and > remove all uses of the other four symbols in favor of MAXPGPATH. > That would at least provide a single point of tweaking for anyone > who didn't like the value of 1024. > > Does anyone have a better idea? Is it worth trying to extract a > system limit on pathlength during configure, rather than leaving > MAXPGPATH as a manual configuration item --- and if so, exactly how > should configure go about it? I don't like the 128 or 256 numbers, but isn't there a predefined place for this value in standard system headers? -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <maillist@candle.pha.pa.us> writes: >> Does anyone have a better idea? Is it worth trying to extract a >> system limit on pathlength during configure, rather than leaving >> MAXPGPATH as a manual configuration item --- and if so, exactly how >> should configure go about it? > I don't like the 128 or 256 numbers, but isn't there a predefined place > for this value in standard system headers? There are too many of 'em, actually --- I had never realized this before, but there are three or four *different* "standard" symbols that all purport to be max pathlength. On my box they actually have three different values, which doesn't leave a warm feeling in the stomach. As I was just commenting off-list, we do not need to enforce the local kernel's pathlength limit --- it's perfectly capable of doing that for itself. All we really need to do is make sure we are not a bottleneck preventing reasonable usage. So, although I was thinking last night that a configure test might be a good idea, I now believe it's a waste of cycles. (It could even be counterproductive, if it seized on a bogusly small value, as _POSIX_PATH_MAX appears to be on both of the systems I've checked.) Let's just set the value at something generous like 1K and forget it. But we should use a consistent, tweakable-in- one-place value, just in case. regards, tom lane
> Bruce Momjian <maillist@candle.pha.pa.us> writes: > >> Does anyone have a better idea? Is it worth trying to extract a > >> system limit on pathlength during configure, rather than leaving > >> MAXPGPATH as a manual configuration item --- and if so, exactly how > >> should configure go about it? > > > I don't like the 128 or 256 numbers, but isn't there a predefined place > > for this value in standard system headers? > > There are too many of 'em, actually --- I had never realized this > before, but there are three or four *different* "standard" symbols that > all purport to be max pathlength. On my box they actually have three > different values, which doesn't leave a warm feeling in the stomach. Couldn't we pick one of the standard ones for use in setting a value for our own define, or at least test one of the standard ones against ours to see that it is either equal or greater than the 1024 we chose? -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
This came and went already but I did some research on it and it doesn't look as bad as it seems. On 1999-10-23, Tom Lane mentioned: > Different parts of the system rely on no fewer than four different > symbols that they import from several different system header > files (any one of which might not exist on a particular platform): > MAXPATHLEN, _POSIX_PATH_MAX, MAX_PATH, PATH_MAX > And on top of that, postgres.h defines MAXPGPATH which is used > by yet other places. > > On my system, _POSIX_PATH_MAX = 255, PATH_MAX = 1023, MAXPATHLEN = 1024 > (a nearby Linux box is almost but not quite the same) whereas MAXPGPATH > is 128. So there is absolutely no consistency to the pathname length > limits being imposed in different parts of Postgres. The Posix.1 symbol is PATH_MAX, which, in theory, describes the "uniform system limit". The symbol _POSIX_PATH_MAX defines the minimum which PATH_MAX is required to be on any Posix system, therefore that value should be fixed at 255 in the whole world. (Which yields code such as this: #ifndef MAXPATHLEN #define MAXPATHLEN _POSIX_PATH_MAX #endif --from the actual source-- conceptually incorrect.) >From my linux/limits.h (which propagates through to limits.h): #define PATH_MAX 4095 /* # chars in a path name */ In addition there is FILENAME_MAX, which is even defined if there is, in fact, no limit on the filename length, in which case it is set to some really large number. (Thus it is no good for allocating fixed size buffers.) This seems to be an ANSI C symbol for stdio sort of stuff, not a kernel thing. (And of course in the GNU "Any Day Now" System, there is no such limit. ;) MAXPATHLEN is the BSD name for PATH_MAX. From my sys/param.h: /* BSD names for some <limits.h> values. */. . . #define MAXPATHLEN PATH_MAX Although this seems to be the most popular thing to use, I can hardly see it referenced in any documentation at all on this machine. If one wishes to be anally proper one could use pathconf() to find out the limits on the fly as they apply to a particular file system. Finally, the symbol MAX_PATH is not described anywhere and I didn't find it in the source either. Which would lead one to suggest the following as portable as possible way out: #if defined(PATH_MAX) #define MAXPGPATH PATH_MAX #else #if defined(MAXPATHLEN) #define MAXPGPATH MAXPATHLEN #else #define MAXPGPATH 255 /* because this is the lowestcommon denominator on Posix systems */ #endif #endif That ought to cover all bases really. And if your system doesn't have either Posix or BSD includes (whoops!) you can tweak it yourself. Put that in config.h and everyone is happy. Then again, I would be even happier if we just used PATH_MAX and not invent a PostgreSQL-specific constant for everything in the world, but I'm not sure about the Posix'ness of other systems in the crowd out there. How about simply: #ifndef PATH_MAX #define PATH_MAX 255 #endif in c.h (not config.h) -- end of story. (Of course the code would actually have to use this as well. Currently, MAXPATHLEN is most widespread.) -Peter -- Peter Eisentraut Sernanders vaeg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Peter Eisentraut <peter_e@gmx.net> writes: > Which would lead one to suggest the following as portable as possible way > out: > #if defined(PATH_MAX) > #define MAXPGPATH PATH_MAX > #else > #if defined(MAXPATHLEN) > #define MAXPGPATH MAXPATHLEN > #else > #define MAXPGPATH 255 /* because this is the lowest common > denominator on Posix systems */ > #endif > #endif I don't think this would be an improvement. The main problem with it is that the above code could yield different values for MAXPGPATH *on the same system* depending on which system include file(s) you had included before reading config.h. Of course it would be a very bad thing if different Postgres source files had different ideas about the value of MAXPGPATH --- it could lead to different interpretations of a struct layout, for example. (I'm not sure that we actually have any such structs, but there's obviously potential for trouble.) If it were really important to have MAXPGPATH exactly equal to the local filename length limit, I'd be more interested in trying to configure it just so. One possibility would be to have the configure script do the equivalent of the above logic once at configure time, and then put the nailed-down value into config.h. But I can't see that it's worth the trouble. As long as we are not getting in people's way with an unreasonably small limit on pathlengths, it doesn't much matter exactly what the limit is. IMHO anyway. However, this line of thought does lead to something that maybe we should change: right now, most of the source files are set up as #include <all necessary system header files> #include "postgres.h" #include "necessary postgres headers" where config.h is read as part of postgres.h. I wonder whether it's such a good idea to have different source files reading different sets of system headers before config.h. Maybe the standard order ought to be #include "postgres.h" #include <all necessary system header files> #include "necessary postgres headers" so that config.h is always read in a uniform context. regards, tom lane
On 1999-11-06, Tom Lane mentioned: > Peter Eisentraut <peter_e@gmx.net> writes: > > Which would lead one to suggest the following as portable as possible way > > out: > > > #if defined(PATH_MAX) > > #define MAXPGPATH PATH_MAX > > #else > > #if defined(MAXPATHLEN) > > #define MAXPGPATH MAXPATHLEN > > #else > > #define MAXPGPATH 255 /* because this is the lowest common > > denominator on Posix systems */ > > #endif > > #endif > > I don't think this would be an improvement. The main problem with it is That's why I suggested: #ifndef PATH_MAX #define PATH_MAX 255 #endif instead. Then remove all references to MAXPATHLEN and MAXPGPATH. That can be done rather quickly. The above is standardized and then we'll have a uniform limit throughout the source, that should be equal to the actual system limit on 99% of all systems. And it makes the source simpler along the way. As it is right now, the vast majority of files doesn't use MAXPGPATH anyway. Of course, this is a stupid topic to discuss, but please consider the point. > However, this line of thought does lead to something that maybe we > should change: right now, most of the source files are set up as > > #include <all necessary system header files> > > #include "postgres.h" > > #include "necessary postgres headers" > > where config.h is read as part of postgres.h. I wonder whether it's > such a good idea to have different source files reading different > sets of system headers before config.h. Maybe the standard order > ought to be > > #include "postgres.h" > > #include <all necessary system header files> > > #include "necessary postgres headers" > > so that config.h is always read in a uniform context. Definitely. -- Peter Eisentraut Sernanders vaeg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Peter Eisentraut <peter_e@gmx.net> writes: > As it is right now, the vast majority of files doesn't use > MAXPGPATH anyway. ?? I think you are looking at out-of-date sources, because I changed everything to use MAXPGPATH a week or two ago... regards, tom lane