Обсуждение: "Make" versus effective stack limit in regression tests
I wondered why some of the buildfarm machines were showing max_stack_depth = 100kB, and Andrew Dunstan kindly lent me the use of "dungbeetle" to check it out. What I found out: 1. max_stack_depth has the expected value (equal to ulimit -s) in any manually started postmaster. It only drops to 100kB in the "make check" environment. 2. postgres.c's getrlimit(RLIMIT_STACK) call returns the expected values in a manual start:rlim_cur = 10485760 rlim_max = -1 but in a "make check" run:rlim_cur = -1 rlim_max = -1 ie, the soft limit has been reset to RLIM_INFINITY. get_stack_depth_rlimit chooses to treat this as "unknown", resulting in setting max_stack_depth to the minimal value. 3. Further experimentation proves that "make" is resetting the limit for any program it invokes. I couldn't reproduce this on my Fedora 13 machine, even though it is nominally running the same gmake 3.81 as dungbeetle's Fedora 6. So I took a look into Fedora git, and sure enough, there's a relevant patch there. It seems that gmake 3.81 tries to force up the RLIMIT_STACK rlim_cur to rlim_max, because it relies heavily on alloca() and so needs lots of stack space. Fedora 7 and up have patched it to restore the caller's setting before actually invoking any programs: https://bugzilla.redhat.com/show_bug.cgi?id=214033 I haven't done the research to find out which gmake versions have this behavior or which other Linux distros are carrying similar patches, but I'm sure this explains why some of the buildfarm members report max_stack_depth = 100kB when most others don't. Anyway, what this points up is that we are making a very conservative assumption about what to do when getrlimit() returns RLIM_INFINITY. It does not seem real reasonable to interpret that as 100kB on any modern platform. I'm inclined to interpret it as 4MB, which is the same default stack limit that we use on Windows. Thoughts? regards, tom lane
On 11/05/2010 05:45 PM, Tom Lane wrote: > Anyway, what this points up is that we are making a very conservative > assumption about what to do when getrlimit() returns RLIM_INFINITY. > It does not seem real reasonable to interpret that as 100kB on any > modern platform. I'm inclined to interpret it as 4MB, which is the > same default stack limit that we use on Windows. +1. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > On 11/05/2010 05:45 PM, Tom Lane wrote: >> Anyway, what this points up is that we are making a very conservative >> assumption about what to do when getrlimit() returns RLIM_INFINITY. >> It does not seem real reasonable to interpret that as 100kB on any >> modern platform. I'm inclined to interpret it as 4MB, which is the >> same default stack limit that we use on Windows. > +1. After looking a bit closer, I think the real problem is that get_stack_depth_rlimit's API fails to distinguish between "unknown" and "unlimited". In the first case we ought to have a conservative default, whereas in the second case not. It's already the case that (a) max_stack_depth is a SUSET parameter, and (b) for either unknown or unlimited RLIMIT_STACK, we will let a superuser set whatever value he wants, and it's on his head whether that value is safe or not. That part of the behavior seems OK. What's not OK is using the same built-in default value in both cases. We need to fix it so that InitializeGUCOptions can tell the difference. If it can, I think the current default of 2MB is OK --- most people will be fine with that, and those who aren't can select some other value. regards, tom lane
On 06.11.2010 00:39, Tom Lane wrote: > Andrew Dunstan<andrew@dunslane.net> writes: >> On 11/05/2010 05:45 PM, Tom Lane wrote: >>> Anyway, what this points up is that we are making a very conservative >>> assumption about what to do when getrlimit() returns RLIM_INFINITY. >>> It does not seem real reasonable to interpret that as 100kB on any >>> modern platform. I'm inclined to interpret it as 4MB, which is the >>> same default stack limit that we use on Windows. > >> +1. > > After looking a bit closer, I think the real problem is that > get_stack_depth_rlimit's API fails to distinguish between "unknown" and > "unlimited". In the first case we ought to have a conservative default, > whereas in the second case not. It's already the case that (a) > max_stack_depth is a SUSET parameter, and (b) for either unknown or > unlimited RLIMIT_STACK, we will let a superuser set whatever value he > wants, and it's on his head whether that value is safe or not. That > part of the behavior seems OK. What's not OK is using the same > built-in default value in both cases. We need to fix it so that > InitializeGUCOptions can tell the difference. If it can, I think the > current default of 2MB is OK --- most people will be fine with that, > and those who aren't can select some other value. Yeah, I bumped into this two years ago but it didn't lead to a patch: http://archives.postgresql.org/pgsql-hackers/2008-07/msg00918.php +1 on choosing 2MB for RLIM_INFINITY. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com