Обсуждение: Backend working directories and absolute file paths
Ciprian Popovici discovered an entirely new way to break the safety interlocks that are meant to prevent you from starting a postmaster in a data directory of the wrong version: http://archives.postgresql.org/pgsql-general/2005-06/msg01349.php While one could say this is pilot error, it's still annoying that the database manages to hose itself so thoroughly. The problem as I see it is that we address all data files (including xlog, pg_control, etc) via absolute path names, and so renaming a different data directory into place exposes its contents to being clobbered by the already-running postmaster. What I am speculating about is:1. At postmaster start (or standalone backend start), chdir into $PGDATA.2. Henceforth,address everything under $PGDATA by relative paths; don't use DataDir in the path at all. This way, if someone moves a data directory with a running postmaster in it, nothing breaks at all. It would probably run a bit faster too, since file open calls would have fewer directories to traverse through. The only downside I can see to it is that backend and postmaster crashes would all consistently dump core into $PGDATA (on platforms where cores dump into the working directory, which is many but not all). The current arrangement makes backends dump core into the subdirectory for the database they are in, which sometimes makes it a bit easier to identify what's what. But I can't see that that's a valuable enough property to override the advantages of using relative paths. Thoughts? regards, tom lane
On Thu, Jun 30, 2005 at 10:55:58AM -0400, Tom Lane wrote: > Ciprian Popovici discovered an entirely new way to break the safety > interlocks that are meant to prevent you from starting a postmaster > in a data directory of the wrong version: > http://archives.postgresql.org/pgsql-general/2005-06/msg01349.php > While one could say this is pilot error, it's still annoying that > the database manages to hose itself so thoroughly. There will always be a way for a user with enough knowlege to hose a database completely. I think it's significant that Mr. Popovici is the first to manage this one, in the sense that it takes an especially creative combination of a little knowlege and rushing in where angels fear to tread to reproduce the problem. There will never be a solution to human foolishness, so I say we just tell him and others like him to restore from backup and move on. Just my $.02 Cheers, D -- David Fetter david@fetter.org http://fetter.org/ phone: +1 510 893 6100 mobile: +1 415 235 3778 Remember to vote!
David Fetter <david@fetter.org> writes: > On Thu, Jun 30, 2005 at 10:55:58AM -0400, Tom Lane wrote: >> Ciprian Popovici discovered an entirely new way to break the safety >> interlocks that are meant to prevent you from starting a postmaster >> in a data directory of the wrong version: >> http://archives.postgresql.org/pgsql-general/2005-06/msg01349.php >> While one could say this is pilot error, it's still annoying that >> the database manages to hose itself so thoroughly. > There will always be a way for a user with enough knowlege to hose a > database completely. I think it's significant that Mr. Popovici is > the first to manage this one, in the sense that it takes an especially > creative combination of a little knowlege and rushing in where angels > fear to tread to reproduce the problem. There will never be a > solution to human foolishness, so I say we just tell him and others > like him to restore from backup and move on. Well, I'm not sure that he's the first to manage it --- he's just the first to report it in an identifiable way (which is the usual criterion for assigning credit for discoveries ;-)). Renaming data directories around is not that uncommon, especially if you're using a platform that really really wants the active database to be /var/lib/pgsql/data (if you're running Red Hat's current selinux policy, you don't have a whole lotta choice about that). All you have to do is rename and shut down the postmaster in the wrong order, and you're hosed. (The terminating checkpoint will be able to write some files and not others, depending on what it already had open, so I think this could be a recipe for corrupting the moved-away database as well as the moved-in one :-() Do you have a specific objection to switching over to relative paths, or are you just saying that this one report doesn't excite you enough to change it? regards, tom lane
On Thu, Jun 30, 2005 at 11:42:59AM -0400, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > On Thu, Jun 30, 2005 at 10:55:58AM -0400, Tom Lane wrote: > >> Ciprian Popovici discovered an entirely new way to break the safety > >> interlocks that are meant to prevent you from starting a postmaster > >> in a data directory of the wrong version: > >> http://archives.postgresql.org/pgsql-general/2005-06/msg01349.php > > >> While one could say this is pilot error, it's still annoying that > >> the database manages to hose itself so thoroughly. > > > There will always be a way for a user with enough knowlege to hose a > > database completely. I think it's significant that Mr. Popovici is > > the first to manage this one, in the sense that it takes an especially > > creative combination of a little knowlege and rushing in where angels > > fear to tread to reproduce the problem. There will never be a > > solution to human foolishness, so I say we just tell him and others > > like him to restore from backup and move on. > > Well, I'm not sure that he's the first to manage it --- he's just the > first to report it in an identifiable way (which is the usual criterion > for assigning credit for discoveries ;-)). True ;) > Renaming data directories around is not that uncommon, With all due respect, I believe that this falls under the category of prying off cover plates. When people do this, they're responsible for knowing what they're about, and taking the consequences if they don't. In other words, it's pilot error, and that's Not Our Problem. > especially if you're using a platform that really really wants the > active database to be /var/lib/pgsql/data (if you're running Red > Hat's current selinux policy, you don't have a whole lotta choice > about that). All you have to do is rename and shut down the > postmaster in the wrong order, and you're hosed. (The terminating > checkpoint will be able to write some files and not others, > depending on what it already had open, so I think this could be a > recipe for corrupting the moved-away database as well as the > moved-in one :-() > > Do you have a specific objection to switching over to relative > paths, or are you just saying that this one report doesn't excite > you enough to change it? The latter, because I believe that this isn't a situation a reasonable person can stumble into. Cheers, D -- David Fetter david@fetter.org http://fetter.org/ phone: +1 510 893 6100 mobile: +1 415 235 3778 Remember to vote!
Tom Lane wrote: >What I am speculating about is: > 1. At postmaster start (or standalone backend start), > chdir into $PGDATA. > 2. Henceforth, address everything under $PGDATA by > relative paths; don't use DataDir in the path at all. > >This way, if someone moves a data directory with a running postmaster >in it, nothing breaks at all. It would probably run a bit faster too, >since file open calls would have fewer directories to traverse through. > > Makes plenty of sense, and is a common way of working. >The only downside I can see to it is that backend and postmaster crashes >would all consistently dump core into $PGDATA (on platforms where cores >dump into the working directory, which is many but not all). The >current arrangement makes backends dump core into the subdirectory for >the database they are in, which sometimes makes it a bit easier to >identify what's what. But I can't see that that's a valuable enough >property to override the advantages of using relative paths. > > > > Maybe I have misunderstood. Could the backends not chdir into the db subdir and then do everything relative to that (using .. if necessary)? How does this all play with tablespaces? cheers andrew
David Fetter wrote: >On Thu, Jun 30, 2005 at 11:42:59AM -0400, Tom Lane wrote: > > >>Renaming data directories around is not that uncommon, >> >> > >With all due respect, I believe that this falls under the category of >prying off cover plates. When people do this, they're responsible for >knowing what they're about, and taking the consequences if they don't. > >In other words, it's pilot error, and that's Not Our Problem. > > > We provide many defences against pilot error. So does the Air Force - that's part of why you see pilots wearing parachutes. More to the point, there's not much compelling reason *not* to do this. cheers andrew
On Thu, Jun 30, 2005 at 02:31:01PM -0400, Andrew Dunstan wrote: > > > David Fetter wrote: > > >On Thu, Jun 30, 2005 at 11:42:59AM -0400, Tom Lane wrote: > > > > > >>Renaming data directories around is not that uncommon, > >> > >> > > > >With all due respect, I believe that this falls under the category > >of prying off cover plates. When people do this, they're > >responsible for knowing what they're about, and taking the > >consequences if they don't. > > > >In other words, it's pilot error, and that's Not Our Problem. > > We provide many defences against pilot error. So does the Air Force > - that's part of why you see pilots wearing parachutes. > > More to the point, there's not much compelling reason *not* to do > this. OK, let's. I'm hesitant to talk about doing new stuff, as I'm still not qualified to do any of it, and there are things that think are higher priority ahead of this. Cheers, D -- David Fetter david@fetter.org http://fetter.org/ phone: +1 510 893 6100 mobile: +1 415 235 3778 Remember to vote!
Andrew Dunstan <andrew@dunslane.net> writes: > Maybe I have misunderstood. Could the backends not chdir into the db > subdir and then do everything relative to that (using .. if necessary)? If we do that then the path to things from the postmaster is different than it is for the children, which is going to make things quite a bit more complicated (eg, md.c will have to be aware of whether it is running in a backend or the bgwriter). I'm certain we can make it work if everyplace uses the same relative paths, but I'm less certain about the reliability of using varying paths. Also that would break setups where $PGDATA/base or one of its immediate children is a symlink. Now the need to set things up that way is certainly a lot less than it was before we had tablespaces, but I'm still inclined to avoid depending on .. for addressing stuff. > How does this all play with tablespaces? I don't think it matters, since we address those via pg_tblspc anyway. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > This way, if someone moves a data directory with a running postmaster > in it, nothing breaks at all. It would probably run a bit faster too, > since file open calls would have fewer directories to traverse through. On reasonable platforms the time spent traversing shouldn't be a problem -- however if there are a lot of metadata operations happening at the same time absolute file paths can cause contention, especially on the root and first few path elements. > The only downside I can see to it is that backend and postmaster crashes > would all consistently dump core into $PGDATA (on platforms where cores > dump into the working directory, which is many but not all). The > current arrangement makes backends dump core into the subdirectory for > the database they are in, which sometimes makes it a bit easier to > identify what's what. But I can't see that that's a valuable enough > property to override the advantages of using relative paths. Having dumps occur in per-database directories vs per-cluster directories isn't really that big a deal. However it might be nice to have dumps go to a configurable place. Even to a place to can be set by a session settable GUC. That would make debugging by non-root users feasible. (You might need a second GUC to enable this feature for security reasons though). There's another approach that seems more robust. When initdb is run randomly generate a unique id. Then whenever creating files include that unique id in the first block of the file. Whenever you open a file sanity check the first block. If it doesn't match PANIC immediately. (hm, actually you don't even need to PANIC, jut shutting the one backend should be enough.) This would ensure that you don't accidentally restore the wrong files from your cold backup too. Or anything else anyone might try involving swapping files around. -- greg
Greg Stark <gsstark@mit.edu> writes: > However it might be nice to have dumps go to a configurable place. You'd have to talk to your kernel provider about that one; we don't have any direct control over where or even whether core dumps occur. > There's another approach that seems more robust. When initdb is run randomly > generate a unique id. Then whenever creating files include that unique id in > the first block of the file. Whenever you open a file sanity check the first > block. If it doesn't match PANIC immediately. (hm, actually you don't even > need to PANIC, jut shutting the one backend should be enough.) This adds overhead, rather than removing it as I was hoping to do. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Greg Stark <gsstark@mit.edu> writes: > > However it might be nice to have dumps go to a configurable place. > > You'd have to talk to your kernel provider about that one; we don't have > any direct control over where or even whether core dumps occur. Well on most platforms setting the cwd would suffice. This would also potentially allow you to control profiling output (though I suspect that gets created at fork time, which would be too late) and other such things. For that matter, would depending on the cwd interact well with trusted Pl languages that can change the cwd? Would they have to guarantee to set it back when they're done? > This adds overhead, rather than removing it as I was hoping to do. That's true. Hm. If the id were short it could go on every page. -- greg
Greg Stark <gsstark@mit.edu> writes: > For that matter, would depending on the cwd interact well with trusted Pl > languages that can change the cwd? That would definitely be in the category of "don't do that" --- but there are such a long list of ways to hose your backend in a trusted PL that adding one more doesn't make me blink. regards, tom lane
Tom Lane wrote: > You'd have to talk to your kernel provider about that one; we don't > have any direct control over where or even whether core dumps occur. Apache used to have (still has?) a way to configure that. I think they must have done the chdir() in the SIGSEGV handler. Not that I'm proposing we do that... ;-) -- Peter Eisentraut http://developer.postgresql.org/~petere/