Обсуждение: Re: warn if GUC set to an invalid shared library
Thanks for working on this! I tried it out and it worked for me. I reviewed the patch and didn't see any problems, but I'm not much of a C programmer. On Tue, Dec 28, 2021 at 9:45 AM Justin Pryzby <pryzby@telsasoft.com> wrote: > 0002 adds context when failing to start. > > 2021-12-27 17:01:12.996 CST postmaster[1403] WARNING: could not load library: $libdir/plugins/asdf: cannot openshared object file: No such file or directory > 2021-12-27 17:01:14.938 CST postmaster[1403] FATAL: could not access file "asdf": No such file or directory > 2021-12-27 17:01:14.938 CST postmaster[1403] CONTEXT: guc "shared_preload_libraries" > 2021-12-27 17:01:14.939 CST postmaster[1403] LOG: database system is shut down For whatever reason, I get slightly different (and somewhat redundant) output on failing to start: 2022-01-08 12:59:36.784 PST [324482] WARNING: could not load library: $libdir/plugins/totally bogus: cannot open shared object file: No such file or directory 2022-01-08 12:59:36.787 PST [324482] FATAL: could not load library: totally bogus: cannot open shared object file: No such file or directory 2022-01-08 12:59:36.787 PST [324482] LOG: database system is shut down I'm on a pretty standard Ubuntu 20.04 (with PGDG packages installed for 11, 12, and 13). I did configure to a --prefix and set LD_LIBRARY_PATH and PATH. Not sure if this is an issue in my environment or a platform difference? Also, regarding the original warning: 2022-01-08 12:57:24.953 PST [324338] WARNING: could not load library: $libdir/plugins/totally bogus: cannot open shared object file: No such file or directory I think this is pretty clear to users familiar with shared_preload_libraries. However, for someone less experienced with Postgres and just following instructions on how to set up, e.g., auto_explain (and making a typo), it's not clear from this message that your server will fail to start again after this if you shut it down (or it crashes!), and how to get out of this situation. Should we add a HINT to that effect? Similarly, for > 0001 adds WARNINGs when doing SET: > > postgres=# SET local_preload_libraries=xyz; > WARNING: could not load library: xyz: cannot open shared object file: No such file or directory > SET This works for me, but should this explain the impact (especially if used with something like ALTER ROLE)? I guess that's probably because the context may not be easily accessible. I think shared_preload_libraries is much more common, though, so I'm more interested in a warning there. Thanks, Maciek
On Sat, Jan 08, 2022 at 01:29:24PM -0800, Maciek Sakrejda wrote: > Thanks for working on this! I tried it out and it worked for me. I > reviewed the patch and didn't see any problems, but I'm not much of a > C programmer. Thanks for looking at it. I was just hacking on it myself. Unfortunately, the output for dlopen() is not portable, which (I think) means most of what I wrote can't be made to work.. Since it doesn't work to call dlopen() when using SET, I tried using just stat(). But that also fails on windows, since one of the regression tests has an invalid filename involving unbalanced quotes, which cause it to return EINVAL rather than ENOENT. So SET cannot warn portably, unless it includes no details at all (or we specially handle the windows case), or change the pre-existing regression test. But there's a 2nd instability, too, apparently having to do with timing. So I'm planning to drop the 0001 patch. > On Tue, Dec 28, 2021 at 9:45 AM Justin Pryzby <pryzby@telsasoft.com> wrote: > > 0002 adds context when failing to start. > > > > 2021-12-27 17:01:12.996 CST postmaster[1403] WARNING: could not load library: $libdir/plugins/asdf: cannot openshared object file: No such file or directory > > 2021-12-27 17:01:14.938 CST postmaster[1403] FATAL: could not access file "asdf": No such file or directory > > 2021-12-27 17:01:14.938 CST postmaster[1403] CONTEXT: guc "shared_preload_libraries" > > 2021-12-27 17:01:14.939 CST postmaster[1403] LOG: database system is shut down > > For whatever reason, I get slightly different (and somewhat redundant) > output on failing to start: > > 2022-01-08 12:59:36.784 PST [324482] WARNING: could not load library: $libdir/plugins/totally bogus: cannot open sharedobject file: No such file or directory > 2022-01-08 12:59:36.787 PST [324482] FATAL: could not load library: totally bogus: cannot open shared object file: Nosuch file or directory > 2022-01-08 12:59:36.787 PST [324482] LOG: database system is shut down I think the first WARNING is from the GUC mechanism "setting" the library. And then the FATAL is from trying to apply the GUC. It looks like you didn't apply the 0002 patch for that test so got no CONTEXT ? $ ./tmp_install/usr/local/pgsql/bin/postgres -D src/test/regress/tmp_check/data -c shared_preload_libraries=asdf 2022-01-08 16:05:00.050 CST postmaster[2588] FATAL: could not access file "asdf": No such file or directory 2022-01-08 16:05:00.050 CST postmaster[2588] CONTEXT: while loading shared libraries for GUC "shared_preload_libraries" 2022-01-08 16:05:00.050 CST postmaster[2588] LOG: database system is shut down -- Justin
Вложения
On Sat, Jan 8, 2022 at 2:07 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > Unfortunately, the output for dlopen() is not portable, which (I think) means > most of what I wrote can't be made to work.. Since it doesn't work to call > dlopen() when using SET, I tried using just stat(). But that also fails on > windows, since one of the regression tests has an invalid filename involving > unbalanced quotes, which cause it to return EINVAL rather than ENOENT. So SET > cannot warn portably, unless it includes no details at all (or we specially > handle the windows case), or change the pre-existing regression test. But > there's a 2nd instability, too, apparently having to do with timing. So I'm > planning to drop the 0001 patch. Hmm. I think 001 is a big part of the usability improvement here. Could we not at least warn generically, without relaying the underlying error? The notable thing in this situation is that the specified library could not be loaded (and that it will almost certainly cause problems on restart). The specific error would be nice to have, but it's less important. What is the timing instability? > > On Tue, Dec 28, 2021 at 9:45 AM Justin Pryzby <pryzby@telsasoft.com> wrote: > > For whatever reason, I get slightly different (and somewhat redundant) > > output on failing to start: > > > > 2022-01-08 12:59:36.784 PST [324482] WARNING: could not load library: $libdir/plugins/totally bogus: cannot open sharedobject file: No such file or directory > > 2022-01-08 12:59:36.787 PST [324482] FATAL: could not load library: totally bogus: cannot open shared object file: Nosuch file or directory > > 2022-01-08 12:59:36.787 PST [324482] LOG: database system is shut down > > I think the first WARNING is from the GUC mechanism "setting" the library. > And then the FATAL is from trying to apply the GUC. > It looks like you didn't apply the 0002 patch for that test so got no CONTEXT ? I still had the terminal open where I tested this, and the scrollback did show me applying the patch (and building after). I tried a make clean and applying the patch again, and I do see the CONTEXT line now. I'm not sure what the problem was but seems like PEBKAC--sorry about that. Thanks, Maciek