Обсуждение: 7.2.1 segfaults.

Поиск
Список
Период
Сортировка

7.2.1 segfaults.

От
Stephen Amadei
Дата:
Hello.

I am new to the bugs list, so I hope this hasn't been covered before, but
I have a Slackware 8.0 system that is fairly up to date.  Possibly to a
fault.  Glibc is 2.2.5, gcc is 2.95.3, zlib is 1.1.4, kernel is 2.4.18
with the GRSecurity patch.

I have been trying to run Postgres 7.2.1 in a chrooted environment, but
once I try to connect to the server with a "psql -l", it segfaults as it
tries to read from the data/global/1262.  I ran Postgres out of the chroot
and with a non-GRSecurity kernel, and it still segfaults.

I saw simular problems with a few other applications that use libz, since
I upgraded to 1.1.4 for security reasons.  If I compile Postgres 7.1.3,
it runs fine... Any ideas?

                    ----Steve
Stephen Amadei
Dandy.NET!  CTO
Atlantic City, NJ

Re: 7.2.1 segfaults.

От
Tom Lane
Дата:
Stephen Amadei <amadei@dandy.net> writes:
> I have been trying to run Postgres 7.2.1 in a chrooted environment, but
> once I try to connect to the server with a "psql -l", it segfaults as it
> tries to read from the data/global/1262.

Urgh.  Can you provide a stack trace?

            regards, tom lane

Re: 7.2.1 segfaults.

От
Stephen Amadei
Дата:
On Fri, 3 May 2002, Tom Lane wrote:

> Stephen Amadei <amadei@dandy.net> writes:
> > I have been trying to run Postgres 7.2.1 in a chrooted environment, but
> > once I try to connect to the server with a "psql -l", it segfaults as it
> > tries to read from the data/global/1262.
>
> Urgh.  Can you provide a stack trace?

You mean using strace?  Yeah.  The strace created quite a bit of logs, but
the process that segfaulted is included below.

If you need more, let me know.

                    ----Steve
Stephen Amadei
Dandy.NET!  CTO
Atlantic City, NJ


--------------------------------------------------

close(4) = 0
close(5) = 0
close(7) = 0
close(8)                                = 0
getpid()                                = 5217
rt_sigaction(SIGTERM, {0x810cfe8, [], SA_RESTART|0x4000000}, {0x80f45c8, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGQUIT, {0x810cfe8, [], SA_RESTART|0x4000000}, {0x80f45c8, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGALRM, {0x810cfe8, [], 0x4000000}, {SIG_IGN}, 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[QUIT ILL TRAP ABRT BUS FPE SEGV ALRM TERM CONT UNUSED], NULL, 8) = 0
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={60, 0}}, {it_interval={0, 0}, it_value={0, 0}}) = 0
recv(9, "\0\0\1(\0\2\0\0template1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 0) = 296
getpid()                                = 5217
send(9, "R\0\0\0\0", 5, 0)              = 5
setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 0}}, {it_interval={0, 0}, it_value={60, 0}}) = 0
rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP ABRT BUS FPE SEGV CONT UNUSED], NULL, 8) = 0
write(2, "DEBUG:  connection: host=209.128"..., 69) = 69
gettimeofday({1020378083, 363259}, {240, 0}) = 0
rt_sigaction(SIGHUP, {0x810d0a0, [], SA_RESTART|0x4000000}, {0x80f4564, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGINT, {0x810cff8, [], SA_RESTART|0x4000000}, {0x80f45c8, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGTERM, {0x810cf78, [], SA_RESTART|0x4000000}, {0x810cfe8, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGQUIT, {0x810cf44, [], SA_RESTART|0x4000000}, {0x810cfe8, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGALRM, {0x8108ef0, [], 0x4000000}, {0x810cfe8, [], 0x4000000}, 8) = 0
rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_IGN}, 8) = 0
rt_sigaction(SIGUSR1, {SIG_IGN}, {0x80f5480, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGUSR2, {0x80a8294, [], SA_RESTART|0x4000000}, {0x80f554c, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGFPE, {0x810d088, [], SA_RESTART|0x4000000}, {SIG_DFL}, 8) = 0
rt_sigaction(SIGCHLD, {SIG_DFL}, {0x80f4808, [], SA_RESTART|0x4000000}, 8) = 0
rt_sigaction(SIGTTIN, {SIG_DFL}, {SIG_IGN}, 8) = 0
rt_sigaction(SIGTTOU, {SIG_DFL}, {SIG_IGN}, 8) = 0
rt_sigaction(SIGCONT, {SIG_DFL}, {SIG_DFL}, 8) = 0
rt_sigaction(SIGWINCH, {SIG_DFL}, {SIG_DFL}, 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[QUIT ILL TRAP ABRT BUS FPE SEGV CONT UNUSED], NULL, 8) = 0
fcntl(2, F_GETFD)                       = 0
brk(0x820a000)                          = 0x820a000
brk(0x820d000)                          = 0x820d000
brk(0x8214000)                          = 0x8214000
open("/usr/local/pgsql/data/global/1262", O_RDONLY) = 4
read(4, "\0\0\0\0\f\222\20\0\7\0\0\0\34\0\244\37\0 \0 \244\37\0"..., 8192) = 8192
--- SIGSEGV (Segmentation fault) ---

Re: 7.2.1 segfaults.

От
Tom Lane
Дата:
Stephen Amadei <amadei@dandy.net> writes:
>> Urgh.  Can you provide a stack trace?

> You mean using strace?  Yeah.  The strace created quite a bit of logs, but
> the process that segfaulted is included below.

> open("/usr/local/pgsql/data/global/1262", O_RDONLY) = 4
> read(4, "\0\0\0\0\f\222\20\0\7\0\0\0\34\0\244\37\0 \0 \244\37\0"..., 8192) = 8192
> --- SIGSEGV (Segmentation fault) ---

Hmm, this does not square with your prior statement that it's a chroot
can't-call-/bin/cp issue.  Would you set things up to allow a core dump
(ie, not ulimit -c 0) and then do "gdb postgres-executable corefile"
followed by "bt"?

            regards, tom lane

Re: 7.2.1 segfaults.

От
Stephen Amadei
Дата:
On Fri, 3 May 2002, Tom Lane wrote:

> Stephen Amadei <amadei@dandy.net> writes:
> >> Urgh.  Can you provide a stack trace?
>
> > You mean using strace?  Yeah.  The strace created quite a bit of logs, but
> > the process that segfaulted is included below.
>
> > open("/usr/local/pgsql/data/global/1262", O_RDONLY) = 4
> > read(4, "\0\0\0\0\f\222\20\0\7\0\0\0\34\0\244\37\0 \0 \244\37\0"..., 8192) = 8192
> > --- SIGSEGV (Segmentation fault) ---
>
> Hmm, this does not square with your prior statement that it's a chroot
> can't-call-/bin/cp issue.

It's not.  I don't mean to confuse the two separate problems, that's why
I made two threads.  In order to be sure that neither GRSecurity or the
chroot was causing the segfault, I disabled these features and ran
postmaster as a normal user would... but I still connected via TCPIP.

> Would you set things up to allow a core dump
> (ie, not ulimit -c 0) and then do "gdb postgres-executable corefile"
> followed by "bt"?

Uh... sure.   This will take a moment.

O.K... I think I have the info.

#0 0x255843 in strncpy (s1=0xbfffead0 "n\013", s2=0x8213414 "n\013",
   n=4294967292) at ../sysdeps/generic/strncpy.c:82
#1 0x81516ab in GetRawDatabaseInfo ()
#2 0x81511fb in InitPostgres ()

I am not real familiar with gdb, so I only vaguely know what this shows,
besides stack.  And in the above info, the 'n' in "n\013" actually has a
little '~' above it, but I figured that character might get managed by the
email.

                    ----Steve
Stephen Amadei
Dandy.NET!  CTO
Atlantic City, NJ

Re: 7.2.1 segfaults.

От
Tom Lane
Дата:
Stephen Amadei <amadei@dandy.net> writes:
> #0 0x255843 in strncpy (s1=0xbfffead0 "n\013", s2=0x8213414 "n\013",
>    n=4294967292) at ../sysdeps/generic/strncpy.c:82
> #1 0x81516ab in GetRawDatabaseInfo ()
> #2 0x81511fb in InitPostgres ()

Hmm.  It looks like GetRawDatabaseInfo is reading a zero for the VARSIZE
of datpath, and then computing -4 (which strncpy will take as a huge
unsigned value) as the string length to copy.  You could try applying
a patch like this, in src/backend/utils/misc/database.c (about line
225 in current sources):

                /* Found it; extract the OID and the database path. */
                *db_id = tup.t_data->t_oid;
                pathlen = VARSIZE(&(tup_db->datpath)) - VARHDRSZ;
+               if (pathlen < 0)
+                   pathlen = 0;                /* pure paranoia */
                if (pathlen >= MAXPGPATH)
                    pathlen = MAXPGPATH - 1;    /* pure paranoia */
                strncpy(path, VARDATA(&(tup_db->datpath)), pathlen);
                path[pathlen] = '\0';

However this really shouldn't be needed; I'm wondering whether the
database's row in pg_database has been clobbered somehow.  If so,
it probably won't get much further before dying.

Two questions: does the same thing happen for all available databases?
Have you tried to create a database with a nonstandard location
(nondefault datpath)?

            regards, tom lane

Re: 7.2.1 segfaults.

От
Stephen Amadei
Дата:
On Sat, 4 May 2002, Tom Lane wrote:

> Hmm.  It looks like GetRawDatabaseInfo is reading a zero for the VARSIZE
> of datpath, and then computing -4 (which strncpy will take as a huge
> unsigned value) as the string length to copy.  You could try applying
> a patch like this, in src/backend/utils/misc/database.c (about line
> 225 in current sources):

Wierd.

> However this really shouldn't be needed; I'm wondering whether the
> database's row in pg_database has been clobbered somehow.  If so,
> it probably won't get much further before dying.

Good point.  And deleting the $PGDATA directory and recreating it
fixed it without a patch.

> Two questions: does the same thing happen for all available databases?
> Have you tried to create a database with a nonstandard location
> (nondefault datpath)?

No... I was creating the database in /usr/local/pgsql/data and then
'cp -aRp'ing it into the chroot.  So I had two copies of the same corrupt
database.

Thanks for the help.

                    ----Steve
Stephen Amadei
Dandy.NET!  CTO
Atlantic City, NJ