Обсуждение: Compiling 6.4 on NetBSD-current/pc532

Поиск
Список
Период
Сортировка

Compiling 6.4 on NetBSD-current/pc532

От
Jon Buller
Дата:
OK,

I have a current version of PostgreSQL running on my pc532 now.
(It's a NS32K based machine.  Somewhat of an antique really...)

I had to apply the following diffs to get it to compile, but it
then passed the spinlock test, and most of the regression tests,
including the int8 test.  It did not, however, get the datetime
test correct.

For some reason which I haven't figured out yet, it thinks the
following:

template1=> select ('epoch'::datetime) as ZeroSecs;
zerosecs
----------------------------
Fri Dec 31 18:00:00 1999 CST
(1 row)

template1=> select ('current'::datetime) as ZeroSecs;
zerosecs
----------------------------
Fri Dec 31 18:00:00 1999 CST
(1 row)

template1=> select ('now'::datetime) as ZeroSecs;
zerosecs
----------------------------
Sun Sep 13 19:05:43 1998 CDT
(1 row)


So it knows how to get the date and time, but not always...  I'd
think this was machine independent, but then I'd expect a MI problem
like that to get fixed very quickly.  So I don't know if it's a
NetBSD/pc532 problem, a NetBSD problem, or a PostgreSQL problem,
but I suspect the first due to a lack of people screaming.

I did build 6.3.2 with -DDATEDEBUG, but I'm not coherent enough
(yet?) to properly deduce anything yet.  It appeared to all be
correct until it printed the results, implying that libc or a
syscall was returning some funny constant perhaps?

Well, with all those disclaimers, here's the patch.  (Beware, I
think I have it reversed, so you probably want to type patch -r...)


*** /usr/local/pgsql/src/include/storage/s_lock.h    Fri Sep 11 19:00:55 1998
--- s_lock.h    Sat Sep 12 00:27:51 1998
***************
*** 213,219 ****
  #endif     /* NEED_I386_TAS_ASM */


! /* NS32K code is in s_lock.c */

  #endif     /* defined(__GNUC__) */

--- 213,234 ----
  #endif     /* NEED_I386_TAS_ASM */


!
! #if defined(NEED_NS32K_TAS_ASM)
!
! #define S_LOCK(lock)                \
! {                        \
!     slock_t res = 1;                \
!     while (res) {                \
!       __asm__("movqd 0, r0");            \
!       __asm__("sbitd r0, %0" : "=m"(*lock));    \
!       __asm__("sprb us, %0" : "=r" (res));    \
!       res = ((res >> 5) & 1);            \
!     }                        \
! }
!
! #endif     /* NEED_NS32K_TAS_ASM */
!

  #endif     /* defined(__GNUC__) */

*** /usr/local/pgsql/src/backend/storage/buffer/s_lock.c    Thu Sep 10 23:08:00 1998
--- s_lock.c    Sat Sep 12 00:23:04 1998
***************
*** 118,134 ****
  #endif     /* PPC */


- #if defined(__ns32k__)
- int
- tas(volatile slock_t *lock)
- {
-   int res;
-   __asm__("sbitb 0, %0" : "=m"(*lock));
-   __asm__("sprb us, %0" : "=r"(res));
-   res = (res >> 5) & 1;
-   return res;
- }
- #endif

  #else                            /* defined(__GNUC__) */
  /***************************************************************************
--- 118,123 ----


BTW, does the spinlock code need a TAS function so it can spin for
a while and then declare itself stuck, or does a second process/thread
take care of that.  It would be simpler for the NS32K to just make
the whole lock function be 2 lines of inline assembler, but that
would contain an infinite loop if the lock was stuck...

Jon Buller  <jonb@metronet.com>

Re: [HACKERS] Compiling 6.4 on NetBSD-current/pc532

От
"Thomas G. Lockhart"
Дата:
> I have a current version of PostgreSQL running on my pc532 now.
> (It's a NS32K based machine.  Somewhat of an antique really...)
> It did not, however, get the datetime test correct.
> ... I suspect (a NetBSD/pc532 problem) due to a lack of people
> screaming.

Screaming means you get to help do development :)

> I did build 6.3.2 with -DDATEDEBUG, but I'm not coherent enough
> (yet?) to properly deduce anything yet.  It appeared to all be
> correct until it printed the results, implying that libc or a
> syscall was returning some funny constant perhaps?

Ah. It's slowly coming back to me, so here are some suggestions:

"epoch" and "current" are stored internally in the database so that the
support code can pull one back out and say "oh! that's supposed to be
'epoch'", for example. So, I used _very_ small floating point numbers to
represent those special constants (numbers much smaller than one could
get by doing the usual arithmetic, at least under normal circumstances).

I'll bet that your machine is somehow pushing those numbers to be
exactly zero, which corresponds to Y2K (with the timezone offset, that
is what you are seeing). So, try looking at the numbers, and try seeing
what they are being set to. In src/include/utils/dt.h, you will see that
current and epoch are being set to DBL_MIN and -DBL_MIN, respectively.
Make sure that these are not identical to zero (they are something like
1e-308 on my machine). You can change dt.h to make these some other
usually unique number, say +/-1e-20 for now.

And track down why DBL_MIN on your machine isn't the smallest allowed
double-precision number...

                    - Tom

Re: [HACKERS] Also on SVR4 (Compiling 6.4 on NetBSD-current/pc532)

От
Frank Ridderbusch
Дата:
Thomas G. Lockhart writes:
 > > I have a current version of PostgreSQL running on my pc532 now.
 > > (It's a NS32K based machine.  Somewhat of an antique really...)
 > > It did not, however, get the datetime test correct.
 > > ... I suspect (a NetBSD/pc532 problem) due to a lack of people
 > > screaming.
 >
 > Screaming means you get to help do development :)
 >
 > > I did build 6.3.2 with -DDATEDEBUG, but I'm not coherent enough
 > > (yet?) to properly deduce anything yet.  It appeared to all be
 > > correct until it printed the results, implying that libc or a
 > > syscall was returning some funny constant perhaps?
 >
 > Ah. It's slowly coming back to me, so here are some suggestions:
 >
 > "epoch" and "current" are stored internally in the database so that the
 > support code can pull one back out and say "oh! that's supposed to be
 > 'epoch'", for example. So, I used _very_ small floating point numbers to
 > represent those special constants (numbers much smaller than one could
 > get by doing the usual arithmetic, at least under normal circumstances).
 >
 > I'll bet that your machine is somehow pushing those numbers to be
 > exactly zero, which corresponds to Y2K (with the timezone offset, that
 > is what you are seeing). So, try looking at the numbers, and try seeing
 > what they are being set to. In src/include/utils/dt.h, you will see that
 > current and epoch are being set to DBL_MIN and -DBL_MIN, respectively.
 > Make sure that these are not identical to zero (they are something like
 > 1e-308 on my machine). You can change dt.h to make these some other
 > usually unique number, say +/-1e-20 for now.
 >
 > And track down why DBL_MIN on your machine isn't the smallest allowed
 > double-precision number...
 >
 >                     - Tom

Well, finally somebody, who sees the same problems as me. I had the
same problems with 6.3, which I didn't pursue due to lack off time.

However, now I retried with the 6.4 snapshot from Sep. 11. I compiled
under Siemens SINIX (Mips based SVR4) and the system cc.

If I issue the following sql statement, the backend dies with a core.

  regression=> select ('now'::datetime - 'current'::datetime) as "Zero";

I guess the following result is ok.

  regression=> select ('current'::datetime) as "Zero";
  Zero
  -------
  current
  (1 row)

I guess, somehow the applying math operators to datetime causes the
core dumps.

Here is the GDB stack trace from the core. (Only dt.c

$ gdb ../../../bin/postgres core
GDB is free software and you are welcome to distribute copies of it
 under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.15.1 (mips-sni-sysv4), Copyright 1995 Free Software Foundation, Inc...
Core was generated by `/local/pgsql-6.4/bin/postmaster -i -p 2222 '.
Program terminated with signal 10, Bus error.
Reading symbols from /lib/libnsl.so...done.
Reading symbols from /lib/libsocket.so...done.
Reading symbols from /lib/libdl.so...done.
Reading symbols from /lib/libmutex.so...done.
Reading symbols from /usr/lib/libc.so.1...done.
(gdb) where
#0  0x51595c in tm2datetime (tm=0x7fffb334, fsec=1.0609881999384027e-314,
    tzp=0x5db6d4, result=0x542658) at dt.c:2517
#1  0x50e8cc in SetDateTime (dt=2.2250738585072022e-308) at dt.c:346
                                ^^^^^^^^^^^^^^^^^^^^^^^
                this is DBL_MIN

#2  0x510884 in datetime_mi (datetime1=0x60d370, datetime2=0x60d3d0)
    at dt.c:783
#3  0x5420b0 in fmgr_c ()
#4  0x46cf98 in ExecMakeFunctionResult ()
#5  0x46d034 in ExecEvalOper ()
#6  0x46d418 in ExecEvalExpr ()
#7  0x46d838 in ExecTargetList ()
#8  0x46dbf8 in ExecProject ()
#9  0x476e14 in ExecResult ()
#10 0x46b32c in ExecProcNode ()
#11 0x46a3e0 in ExecutePlan ()
#12 0x469ab0 in ExecutorRun ()
#13 0x4fc9dc in ProcessQueryDesc ()
#14 0x4fcabc in ProcessQuery ()
#15 0x4faa10 in pg_exec_query_dest ()
#16 0x4fa7dc in pg_exec_query ()
#17 0x4fc0a8 in PostgresMain ()
#18 0x4d6a64 in DoBackend ()
#19 0x4d6260 in BackendStartup ()
#20 0x4d5050 in ServerLoop ()
#21 0x4d4840 in PostmasterMain ()
#22 0x4875c0 in main ()
#23 0x418004 in __start ()
(gdb)

Does this give you some additional hints?

Oh, one more thing. With 6.3, when psql lost connection with the
backend, I got an error message saying so. With 6.4 it psql seems to
be simply hanging.
--
MfG/Regards
--
     /====                         Siemens Nixdorf Informationssysteme AG
    /    Ridderbusch        / ,   Abt.: OEC XS QM4
   /                       /./   Heinz Nixdorf Ring
  /=== /,== ,===/  /,==,  //    33106 Paderborn, Germany
 /    //   /   /  //   / / \   Tel.: (49) 5251-8-15211
/    /     `==/\ /    / /   \ Email: ridderbusch.pad@sni.de

Since I have taken all the Gates out of my computer, it finally works!!

Re: [HACKERS] Also on SVR4 (Compiling 6.4 on NetBSD-current/pc532)

От
"Thomas G. Lockhart"
Дата:
>  > > ... I suspect (a NetBSD/pc532 problem) due to a lack of people
>  > > screaming.
>  > Screaming means you get to help do development :)
> Well, finally somebody, who sees the same problems as me. I had the
> same problems with 6.3, which I didn't pursue due to lack off time.

Frank, do I hear you screaming? :)

> However, now I retried with the 6.4 snapshot from Sep. 11. I compiled
> under Siemens SINIX (Mips based SVR4) and the system cc.
> If I issue the following sql statement, the backend dies with a core.
>   regression=> select ('now'::datetime - 'current'::datetime);
> I guess the following result is ok.
>   regression=> select ('current'::datetime) as "Zero";

Yes.

> I guess, somehow the applying math operators to datetime causes the
> core dumps.

No. (or, not likely...)

> Here is the GDB stack trace from the core. (Only dt.c ...
> (gdb) where
> #0  0x51595c in tm2datetime (tm=0x7fffb334, fsec=1.0609881999384027e-314,
>     tzp=0x5db6d4, result=0x542658) at dt.c:2517

Here is the call in the code:

        tm2datetime(&tt, 0, NULL, &dt);

Note that fsec is not the zero-value it should be, and that tzp is not
null (zero usually) as it should be. So, somehow your system is not
converting the second argument zero in the call to
  (double) 0.0e0
and that is also screwing up the alignment for the next argument.

Here is the declaration for tm2datetime() (in include/utils/dt.h):

extern int  tm2datetime(struct tm * tm, double fsec,
                        int *tzp, DateTime *dt);

which is pretty clear on what the arguments should be. So, you can
probably change/coerce the call inside SetDateTime() to make sure that
the second argument is a double zero:

        tm2datetime(&tt, 0.0e0, NULL, &dt);

but I'm not sure why your system would be ignoring the declaration for
the routine. Are there some switches you can set on your compiler so it
does the right thing? Something like

  --stop-annoying-frank
or
  --actually-use-standard-C-conventions

?? ;-)

> Oh, one more thing. With 6.3, when psql lost connection with the
> backend, I got an error message saying so. With 6.4 it psql seems to
> be simply hanging.

One thing at a time. Although I find that mine core dumps about as well
as ever...

                        - Tom

Re: [HACKERS] Also on SVR4 (Compiling 6.4 on NetBSD-current/pc532)

От
Tom Lane
Дата:
Frank Ridderbusch <ridderbusch.pad@sni.de> writes:
> Oh, one more thing. With 6.3, when psql lost connection with the
> backend, I got an error message saying so. With 6.4 it psql seems to
> be simply hanging.

Drat, I thought I fixed that.  I better make sure all the libpq patches
I sent in have been applied...

What point is psql "hanging" at, exactly?  Does it happen when you next
try to send a command, or when?  Are you using a TCP or Unix-socket
connection to the backend?

            regards, tom lane

Re: [HACKERS] Compiling 6.4 on NetBSD-current/pc532

От
Jon Buller
Дата:
"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> wrote:

> > I have a current version of PostgreSQL running on my pc532 now.
> > (It's a NS32K based machine.  Somewhat of an antique really...)
> > It did not, however, get the datetime test correct.
> > ... I suspect (a NetBSD/pc532 problem) due to a lack of people
> > screaming.
>
> Screaming means you get to help do development :)

Doh! opened my mouth one too many times... 8-)

> > I did build 6.3.2 with -DDATEDEBUG, but I'm not coherent enough
> > (yet?) to properly deduce anything yet.  It appeared to all be
> > correct until it printed the results, implying that libc or a
> > syscall was returning some funny constant perhaps?
>
> Ah. It's slowly coming back to me, so here are some suggestions:
>
> "epoch" and "current" are stored internally in the database so that the
> support code can pull one back out and say "oh! that's supposed to be
> 'epoch'", for example. So, I used _very_ small floating point numbers to
> represent those special constants (numbers much smaller than one could
> get by doing the usual arithmetic, at least under normal circumstances).
>
> I'll bet that your machine is somehow pushing those numbers to be
> exactly zero, which corresponds to Y2K (with the timezone offset, that
> is what you are seeing). So, try looking at the numbers, and try seeing
> what they are being set to. In src/include/utils/dt.h, you will see that
> current and epoch are being set to DBL_MIN and -DBL_MIN, respectively.
> Make sure that these are not identical to zero (they are something like
> 1e-308 on my machine). You can change dt.h to make these some other
> usually unique number, say +/-1e-20 for now.
>
> And track down why DBL_MIN on your machine isn't the smallest allowed
> double-precision number...

I think I saw PostgreSQL say "0" in some of those debug messages,
I'll recompile tonight using some number that's a bit bigger and
see what happens.  I'm not totally sure that's the problem though.
(It probably used to be, since the NS32K floating point hardware
has no concept of a denormalized number, except that it can trap
to some software routine to take care of doing the real work for
it.  Ian Dall wrote all that last year, bless his heart.)

However, I just ran this test code that turns up some problems with
the libc routines, but conclusively shows that the number is not read
correctly with scanf, but is not zero either...

#include <stdio.h>
#define DBL_MIN         2.2250738585072014E-308
void main () {
  double x;
  scanf ("%g", &x);
  printf ("%g   %g %d %d\n", DBL_MIN, x, DBL_MIN == x, x == 0.0);
}

It spits out the following (notice that I cut and pasted the DBL_MIN value):
a.out
2.2250738585072014E-308
2.22507e-308   6.79039e-313 0 0

I'll chase this further with Phil Nelson the NetBSD/pc532 port-master.
Thanks everyone for all your help,

Jon Buller

Re: [HACKERS] Found it. Also on SVR4 (Compiling 6.4 on NetBSD...

От
Frank Ridderbusch
Дата:
Thanks to your suggestion, the '0' is indeed the problem. Apparently a
compiler problem, since the function prototypes are not correctly
coerced. Prefixing the 0 with (double) did it.

 > Here is the call in the code:
 >
 >         tm2datetime(&tt, 0, NULL, &dt);
 >

Thanx again.

Thomas G. Lockhart writes:
 > >  > > ... I suspect (a NetBSD/pc532 problem) due to a lack of people
 > >  > > screaming.
 > >  > Screaming means you get to help do development :)
 > > Well, finally somebody, who sees the same problems as me. I had the
 > > same problems with 6.3, which I didn't pursue due to lack off time.
 >
 > Frank, do I hear you screaming? :)
 >
 > > However, now I retried with the 6.4 snapshot from Sep. 11. I compiled
 > > under Siemens SINIX (Mips based SVR4) and the system cc.
 > > If I issue the following sql statement, the backend dies with a core.
 > >   regression=> select ('now'::datetime - 'current'::datetime);
 > > I guess the following result is ok.
 > >   regression=> select ('current'::datetime) as "Zero";
 >
 > Yes.
 >
 > > I guess, somehow the applying math operators to datetime causes the
 > > core dumps.
 >
 > No. (or, not likely...)
 >
 > > Here is the GDB stack trace from the core. (Only dt.c ...
 > > (gdb) where
 > > #0  0x51595c in tm2datetime (tm=0x7fffb334, fsec=1.0609881999384027e-314,
 > >     tzp=0x5db6d4, result=0x542658) at dt.c:2517
 >
 > Here is the call in the code:
 >
 >         tm2datetime(&tt, 0, NULL, &dt);
 >
 > Note that fsec is not the zero-value it should be, and that tzp is not
 > null (zero usually) as it should be. So, somehow your system is not
 > converting the second argument zero in the call to
 >   (double) 0.0e0
 > and that is also screwing up the alignment for the next argument.
 >
 > Here is the declaration for tm2datetime() (in include/utils/dt.h):
 >
 > extern int  tm2datetime(struct tm * tm, double fsec,
 >                         int *tzp, DateTime *dt);
 >
 > which is pretty clear on what the arguments should be. So, you can
 > probably change/coerce the call inside SetDateTime() to make sure that
 > the second argument is a double zero:
 >
 >         tm2datetime(&tt, 0.0e0, NULL, &dt);
 >
 > but I'm not sure why your system would be ignoring the declaration for
 > the routine. Are there some switches you can set on your compiler so it
 > does the right thing? Something like
 >
 >   --stop-annoying-frank
 > or
 >   --actually-use-standard-C-conventions
 >
 > ?? ;-)
 >
 > > Oh, one more thing. With 6.3, when psql lost connection with the
 > > backend, I got an error message saying so. With 6.4 it psql seems to
 > > be simply hanging.
 >
 > One thing at a time. Although I find that mine core dumps about as well
 > as ever...
 >
 >                         - Tom

Re: [HACKERS] Compiling 6.4 on NetBSD-current/pc532

От
Bruce Momjian
Дата:
Applied.

> OK,
>
> I have a current version of PostgreSQL running on my pc532 now.
> (It's a NS32K based machine.  Somewhat of an antique really...)
>
> I had to apply the following diffs to get it to compile, but it
> then passed the spinlock test, and most of the regression tests,
> including the int8 test.  It did not, however, get the datetime
> test correct.
>
> For some reason which I haven't figured out yet, it thinks the
> following:
>
> template1=> select ('epoch'::datetime) as ZeroSecs;
> zerosecs
> ----------------------------
> Fri Dec 31 18:00:00 1999 CST
> (1 row)
>
> template1=> select ('current'::datetime) as ZeroSecs;
> zerosecs
> ----------------------------
> Fri Dec 31 18:00:00 1999 CST
> (1 row)
>
> template1=> select ('now'::datetime) as ZeroSecs;
> zerosecs
> ----------------------------
> Sun Sep 13 19:05:43 1998 CDT
> (1 row)
>
>
> So it knows how to get the date and time, but not always...  I'd
> think this was machine independent, but then I'd expect a MI problem
> like that to get fixed very quickly.  So I don't know if it's a
> NetBSD/pc532 problem, a NetBSD problem, or a PostgreSQL problem,
> but I suspect the first due to a lack of people screaming.
>
> I did build 6.3.2 with -DDATEDEBUG, but I'm not coherent enough
> (yet?) to properly deduce anything yet.  It appeared to all be
> correct until it printed the results, implying that libc or a
> syscall was returning some funny constant perhaps?
>
> Well, with all those disclaimers, here's the patch.  (Beware, I
> think I have it reversed, so you probably want to type patch -r...)
>
>
> *** /usr/local/pgsql/src/include/storage/s_lock.h    Fri Sep 11 19:00:55 1998
> --- s_lock.h    Sat Sep 12 00:27:51 1998
> ***************
> *** 213,219 ****
>   #endif     /* NEED_I386_TAS_ASM */
>
>
> ! /* NS32K code is in s_lock.c */
>
>   #endif     /* defined(__GNUC__) */
>
> --- 213,234 ----
>   #endif     /* NEED_I386_TAS_ASM */
>
>
> !
> ! #if defined(NEED_NS32K_TAS_ASM)
> !
> ! #define S_LOCK(lock)                \
> ! {                        \
> !     slock_t res = 1;                \
> !     while (res) {                \
> !       __asm__("movqd 0, r0");            \
> !       __asm__("sbitd r0, %0" : "=m"(*lock));    \
> !       __asm__("sprb us, %0" : "=r" (res));    \
> !       res = ((res >> 5) & 1);            \
> !     }                        \
> ! }
> !
> ! #endif     /* NEED_NS32K_TAS_ASM */
> !
>
>   #endif     /* defined(__GNUC__) */
>
> *** /usr/local/pgsql/src/backend/storage/buffer/s_lock.c    Thu Sep 10 23:08:00 1998
> --- s_lock.c    Sat Sep 12 00:23:04 1998
> ***************
> *** 118,134 ****
>   #endif     /* PPC */
>
>
> - #if defined(__ns32k__)
> - int
> - tas(volatile slock_t *lock)
> - {
> -   int res;
> -   __asm__("sbitb 0, %0" : "=m"(*lock));
> -   __asm__("sprb us, %0" : "=r"(res));
> -   res = (res >> 5) & 1;
> -   return res;
> - }
> - #endif
>
>   #else                            /* defined(__GNUC__) */
>   /***************************************************************************
> --- 118,123 ----
>
>
> BTW, does the spinlock code need a TAS function so it can spin for
> a while and then declare itself stuck, or does a second process/thread
> take care of that.  It would be simpler for the NS32K to just make
> the whole lock function be 2 lines of inline assembler, but that
> would contain an infinite loop if the lock was stuck...
>
> Jon Buller  <jonb@metronet.com>
>
>


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
http://www.op.net/~candle              |  (610) 353-9879(w)
  +  If your life is a hard drive,     |  (610) 853-3000(h)
  +  Christ can be your backup.        |

Re: [HACKERS] Compiling 6.4 on NetBSD-current/pc532

От
dg@informix.com (David Gould)
Дата:
>
> Applied.
>
> > OK,
> >
> > I have a current version of PostgreSQL running on my pc532 now.
> > (It's a NS32K based machine.  Somewhat of an antique really...)
> >
> >
> > *** /usr/local/pgsql/src/include/storage/s_lock.h    Fri Sep 11 19:00:55 1998
> > --- s_lock.h    Sat Sep 12 00:27:51 1998
> > ***************
> > *** 213,219 ****
> >   #endif     /* NEED_I386_TAS_ASM */
> >
> >
> > ! /* NS32K code is in s_lock.c */
> >
> >   #endif     /* defined(__GNUC__) */
> >
> > --- 213,234 ----
> >   #endif     /* NEED_I386_TAS_ASM */
> >
> >
> > !
> > ! #if defined(NEED_NS32K_TAS_ASM)
> > !
> > ! #define S_LOCK(lock)                \
> > ! {                        \
> > !     slock_t res = 1;                \
> > !     while (res) {                \
> > !       __asm__("movqd 0, r0");            \
> > !       __asm__("sbitd r0, %0" : "=m"(*lock));    \
> > !       __asm__("sprb us, %0" : "=r" (res));    \
> > !       res = ((res >> 5) & 1);            \
> > !     }                        \
> > ! }
> > !
> > ! #endif     /* NEED_NS32K_TAS_ASM */
> > !
> >
> >   #endif     /* defined(__GNUC__) */
> >
> > *** /usr/local/pgsql/src/backend/storage/buffer/s_lock.c    Thu Sep 10 23:08:00 1998
> > --- s_lock.c    Sat Sep 12 00:23:04 1998
> > ***************
> > *** 118,134 ****
> >   #endif     /* PPC */
> >
> >
> > - #if defined(__ns32k__)
> > - int
> > - tas(volatile slock_t *lock)
> > - {
> > -   int res;
> > -   __asm__("sbitb 0, %0" : "=m"(*lock));
> > -   __asm__("sprb us, %0" : "=r"(res));
> > -   res = (res >> 5) & 1;
> > -   return res;
> > - }
> > - #endif
> >
> >   #else                            /* defined(__GNUC__) */
> >   /***************************************************************************
> > --- 118,123 ----
> >
> >
> > BTW, does the spinlock code need a TAS function so it can spin for
> > a while and then declare itself stuck, or does a second process/thread
> > take care of that.  It would be simpler for the NS32K to just make
> > the whole lock function be 2 lines of inline assembler, but that
> > would contain an infinite loop if the lock was stuck...
> >
> > Jon Buller  <jonb@metronet.com>

I wish I had noticed this before Bruce applied it.

The TAS function is needed so that stuck spinlocks can be recovered from.
Also, it enables the pseudo random back off which helps performance when
there are many backends.

In any case, this patch does not "follow the one true path" that I tried
to outline in s_lock.c and s_lock.h. In fact it is exactly backwards.

Basically the preferred way is:

 - in s_lock.h do nothing, the defaults should take care of you.

  -in s_lock.c define a TAS function that sets the spinlock and returns the
   previous state of the lock.

I see from your asm()s that you are using gcc. In this case, your TAS function
should be called tas(), and should be defined inside the __GNUC__ section.

-dg

David Gould            dg@informix.com           510.628.3783 or 510.305.9468
Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
 - If simplicity worked, the world would be overrun with insects. -

Re: [HACKERS] Compiling 6.4 on NetBSD-current/pc532

От
Bruce Momjian
Дата:
> >
> > Applied.
> >
> > > OK,
> > >
> > > I have a current version of PostgreSQL running on my pc532 now.
> > > (It's a NS32K based machine.  Somewhat of an antique really...)
> > >
> > >
> > > *** /usr/local/pgsql/src/include/storage/s_lock.h    Fri Sep 11 19:00:55 1998
> > > --- s_lock.h    Sat Sep 12 00:27:51 1998
> > > ***************
> > > *** 213,219 ****
> > >   #endif     /* NEED_I386_TAS_ASM */
> > >
> > >
> > > ! /* NS32K code is in s_lock.c */
> > >
> > >   #endif     /* defined(__GNUC__) */
> > >
> > > --- 213,234 ----
> > >   #endif     /* NEED_I386_TAS_ASM */
> > >
> > >
> > > !
> > > ! #if defined(NEED_NS32K_TAS_ASM)
> > > !
> > > ! #define S_LOCK(lock)                \
> > > ! {                        \
> > > !     slock_t res = 1;                \
> > > !     while (res) {                \
> > > !       __asm__("movqd 0, r0");            \
> > > !       __asm__("sbitd r0, %0" : "=m"(*lock));    \
> > > !       __asm__("sprb us, %0" : "=r" (res));    \
> > > !       res = ((res >> 5) & 1);            \
> > > !     }                        \
> > > ! }
> > > !
> > > ! #endif     /* NEED_NS32K_TAS_ASM */
> > > !
> > >
> > >   #endif     /* defined(__GNUC__) */
> > >
> > > *** /usr/local/pgsql/src/backend/storage/buffer/s_lock.c    Thu Sep 10 23:08:00 1998
> > > --- s_lock.c    Sat Sep 12 00:23:04 1998
> > > ***************
> > > *** 118,134 ****
> > >   #endif     /* PPC */
> > >
> > >
> > > - #if defined(__ns32k__)
> > > - int
> > > - tas(volatile slock_t *lock)
> > > - {
> > > -   int res;
> > > -   __asm__("sbitb 0, %0" : "=m"(*lock));
> > > -   __asm__("sprb us, %0" : "=r"(res));
> > > -   res = (res >> 5) & 1;
> > > -   return res;
> > > - }
> > > - #endif
> > >
> > >   #else                            /* defined(__GNUC__) */
> > >   /***************************************************************************
> > > --- 118,123 ----
> > >
> > >
> > > BTW, does the spinlock code need a TAS function so it can spin for
> > > a while and then declare itself stuck, or does a second process/thread
> > > take care of that.  It would be simpler for the NS32K to just make
> > > the whole lock function be 2 lines of inline assembler, but that
> > > would contain an infinite loop if the lock was stuck...
> > >
> > > Jon Buller  <jonb@metronet.com>
>
> I wish I had noticed this before Bruce applied it.
>
> The TAS function is needed so that stuck spinlocks can be recovered from.
> Also, it enables the pseudo random back off which helps performance when
> there are many backends.
>
> In any case, this patch does not "follow the one true path" that I tried
> to outline in s_lock.c and s_lock.h. In fact it is exactly backwards.
>
> Basically the preferred way is:
>
>  - in s_lock.h do nothing, the defaults should take care of you.
>
>   -in s_lock.c define a TAS function that sets the spinlock and returns the
>    previous state of the lock.
>
> I see from your asm()s that you are using gcc. In this case, your TAS function
> should be called tas(), and should be defined inside the __GNUC__ section.

The patch had been on the list for a while, so I figured it was safe.
I will back it out, and the user will have to study what is already
there and resubmit a new patch.

I belive the problem is that 6.3 users are trying to get their platforms
into the new locking code, and naturally using their old code to do it.


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
http://www.op.net/~candle              |  (610) 353-9879(w)
  +  If your life is a hard drive,     |  (610) 853-3000(h)
  +  Christ can be your backup.        |

Re: [HACKERS] Compiling 6.4 on NetBSD-current/pc532

От
Jon Buller
Дата:
dg@informix.com (David Gould) writes:

[ reversed, and possibly bad NS32K patch removed from quoted material. ]

> I wish I had noticed this before Bruce applied it.

Me too, I wrote a quick hack that worked, and I always like people
who know what's really going on to double check things when possible.

> The TAS function is needed so that stuck spinlocks can be recovered from.
> Also, it enables the pseudo random back off which helps performance when
> there are many backends.

I suspected as much, which is why I wrote a tas function to fix
the link errors, rather than try to remove the code that called
tas.

> In any case, this patch does not "follow the one true path" that I tried
> to outline in s_lock.c and s_lock.h. In fact it is exactly backwards.
>
> Basically the preferred way is:
>
>  - in s_lock.h do nothing, the defaults should take care of you.
>
>   -in s_lock.c define a TAS function that sets the spinlock and returns the
>    previous state of the lock.
>
> I see from your asm()s that you are using gcc. In this case, your TAS function
> should be called tas(), and should be defined inside the __GNUC__ section.

Your quote removes the comment that says something to the effect
of:  I think I built a bogus patch, you might need patch -r to make
it work because it's backwards. In fact you might need more than
that to get the right pathnames in the right places.  (My quoting
removes the whole patch... 8-)  I built the patch by hand, since
cvs patch spewed out a whole bunch of stuff in files I know I never
touched.  I figured they were altered in the build process or
something...

However, what I did on my machine was to remove a #define S_LOCK ...
from s_lock.h and add function tas in s_lock.c.  It wouldn't link
the way it came out of cvs checkout, but it would with the changes
I just described.

I suspect Bruce got it right, otherwise hed have a handful of
garbled code.  I think it *might* be like you describe it should
be, but please double check it, I will.  (Like I said above, the
more the better, since I don't really know what I'm doing, just
making some guesses.)

Re: [HACKERS] Compiling 6.4 on NetBSD-current/pc532

От
dg@informix.com (David Gould)
Дата:
>
> dg@informix.com (David Gould) writes:
>
> [ reversed, and possibly bad NS32K patch removed from quoted material. ]
>
> > I wish I had noticed this before Bruce applied it.
>
> Me too, I wrote a quick hack that worked, and I always like people
> who know what's really going on to double check things when possible.
>
> > The TAS function is needed so that stuck spinlocks can be recovered from.
> > Also, it enables the pseudo random back off which helps performance when
> > there are many backends.
>
> I suspected as much, which is why I wrote a tas function to fix
> the link errors, rather than try to remove the code that called
> tas.
>
> > In any case, this patch does not "follow the one true path" that I tried
> > to outline in s_lock.c and s_lock.h. In fact it is exactly backwards.
> >
> > Basically the preferred way is:
> >
> >  - in s_lock.h do nothing, the defaults should take care of you.
> >
> >   -in s_lock.c define a TAS function that sets the spinlock and returns the
> >    previous state of the lock.
> >
> > I see from your asm()s that you are using gcc. In this case, your TAS function
> > should be called tas(), and should be defined inside the __GNUC__ section.
>
> Your quote removes the comment that says something to the effect
> of:  I think I built a bogus patch, you might need patch -r to make
> it work because it's backwards. In fact you might need more than
> that to get the right pathnames in the right places.  (My quoting
> removes the whole patch... 8-)  I built the patch by hand, since
> cvs patch spewed out a whole bunch of stuff in files I know I never
> touched.  I figured they were altered in the build process or
> something...
>
> However, what I did on my machine was to remove a #define S_LOCK ...
> from s_lock.h and add function tas in s_lock.c.  It wouldn't link
> the way it came out of cvs checkout, but it would with the changes
> I just described.
>
> I suspect Bruce got it right, otherwise hed have a handful of
> garbled code.  I think it *might* be like you describe it should
> be, but please double check it, I will.  (Like I said above, the
> more the better, since I don't really know what I'm doing, just
> making some guesses.)

I completely missed the part about the patch being reversed... duh.

Ok, so if what you did is undefine the S_LOCK for your platform (to select
the default), and add a tas() function to s_lock.c then that is exactly
what was supposed to happen and I was just confused by the reversed patch.

Since Bruce un-applied your change, the simplest way to fix things from here
might be for you to submit a new (un-reversed this time) patch and after
Bruce applies it I will pull it down and verify it.

Sorry for the confusion.

-dg

David Gould            dg@informix.com           510.628.3783 or 510.305.9468
Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
 - If simplicity worked, the world would be overrun with insects. -