Обсуждение: Inconsistent shift operator

Поиск
Список
Период
Сортировка

Inconsistent shift operator

От
Roman Kononov
Дата:
The below test cases show the obvious inconsistency between different
integer types.

Regards,

Roman

test=# \t
Showing only tuples.
test=# select 1::int2 << 17;
         0

test=# select 1::int4 << 33;
         2

test=# select 1::int8 << 65;
         2

test=# select 2::int2 >> 17;
         0

test=# select 2::int4 >> 33;
         1

test=# select 2::int8 >> 65;
         1

Re: Inconsistent shift operator

От
Tom Lane
Дата:
Roman Kononov <kononov@dls.net> writes:
> The below test cases show the obvious inconsistency between different
> integer types.

[ shrug... ]  The << and >> operators just expose the behavior of the
local C compiler's shift operators, and it's clearly stated in the C
spec that shifting by more than the word width produces unspecified
results.  I don't see a bug here, any more than it's a bug that the
floating point operators expose machine-specific behavior.

            regards, tom lane

Re: Inconsistent shift operator

От
Sam Mason
Дата:
On Sat, Apr 19, 2008 at 11:26:57AM -0400, Tom Lane wrote:
> Roman Kononov <kononov@dls.net> writes:
> > The below test cases show the obvious inconsistency between different
> > integer types.
>
> [ shrug... ]  The << and >> operators just expose the behavior of the
> local C compiler's shift operators, and it's clearly stated in the C
> spec that shifting by more than the word width produces unspecified
> results.

I thought that getting the correct answer was more important than
getting the "wrong" answer quickly.  The current code also introduces
its own set of strangeness in the 16 bit case on my x86_64 box.  It does
something closer to:

  int16 shl (int val, int n) {
    n %= 32;
    return n < 16 ? val << n : 0;
  }

This is exactly what the code says, it's just the resulting semantics
aren't very nice for the user.

Wouldn't it be easy to put some code like this in:

  if (arg2 < 16)
    return PG_RETURN_INT16(arg1 << arg2);
  return PG_RETURN_INT16(0);

People would have one less platform specific weirdo to worry about.

As an aside, I thought it would be interesting to see what MySQL did and
it seems to check for and handle this case--albeit only the 64bit case,
but as far as I can tell it only really knows about "long long" ints.


  Sam

Re: Inconsistent shift operator

От
Tom Lane
Дата:
Sam Mason <sam@samason.me.uk> writes:
> Wouldn't it be easy to put some code like this in:

>   if (arg2 < 16)
>     return PG_RETURN_INT16(arg1 << arg2);
>   return PG_RETURN_INT16(0);

This is a straw man.  It covers *one* of the behaviors left undefined
by the C standard.  I quote from C99:

       [#3] The integer promotions are performed  on  each  of  the
       operands.   The  type  of the result is that of the promoted
       left operand.  If the value of the right operand is negative
       or  is  greater  than  or equal to the width of the promoted
       left operand, the behavior is undefined.

       [#4] The result of E1  <<  E2  is  E1  left-shifted  E2  bit
       positions; vacated bits are filled with zeros.  If E1 has an
       unsigned type, the value of the result  is  E1<<E2,  reduced
       modulo  one more than the maximum value representable in the
       result type.  If E1 has a signed type and nonnegative value,
       and E1<<E2 is representable in the result type, then that is
       the resulting value; otherwise, the behavior is undefined.

       [#5] The result of E1 >>  E2  is  E1  right-shifted  E2  bit
       positions.  If E1 has an unsigned type or if E1 has a signed
       type and a nonnegative value, the value of the result is the
       integral part of the quotient of E1 divided by the quantity,
       2 raised to the power E2.  If E1 has a  signed  type  and  a
       negative  value,  the  resulting  value  is  implementation-
       defined.

We are shifting signed types so we are exposed to every one of these
unspecified behaviors.  In particular, since we don't know whether the
behavior of >> will be zero-fill or sign-fill, it's not going to be
exactly trivial to make a consistent extension of it to shift values
greater than the word width.

By the time you get done replicating all this for the int2, int4,
and int8 shift operators, it's not looking like such a small patch
anymore.  And I still find the premise entirely unconvincing.
Maybe the user *wants* to see the local behavior of shift, whatever
it might be.  It's certainly not impossible that we'd break applications
that worked fine before (at least on the hardware they were being
used on).

            regards, tom lane

Re: Inconsistent shift operator

От
Peter Eisentraut
Дата:
Tom Lane wrote:
> And I still find the premise entirely unconvincing.
> Maybe the user *wants* to see the local behavior of shift, whatever
> it might be.  It's certainly not impossible that we'd break applications
> that worked fine before (at least on the hardware they were being
> used on).

Certainly someone who uses shift operators is more prone to accept
architecture-dependent behaviors.  Those who like to take a more abstract
view can use regular arithmetic.

Re: Inconsistent shift operator

От
Sam Mason
Дата:
On Sun, Apr 20, 2008 at 12:27:38PM -0400, Tom Lane wrote:
> Sam Mason <sam@samason.me.uk> writes:
> > Wouldn't it be easy to put some code like this in:
> >   if (arg2 < 16)
> >     return PG_RETURN_INT16(arg1 << arg2);
> >   return PG_RETURN_INT16(0);
>
> This is a straw man.  It covers *one* of the behaviors left undefined
> by the C standard.  I quote from C99:

 [...]

wow, I never realised how little semantics C actually defines.  I'm a
fan of formally defined semantics and the above just seems like a big
cop-out.  The case of E1 being negative seems completely implementation
defined!

> We are shifting signed types so we are exposed to every one of these
> unspecified behaviors.  In particular, since we don't know whether the
> behavior of >> will be zero-fill or sign-fill, it's not going to be
> exactly trivial to make a consistent extension of it to shift values
> greater than the word width.

About the only reasonable thing I can think of that would remain within
C's spec (at least with respect of shifting) would be to always treat
E1 as an unsigned value.  This would allow it to be used consistently
with the other bit-wise operators, but would cause any non-bitwise
interpretation of the result to become implementation defined.  Not very
nice.

> By the time you get done replicating all this for the int2, int4,
> and int8 shift operators, it's not looking like such a small patch
> anymore.  And I still find the premise entirely unconvincing.

After an afternoon of getting utterly bogged down looking into what
other languages do and getting very distracted with ring theory I'm
tempted to reluctantly agree.  Maybe the warning about floats could be
extended to cover the shift operators as well.  Maybe:

  The bit shift operators only return consistent results when both the
  RHS is within the bit-size of the arguments' data type and the LHS is
  positive, in all other cases the behaviour is platform specific.

I think that's technically correct, but seems to come across as very
pessimistic.

> Maybe the user *wants* to see the local behavior of shift, whatever
> it might be.  It's certainly not impossible that we'd break applications
> that worked fine before (at least on the hardware they were being
> used on).

Yes.  It always surprises me how hard getting formal semantics into an
existing language seems to be.


  Sam