Обсуждение: Modulus operator returns negative values / numeric division rounds up sometimes
Modulus operator returns negative values / numeric division rounds up sometimes
От
Paul Tillotson
Дата:
Alvaro Herrera (and others) have noticed that the numeric modulus operator sometimes gives negative results even when thedividend and divisor are positive: > Oh, and while at it, it would be nice to solve the modulo bug that still lurks there: alvherre=# select 12345678901234567890 % 123; ?column? ---------- -45 (1 fila) alvherre=# select 12345678901234567890 % 123::numeric(4,1); ?column? ---------- 78.0 (1 fila) alvherre=# select 12345678901234567890 % 123::numeric(3,0); ?column? ---------- -45 (1 fila) alvherre=# select version(); version ---------------------------------------------------------------------------------------------- PostgreSQL 8.1devel on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.6 (Debian 1:3.3.6-4) (1 fila) This "bug" happens happens because modulus(a, b) is calculated as a - (truncate(a / b) * b). However, since the divisionoperator does its own rounding, then sometimes a / b has already been rounded up, such that truncate(a / b) * b isgreater than a, thus giving the negative modulus observed. Now, Alvaro called it a "bug," but the code comments appear to indicate that this is a feature: /* * div_var() - * * Division on variable level. Quotient of var1 / var2 is stored * in result. Result is ROUNDED to no more than rscale fractional digits. */ (From /src/backend/utils/adt/numeric.c Capitalization added.) It looks like the "bug" can be easily fixed by changing the end of div_var where it says round_var(result, rscale); to trunc_var(result, scale); I did so myself and recompiled, producing these results: paul=# select 12345678901234567890 % 123; ?column? ---------- 78 (1 row) paul=# select version(); version ------------------------------------------------------------------------------------------------------------ PostgreSQL 8.0.0rc1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.2 20031022 (Red Hat Linux 3.3.2-1) (1 row) I don't think anyone wants to defend the negative modulus as such, but to fix it, we have to do one of these: (1) Keep rounding division, but rewrite the numeric modulus operator to use a form of division that always rounds towardszero. This violates the identity ((a / b) * b) + (a % b) = a, where (a / b) is coerced to an integer. (Logically, this correspondsto saying that a % b is the remainder obtained when dividing a / b.) This would probably be preferred by people who use numeric division as a sort of more-precise-than-double floating pointnumber, as they want division to round to the nearest integer rather than to towards zero. or (2) Give up rounding division in favor of truncating towards zero. This would probably be preferred by people who think of numeric as a very large integer. Consider: paul=# select 9 / 5; ?column? ---------- 1 (1 row) paul=# select 9 % 5; ?column? ---------- 4 (1 row) Also, here is what you get with bc and python: bc 1.06 Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc. This is free software with ABSOLUTELY NO WARRANTY. For details type `warranty'. >>> 12345678901234567890 % 123 78 >>> 12345678901234567890 / 123 100371373180768844 >>> scale=2 >>> 12345678901234567890 / 123 100371373180768844.63 And with python's built-in multi-precision library: Python 2.2.3 (#1, Oct 15 2003, 23:33:35) [GCC 3.3.1 20030930 (Red Hat Linux 3.3.1-6)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> 12345678901234567890 % 123 78L >>> To me, having numeric division (at least numerics with no fractional part) behave like integer division seems more useful. Thoughts, anyone? Regards, Paul Tillotson
Paul Tillotson <pntil@shentel.net> writes: > I don't think anyone wants to defend the negative modulus as such, but to fix it, we have to do one of these: > (1) Keep rounding division, but rewrite the numeric modulus operator to use a form of division that always rounds towardszero. > or > (2) Give up rounding division in favor of truncating towards zero. or (3) increase the calculation precision (rscale), as suggested by Alvaro's message. Possibly that cannot work, but I haven't seen a proof. > It looks like the "bug" can be easily fixed by changing the end of div_var where it says > round_var(result, rscale); > to > trunc_var(result, scale); I cannot believe that that won't create problems at least as bad as it solves. Have you even tried the regression tests on this? regards, tom lane
Tom Lane wrote: >Paul Tillotson <pntil@shentel.net> writes: > > >>I don't think anyone wants to defend the negative modulus as such, but to fix it, we have to do one of these: >> >> > > > >>(1) Keep rounding division, but rewrite the numeric modulus operator to use a form of division that always rounds towardszero. >> >> > > > >>or >> >> > > > >>(2) Give up rounding division in favor of truncating towards zero. >> >> > >or (3) increase the calculation precision (rscale), as suggested by >Alvaro's message. > >Possibly that cannot work, but I haven't seen a proof. > > > I don't think that will work. Before switching round_var() to trunc_var() at the end of div_var(), I tried recompiling it to say div_var(var1, var2, &tmp, rscale + 1); instead of div_var(var1, var2, &tmp, rscale); Around line 4129 in mod_var(). (Which would perform the division with one extra decimal place when calculating a modulus.) It fixed the case that Alvaro used as a test, but I was still able to get a negative modulus by trying other values. I think that adding digits to rscale will cause the negative modulus to become more rare, but it will always be possible to do get it. For example, 123456789012345678980 / 123 is 100371373180768844.6341 (rounded to 4 decimal places.) If you divide with no extra decimal places you get 45 at the tens' and ones' digits. If you divide with one extra decimal place, you get 44.6, which is truncated to 44. But suppose that dividing that gave you 100371373180768844.999997 In that case, you would need to work it to at least 6 extra places before truncation would actually give you the expected 44 rather than 45, because even when working it to 5 decimal places, the carry propagation would eventually carry into the ones digit, changing the 4 to a 5. In other words, no arbitrary number of extra decimal places when calling div_var() will be always sufficient to prevent rounding up at some other decimal place. >>It looks like the "bug" can be easily fixed by changing the end of div_var where it says >> round_var(result, rscale); >>to >> trunc_var(result, scale); >> >> > >I cannot believe that that won't create problems at least as bad as it >solves. Have you even tried the regression tests on this? > > regards, tom lane > > > > <sheepish grin> No. Can you tell me how to do that? Paul
On Thu, May 26, 2005 at 08:56:34AM -0400, Paul Tillotson wrote: > Tom Lane wrote: > > >Paul Tillotson <pntil@shentel.net> writes: > In other words, no arbitrary number of extra decimal places when calling > div_var() will be always sufficient to prevent rounding up at some other > decimal place. No, an arbitrary number won't do. I found I could make it work by adding as much extra decimals as digits in the divisor. At least it worked for these test cases I made up. (Attached) > >I cannot believe that that won't create problems at least as bad as it > >solves. Have you even tried the regression tests on this? > <sheepish grin> No. Can you tell me how to do that? make installcheck in src/test/regress -- Alvaro Herrera (<alvherre[a]surnet.cl>) "Cómo ponemos nuestros dedos en la arcilla del otro. Eso es la amistad; jugar al alfarero y ver qué formas se pueden sacar del otro" (C. Halloway en La Feria de las Tinieblas, R. Bradbury)