Обсуждение: tighten input to float4/float8/oid

Поиск
Список
Период
Сортировка

tighten input to float4/float8/oid

От
Neil Conway
Дата:
This patch makes the following changes

- emit a warning when an empty string is input to the float4, float8,
and oid types, per earlier discussion on -hackers
- emit a warning when there is trailing whitespace in the input to the
oid type, per suggestion from Tom (BTW, I couldn't see anything about
this in the standard, although I may have been looking in the wrong
place; in any case, rejecting trailing whitespace seems clearly
correct behavior to me)
- update the regression tests

The first two changes are intended as a temporary measures for
backward compatibility: 7.5 will emit warnings, but 7.6 will reject
this input. That should hopefully avoid a similar situation to the
pg_atoi() change.

I invented a new SQLSTATE code for these warnings,
ERRCODE_WARNING_DEPRECATED_FEATURE. Did I do this correctly?

-Neil

Index: doc/src/sgml/errcodes.sgml
===================================================================
RCS file: /var/lib/cvs/pgsql-server/doc/src/sgml/errcodes.sgml,v
retrieving revision 1.2
diff -c -r1.2 errcodes.sgml
*** a/doc/src/sgml/errcodes.sgml    29 Nov 2003 19:51:37 -0000    1.2
--- b/doc/src/sgml/errcodes.sgml    4 Mar 2004 02:05:49 -0000
***************
*** 95,100 ****
--- 95,104 ----
  <entry>WARNING STRING DATA RIGHT TRUNCATION</entry>
  </row>

+ <row>
+ <entry><literal>01P01</literal></entry>
+ <entry>WARNING DEPRECATED FEATURE</entry>
+ </row>

  <row>
  <entry>Class 02</entry>
Index: src/backend/utils/adt/float.c
===================================================================
RCS file: /var/lib/cvs/pgsql-server/src/backend/utils/adt/float.c,v
retrieving revision 1.96
diff -c -r1.96 float.c
*** a/src/backend/utils/adt/float.c    29 Nov 2003 19:51:58 -0000    1.96
--- b/src/backend/utils/adt/float.c    4 Mar 2004 00:31:59 -0000
***************
*** 209,214 ****
--- 209,227 ----
      }

      /*
+      * In releases prior to 7.5, we accepted an empty string as valid
+      * input (yielding a float4 of 0). In 7.5, we accept empty
+      * strings, but emit a warning noting that the feature is
+      * deprecated. In 7.6+, the warning should be replaced by an error.
+      */
+     if (num == endptr)
+         ereport(WARNING,
+                 (errcode(ERRCODE_WARNING_DEPRECATED_FEATURE),
+                  errmsg("deprecated input syntax for type real: \"\""),
+                  errdetail("this input will be rejected in "
+                            "a future release of PostgreSQL")));
+
+     /*
       * if we get here, we have a legal double, still need to check to see
       * if it's a legal float
       */
***************
*** 309,314 ****
--- 322,340 ----
                       errmsg("\"%s\" is out of range for type double precision", num)));
      }

+     /*
+      * In releases prior to 7.5, we accepted an empty string as valid
+      * input (yielding a float8 of 0). In 7.5, we accept empty
+      * strings, but emit a warning noting that the feature is
+      * deprecated. In 7.6+, the warning should be replaced by an error.
+      */
+     if (num == endptr)
+         ereport(WARNING,
+                 (errcode(ERRCODE_WARNING_DEPRECATED_FEATURE),
+                  errmsg("deprecated input syntax for type double precision: \"\""),
+                  errdetail("this input will be rejected in "
+                            "a future release of PostgreSQL")));
+
      CheckFloat8Val(val);

      PG_RETURN_FLOAT8(val);
Index: src/backend/utils/adt/oid.c
===================================================================
RCS file: /var/lib/cvs/pgsql-server/src/backend/utils/adt/oid.c,v
retrieving revision 1.54
diff -c -r1.54 oid.c
*** a/src/backend/utils/adt/oid.c    18 Feb 2004 00:01:34 -0000    1.54
--- b/src/backend/utils/adt/oid.c    4 Mar 2004 00:56:48 -0000
***************
*** 40,54 ****
       * strtoul() normally only sets ERANGE.  On some systems it also may
       * set EINVAL, which simply means it couldn't parse the input string.
       * This is handled by the second "if" consistent across platforms.
-      * Note that for historical reasons we accept an empty string as
-      * meaning 0.
       */
      if (errno && errno != ERANGE && errno != EINVAL)
          ereport(ERROR,
                  (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                   errmsg("invalid input syntax for type oid: \"%s\"",
                          s)));
!     if (endptr == s && *endptr)
          ereport(ERROR,
                  (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                   errmsg("invalid input syntax for type oid: \"%s\"",
--- 40,66 ----
       * strtoul() normally only sets ERANGE.  On some systems it also may
       * set EINVAL, which simply means it couldn't parse the input string.
       * This is handled by the second "if" consistent across platforms.
       */
      if (errno && errno != ERANGE && errno != EINVAL)
          ereport(ERROR,
                  (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                   errmsg("invalid input syntax for type oid: \"%s\"",
                          s)));
!
!     /*
!      * In releases prior to 7.5, we accepted an empty string as valid
!      * input (yielding an OID of 0). In 7.5, we accept empty strings,
!      * but emit a warning noting that the feature is deprecated. In
!      * 7.6+, the warning should be replaced by an error.
!      */
!     if (*s == '\0')
!         ereport(WARNING,
!                 (errcode(ERRCODE_WARNING_DEPRECATED_FEATURE),
!                  errmsg("deprecated input syntax for type oid: \"\""),
!                  errdetail("this input will be rejected in "
!                            "a future release of PostgreSQL")));
!
!     if (endptr == s && *s)
          ereport(ERROR,
                  (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                   errmsg("invalid input syntax for type oid: \"%s\"",
***************
*** 66,79 ****
      }
      else
      {
!         /* allow only whitespace after number */
          while (*endptr && isspace((unsigned char) *endptr))
              endptr++;
          if (*endptr)
              ereport(ERROR,
                      (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                       errmsg("invalid input syntax for type oid: \"%s\"",
                              s)));
      }

      result = (Oid) cvt;
--- 78,111 ----
      }
      else
      {
!         bool seen_whitespace = false;
!         /* allow only whitespace after number; see below */
          while (*endptr && isspace((unsigned char) *endptr))
+         {
              endptr++;
+             seen_whitespace = true;
+         }
+
          if (*endptr)
              ereport(ERROR,
                      (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
                       errmsg("invalid input syntax for type oid: \"%s\"",
                              s)));
+
+         /*
+          * In releases prior to 7.5, we allowed any number of trailing
+          * whitespace characters following the numeric portion of the
+          * OID. In 7.5, we still accept this kind of input, but issue
+          * a warning that it is deprecated. In 7.6+, we will issue an
+          * error rather than a warning.
+          */
+         if (seen_whitespace)
+             ereport(WARNING,
+                     (errcode(ERRCODE_WARNING_DEPRECATED_FEATURE),
+                      errmsg("deprecated input syntax for type oid: \"%s\"", s),
+                      errhint("trailing whitespace in this context has been deprecated"),
+                      errdetail("this input will be rejected in "
+                                "a future release of PostgreSQL")));
      }

      result = (Oid) cvt;
Index: src/include/utils/errcodes.h
===================================================================
RCS file: /var/lib/cvs/pgsql-server/src/include/utils/errcodes.h,v
retrieving revision 1.7
diff -c -r1.7 errcodes.h
*** a/src/include/utils/errcodes.h    29 Nov 2003 22:41:15 -0000    1.7
--- b/src/include/utils/errcodes.h    4 Mar 2004 02:05:04 -0000
***************
*** 44,49 ****
--- 44,50 ----
  #define ERRCODE_WARNING_IMPLICIT_ZERO_BIT_PADDING    MAKE_SQLSTATE('0','1', '0','0','8')
  #define ERRCODE_WARNING_NULL_VALUE_ELIMINATED_IN_SET_FUNCTION    MAKE_SQLSTATE('0','1', '0','0','3')
  #define ERRCODE_WARNING_STRING_DATA_RIGHT_TRUNCATION    MAKE_SQLSTATE('0','1', '0','0','4')
+ #define ERRCODE_WARNING_DEPRECATED_FEATURE    MAKE_SQLSTATE('0','1', 'P','0','1')

  /* Class 02 - No Data --- this is also a warning class per SQL99 */
  /* (do not use this class for failure conditions!) */
Index: src/test/regress/expected/oid.out
===================================================================
RCS file: /var/lib/cvs/pgsql-server/src/test/regress/expected/oid.out,v
retrieving revision 1.8
diff -c -r1.8 oid.out
*** a/src/test/regress/expected/oid.out    18 Feb 2004 00:01:34 -0000    1.8
--- b/src/test/regress/expected/oid.out    4 Mar 2004 00:35:57 -0000
***************
*** 7,13 ****
  INSERT INTO OID_TBL(f1) VALUES ('987');
  INSERT INTO OID_TBL(f1) VALUES ('-1040');
  INSERT INTO OID_TBL(f1) VALUES ('99999999');
- INSERT INTO OID_TBL(f1) VALUES ('');
  -- bad inputs
  INSERT INTO OID_TBL(f1) VALUES ('asdfasd');
  ERROR:  invalid input syntax for type oid: "asdfasd"
--- 7,12 ----
***************
*** 21,28 ****
       |        987
       | 4294966256
       |   99999999
!      |          0
! (6 rows)

  SELECT '' AS one, o.* FROM OID_TBL o WHERE o.f1 = 1234;
   one |  f1
--- 20,26 ----
       |        987
       | 4294966256
       |   99999999
! (5 rows)

  SELECT '' AS one, o.* FROM OID_TBL o WHERE o.f1 = 1234;
   one |  f1
***************
*** 37,59 ****
        |        987
        | 4294966256
        |   99999999
!       |          0
! (5 rows)

  SELECT '' AS three, o.* FROM OID_TBL o WHERE o.f1 <= '1234';
   three |  f1
  -------+------
         | 1234
         |  987
!        |    0
! (3 rows)

  SELECT '' AS two, o.* FROM OID_TBL o WHERE o.f1 < '1234';
   two | f1
  -----+-----
       | 987
!      |   0
! (2 rows)

  SELECT '' AS four, o.* FROM OID_TBL o WHERE o.f1 >= '1234';
   four |     f1
--- 35,54 ----
        |        987
        | 4294966256
        |   99999999
! (4 rows)

  SELECT '' AS three, o.* FROM OID_TBL o WHERE o.f1 <= '1234';
   three |  f1
  -------+------
         | 1234
         |  987
! (2 rows)

  SELECT '' AS two, o.* FROM OID_TBL o WHERE o.f1 < '1234';
   two | f1
  -----+-----
       | 987
! (1 row)

  SELECT '' AS four, o.* FROM OID_TBL o WHERE o.f1 >= '1234';
   four |     f1
Index: src/test/regress/sql/oid.sql
===================================================================
RCS file: /var/lib/cvs/pgsql-server/src/test/regress/sql/oid.sql,v
retrieving revision 1.4
diff -c -r1.4 oid.sql
*** a/src/test/regress/sql/oid.sql    21 Nov 2000 03:23:21 -0000    1.4
--- b/src/test/regress/sql/oid.sql    4 Mar 2004 00:33:10 -0000
***************
*** 14,21 ****

  INSERT INTO OID_TBL(f1) VALUES ('99999999');

- INSERT INTO OID_TBL(f1) VALUES ('');
-
  -- bad inputs

  INSERT INTO OID_TBL(f1) VALUES ('asdfasd');
--- 14,19 ----

Re: tighten input to float4/float8/oid

От
Tom Lane
Дата:
Neil Conway <neilc@samurai.com> writes:
> - emit a warning when there is trailing whitespace in the input to the
> oid type, per suggestion from Tom (BTW, I couldn't see anything about
> this in the standard, although I may have been looking in the wrong
> place; in any case, rejecting trailing whitespace seems clearly
> correct behavior to me)

I think this is wrong.  We silently accept leading whitespace in
(IIRC) all the numeric datatypes, and I believe we should accept
trailing whitespace too.  The SQL spec does not speak anywhere that
I can see to the external representation of datatypes, but it has
plenty to say about the behavior of casts, and it seems reasonable
to me to consider the spec's description of casting to and from
character-string data as normative for I/O.  (Especially since we
tend to use the same code for both purposes.)  In SQL92 section
6.10 <cast specification> I find

         3) If TD is exact numeric, then
            ...
            b) If SD is character string, then SV is replaced by SV with any
              leading or trailing <space>s removed.
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

              Case:

              i) If SV does not comprise a <signed numeric literal> as
                 defined by the rules for <literal> in Subclause 5.3,
                 "<literal>", then an exception condition is raised: data
                 exception-invalid character value for cast.

             ii) Otherwise, let LT be that <signed numeric literal>. The
                 <cast specification> is equivalent to

                   CAST ( LT AS TD )

This looks to me like it's perfectly clear that the input can have
leading or trailing spaces, but not leading or trailing non-space
junk.

The spec means <space> to indicate the space character only, so
allowing other whitespace characters (as the existing code with
isspace() does) is perhaps a bit arguable.  I think it's reasonable
to allow any whitespace, though.

> +                  errdetail("this input will be rejected in "
> +                            "a future release of PostgreSQL")));

Minor stylistic gripe here: errdetail and errhint messages are
supposed to be complete sentences --- this is a deliberate departure
from the style for primary errmsg messages.  So the above should be

+                  errdetail("This input will be rejected in "
+                            "a future release of PostgreSQL.")));

            regards, tom lane

Re: tighten input to float4/float8/oid

От
Neil Conway
Дата:
Tom Lane wrote:
> I think this is wrong.  We silently accept leading whitespace in
> (IIRC) all the numeric datatypes, and I believe we should accept
> trailing whitespace too.

(Sorry, I had misremembered your suggestion -- you had earlier said
that the spec probably allows for leading and trailing whitespace.)

I think there's a case to be made that we shouldn't accept either
leading or trailing whitespace, but it seems too late to go down that
path: it isn't worth breaking backward compatibility over, for one thing.

So if we're going to continue to accept leading whitespace, it seems
only reasonable to consistently accept trailing whitespace as well.

> Minor stylistic gripe here: errdetail and errhint messages are
> supposed to be complete sentences

Thanks for catching that, I'll fix it before applying.

I'll apply the patch this evening without the trailing whitespace
change; if there's a consensus that allowing trailing whitespace is
the best course, I'll post a separate patch for that shortly.

-Neil

Re: tighten input to float4/float8/oid

От
Tom Lane
Дата:
Neil Conway <neilc@samurai.com> writes:
> I think there's a case to be made that we shouldn't accept either
> leading or trailing whitespace, but it seems too late to go down that
> path: it isn't worth breaking backward compatibility over, for one thing.

To do that we'd have to use separate code for I/O and casting.  As
an example, this current behavior is unquestionably contrary to the
spec text I cited:

regression=# select '  42'::text::int;
 int4
------
   42
(1 row)

regression=# select '  42 '::text::int;
ERROR:  invalid input syntax for integer: "  42 "
regression=#

            regards, tom lane