Обсуждение: BUG #5102: Silent IN (list of strings) failure to detect syntax error when list is linewrapped

Поиск
Список
Период
Сортировка

BUG #5102: Silent IN (list of strings) failure to detect syntax error when list is linewrapped

От
"Geoff Tolley"
Дата:
The following bug has been logged online:

Bug reference:      5102
Logged by:          Geoff Tolley
Email address:      geoff.tolley@yougov.com
PostgreSQL version: 8.3.8
Operating system:   Ubuntu 8.04 x86_64
Description:        Silent IN (list of strings) failure to detect syntax
error when list is linewrapped
Details:

I would expect the last of these queries to give the same syntax error as
the penultimate one.  Reproducible on my test instance of 8.4.1 as well:


postgres=# SELECT version();
                                             version
----------------------------------------------------------------------------
---------------------
 PostgreSQL 8.3.8 on x86_64-pc-linux-gnu, compiled by GCC cc (GCC) 4.2.4
(Ubuntu 4.2.4-1ubuntu3)
(1 row)

postgres=# SELECT 'hello' WHERE '1' IN ('1', '2');
 ?column?
----------
 hello
(1 row)

postgres=# SELECT 'hello' WHERE '1' IN ('1' '2');
ERROR:  syntax error at or near "'2'"
LINE 1: SELECT 'hello' WHERE '1' IN ('1' '2');
                                         ^
postgres=# SELECT 'hello' WHERE '1' IN ('1'
postgres(#  '2');
 ?column?
----------
(0 rows)

Re: BUG #5102: Silent IN (list of strings) failure to detect syntax error when list is linewrapped

От
Alvaro Herrera
Дата:
Geoff Tolley wrote:

> postgres=# SELECT 'hello' WHERE '1' IN ('1' '2');
> ERROR:  syntax error at or near "'2'"
> LINE 1: SELECT 'hello' WHERE '1' IN ('1' '2');
>                                          ^
> postgres=# SELECT 'hello' WHERE '1' IN ('1'
> postgres(#  '2');
>  ?column?
> ----------
> (0 rows)

I guess what's happening here is this:

alvherre=# select 'a'
alvherre-# 'b';
 ?column?
----------
 ab
(1 fila)


I don't know much about this oddity but my first guess is that it's
mandated by the SQL standard.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: BUG #5102: Silent IN (list of strings) failure to detect syntax error when list is linewrapped

От
Alvaro Herrera
Дата:
Alvaro Herrera wrote:

> alvherre=# select 'a'
> alvherre-# 'b';
>  ?column?
> ----------
>  ab
> (1 fila)

Ah, yes -- saith scan.l:

/*
 * SQL requires at least one newline in the whitespace separating
 * string literals that are to be concatenated.  Silly, but who are we
 * to argue?  Note that {whitespace_with_newline} should not have * after
 * it, whereas {whitespace} should generally have a * after it...
 */

special_whitespace      ({space}+|{comment}{newline})
horiz_whitespace        ({horiz_space}|{comment})
whitespace_with_newline ({horiz_whitespace}*{newline}{special_whitespace}*)


--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: BUG #5102: Silent IN (list of strings) failure to detect syntax error when list is linewrapped

От
"Kevin Grittner"
Дата:
"Geoff Tolley" <geoff.tolley@yougov.com> wrote:

> postgres=# SELECT 'hello' WHERE '1' IN ('1'
> postgres(#  '2');

Per the SQL standard, that is the same as

SELECT 'hello' WHERE '1' IN ('12');

I believe that's intended to make it easier to code long string
literals without creating query text which has long line lengths, but
they (understandably) don't require a minimum length for the string
fragments.

-Kevin
Alvaro Herrera <alvherre@commandprompt.com> writes:
> I don't know much about this oddity but my first guess is that it's
> mandated by the SQL standard.

Yeah.  The spec defines that you can split a literal across lines this
way.  Notable quotes from SQL92:

         <separator> ::= { <comment> | <space> | <newline> }...


         1) In a <character string literal> or <national character string
            literal>, the sequence:

              <quote> <character representation>... <quote>
              <separator>... <quote> <character representation>... <quote>

            is equivalent to the sequence

              <quote> <character representation>... <character representa-
              tion>... <quote>


         4) In a <character string literal>, <national character string
            literal>, <bit string literal>, or <hex string literal>, a <sep-
            arator> shall contain a <newline>.

            regards, tom lane
... Actually, I just noticed that there *is* a bug here:

regression=# select '1' /* foo
regression*# */
regression-# '2';
ERROR:  syntax error at or near "'2'"
LINE 3: '2';
        ^
regression=#

The above should be accepted, but it isn't.  I think the problem is
here:

special_whitespace        ({space}+|{comment}{newline})

Shouldn't that be

special_whitespace        ({space}+|{comment}|{newline})

?  Although I'm not quite sure how this leads to the error, because
I didn't type anything between the second return and the '2', so in

whitespace_with_newline    ({horiz_whitespace}*{newline}{special_whitespace}*)

the {special_whitespace}* should have been satisfied with zero
repetitions no matter what.  Odd ...

            regards, tom lane

Re: BUG #5102: Silent IN (list of strings) failure to detect syntax error when list is linewrapped

От
"Kevin Grittner"
Дата:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> ... Actually, I just noticed that there *is* a bug here:
>
> regression=# select '1' /* foo
> regression*# */
> regression-# '2';
> ERROR:  syntax error at or near "'2'"
> LINE 3: '2';
>         ^
> regression=#
>
> The above should be accepted, but it isn't.

It works with the -- comment format.  Has the C format been added to
the standard, or is it an extension?  If the latter, support for it
would be up to the PostgreSQL community -- it's only a bug if we say
it is.

cc=> select 'a' --comment
'b';
 ?column?
----------
 ab
(1 row)

cc=> select 'a' -- comment
-- comment
'b';
 ?column?
----------
 ab
(1 row)

-Kevin
I wrote:
> Shouldn't that be
> special_whitespace        ({space}+|{comment}|{newline})

No, scratch that ... {comment} is the -- form of comment and the newline
is to terminate the comment.

The actual issue here seems to be that the whitespace productions only
deal with the -- form of comment, so that's the only kind you can embed
between sections of a string literal.  This is okay per SQL92 but we
didn't upgrade it to SQL99, which appears to allow /* comments between
sections too.

Right offhand that looks like it would be *vastly* more work than it'd
be worth :-( --- there's no way to do /* comments without multiple
lexer rules.

            regards, tom lane