Обсуждение: Broken type checking for empty subqueries

Поиск
Список
Период
Сортировка

Broken type checking for empty subqueries

От
Marek Malevič
Дата:
Hello,

after upgrade from PG11 to PG14.8 the following statement started failing:

select
     a::BIGINT
FROM (
     SELECT
         '$maybeNumber'::VARCHAR AS a
     WHERE NOT '$maybeNumber' LIKE '%maybeNumber'
) AS foo;

Expected (as in PG11): The query should return empty result with one column.
Actual:  SQL Error [22P02]: ERROR: invalid input syntax for type bigint: 
"$maybeNumber".

Where is a workaround that shows the behavior is quite non-stable. 
Following passes normally while still using the same subquery with one 
more empty result from another one:

select
     a::BIGINT
FROM (
     SELECT
         '$maybeNumber'::VARCHAR AS a
     WHERE NOT '$maybeNumber' LIKE '%maybeNumber'
     UNION
     SELECT '0' AS a WHERE false
) AS foo;

Thanks all for the great work you do on PG.
Marek Malevic



Re: Broken type checking for empty subqueries

От
Vik Fearing
Дата:
On 9/28/23 11:08, Marek Malevič wrote:
> Hello,
> 
> after upgrade from PG11 to PG14.8 the following statement started failing:
> 
> select
>      a::BIGINT
> FROM (
>      SELECT
>          '$maybeNumber'::VARCHAR AS a
>      WHERE NOT '$maybeNumber' LIKE '%maybeNumber'
> ) AS foo;
> 
> Expected (as in PG11): The query should return empty result with one 
> column.
> Actual:  SQL Error [22P02]: ERROR: invalid input syntax for type bigint: 
> "$maybeNumber".
> 
> Where is a workaround that shows the behavior is quite non-stable. 
> Following passes normally while still using the same subquery with one 
> more empty result from another one:
> 
> select
>      a::BIGINT
> FROM (
>      SELECT
>          '$maybeNumber'::VARCHAR AS a
>      WHERE NOT '$maybeNumber' LIKE '%maybeNumber'
>      UNION
>      SELECT '0' AS a WHERE false
> ) AS foo;

I bisected this back to 4be058fe9ec5e630239b656af21fc083371f30ed, "In 
the planner, replace an empty FROM clause with a dummy RTE."
-- 
Vik Fearing




Re: Broken type checking for empty subqueries

От
David Rowley
Дата:
On Fri, 29 Sept 2023 at 00:28, Vik Fearing <vik@postgresfriends.org> wrote:
> I bisected this back to 4be058fe9ec5e630239b656af21fc083371f30ed, "In
> the planner, replace an empty FROM clause with a dummy RTE."

This just seems to be due to the fact that that commit allowed
FROM-less subqueries to be pulled up and that now allows constant
folding to attempt to perform the cast to BIGINT. An OFFSET 0 in the
subquery would prevent the subquery being pulled up.

After the pullup, and constant folding the WHERE clause, the query becomes:

select '$maybeNumber'::bigint where false;

When we constant-fold the target list, we'll get the error as it'll
try and convert the string to a bigint.

Without that ability, we'd have to do more run-time evaluation in
queries such as:

postgres=# explain verbose select '10' + 1;
                QUERY PLAN
------------------------------------------
 Result  (cost=0.00..0.01 rows=1 width=4)
   Output: 11

but as you can see, the planner knows the value is 11.

There have been reports about these things before. For example [1].

I've not looked at the code to see if this would be practical or not,
but I wonder if we could reduce these bug reports by using the new
soft error reporting that's now done in the input functions to have
constant folding just silently not do any folding for the expression
if a cast fails.  I don't particularly see anything wrong with making
these queries work if they currently fail. I imagine most people who
have had them fail during constant folding have just redesigned or
found some hack to prevent the folding from taking place anyway.

Overall, it wouldn't be great if we had to ensure we keep these sorts
of queries working as they did in previous versions.  That would just
prevent us from adding certain optimisations that might be very
useful.

David

[1] https://www.postgresql.org/message-id/17637-5904e3fdee533c7f@postgresql.org



Re: Broken type checking for empty subqueries

От
David Rowley
Дата:
On Fri, 29 Sept 2023 at 01:41, David Rowley <dgrowleyml@gmail.com> wrote:
> I've not looked at the code to see if this would be practical or not,
> but I wonder if we could reduce these bug reports by using the new
> soft error reporting that's now done in the input functions to have
> constant folding just silently not do any folding for the expression
> if a cast fails.  I don't particularly see anything wrong with making
> these queries work if they currently fail. I imagine most people who
> have had them fail during constant folding have just redesigned or
> found some hack to prevent the folding from taking place anyway.

I got curious if this would be difficult or not and came up with the
attached attempt-to-do-it-to-see-if-it-works grade patch.

On a very quick test, it does seem to work.

Patched:
postgres=# explain verbose select '123a'::varchar::bigint where false;
-- previously failed
                QUERY PLAN
------------------------------------------
 Result  (cost=0.00..0.01 rows=1 width=8)
   Output: ('123a'::cstring)::bigint
   One-Time Filter: false
(3 rows)

postgres=# explain verbose select '123'::varchar::bigint where false;
                QUERY PLAN
------------------------------------------
 Result  (cost=0.00..0.01 rows=1 width=8)
   Output: '123'::bigint
   One-Time Filter: false
(3 rows)

Master:
postgres=# explain verbose select '123a'::varchar::bigint where false;
ERROR:  invalid input syntax for type bigint: "123a"
postgres=# explain verbose select '123'::varchar::bigint where false;
                QUERY PLAN
------------------------------------------
 Result  (cost=0.00..0.01 rows=1 width=8)
   Output: '123'::bigint
   One-Time Filter: false
(3 rows)

I didn't really follow the soft error development work done in
d9f7f5d32 very closely. I'm assuming any extension that has not
updated its type input function(s) to use errsave/ereturn instead of
elog/ereport will just continue to fail and there's nothing really we
can do about that.  Maybe that's not really a problem as constant
folding would raise errors today for invalid inputs for the given type
anyway. It would at least work nicer for all our built-in types that
did have their input functions modified to suppress invalid inputs
when there's an ErrorSaveContext.

Should we consider doing something like this?

David

Вложения

Re: Broken type checking for empty subqueries

От
Tom Lane
Дата:
David Rowley <dgrowleyml@gmail.com> writes:
> I've not looked at the code to see if this would be practical or not,
> but I wonder if we could reduce these bug reports by using the new
> soft error reporting that's now done in the input functions to have
> constant folding just silently not do any folding for the expression
> if a cast fails.

Sadly, I doubt that would cover enough of the problem space to make
much difference to people who try to do this sort of thing.

> ... I imagine most people who
> have had them fail during constant folding have just redesigned or
> found some hack to prevent the folding from taking place anyway.

Yeah.  The given query looks like it was already hacked to avoid
constant-folding, though I wonder if whoever wrote it really understood
that.  Otherwise it'd be a lot more natural to use something like CASE.
Anyway, adding OFFSET 0 to suppress sub-select folding is probably
the best answer here.

            regards, tom lane



Re: Broken type checking for empty subqueries

От
David Rowley
Дата:
On Fri, 29 Sept 2023 at 03:57, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > I've not looked at the code to see if this would be practical or not,
> > but I wonder if we could reduce these bug reports by using the new
> > soft error reporting that's now done in the input functions to have
> > constant folding just silently not do any folding for the expression
> > if a cast fails.
>
> Sadly, I doubt that would cover enough of the problem space to make
> much difference to people who try to do this sort of thing.

hmm, I guess it does nothing for stuff like overflow errors since we
didn't adjust the ereports of those functions:

We'd still get:
postgres=# explain verbose select 0x7FFFFFFF + 1 where false;
ERROR:  integer out of range

David



Re: Broken type checking for empty subqueries

От
Tom Lane
Дата:
David Rowley <dgrowleyml@gmail.com> writes:
> On Fri, 29 Sept 2023 at 03:57, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Sadly, I doubt that would cover enough of the problem space to make
>> much difference to people who try to do this sort of thing.

> hmm, I guess it does nothing for stuff like overflow errors since we
> didn't adjust the ereports of those functions:

Yeah.  I suppose we could start trying to expand the use of soft
error reporting, but it'd be a herculean, probably multi-year
task to get to where even "most cases" would be covered.

I was actually wondering after sending my previous message whether
CASE wouldn't do the job for the OP.  We do have a rule that once
a CASE test expression has reduced to constant-true or constant-false,
we skip const-folding for CASE arms that are thereby proven unreachable.
Thus for instance

regression=# select case when 1 < 2 then 42 else 1/0 end;
 case 
------
   42
(1 row)

The big trick in using this is to understand what is covered by
const-folding and what is not.  Initial conversion of a literal string
isn't, because that's done at parse time.  So this still fails:

regression=# select case when 1 < 2 then 42 else 'foo'::int end;
ERROR:  invalid input syntax for type integer: "foo"
LINE 1: select case when 1 < 2 then 42 else 'foo'::int end;
                                            ^

but you can get around that like this:

regression=# select case when 1 < 2 then 42 else 'foo'::text::int end;
 case 
------
   42
(1 row)

because the text-to-int conversion is considered a run-time not
parse-time operation.

So I wonder whether that weird-looking sub-select couldn't be converted to
a simple SELECT CASE with both more readability and better future-proofing
against optimizer improvements.

Another way is to push it all into procedural logic, ie a plpgsql function
that doesn't try to evaluate the problematic expression until it's
checked the test expression.

            regards, tom lane



Re: Broken type checking for empty subqueries

От
David Rowley
Дата:
On Fri, 29 Sept 2023 at 10:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yeah.  I suppose we could start trying to expand the use of soft
> error reporting, but it'd be a herculean, probably multi-year
> task to get to where even "most cases" would be covered.

I had a go at trying to categorise the reports we've received over the
years about this.  My search was just limited to the search term
"constant folding"

It looks like of the 9 below, the input function soft errors would
cover #0 (this report), #1 and #4.  Division by 0 covers another 2 (#5
and #8). So we could fix "most cases" if we added soft errors to
arithmetic.

Also, from looking at [1], there has been some interest in the past to
stop these surprising const-folding errors.

0. This thread.

Casting issue. text can't be parsed by type's input function.

effectively:
# explain select '$maybeNumber'::bigint where false;
ERROR:  invalid input syntax for type bigint: "$maybeNumber"

1. Bug 17637 https://www.postgresql.org/message-id/17637-5904e3fdee533c7f%40postgresql.org

Casting issue in case statement

with tmp as (select 1::int8 id, '123.4'::text v)
select t.id,
case m.mapped_to when 1 then v::float8 else null end,
case m.mapped_to when 2 then v::bool else null end
from tmp t
join tmap m on m.id = t.id;

ERROR:  invalid input syntax for type boolean: "123.4"

2. https://www.postgresql.org/message-id/CAAT35tGXUYgjjViNZ5%2B9nFkrOcmgE4ce%2BVvekjgKDy_C38RT2g%40mail.gmail.com

immutable function raising an exception

SELECT raise_exception_immutable('foo') WHERE false;
ERROR: foo

3. Bug 16545 https://www.postgresql.org/message-id/16545-affff840bc4e72ca%40postgresql.org

Complaint about coalesce evaluating arguments after the first non-NULL value.

# SELECT coalesce((SELECT 'ONE'),
                (SELECT 'TWO'
                  WHERE '123' ~
((xpath('/tag/text()','<tag>[</tag>'))[1]::TEXT)
                )
);
ERROR:  invalid regular expression: brackets [] not balanced

ereport in RE_compile_and_cache().

4. https://www.postgresql.org/message-id/C0FDEC5E-0E01-4FAB-A7A6-3FAC1F94B51E%40gmail.com

Appears an error from the t7.value::numeric case below.

-- CASE WHEN t9.is_internal_namespace = true
-- AND t9.code = 'STORAGE_POSITION.STORAGE_RACK_ROW'
-- AND (t10.code = 'INTEGER' OR t10.code = 'REAL')
-- THEN t7.value::numeric = 1
-- ELSE false
-- END

not much further details.

5. https://www.postgresql.org/message-id/A737B7A37273E048B164557ADEF4A58B17BB4EF8%40ntex2010a.host.magwien.gv.at

division by zero.

test=> CREATE FUNCTION zero() RETURNS integer IMMUTABLE LANGUAGE SQL
AS 'SELECT 0';
CREATE FUNCTION
test=> SELECT CASE WHEN (SELECT zero()) = 0 THEN -1 ELSE 1/zero() END;
ERROR: division by zero

6. https://www.postgresql.org/message-id/flat/11494.1144794560%40sss.pgh.pa.us#68696f2d794cd2d64f9c596782ee8f3a

Some procedural language error.

7. https://www.postgresql.org/message-id/flat/815.1049942992%40sss.pgh.pa.us#0874c4dffad1d389fe3f812f18039583

A very old one. Unsure if it relates to casting or a bug that was fixed.

8. https://www.postgresql.org/message-id/20020414165222.914FB475451%40postgresql.org

A very old one from 2002

SELECT
CASE
WHEN 1 = 2 THEN 1 / 0
WHEN 1 = 1 THEN 1.0
END;
ERROR: floating point exception! The last floating point operation
either exceeded legal ranges or was a divide by zero

Stopped as these reports are getting very old and less valuable.

David

[1] https://www.postgresql.org/message-id/265964.1595523454%40sss.pgh.pa.us