Обсуждение: "A block containing an EXCEPTION clause is significantly more expensive to enter and exit than a block without one"

Поиск
Список
Период
Сортировка
Does the “Tip” call-out box, from which the “Subject” here is copied, and the larger story that I copied below, apply even when the executable section of the block statement in question does nothing at all that could be rolled back?

This is my block:

begin
  year_as_int := year_as_text;
exception
  when invalid_text_representation then
    bad_integer := true;
end;


The typecast to integer will cause an error if the input text does not represent an integer. I need to detect this, when I get such a text value, and then go on to do something appropriate after the block. My tests show that I get exactly the behavior that I want. It would be tedious to program the test (presumably by typecasting to a numeric value and comparing that with the result of rounding it) to avoid the "significant" performance penalty that the doc describes.

It seems to me that the PG implementation has, in a block that has an exception section, an opportunity to wait to start a subtraction until it first hits a statement whose effect could be rolled back—and, of course, not do this if it doesn't hit such a statement. I see that this would require first preparing the statement and only then, if it needs it, starting its subtransaction. But this sounds doable.
________________________________________________________________________________

The current PL/pgSQL doc, in this section:

43.6.8. Trapping Errors

says this:

When an error is caught by an EXCEPTION clause, the local variables of the PL/pgSQL function remain as they were when the error occurred, but all changes to persistent database state within the block are rolled back. 

A block containing an EXCEPTION clause is significantly more expensive to enter and exit than a block without one. Therefore, don't use EXCEPTION without need.

and then on Stack Overflow, here

Tom Lane said:

Yeah, it's about the overhead of setting up and ending a subtransaction. That's a fairly expensive mechanism, but we don't have anything cheaper that is able to recover from arbitrary errors.

and David G. Johnston said:

Setting up the pl/pgsql execution layer to trap "arbitrary SQL-layer exceptions"​ is fairly expensive. Even if the user specifies specific errors the error handling mechanism in pl/pgsql is code for generic (arbitrary) errors being given to it.

Hi

po 13. 6. 2022 v 7:45 odesílatel Bryn Llewellyn <bryn@yugabyte.com> napsal:
Does the “Tip” call-out box, from which the “Subject” here is copied, and the larger story that I copied below, apply even when the executable section of the block statement in question does nothing at all that could be rolled back?

This is my block:

begin
  year_as_int := year_as_text;
exception
  when invalid_text_representation then
    bad_integer := true;
end;


The typecast to integer will cause an error if the input text does not represent an integer. I need to detect this, when I get such a text value, and then go on to do something appropriate after the block. My tests show that I get exactly the behavior that I want. It would be tedious to program the test (presumably by typecasting to a numeric value and comparing that with the result of rounding it) to avoid the "significant" performance penalty that the doc describes.

It seems to me that the PG implementation has, in a block that has an exception section, an opportunity to wait to start a subtraction until it first hits a statement whose effect could be rolled back—and, of course, not do this if it doesn't hit such a statement. I see that this would require first preparing the statement and only then, if it needs it, starting its subtransaction. But this sounds doable.

The block is a PLpgSQL statement (internally it is not just syntax) - and a safepoint is created before execution of any statement inside the block, when the block has EXCEPTION parts. PL/pgSQL is an AST interpreter. It knows nothing about statements inside the block. The lazy implementation theoretically can be possible, but why? Any statement can raise an exception. There is not any evidence what statements are safe and what not.

Regards

Pavel


 
________________________________________________________________________________

The current PL/pgSQL doc, in this section:

43.6.8. Trapping Errors

says this:

When an error is caught by an EXCEPTION clause, the local variables of the PL/pgSQL function remain as they were when the error occurred, but all changes to persistent database state within the block are rolled back. 

A block containing an EXCEPTION clause is significantly more expensive to enter and exit than a block without one. Therefore, don't use EXCEPTION without need.

and then on Stack Overflow, here

Tom Lane said:

Yeah, it's about the overhead of setting up and ending a subtransaction. That's a fairly expensive mechanism, but we don't have anything cheaper that is able to recover from arbitrary errors.

and David G. Johnston said:

Setting up the pl/pgsql execution layer to trap "arbitrary SQL-layer exceptions" is fairly expensive. Even if the user specifies specific errors the error handling mechanism in pl/pgsql is code for generic (arbitrary) errors being given to it.

pavel.stehule@gmail.com wrote:

bryn@yugabyte.com napsal:

Does the “Tip” call-out box, from which the “Subject” here is copied, and the larger story that I copied below, apply even when the executable section of the block statement in question does nothing at all that could be rolled back?

This is my block:

begin
  year_as_int := year_as_text;
exception
  when invalid_text_representation then
    bad_integer := true;
end;


The typecast to integer will cause an error if the input text does not represent an integer.

The block is a PLpgSQL statement (internally it is not just syntax) - and a safepoint is created before execution of any statement inside the block, when the block has EXCEPTION parts. PL/pgSQL is an AST interpreter. It knows nothing about statements inside the block. The lazy implementation theoretically can be possible, but why? Any statement can raise an exception. There is not any evidence what statements are safe and what not.

Thanks for the quick response. I'll take this to mean that the present behavior will never change—in spite of your:

The lazy implementation [that Bryn sketched] theoretically can be possible

You then said:

but why?

OK, so I'm obliged to answer.

Because SQL rests on the principle that you just say *what* you want but not *how*. Here, I want to cast my string, which putatively represents an integer, to an "int" value. The text comes from the outside world, and what is meant to be "42017" might arrive as "42O17". Or, might arrive properly, as "42000e-03".

Consider this:

create function is_int_nn(t in text)
  returns boolean
  language plpgsql
as $body$
declare
  ok boolean not null := (t is not null);
begin
  if ok then
    -- Placeholder naïve REGEXPR test.
    ok := (t != '') and not (t ~ 'x');
  end if;

  if ok then
    declare
      n  constant numeric not null := t;
      r  constant numeric not null := round(n);
    begin
      ok := (r = n);
    end;
  end if;

  return ok;
end;
$body$;

select
  (select is_int_nn(null       )::text) as test_1,
  (select is_int_nn(''         )::text) as test_2,
  (select is_int_nn('42000x-04')::text) as test_3,
  (select is_int_nn('42000e-04')::text) as test_4,
  (select is_int_nn('42000e-03')::text) as test_5;


Neither the design nor the implementation of the code is by any means finished yet. And this holds, therefore, for the tests too. So this is a loud denial of « just say *what* you want ».

You might argue that any junior programmer could manage the complete exercise in a morning. But then somebody else has to review it. And it's another artifact to be managed. Generic utilities like this always present a challenge when they need to be used in more than one distinct application. You need to invent a "single source of truth" scheme.

Compare everything that "function is_int_nn(t in text)" implies with the block that I showed above.

Oracle Database 12c Release 2 (and later) has a validate_conversion() built-in.

https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/VALIDATE_CONVERSION.html#GUID-DC485EEB-CB6D-42EF-97AA-4487884CB2CD

This arrived in (some time around) 2017.

Is there any chance that PG might ship an equivalent?





po 13. 6. 2022 v 21:02 odesílatel Bryn Llewellyn <bryn@yugabyte.com> napsal:
pavel.stehule@gmail.com wrote:

bryn@yugabyte.com napsal:

Does the “Tip” call-out box, from which the “Subject” here is copied, and the larger story that I copied below, apply even when the executable section of the block statement in question does nothing at all that could be rolled back?

This is my block:

begin
  year_as_int := year_as_text;
exception
  when invalid_text_representation then
    bad_integer := true;
end;


The typecast to integer will cause an error if the input text does not represent an integer.

The block is a PLpgSQL statement (internally it is not just syntax) - and a safepoint is created before execution of any statement inside the block, when the block has EXCEPTION parts. PL/pgSQL is an AST interpreter. It knows nothing about statements inside the block. The lazy implementation theoretically can be possible, but why? Any statement can raise an exception. There is not any evidence what statements are safe and what not.

Thanks for the quick response. I'll take this to mean that the present behavior will never change—in spite of your:

The lazy implementation [that Bryn sketched] theoretically can be possible

You then said:

but why?

OK, so I'm obliged to answer.

Because SQL rests on the principle that you just say *what* you want but not *how*. Here, I want to cast my string, which putatively represents an integer, to an "int" value. The text comes from the outside world, and what is meant to be "42017" might arrive as "42O17". Or, might arrive properly, as "42000e-03".

Consider this:

create function is_int_nn(t in text)
  returns boolean
  language plpgsql
as $body$
declare
  ok boolean not null := (t is not null);
begin
  if ok then
    -- Placeholder naïve REGEXPR test.
    ok := (t != '') and not (t ~ 'x');
  end if;

  if ok then
    declare
      n  constant numeric not null := t;
      r  constant numeric not null := round(n);
    begin
      ok := (r = n);
    end;
  end if;

  return ok;
end;
$body$;

select
  (select is_int_nn(null       )::text) as test_1,
  (select is_int_nn(''         )::text) as test_2,
  (select is_int_nn('42000x-04')::text) as test_3,
  (select is_int_nn('42000e-04')::text) as test_4,
  (select is_int_nn('42000e-03')::text) as test_5;


Neither the design nor the implementation of the code is by any means finished yet. And this holds, therefore, for the tests too. So this is a loud denial of « just say *what* you want ».

You might argue that any junior programmer could manage the complete exercise in a morning. But then somebody else has to review it. And it's another artifact to be managed. Generic utilities like this always present a challenge when they need to be used in more than one distinct application. You need to invent a "single source of truth" scheme.

Compare everything that "function is_int_nn(t in text)" implies with the block that I showed above.

Oracle Database 12c Release 2 (and later) has a validate_conversion() built-in.

https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/VALIDATE_CONVERSION.html#GUID-DC485EEB-CB6D-42EF-97AA-4487884CB2CD

This arrived in (some time around) 2017.

Is there any chance that PG might ship an equivalent?

I remember a long and still not closed discussion about fault tolerant copy implementation. The problem is a lot of possibly redundant code for exception safe input functions, if I remember well. And it is not enough for fault tolerant copy still. Maybe it needs some refactoring of the PostgreSQL exceptions handling system to be able to handle some exceptions that come from a non-storage engine without the necessity to use safepoints. I have no idea if somebody is working on this issue now, but I don't expect so it is easy to fix it. Maybe a more probable fix can be to reduce an overhead of savepoints. This issue is more complex than can be visible from user perspective - and the complexity is based on how pg has implemented exceptions.

Regards

Pavel

Bryn Llewellyn <bryn@yugabyte.com> writes:
> OK, so I'm obliged to answer.
> Because SQL rests on the principle that you just say *what* you want but
> not *how*.

It also rests on the principle that the programmer shouldn't be too
concerned about micro-efficiencies.  You've given a perfectly good
six-line implementation of what you want; use it and be happy.

> Oracle Database 12c Release 2 (and later) has a validate_conversion() built-in.

[ shrug... ]  We are not Oracle.  One of the main ways in which we
are not Oracle is that we support extensible database functionality.
To write a "validate_conversion" function that supports extension
datatypes, but doesn't use something morally equivalent to a
subtransaction, would be a nightmare: large, fragile, and probably
not all that much faster.

            regards, tom lane




> On Jun 12, 2022, at 23:07, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> The lazy implementation theoretically can be possible, but why?

Isn't one of the reasons for the savepoint (in fact, the principal reason) to reset the connection back to non-error
stateso that execution can continue?  In that case, it really does need to create the savepoint at the start of the
block,regardless of what's in it, since any statement can raise an error. 


xof@thebuild.com wrote:

Isn't one of the reasons for the savepoint (in fact, the principal reason) to reset the connection back to non-error state so that execution can continue? In that case, it really does need to create the savepoint at the start of the block, regardless of what's in it, since any statement can raise an error.

tgl@sss.pgh.pa.us wrote:

It also rests on the principle that the programmer shouldn't be too concerned about micro-efficiencies. You've given a perfectly good six-line implementation of what you want; use it and be happy.

...

[ shrug... ] We are not Oracle. One of the main ways in which we are not Oracle is that we support extensible database functionality. To write a "validate_conversion" function that supports extension datatypes, but doesn't use something morally equivalent to a subtransaction, would be a nightmare: large, fragile, and probably not all that much faster.


I remember a long and still not closed discussion about fault tolerant copy implementation. The problem is a lot of possibly redundant code for exception safe input functions, if I remember well. And it is not enough for fault tolerant copy still. Maybe it needs some refactoring of the PostgreSQL exceptions handling system to be able to handle some exceptions that come from a non-storage engine without the necessity to use safepoints. I have no idea if somebody is working on this issue now, but I don't expect so it is easy to fix it. Maybe a more probable fix can be to reduce an overhead of savepoints. This issue is more complex than can be visible from user perspective - and the complexity is based on how pg has implemented exceptions.
——————————

Thanks, Tom Lane, for your reply. The note in the doc:

A block containing an EXCEPTION clause is significantly more expensive to enter and exit than a block without one. 

uses the word "significantly". This scares the application programmer. Would you (all) consider this revised wording:

« If the executable section of a block that has an exception section makes changes to the database, then this brings significantly more expense than when there is no exception section. However, if the executable section makes only changes to local variables or session parameters, then the additional expense brought by an exception section is negligible. »

Oracle Database users had to wait about 40 years for the "validate_conversion()" built-in—despite strong and persistent user-pressure. PostreSQL is about 25 years old. So there's plenty of time to reconsider...
——————————

Thanks, Pavel Stehule, for your reply. Forget savepoints and blocks with exception sections. And forget the general case that Oracle's "validate_conversion()" handles.

It would seem to me that the code that implements "year_as_int := year_as_text" and, maybe, says "text into int won't go", is distinct from the code that it notifies to raise an error. It ought to be possible to factor out this code for re-use and use it both to do what it presently does and to implement a built-in "is_int_nn(t in text)". And maybe the same for numeric and for timestamp[tz]. These must be the common cases when such putative values come in from the UI. Proper practice insists on re-validating the conversions in the data base even though it's to be hoped that the UI will have done this.

However, in the light of Tom's « You've given a perfectly good six-line implementation of what you want; use it and be happy. », the need (almost) vanishes—except for the point that I mentioned earlier about packaging up the test in a the application development show's "utilities" kit.
——————————

Thanks, Christophe Pettus, for your reply.

it really does need to create the savepoint at the start of the block, regardless of what's in it, since any statement can raise an error.

This is necessary only when it can't be proved that the executable section that precedes the exception section cannot make database changes—hereinafter "is safe". Such safety tests are always over cautious. So, for example, it would be assumed that any transfer of control out of the executable section, and then back, was unsafe.

However, the experts have insisted that even the safety of this case cannot be proved:

> an executable section with only simple assignments that use only built-in functionality.

I suppose that "PL/pgSQL is an AST interpreter" is the clue here.

Given this, then yes, you're right.