Re: plpgsql; execute query inside exists

Поиск
Список
Период
Сортировка
От Merlin Moncure
Тема Re: plpgsql; execute query inside exists
Дата
Msg-id CAHyXU0zS79MX1E5nNMuR=eSZ7KwSmDVBjxbJ=-y3=YkP_nX2Ug@mail.gmail.com
обсуждение исходный текст
Ответ на Re: plpgsql; execute query inside exists  (Alban Hertroys <haramrae@gmail.com>)
Список pgsql-general
On Mon, Oct 17, 2011 at 10:28 AM, Alban Hertroys <haramrae@gmail.com> wrote:
> On 17 October 2011 16:24, Merlin Moncure <mmoncure@gmail.com> wrote:
>> On Mon, Oct 17, 2011 at 8:44 AM, Alban Hertroys <haramrae@gmail.com> wrote:
>>> On 17 October 2011 15:20, Merlin Moncure <mmoncure@gmail.com> wrote:
>>>> A better way to do this is to query information_schema:
>>>>
>>>> PERFORM 1 FROM information_schema.tables where schema_name = x and
>>>> table_name = y;
>>>>
>>>> IF FOUND THEN
>>>>  CREATE TABLE ...
>>>> END IF;
>>>>
>>>> (there is a race condition in the above code -- do you see it? if
>>>> concurrent access to this function is an issue, you have to LOCK an
>>>> object before running the PERFORM or perhaps use an advisory lock).
>>>
>>> Is there? You'd think that with transactional DDL and the code running
>>> in a single transaction (namely inside a stored function) it would be
>>> concurrency-safe.
>>
>> Transactional DDL does not protect you from race conditions any more
>> than MVCC protects you from race conditions in regular DML.  What
>> transactional DDL does is roll back the changes in the event of an
>> error so you don't have half written schema changes in your database.
>> MVCC gives a rigorous definition of visibility rules and transactions
>> guarantee only a complete unit of work getting committed to the
>> database.  You still have to code defensively against multi-user
>> access however.  The good news is that multi user coding is about an
>> order of magnitude easier in sql (especially postgres variant) than in
>> any other development platform that I'm aware of.
>>
>> The race condition here is basically the same problem that affects
>> 'upsert' patterns:
>>
>> test record(s) if found update if not found insert;
>>
>> The problem comes that in between the test and the insert case someone
>> else can also test and get the insert in before you do.  You have two
>> general strategies to get around this: locking and retry.  I greatly
>> advise going the locking route unless your concurrency requirements
>> are very high.  It's much simpler, and since you're not invoking a
>> subtransaction, faster in the uncontested case.
>
> So what would happen if you don't lock? I think it's this:
>
>  Session A              | Session B
> ----------------+-----------------
>  SELECT x               | SELECT x
>  NOT FOUND              | NOT FOUND
>   ...                  | CREATE TABLE
>  CREATE TABLE   | <commit>
>  <error>                |
>  <rollback>             |
>
> If I understand correctly, if you don't mind the error and the
> subsequent rollback in Session A, than there's not much need to lock,
> or is there? It is important to be aware of the possible rollback of
> such a transaction, of course.

right -- allowing for rollback and retrying (either from the client or
in the procedure via sub-transaction) is always an option.  I prefer
to lock -- it's faster (usually) and gives more regular behavior.

> And what would you lock? A record in information_schema.tables? That's
> a read-only view. A table that doesn't exist yet? Can't do. A record
> in the pg_ schema? Rather not...
> I suppose you could work around that problem by keeping track of your
> own tables that were generated using aforementioned plpgsql function.
> Then you have a table (that you own) with records to lock.

yeah -- you could use an advisory lock or a special table created for
that purpose, or a row of a table that does your tracking.  Agree that
locking system catalogs is *not* advisable.

merlin

В списке pgsql-general по дате отправления:

Предыдущее
От: Alban Hertroys
Дата:
Сообщение: Re: 9.1 got really fast ;)
Следующее
От: Tom Lane
Дата:
Сообщение: Re: 9.1 got really fast ;)