Обсуждение: LIMIT for UPDATE and DELETE

Поиск
Список
Период
Сортировка

LIMIT for UPDATE and DELETE

От
Rukh Meski
Дата:
Greetings,

Based on the feedback on my previous patch, I've separated only the
LIMIT part into its own feature.  This version plays nicely with
inheritance.  The intended use is splitting up big UPDATEs and DELETEs
into batches more easily and efficiently.


♜

Вложения

Re: LIMIT for UPDATE and DELETE

От
Etsuro Fujita
Дата:
Hi Rukh,

(2014/08/15 6:18), Rukh Meski wrote:
> Based on the feedback on my previous patch, I've separated only the
> LIMIT part into its own feature.  This version plays nicely with
> inheritance.  The intended use is splitting up big UPDATEs and DELETEs
> into batches more easily and efficiently.

Before looking into the patch, I'd like to know the use cases in more 
details.

Thanks,

Best regards,
Etsuro Fujita



Re: LIMIT for UPDATE and DELETE

От
Amit Kapila
Дата:
On Mon, Aug 25, 2014 at 12:18 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
> (2014/08/15 6:18), Rukh Meski wrote:
>>
>> Based on the feedback on my previous patch, I've separated only the
>> LIMIT part into its own feature.  This version plays nicely with
>> inheritance.  The intended use is splitting up big UPDATEs and DELETEs
>> into batches more easily and efficiently.
>
>
> Before looking into the patch, I'd like to know the use cases in more details.

You can once check the previous commit fest thread [1] for
this feature, in that you can find some use cases and what are
the difficulties to implement this feature, it might aid you in
review of this feature.

Re: LIMIT for UPDATE and DELETE

От
Kevin Grittner
Дата:
Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
> (2014/08/15 6:18), Rukh Meski wrote:
>> Based on the feedback on my previous patch, I've separated only the
>> LIMIT part into its own feature.  This version plays nicely with
>> inheritance.  The intended use is splitting up big UPDATEs and DELETEs
>> into batches more easily and efficiently.
>
> Before looking into the patch, I'd like to know the use cases in more
> details.

There have been a few times I wanted something like this, so that
it was easier to do an update that affects a very high percentage
of rows in a table, while making the old version of the row no
longer match the selection criteria for the UPDATE.  There are
workarounds using cursors or subselects returning ctid, but they
are kludgy and error prone.  Basically I wanted to alternate UPDATE
of a subset of the rows with table VACUUM so that subsequent
iterations can re-use space and avoid bloating the table.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: LIMIT for UPDATE and DELETE

От
Jeff Janes
Дата:
On Sun, Aug 24, 2014 at 11:48 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
Hi Rukh,


(2014/08/15 6:18), Rukh Meski wrote:
Based on the feedback on my previous patch, I've separated only the
LIMIT part into its own feature.  This version plays nicely with
inheritance.  The intended use is splitting up big UPDATEs and DELETEs
into batches more easily and efficiently.

Before looking into the patch, I'd like to know the use cases in more details.


There are two common use cases I can think of:

1) 

I've just added a column to an existing table, and it is all NULL.  I've changed the code to populate that column appropriately for new or updated rows, but I need to back fill the existing rows.  I have a (slow) method to compute the new value.  (I've not yet changed the code to depend on that column being populated)

The obvious solution is:

update the_table set new_col=populate_new_col(whatever) where new_col is null.

But this will bloat the table because vacuum cannot intervene, and will take a very long time.  The first row to be update will remain locked until the last row gets updated, which is not acceptable.  And if something goes wrong before the commit, you've lost all the work.

With the limit clause, you can just do this:

update the_table set new_col=populate_new_col(whatever) where new_col is null limit 50000;

In a loop with appropriate vacuuming and throttling.

2) 

I've introduced or re-designed partitioning, and need to migrate rows to the appropriate partitions without long lived row locks.

create table pgbench_accounts2 () inherits (pgbench_accounts);

and then in a loop:

with t as (delete from only pgbench_accounts where aid < 500000 limit 5000 returning *) 
  insert into pgbench_accounts2 select * from t;

Cheers,

Jeff

Re: LIMIT for UPDATE and DELETE

От
Etsuro Fujita
Дата:
(2014/08/25 15:48), Etsuro Fujita wrote:
> (2014/08/15 6:18), Rukh Meski wrote:
>> Based on the feedback on my previous patch, I've separated only the
>> LIMIT part into its own feature.  This version plays nicely with
>> inheritance.  The intended use is splitting up big UPDATEs and DELETEs
>> into batches more easily and efficiently.
>
> Before looking into the patch, I'd like to know the use cases in more
> details.

Thanks for the input, Amit, Kevin and Jeff!  I understand that the patch 
is useful.

I've looked at the patch a bit closely.  Here is my initial thought 
about the patch.

The patch places limit-counting inside ModifyTable, and works well for 
inheritance trees, but I'm not sure that that is the right way to go.  I 
think that this feature should be implemented in the way that we can 
naturally extend it to the ORDER-BY-LIMIT case in future.  But honestly 
the patch doesn't seem to take into account that, I might be missing 
something, though.  What plan do you have for the future extensibility?

I think that the approach discussed in [1] would be promissing, so ISTM 
that it would be better to implement this feature by allowing for plans 
in the form of eg, ModifyTModifyTable+Limit+Append.

Thanks,

[1] http://www.postgresql.org/message-id/26819.1291133045@sss.pgh.pa.us

Best regards,
Etsuro Fujita



Re: LIMIT for UPDATE and DELETE

От
Marko Tiikkaja
Дата:
On 8/29/14 12:20 PM, Etsuro Fujita wrote:
> The patch places limit-counting inside ModifyTable, and works well for
> inheritance trees, but I'm not sure that that is the right way to go.  I
> think that this feature should be implemented in the way that we can
> naturally extend it to the ORDER-BY-LIMIT case in future.  But honestly
> the patch doesn't seem to take into account that, I might be missing
> something, though.

The LIMIT part *has* to happen after the rows have been locked or it 
will work very surprisingly under concurrency (sort of like how FOR 
SHARE / FOR UPDATE worked before 9.0).  So either it has to be inside 
ModifyTable or the ModifyTable has to somehow pass something to a Limit 
node on top of it which would then realize that the tuples from 
ModifyTable aren't supposed to be sent to the client (unless there's a 
RETURNING clause).  I think it's a lot nicer to do the LIMITing inside 
ModifyTable, even though that duplicates a small portion of code that 
already exists in the Limit node.

> What plan do you have for the future extensibility?
>
> I think that the approach discussed in [1] would be promissing, so ISTM
> that it would be better to implement this feature by allowing for plans
> in the form of eg, ModifyTModifyTable+Limit+Append.

I don't see an approach discussed there, just a listing of problems with 
no solutions.

This is just my personal opinion, but what I think should happen is:
  1) We put the LIMIT inside ModifyTable like this patch does.  This 
doesn't prevent us from doing ORDER BY in the future, but helps numerous 
people who today have to  2) We allow ORDER BY on tables with no inheritance children using 
something similar to Rukh's previous patch.  3) Someone rewrites how UPDATE works based on Tom's suggestion here: 
http://www.postgresql.org/message-id/1598.1399826841@sss.pgh.pa.us, 
which allows us to support ORDER BY on all tables (or perhaps maybe not 
FDWs, I don't know how those work).  The LIMIT functionality in this 
patch is unaffected.

Now, I know some people disagree with me on whether step #2 is worth 
taking or not, but that's a separate discussion.  My point w.r.t. this 
patch still stands: I don't see any forwards compatibility problems with 
this approach, nor do I really see any viable alternatives either.


.marko



Re: LIMIT for UPDATE and DELETE

От
Marko Tiikkaja
Дата:
On 8/29/14 1:53 PM, I wrote:
> This is just my personal opinion, but what I think should happen is:
>
>     1) We put the LIMIT inside ModifyTable like this patch does.  This
> doesn't prevent us from doing ORDER BY in the future, but helps numerous
> people who today have to

Oops, looks like I didn't finish my thought here.

.. but helps numerous people who today have to achieve the same thing 
via tedious, slow and problematic subqueries (or a choose-any-two 
combination of these).


.marko



Re: LIMIT for UPDATE and DELETE

От
Tom Lane
Дата:
Marko Tiikkaja <marko@joh.to> writes:
> The LIMIT part *has* to happen after the rows have been locked or it 
> will work very surprisingly under concurrency (sort of like how FOR 
> SHARE / FOR UPDATE worked before 9.0).

Good point.

> So either it has to be inside 
> ModifyTable or the ModifyTable has to somehow pass something to a Limit 
> node on top of it

... or we add a LockRows node below the Limit node.  Yeah, that would make
UPDATE/LIMIT a tad slower, but I think that might be preferable to what
you're proposing anyway.  Raw speed of what is fundamentally a fringe
feature ought not trump every other concern.

> This is just my personal opinion, but what I think should happen is:

>    1) We put the LIMIT inside ModifyTable like this patch does.  This 
> doesn't prevent us from doing ORDER BY in the future, but helps numerous 
> people who today have to
>    2) We allow ORDER BY on tables with no inheritance children using 
> something similar to Rukh's previous patch.
>    3) Someone rewrites how UPDATE works based on Tom's suggestion here: 
> http://www.postgresql.org/message-id/1598.1399826841@sss.pgh.pa.us, 
> which allows us to support ORDER BY on all tables (or perhaps maybe not 
> FDWs, I don't know how those work).  The LIMIT functionality in this 
> patch is unaffected.

I still think we should skip #2 and go directly to work on #3.  Getting
rid of the unholy mess that is inheritance_planner would be a very nice
thing.
        regards, tom lane



Re: LIMIT for UPDATE and DELETE

От
Marko Tiikkaja
Дата:
On 8/29/14 4:33 PM, Tom Lane wrote:
>> So either it has to be inside
>> ModifyTable or the ModifyTable has to somehow pass something to a Limit
>> node on top of it
>
> ... or we add a LockRows node below the Limit node.  Yeah, that would make
> UPDATE/LIMIT a tad slower, but I think that might be preferable to what
> you're proposing anyway.  Raw speed of what is fundamentally a fringe
> feature ought not trump every other concern.

I don't consider this a fringe feature, but in any case, the main use 
case for LIMIT without ORDER BY in UPDATE and DELETE is to split up 
large transactions into smaller batches.  And considering that, I think 
raw speed should be a concern (though it shouldn't trump every other 
concern, obviously).

More to the point, personally, I think the changes to nodeModifyTable.c 
are very reasonable so it's not clear to me that the "extra 
LockRows+Limit nodes" approach would be inherently better (even ignoring 
performance concerns).

>> This is just my personal opinion, but what I think should happen is:
>
>>     2) We allow ORDER BY on tables with no inheritance children using
>> something similar to Rukh's previous patch.
>>     3) Someone rewrites how UPDATE works based on Tom's suggestion here:
>> http://www.postgresql.org/message-id/1598.1399826841@sss.pgh.pa.us,
>
> I still think we should skip #2 and go directly to work on #3.  Getting
> rid of the unholy mess that is inheritance_planner would be a very nice
> thing.

Ideally?  Yeah, that would be great.  But I don't see anyone 
volunteering to do that work, and I think holding back a useful feature 
(ORDER BY with UPDATE/DELETE) in hopes of getting someone to volunteer 
to do it is insane.  Now, you're free to argue that ORDER BY with 
UPDATE/DELETE isn't that useful, of course, but I'm sure there are lots 
of people who agree with me.


.marko



Re: LIMIT for UPDATE and DELETE

От
Robert Haas
Дата:
On Mon, Sep 1, 2014 at 8:06 AM, Marko Tiikkaja <marko@joh.to> wrote:
> Ideally?  Yeah, that would be great.  But I don't see anyone volunteering to
> do that work, and I think holding back a useful feature (ORDER BY with
> UPDATE/DELETE) in hopes of getting someone to volunteer to do it is insane.
> Now, you're free to argue that ORDER BY with UPDATE/DELETE isn't that
> useful, of course, but I'm sure there are lots of people who agree with me.

I still agree with Tom.  Arbitrary restrictions on which features can
be used in combination with each other piss off and alienate users.
We've put quite a bit of effort into making table inheritance not suck
(e.g. statistics on inheritance trees, Merge Append, etc.).  Making it
suck more because you don't think it's as important as your feature
is, in my opinion, not cool.

This is not to say that I don't like the feature.  I like it a lot.
But I like a product where you can be sure that if walking works and
chewing gum works you can also walk and chew gum at the same time even
more.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: LIMIT for UPDATE and DELETE

От
Marko Tiikkaja
Дата:
On 9/3/14 4:46 PM, Robert Haas wrote:
> Making it
> suck more because you don't think it's as important as your feature
> is, in my opinion, not cool.

I really can't see how that would make inheritance suck *more*.  You 
can't do UPDATE .. ORDER BY now, and you wouldn't be able to do it after 
the patch.  Yeah, sure, perhaps people using inheritance might feel left 
out, but surely that would just motivate them to work on improving it.


.marko



Re: LIMIT for UPDATE and DELETE

От
Robert Haas
Дата:
On Wed, Sep 3, 2014 at 11:02 AM, Marko Tiikkaja <marko@joh.to> wrote:
> On 9/3/14 4:46 PM, Robert Haas wrote:
>> Making it
>> suck more because you don't think it's as important as your feature
>> is, in my opinion, not cool.
>
> I really can't see how that would make inheritance suck *more*.  You can't
> do UPDATE .. ORDER BY now, and you wouldn't be able to do it after the
> patch.  Yeah, sure, perhaps people using inheritance might feel left out,
> but surely that would just motivate them to work on improving it.

I think it's entirely reasonable for us to require that all new SQL
features should be required to work with or without inheritance.  If
we took the opposition position, and said that the only things that
need to work with inheritance are the ones that existed at the time
inheritance was introduced, then we'd not need to worry about it not
only for this feature but for row-level security and SKIP LOCKED and
GROUPING SETS and, going back a bit further, materialized views and
security-barrier views and LATERAL and CTEs and on and on.  Perhaps
not all of those require any special handling for inheritance
hierarchies, but some of them surely did, and if even 10% of the SQL
features that we have added since table inheritance were allowed to
opt out of supporting it, we'd have a broken and unusable feature
today.

Now some people might argue that we have that anyway, but the fact is
that a lot of people are using inheritance today, even with all its
flaws, and they wouldn't be if there were a long laundry list of
limitations that didn't apply to regular tables.  We should be looking
to lift the limitations that currently exist, not add more.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: LIMIT for UPDATE and DELETE

От
Heikki Linnakangas
Дата:
On 09/03/2014 06:39 PM, Robert Haas wrote:
> On Wed, Sep 3, 2014 at 11:02 AM, Marko Tiikkaja <marko@joh.to> wrote:
>> On 9/3/14 4:46 PM, Robert Haas wrote:
>>> Making it
>>> suck more because you don't think it's as important as your feature
>>> is, in my opinion, not cool.
>>
>> I really can't see how that would make inheritance suck *more*.  You can't
>> do UPDATE .. ORDER BY now, and you wouldn't be able to do it after the
>> patch.  Yeah, sure, perhaps people using inheritance might feel left out,
>> but surely that would just motivate them to work on improving it.
>
> I think it's entirely reasonable for us to require that all new SQL
> features should be required to work with or without inheritance.  If
> we took the opposition position, and said that the only things that
> need to work with inheritance are the ones that existed at the time
> inheritance was introduced, then we'd not need to worry about it not
> only for this feature but for row-level security and SKIP LOCKED and
> GROUPING SETS and, going back a bit further, materialized views and
> security-barrier views and LATERAL and CTEs and on and on.  Perhaps
> not all of those require any special handling for inheritance
> hierarchies, but some of them surely did, and if even 10% of the SQL
> features that we have added since table inheritance were allowed to
> opt out of supporting it, we'd have a broken and unusable feature
> today.
>
> Now some people might argue that we have that anyway, but the fact is
> that a lot of people are using inheritance today, even with all its
> flaws, and they wouldn't be if there were a long laundry list of
> limitations that didn't apply to regular tables.  We should be looking
> to lift the limitations that currently exist, not add more.

I agree. If we are to support UPDATE .. ORDER BY .. LIMIT, it should 
work with inheritance. So the path forward is (using Marko's phrasing 
upthread):
   1) We put the LIMIT inside ModifyTable like this patch does.  This
doesn't prevent us from doing ORDER BY in the future, but helps numerous
people who today have to   2) Someone rewrites how UPDATE works based on Tom's suggestion here:
http://www.postgresql.org/message-id/1598.1399826841@sss.pgh.pa.us,
which allows us to support ORDER BY on all tables (or perhaps maybe not
FDWs, I don't know how those work).  The LIMIT functionality in this
patch is unaffected.

What's not clear to me is whether it make sense to do 1) without 2) ? Is 
UPDATE .. LIMIT without support for an ORDER BY useful enough? And if we 
apply this patch now, how much of it needs to be rewritten after 2) ? If 
the answers are "yes" and "not much", then we should review this patch 
now, and put 2) on the TODO list. Otherwise 2) should do done first.

Etsuro, Kaigei, please take a look at the patch and try to make a guess 
on how much of this still needs to be rewritten if we do 2). If not 
much, then please continue to review it, with the aim of getting it into 
a committable state.

- Heikki




Re: LIMIT for UPDATE and DELETE

От
Marko Tiikkaja
Дата:
On 9/9/14 11:57 AM, Heikki Linnakangas wrote:
> What's not clear to me is whether it make sense to do 1) without 2) ? Is
> UPDATE .. LIMIT without support for an ORDER BY useful enough?

I'd say so; I could use it right now.  I have to remove millions of 
lines from a table, but I don't want the live transaction processing to 
take a hit, so I have to do it in batches.  Granted, some kind of rate 
limiting would achieve the same thing for DELETE, but with UPDATE you 
also have to consider row locking, and rate limiting wouldn't help with 
that at all; it would, in fact, just make it worse.  I'll also be 
running a big UPDATE like that later today, so that's two uses today 
alone for me.  And no, these are not routine things so keep your "use 
partitions" suggestions to yourselves.

> And if we
> apply this patch now, how much of it needs to be rewritten after 2) ? If
> the answers are "yes" and "not much", then we should review this patch
> now, and put 2) on the TODO list. Otherwise 2) should do done first.

I'd say "not much, if anything at all".


.marko



Re: LIMIT for UPDATE and DELETE

От
Marko Tiikkaja
Дата:
On 9/9/14 12:37 PM, I wrote:
> And no, these are not routine things so keep your "use
> partitions" suggestions to yourselves.

My apologies.  This was not directed at you personally, Heikki, and in 
any case it was unnecessarily hostile.


.marko



Re: LIMIT for UPDATE and DELETE

От
Heikki Linnakangas
Дата:
On 09/09/2014 01:46 PM, Marko Tiikkaja wrote:
> On 9/9/14 12:37 PM, I wrote:
>> And no, these are not routine things so keep your "use
>> partitions" suggestions to yourselves.
>
> My apologies.  This was not directed at you personally, Heikki, and in
> any case it was unnecessarily hostile.

No worries, it made me smile :-)

- Heikki




Re: LIMIT for UPDATE and DELETE

От
Jeff Janes
Дата:
On Tue, Sep 9, 2014 at 2:57 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
 
I agree. If we are to support UPDATE .. ORDER BY .. LIMIT, it should work with inheritance. So the path forward is (using Marko's phrasing upthread):

   1) We put the LIMIT inside ModifyTable like this patch does.  This
doesn't prevent us from doing ORDER BY in the future, but helps numerous
people who today have to
   2) Someone rewrites how UPDATE works based on Tom's suggestion here:
http://www.postgresql.org/message-id/1598.1399826841@sss.pgh.pa.us,
which allows us to support ORDER BY on all tables (or perhaps maybe not
FDWs, I don't know how those work).  The LIMIT functionality in this
patch is unaffected.

What's not clear to me is whether it make sense to do 1) without 2) ? Is UPDATE .. LIMIT without support for an ORDER BY useful enough?

I've wanted LIMIT even without ORDER BY many times, so I'd vote yes.

 
And if we apply this patch now, how much of it needs to be rewritten after 2) ? If the answers are "yes" and "not much", then we should review this patch now, and put 2) on the TODO list. Otherwise 2) should do done first.

On that I can't give any useful feedback. 


Cheers,

Jeff

Re: LIMIT for UPDATE and DELETE

От
Etsuro Fujita
Дата:
(2014/09/09 18:57), Heikki Linnakangas wrote:
> On 09/03/2014 06:39 PM, Robert Haas wrote:
>> Now some people might argue that we have that anyway, but the fact is
>> that a lot of people are using inheritance today, even with all its
>> flaws, and they wouldn't be if there were a long laundry list of
>> limitations that didn't apply to regular tables.  We should be looking
>> to lift the limitations that currently exist, not add more.

> I agree. If we are to support UPDATE .. ORDER BY .. LIMIT, it should
> work with inheritance. So the path forward is (using Marko's phrasing
> upthread):
>
>     1) We put the LIMIT inside ModifyTable like this patch does.  This
> doesn't prevent us from doing ORDER BY in the future, but helps numerous
> people who today have to
>     2) Someone rewrites how UPDATE works based on Tom's suggestion here:
> http://www.postgresql.org/message-id/1598.1399826841@sss.pgh.pa.us,
> which allows us to support ORDER BY on all tables (or perhaps maybe not
> FDWs, I don't know how those work).  The LIMIT functionality in this
> patch is unaffected.
>
> What's not clear to me is whether it make sense to do 1) without 2) ? Is
> UPDATE .. LIMIT without support for an ORDER BY useful enough? And if we
> apply this patch now, how much of it needs to be rewritten after 2) ? If
> the answers are "yes" and "not much", then we should review this patch
> now, and put 2) on the TODO list. Otherwise 2) should do done first.

My answers are "yes" but "completely rewritten".  So, I think we should 
work on 2) first ideally, but 2) seems a large project at least to me. 
So, I think it would be reasonable to implement UPDATE/DELETE .. LIMIT 
based on Rukh' patch for now and put 2) and the re-implementation of 1) 
after 2) on the TODO list.

I don't have enough time to review it for a while, so I'd like to work 
on it (and postgres_fdw's "update pushdown" enhancement related to it) 
at the next CF (I think I can do it earlier).  I must apologize to Rukh 
for not having enough time for the patch review.

Thanks,

Best regards,
Etsuro Fujita



Re: LIMIT for UPDATE and DELETE

От
Marko Tiikkaja
Дата:
On 2014-09-10 04:25, Etsuro Fujita wrote:
> (2014/09/09 18:57), Heikki Linnakangas wrote:
>> What's not clear to me is whether it make sense to do 1) without 2) ? Is
>> UPDATE .. LIMIT without support for an ORDER BY useful enough? And if we
>> apply this patch now, how much of it needs to be rewritten after 2) ? If
>> the answers are "yes" and "not much", then we should review this patch
>> now, and put 2) on the TODO list. Otherwise 2) should do done first.
>
> My answers are "yes" but "completely rewritten".

Any particular reason for you to say that?  Because an UPDATE might have 
a RETURNING clause, all the updated tuples have to go through the 
ModifyTable node one at a time.  I don't see why we couldn't LIMIT there 
after implementing #2.


.marko



Re: LIMIT for UPDATE and DELETE

От
Etsuro Fujita
Дата:
(2014/09/10 16:57), Marko Tiikkaja wrote:
> On 2014-09-10 04:25, Etsuro Fujita wrote:
>> (2014/09/09 18:57), Heikki Linnakangas wrote:
>>> What's not clear to me is whether it make sense to do 1) without 2) ? Is
>>> UPDATE .. LIMIT without support for an ORDER BY useful enough? And if we
>>> apply this patch now, how much of it needs to be rewritten after 2) ? If
>>> the answers are "yes" and "not much", then we should review this patch
>>> now, and put 2) on the TODO list. Otherwise 2) should do done first.
>>
>> My answers are "yes" but "completely rewritten".
>
> Any particular reason for you to say that?  Because an UPDATE might have
> a RETURNING clause, all the updated tuples have to go through the
> ModifyTable node one at a time.  I don't see why we couldn't LIMIT there
> after implementing #2.

The reason is because I think that after implementing #2, we should 
re-implement this feature by extending the planner to produce a plan 
tree such as ModifyTable+Limit+Append, maybe with LockRows below the 
Limit node.  Such an approach would prevent duplication of the LIMIT 
code in ModifyTable, making the ModifyTable code more simple, I think.

Thanks,

Best regards,
Etsuro Fujita



Re: LIMIT for UPDATE and DELETE

От
Marko Tiikkaja
Дата:
On 9/10/14 11:25 AM, Etsuro Fujita wrote:
> The reason is because I think that after implementing #2, we should
> re-implement this feature by extending the planner to produce a plan
> tree such as ModifyTable+Limit+Append, maybe with LockRows below the
> Limit node.  Such an approach would prevent duplication of the LIMIT
> code in ModifyTable, making the ModifyTable code more simple, I think.

You can already change *this patch* to do ModifyTable <- Limit <- 
LockRows.  If we think that's what we want, we should rewrite this patch 
right now.  This isn't a reason not to implement LIMIT without ORDER BY.

Like I said upthread, I think LockRows is a bad idea, but I'll need to 
run some performance tests first.  But whichever method we decide to 
implement for this patch shouldn't need to be touched when the changes 
to UPDATE land, so your reasoning is incorrect.


.marko



Re: LIMIT for UPDATE and DELETE

От
Etsuro Fujita
Дата:
(2014/09/10 18:33), Marko Tiikkaja wrote:
> On 9/10/14 11:25 AM, Etsuro Fujita wrote:
>> The reason is because I think that after implementing #2, we should
>> re-implement this feature by extending the planner to produce a plan
>> tree such as ModifyTable+Limit+Append, maybe with LockRows below the
>> Limit node.  Such an approach would prevent duplication of the LIMIT
>> code in ModifyTable, making the ModifyTable code more simple, I think.

> You can already change *this patch* to do ModifyTable <- Limit <-
> LockRows.  If we think that's what we want, we should rewrite this patch
> right now.

I think it might be relatively easy to do that for single-table cases, 
but for inheritance cases, inheritance_planner needs work and I'm not 
sure how much work it would take ...

> Like I said upthread, I think LockRows is a bad idea, but I'll need to
> run some performance tests first.  But whichever method we decide to
> implement for this patch shouldn't need to be touched when the changes
> to UPDATE land, so your reasoning is incorrect.

Yeah, as you say, we need the performance tests, and I think it would 
depend on those results whether LockRows is a bad idea or not.

Thanks,

Best regards,
Etsuro Fujita



Re: LIMIT for UPDATE and DELETE

От
Marko Tiikkaja
Дата:
On 9/10/14 12:05 PM, Etsuro Fujita wrote:
> (2014/09/10 18:33), Marko Tiikkaja wrote:
>> You can already change *this patch* to do ModifyTable <- Limit <-
>> LockRows.  If we think that's what we want, we should rewrite this patch
>> right now.
>
> I think it might be relatively easy to do that for single-table cases,
> but for inheritance cases, inheritance_planner needs work and I'm not
> sure how much work it would take ...

Uh.  Yeah, I think I'm an idiot and you're right.

I'll try and get some benchmarking done and see what comes out.


.marko



Re: LIMIT for UPDATE and DELETE

От
Etsuro Fujita
Дата:
(2014/08/15 6:18), Rukh Meski wrote:
> Based on the feedback on my previous patch, I've separated only the
> LIMIT part into its own feature.  This version plays nicely with
> inheritance.  The intended use is splitting up big UPDATEs and DELETEs
> into batches more easily and efficiently.

IIUC, the patch doesn't support OFFSET with UPDATE/DELETE ... LIMIT.  Is 
that OK?  When we support ORDER BY ... LIMIT/OFFSET, we will also be 
allowing for OFFSET with UPDATE/DELETE ... LIMIT.  So, ISTM it would be 
better for the patch to support OFFSET at this point.  No?

Thanks,

Best regards,
Etsuro Fujita



Re: LIMIT for UPDATE and DELETE

От
Kevin Grittner
Дата:
Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:

> (2014/08/15 6:18), Rukh Meski wrote:
>> Based on the feedback on my previous patch, I've separated only the
>> LIMIT part into its own feature.  This version plays nicely with
>> inheritance.  The intended use is splitting up big UPDATEs and DELETEs
>> into batches more easily and efficiently.
>
> IIUC, the patch doesn't support OFFSET with UPDATE/DELETE ... LIMIT.  Is
> that OK?  When we support ORDER BY ... LIMIT/OFFSET, we will also be
> allowing for OFFSET with UPDATE/DELETE ... LIMIT.  So, ISTM it would be
> better for the patch to support OFFSET at this point.  No?

Without ORDER BY you really would have no idea *which* rows the
OFFSET would be skipping.  Even more dangerously, you might *think*
you do, and get a surprise when you see the results (if, for
example, a seqscan starts at a point other than the start of the
heap, due to a concurrent seqscan from an unrelated query).  It
might be better not to provide an illusion of a degree of control
you don't have, especially for UPDATE and DELETE operations.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: LIMIT for UPDATE and DELETE

От
Robert Haas
Дата:
On Tue, Sep 16, 2014 at 11:31 AM, Kevin Grittner <kgrittn@ymail.com> wrote:
> Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
>> (2014/08/15 6:18), Rukh Meski wrote:
>>> Based on the feedback on my previous patch, I've separated only the
>>> LIMIT part into its own feature.  This version plays nicely with
>>> inheritance.  The intended use is splitting up big UPDATEs and DELETEs
>>> into batches more easily and efficiently.
>>
>> IIUC, the patch doesn't support OFFSET with UPDATE/DELETE ... LIMIT.  Is
>> that OK?  When we support ORDER BY ... LIMIT/OFFSET, we will also be
>> allowing for OFFSET with UPDATE/DELETE ... LIMIT.  So, ISTM it would be
>> better for the patch to support OFFSET at this point.  No?
>
> Without ORDER BY you really would have no idea *which* rows the
> OFFSET would be skipping.  Even more dangerously, you might *think*
> you do, and get a surprise when you see the results (if, for
> example, a seqscan starts at a point other than the start of the
> heap, due to a concurrent seqscan from an unrelated query).  It
> might be better not to provide an illusion of a degree of control
> you don't have, especially for UPDATE and DELETE operations.

Fair point, but I'd lean toward including it.  I think we all agree
the end goal is ORDER BY .. LIMIT, and there OFFSET certainly has
meaning.  If we don't include it now, we'll just end up adding it
later.  It makes for fewer patches, and fewer changes for users, if we
do it all at once.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: LIMIT for UPDATE and DELETE

От
Etsuro Fujita
Дата:
(2014/09/17 1:58), Robert Haas wrote:
> On Tue, Sep 16, 2014 at 11:31 AM, Kevin Grittner <kgrittn@ymail.com> wrote:
>> Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
>>> (2014/08/15 6:18), Rukh Meski wrote:
>>>> Based on the feedback on my previous patch, I've separated only the
>>>> LIMIT part into its own feature.  This version plays nicely with
>>>> inheritance.  The intended use is splitting up big UPDATEs and DELETEs
>>>> into batches more easily and efficiently.
>>>
>>> IIUC, the patch doesn't support OFFSET with UPDATE/DELETE ... LIMIT.  Is
>>> that OK?  When we support ORDER BY ... LIMIT/OFFSET, we will also be
>>> allowing for OFFSET with UPDATE/DELETE ... LIMIT.  So, ISTM it would be
>>> better for the patch to support OFFSET at this point.  No?
>>
>> Without ORDER BY you really would have no idea *which* rows the
>> OFFSET would be skipping.  Even more dangerously, you might *think*
>> you do, and get a surprise when you see the results (if, for
>> example, a seqscan starts at a point other than the start of the
>> heap, due to a concurrent seqscan from an unrelated query).  It
>> might be better not to provide an illusion of a degree of control
>> you don't have, especially for UPDATE and DELETE operations.
>
> Fair point, but I'd lean toward including it.  I think we all agree
> the end goal is ORDER BY .. LIMIT, and there OFFSET certainly has
> meaning.  If we don't include it now, we'll just end up adding it
> later.  It makes for fewer patches, and fewer changes for users, if we
> do it all at once.

I agree with Robert.

Rukh, what do you think as an author?

Thanks,

Best regards,
Etsuro Fujita



Re: LIMIT for UPDATE and DELETE

От
Heikki Linnakangas
Дата:
On 09/24/2014 05:22 AM, Etsuro Fujita wrote:
> (2014/09/17 1:58), Robert Haas wrote:
>> On Tue, Sep 16, 2014 at 11:31 AM, Kevin Grittner <kgrittn@ymail.com> wrote:
>>> Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
>>>> (2014/08/15 6:18), Rukh Meski wrote:
>>>>> Based on the feedback on my previous patch, I've separated only the
>>>>> LIMIT part into its own feature.  This version plays nicely with
>>>>> inheritance.  The intended use is splitting up big UPDATEs and DELETEs
>>>>> into batches more easily and efficiently.
>>>>
>>>> IIUC, the patch doesn't support OFFSET with UPDATE/DELETE ... LIMIT.  Is
>>>> that OK?  When we support ORDER BY ... LIMIT/OFFSET, we will also be
>>>> allowing for OFFSET with UPDATE/DELETE ... LIMIT.  So, ISTM it would be
>>>> better for the patch to support OFFSET at this point.  No?
>>>
>>> Without ORDER BY you really would have no idea *which* rows the
>>> OFFSET would be skipping.  Even more dangerously, you might *think*
>>> you do, and get a surprise when you see the results (if, for
>>> example, a seqscan starts at a point other than the start of the
>>> heap, due to a concurrent seqscan from an unrelated query).  It
>>> might be better not to provide an illusion of a degree of control
>>> you don't have, especially for UPDATE and DELETE operations.
>>
>> Fair point, but I'd lean toward including it.  I think we all agree
>> the end goal is ORDER BY .. LIMIT, and there OFFSET certainly has
>> meaning.  If we don't include it now, we'll just end up adding it
>> later.  It makes for fewer patches, and fewer changes for users, if we
>> do it all at once.
>
> I agree with Robert.
>
> Rukh, what do you think as an author?

I have marked this as "returned with feedback" in the commitfest app. 
What I'd like to see happen next is:

Rewrite how UPDATEs work, per Tom's suggestion here: 
http://www.postgresql.org/message-id/1598.1399826841@sss.pgh.pa.us. Then 
implement ORDER BY ... LIMIT on top of that.


A lot of people would also be willing to settle for just implementing 
LIMIT without ORDER BY, as a stopgap measure. But the UPDATE rewrite is 
what would make everyone most happy.

- Heikki