Обсуждение: Alter/update large tables - VERRRY annoying behaviour!

Поиск

Список

Период

Сортировка

Alter/update large tables - VERRRY annoying behaviour!

От

Dmitry Tkach

Дата:

15 апреля 2002 г., 14:47:57

Hi, everybody!

I was wonderring if it is a known "feature" to begin with, and, if there are any plans to fix
it in future?
I had two very large tables in my database (about 30 million rows each), connected by a foreign key,
and wanted to merge them together...

Something like this:

create table a
(
id int primary key,
some_data int
);
create table b
(
id int unique references a,
other_data int
);

So, what I did was:

alter table a add other_data int;
update a set other_data = b.other_data from b where b.id=a.id;

This took me awfully long, but worked (I guess).
I say 'I guess', because I wasn't able so far to verify that - when I triued to do

select * from a limit 1;

It just hungs on me ... at least, it looks like it does.

Lucky me, I have compiled the backend from sources with full debug info, because if I hadn't done that,
(as most users), I would certainly had thought, that my database is hopelessly corrupted, and would have to
recreate it from scratch :-(
Instead, I loaded the whole thing into a debugger, because that seems to be the only way to figure out what
the hell it is thinking about...
So, what I found out was that it seems to have recreated my entire table when I updated it, and left all the
old tuples in it as well, so, my 'select *...limit 1' query was cycling through 30 million deleted tuples, trying
to find the first one that was still valid, and that's waht was taking that long time...

First of all, a question for you - is ANY update to a table equivalent (in this respect) to a delete+insert?
Or is my problem specific to the fact that I have altered the table and add new columns?

Now, I understand, that, if I vacuum'ed it, the problem would have been resolved... The problems I have with it
though are:

- As I said, the behaviour was so unexpected, I would have trashed the whole databse, if I wasn't able to debug
it... If there is no other possible solution, I think, at the very least, it should give the user some indication
that it's not hopelessly hung, when doing that query...

- Vacuum, isn't the speediest thing in the world too (it's been running for a hour now, and still has not finished).
I was hoping to complete modifying my schema first, and then just vacuum everything once. So, it would be
REALLY, REALLY helpful for situations like that, if PG was smart enough to keep track somehow of those deleted
tuples, to avoid having to scan through them all every time...
In my particular situation, the solution would be trivial (just remembering the address of the first valid
tuple would suffice - because the entire table was updated)... I am not familiar enough with internals to suggest
anything more general than this, but EVEN fixing only this particular scenario, would, I believe, be extremely
useful....

Do you agree?

Thanks a lot!

Dima.

Re: Alter/update large tables - VERRRY annoying behaviour!

От

Neil Conway

Дата:

15 апреля 2002 г., 17:28:19

On Mon, 15 Apr 2002 13:07:20 -0400
"Dmitry Tkach" <dmitry@openratings.com> wrote:
> Hi, everybody!

Hi Dmitry! Don't cross-post! It's annoying!

> This took me awfully long, but worked (I guess).
> I say 'I guess', because I wasn't able so far to verify that - when I triued to do
>
> select * from a limit 1;
>
> It just hungs on me ... at least, it looks like it does.

This didn't hang, it just requires a sequential scan of the whole table.
As you observe below, it will also need to scan through dead tuples,
but that is just a product of MVCC and there's no real way around
it. Once you VACUUM the dead tuples will be removed and sequential
scans should be fast once more.

And before assuming that something has hung, it's a good idea to
look at the output of EXPLAIN for that query, as well as monitoring
system performance (through top, vmstat, etc) to see what the
system is doing.

> Lucky me, I have compiled the backend from sources with full debug info, because if I hadn't done that,
> (as most users), I would certainly had thought, that my database is hopelessly corrupted, and would have to
> recreate it from scratch :-(

That's a ludicrous conclusion.

> First of all, a question for you - is ANY update to a table equivalent (in this respect) to a delete+insert?

Yes, AFAIK -- MVCC requires this.

> - Vacuum, isn't the speediest thing in the world too (it's been running for a hour now, and still has not finished).

Is this 7.2? If not, VACUUM should be substantially faster in 7.2.
In any case, you'll always want to VACUUM or VACUUM FULL (and
ANALYZE) when you change your tables in such a dramatic fashion.

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

Re: Alter/update large tables - VERRRY annoying behaviour!

От

Dmitry Tkach

Дата:

15 апреля 2002 г., 19:00:03

Neil Conway wrote:<br /><blockquote cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org" type="cite"><pre
wrap="">OnMon, 15 Apr 2002 13:07:20 -0400<br />"Dmitry Tkach" <a class="moz-txt-link-rfc2396E"
href="mailto:dmitry@openratings.com"><dmitry@openratings.com></a>wrote:<br /></pre><blockquote type="cite"><pre
wrap="">Hi,everybody!<br /></pre></blockquote><pre wrap=""><br />Hi Dmitry! Don't cross-post! It's annoying!<br
/></pre></blockquote>What do you mean by 'cross-post'?<br /> Are you saying that posting to several lists at a time is
annoying?<br/> I just thought, that this problem might be interesting to people, who read those (and not necessarily
ALL<br /> of them)... What's annoying about it?<br /><blockquote
cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org"type="cite"><pre wrap=""><br /></pre><blockquote
type="cite"><prewrap="">This took me awfully long, but worked (I guess).<br />I say 'I guess', because I wasn't able so
farto verify that - when I triued to do<br /><br />select * from a limit 1;<br /><br />It just hungs on me ... at
least,it looks like it does.<br /></pre></blockquote><pre wrap=""><br />This didn't hang, it just requires a sequential
scanof the whole table.</pre></blockquote> I know it does (as I said below). The point is that it SHOULD NOT, and
especially,that I can't imagine anyone, not familiar with postgres internals to expect that it would - all it needs to
dois to grab the first row and return immediately.<br /> That's what it would do, if you just create a new table and
populateit with data.<br /><br /><blockquote cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org"
type="cite"><prewrap=""><br />As you observe below, it will also need to scan through dead tuples,</pre></blockquote>
Not'also' - JUST the dead ones! That's what's especially annoying about it - as soon as it finds the first tuple,
that'snot dead, it returns.<br /><blockquote cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org"
type="cite"><prewrap=""><br />but that is just a product of MVCC and there's no real way around<br
/>it.</pre></blockquote>My whole point is that I don't believe it (that there is no way around) :-)<br /> For one
thing,I have never seen ANY database engine (Oracle, Informix, DB2) that would take more than a second to get the first
rowfrom a table, regardless of what has been done to that table before.<br /> That (and my common sense too) tells me
thatthere MUST be a 'way around it'. <br /> I can see, that it's not currently implemented in postgres, but do believe
(andthat's the whole point of me posting that message in the first place) that it is a huge usability issue and really
needsto be fixed.<br /><blockquote cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org" type="cite"><pre
wrap="">Once you VACUUM the dead tuples will be removed and sequential<br />scans should be fast once more.<br
/></pre></blockquote><br/> Yeah... I hope so. I am running vacuum on that table. It's been running for 6 hours now and
stillhas not finished. <br /> Doesn't it look to you like a little too much trouble to go through just to take a look
atthe first row of a table ? :-)<br /><br /> And, once again, I am not done modifying that schema - this is just an
intermediatestep, which means, I will have to do the vacuum all over when I am finished...<br /><br /> This seems like
WAYtoo much trouble to me :-(<br /><br /><blockquote cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org"
type="cite"><prewrap=""><br />And before assuming that something has hung, it's a good idea to<br />look at the output
ofEXPLAIN for that query, as well as monitoring<br />system performance (through top, vmstat, etc) to see what the<br
/>systemis doing.<br /></pre></blockquote> Yeah, right...<br /><br /> explain select * from a limit 1;<br /> NOTICE: 
QUERYPLAN:<br /><br /> Limit  (cost=0.00..1.01 rows=1 width=46)<br />   ->  Seq Scan on a  (cost=0.00..32529003.00
rows=32243660width=46)<br /><br /> EXPLAIN<br /><br /> There is absolutely nothing in this plan, that would suggest it
willgo on executing for ages...<br /> Look at the 'cost' value for example...<br /> In any event, there is nothing
differentin this plan from what I was getting before I modified the table (when the query would take just a few
millisecondsto be executed).<br /><br /> As for monitoring system performance... Well, I could see it maxing out on CPU
usageand disk IO at times... How exactly does it help me to realize it did not hung?<br /><br /> (Let me clarify that -
by'hung' I mean 'not going to return the results in any reasonable time', not necessarily 'not doing anything at
all')<br/><br /><br /><blockquote cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org" type="cite"><pre
wrap=""><br/></pre><blockquote type="cite"><pre wrap="">Lucky me, I have compiled the backend from sources with full
debuginfo, because if I hadn't done that,<br />(as most users), I would certainly had thought, that my database is
hopelesslycorrupted, and would have to<br />recreate it from scratch :-(<br /></pre></blockquote><pre wrap=""><br
/>That'sa ludicrous conclusion.<br /></pre></blockquote><br /> Why is it?<br /><blockquote
cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org"type="cite"><pre wrap=""><br /></pre><blockquote
type="cite"><prewrap="">First of all, a question for you - is ANY update to a table equivalent (in this respect) to a
delete+insert?<br/></pre></blockquote><pre wrap=""><br />Yes, AFAIK -- MVCC requires this.<br /></pre></blockquote><br
/>What's MVCC?<br /><blockquote cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org" type="cite"><pre
wrap=""><br/></pre><blockquote type="cite"><pre wrap="">- Vacuum, isn't the speediest thing in the world too (it's been
runningfor a hour now, and still has not finished).<br /></pre></blockquote><pre wrap=""><br />Is this 7.2? If not,
VACUUMshould be substantially faster in 7.2.</pre></blockquote> Yes, it is 7.2<br /><blockquote
cite="mid:20020415142451.1d8a21d0.nconway@klamath.dyndns.org"type="cite"><pre wrap=""><br />In any case, you'll always
wantto VACUUM or VACUUM FULL (and<br />ANALYZE) when you change your tables in such a dramatic fashion.<br /><br
/></pre></blockquote><br/> I know... Once again, I was hoping to be able to complete my changes before doing the vacuum
:-(<br/><br /> Thanks for your reply!<br /><br /> Dima<br /><br /><br />

Re: Alter/update large tables - VERRRY annoying behaviour!

От

Bruce Momjian

Дата:

15 апреля 2002 г., 19:34:25

>  Hi Dmitry! Don't cross-post! It's annoying!

Is overly cross posting or HTML mail more annoying?  I can't decide.  :-)

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Alter/update large tables - VERRRY annoying behaviour!

От

Dmitry Tkach

Дата:

15 апреля 2002 г., 19:43:22

Bruce Momjian wrote:<br /><blockquote cite="mid:200204152034.g3FKYAx21352@candle.pha.pa.us" type="cite"><blockquote
type="cite"><prewrap=""> Hi Dmitry! Don't cross-post! It's annoying!<br /></pre></blockquote><pre wrap=""><br />Is
overlycross posting or HTML mail more annoying?  I can't decide.  :-)<br /><br /></pre></blockquote> Do you mean I am
postingin HTML????<br /> I am sorry, if that's the case - I was sure that I have turned it off :-(<br /> Somehow, when
ICC to myself, I am getting a text-only message...<br /> Do you see it as HTML?<br /><br /> Dima<br /><br /><br />

Re: Alter/update large tables - VERRRY annoying behaviour!

От

"Nigel J. Andrews"

Дата:

15 апреля 2002 г., 21:59:52

On Mon, 15 Apr 2002, Dmitry Tkach wrote:

> Neil Conway wrote:
>
>>  On Mon, 15 Apr 2002 13:07:20 -0400
>> "Dmitry Tkach" <dmitry@openratings.com> wrote:
>
>>>  This took me awfully long, but worked (I guess).
>>> I say 'I guess', because I wasn't able so far to verify that - when I
triued
to do
>>> select * from a limit 1;
>>> It just hungs on me ... at least, it looks like it does.
>
>>  This didn't hang, it just requires a sequential scan of the whole table.
>
> I know it does (as I said below). The point is that it SHOULD NOT, and
> especially, that I can't imagine anyone, not familiar with postgres
> internals to expect that it would - all it needs to do is to grab the
> first row and return immediately.
> That's what it would do, if you just create a new table and populate it
> with data.
>
>>  As you observe below, it will also need to scan through dead tuples,
>
> Not 'also' - JUST the dead ones! That's what's especially annoying about
> it - as soon as it finds the first tuple, that's not dead, it returns.
>
>  but that is just a product of MVCC and there's no real way around
> it.
>
> My whole point is that I don't believe it (that there is no way around)
> :-)
> For one thing, I have never seen ANY database engine (Oracle, Informix,
> DB2) that would take more than a second to get the first row from a
> table, regardless of what has been done to that table before.
> That (and my common sense too) tells me that there MUST be a 'way around
> it'.
> I can see, that it's not currently implemented in postgres, but do
> believe (and that's the whole point of me posting that message in the
> first place) that it is a huge usability issue and really needs to be
> fixed.

Ok, so I realise that addressing this issue of using a limit of 1 doesn't
address anything associated with using this table for something useful, however
I find it an interesting point.

All that has been asked for is that a single row be returned. This doesn't have
to be the first valid row at all. Indeed isn't it one of axioms of SQL that
data is returned in some arbitrary order unless specifically forced into an
order? Therefore shouldn't doing SELECT * FROM x LIMIT 1 use an index, any
index, to just fetch a single row?

Or, is this an effect of a row needing to be visited in order to determine it's
'visibility' as proposed as a reason why a sequential scan was being performed
for one of my queries I posted about recently? I am still a little confused
about that though. If there is an index and deletes have been commited,
shouldn't the index have been updated and forced to forget about the deleted
rows unless there is a transaction open that can still access those deleted
items?

As for the rest of the argument, is it constructive? I for one thought the
original poster had done more than a normal user, me included, would have done
before posting.

>[stuff deleted]
>
>>>  First of all, a question for you - is ANY update to a table equivalent (in
this respect) to a delete+insert?
>
>>  Yes, AFAIK -- MVCC requires this.
>
> What's MVCC?

Funny, I was about to ask that question. Something about variable size of
fields in the physical storage?

>
>>>  - Vacuum, isn't the speediest thing in the world too (it's been running
for a hour now, and still has not finished).
>
>>  Is this 7.2? If not, VACUUM should be substantially faster in 7.2.
>
> Yes, it is 7.2

Pleased I haven't tried what I'm doing in a version older than 7.2 in that
case, and my table's only 1 million rows.

> [more deletions]

However, a useful thread, since it explains a 'feature' a ran into earlier
today after managing about 4/5  of an update of every row of my table. I had
been thinking the problem was large storage usage by the txtidx type, I
realise now it is the updated = deleted+inserted storage requirement. I'm
thinking of changing what I'm doing to a way I considered earlier [without
thinking about this thread] so I can stay with in the limits or my resources.

--
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants

Re: Alter/update large tables - VERRRY annoying behaviour!

От

"Nigel J. Andrews"

Дата:

15 апреля 2002 г., 22:16:09

It's sad replying to my own question [of sorts] but...

On Mon, 15 Apr 2002, Nigel J. Andrews wrote:
>
> On Mon, 15 Apr 2002, Dmitry Tkach wrote:
>
> > Neil Conway wrote:
> >
> >>  On Mon, 15 Apr 2002 13:07:20 -0400
> >> "Dmitry Tkach" <dmitry@openratings.com> wrote:
> >
> >>>  First of all, a question for you - is ANY update to a table equivalent (in
> this respect) to a delete+insert?
> >
> >>  Yes, AFAIK -- MVCC requires this.
> >
> > What's MVCC?
>
> Funny, I was about to ask that question. Something about variable size of
> fields in the physical storage?

I've remembered, alsmot: Multi Version ConCurrency?


[plenty deleted from those quoted messages but they you all realised that or
don't care]


--
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants

Re: Alter/update large tables - VERRRY annoying behaviour!

От

Jochem van Dieten

Дата:

16 апреля 2002 г., 09:19:14

Nigel J. Andrews wrote:
> On Mon, 15 Apr 2002, Nigel J. Andrews wrote:
>>On Mon, 15 Apr 2002, Dmitry Tkach wrote:
>>
>>>What's MVCC?
>>
>>Funny, I was about to ask that question. Something about variable size of
>>fields in the physical storage?
>
> I've remembered, alsmot: Multi Version ConCurrency?

Almost: Multi-Version Concurrency Control
http://www.postgresql.org/idocs/index.php?mvcc.html
http://www.onlamp.com/pub/a/onlamp/2001/05/25/postgresql_mvcc.html

If you need more info, Google is your friend.

Jochem

Re: Alter/update large tables - VERRRY annoying behaviour!

От

"Nigel J. Andrews"

Дата:

19 апреля 2002 г., 11:14:12


Revisiting this thread, at least I think it was this thread, I don't remember
anyone pointing out contrib/pgstattuple for monitoring the number of dead
tuples.

Just thought I'd mention it.


--
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Alter/update large tables - VERRRY annoying behaviour!