Обсуждение: UPDATE and Indexes and Performance

Поиск
Список
Период
Сортировка

UPDATE and Indexes and Performance

От
Bill Thoen
Дата:
Does PG (8.1) ever use existing indexes when executing an UPDATE?

I've got some tables with millions of records and whenever I update a
column that involves most or all the records the EXPLAIN command seems
to indicate that it isn't using the pre-existing indexes. This result in
a slow update, which is further slowed by the presence of indexes. So
when doing a large update should I just drop the indexes first, or is
there some good reason to keep them?

Thanks,
- Bill Thoen


Re: UPDATE and Indexes and Performance

От
"Joshua Tolley"
Дата:
On Wed, Oct 15, 2008 at 10:42 AM, Bill Thoen <bthoen@gisnet.com> wrote:
> Does PG (8.1) ever use existing indexes when executing an UPDATE?
>
> I've got some tables with millions of records and whenever I update a column
> that involves most or all the records the EXPLAIN command seems to indicate
> that it isn't using the pre-existing indexes. This result in a slow update,
> which is further slowed by the presence of indexes. So when doing a large
> update should I just drop the indexes first, or is there some good reason to
> keep them?
>
> Thanks,
> - Bill Thoen

You have to hit the disk twice if you use an index -- once to read the
index and once to read the table itself. If the planner guesses that
an operation involves most of the records in a table, it will avoid
indexes and just seqscan, because it's faster.

- Josh / eggyknap

Re: UPDATE and Indexes and Performance

От
"Scott Marlowe"
Дата:
On Wed, Oct 15, 2008 at 10:42 AM, Bill Thoen <bthoen@gisnet.com> wrote:
> Does PG (8.1) ever use existing indexes when executing an UPDATE?
>
> I've got some tables with millions of records and whenever I update a column
> that involves most or all the records the EXPLAIN command seems to indicate
> that it isn't using the pre-existing indexes. This result in a slow update,
> which is further slowed by the presence of indexes. So when doing a large
> update should I just drop the indexes first, or is there some good reason to
> keep them?

You're assuming that seq scan is making it slow.  You can always use
the enable_xxx settings to turn off sequential scan etc to see if it
runs faster with an index.  Also, you might have a tuning issue going
on and indexed lookups would be faster.

If you're hitting every record, it's probably best to do a seq scan as
index scans, as previously mentioned hit both the index and the table.

What's your work_mem set to?  What about random_page_cost,
effective_cache_size, and shared_buffers?

Re: UPDATE and Indexes and Performance

От
Craig Ringer
Дата:
Bill Thoen wrote:
> Does PG (8.1) ever use existing indexes when executing an UPDATE?
>
> I've got some tables with millions of records and whenever I update a
> column that involves most or all the records the EXPLAIN command seems
> to indicate that it isn't using the pre-existing indexes. This result in
> a slow update, which is further slowed by the presence of indexes. So
> when doing a large update should I just drop the indexes first, or is
> there some good reason to keep them?

Joshua Tolley explained why it's doing a sequential scan, and why that's
a good thing.

As for the added cost of maintaining indexes when doing the UPDATE -
yes, you might want to consider dropping the index(es) before issuing
the UPDATE and then recreating it/them afterwards. That can be
considerably faster.

I have the feeling you'd need to drop the index then COMMIT before you
ran the update and recreated the index, though, since Pg probably can't
really get rid of the index if it's still visible to other transactions
and might be restored by a ROLLBACK anyway. I'm not sure, though -
explicit locking might be used to handle that, I haven't looked into it.

--
Craig Ringer