Обсуждение: Hash index performance/operation questions...

Поиск
Список
Период
Сортировка

Hash index performance/operation questions...

От
scott jacobs
Дата:
Hi! I'm a bit confused regarding hashed indexes in postgres. I'm hoping
someone can either answer my questions or point me towards some
documentation that answers them.

How come the performance of a hashed index created on an empty table into
which data is copied is so poor (on my machine)? Selects from this table
are really no faster than on an unindexed table. Once the table is loaded,
if I drop and recreate the index, performance is much better (at least 9X
in my case).  Am I misunderstanding the part of the create_index man page
that says:
    "We mention the algorithms used solely to indicate that all
     of these access methods are fully dynamic and do not have to
     be optimized periodically (as is the case with, for example,
    static hash access methods). "

Is there a chance that copying the data into the table has something to do
with it? Does the index still get dynamically optimised if I'm not
inserting the data?  I am going to try inserting the data later, but don't
have access to the database right now and I thought maybe I can get an
answer before then.

Also...

Is there a performance hit for making an SQL query into a hashed table
which returns no results? I've got ~50,000 unique keys and am iterating
over them, testing for the presence of each one in a 400 entry (not
necessarily keyed-unique) hash-indexed table. This is taking _much_ longer
than if I iterate over the 50,000 keys, looking each one up in the
hash-indexed 50,000 entry table. I can't understand why.  (note: Before
anyone gets confused: It's me who knows the keys are unique, not the
database. Hashed indexes don't support the UNIQUE keyword.)

If it makes any difference, I am doing all of the CREATE TABLE, CREATE
INDEX, and COPY FROM calls within the psql program and all the SELECT calls
using perl's DBI.

Thanks for any help...
scott

-----------------------------------------------------------------------
 scott jacobs                                        interactive magic
 net admin                              strategy and simulation gaming
-----------------------------------------------------------------------

Re: [GENERAL] Hash index performance/operation questions...

От
Vadim Mikheev
Дата:
scott jacobs wrote:
>
> Hi! I'm a bit confused regarding hashed indexes in postgres. I'm hoping
> someone can either answer my questions or point me towards some
> documentation that answers them.
>
> How come the performance of a hashed index created on an empty table into
> which data is copied is so poor (on my machine)? Selects from this table
> are really no faster than on an unindexed table. Once the table is loaded,
> if I drop and recreate the index, performance is much better (at least 9X
> in my case).  Am I misunderstanding the part of the create_index man page
> that says:
>         "We mention the algorithms used solely to indicate that all
>          of these access methods are fully dynamic and do not have to
>          be optimized periodically (as is the case with, for example,
>         static hash access methods). "
>
> Is there a chance that copying the data into the table has something to do
> with it? Does the index still get dynamically optimised if I'm not
> inserting the data?  I am going to try inserting the data later, but don't
> have access to the database right now and I thought maybe I can get an
> answer before then.

Indices are dynamic, but statistic about table pages/rows is not:
run vacuum or, better, create index AFTER loading data into table -
this is faster.

Also note, that nothing was done for hash indices last 2 years -
btree are the most supported indices...

Vadim