Обсуждение: Very long times to build hash indexes

Поиск
Список
Период
Сортировка

Very long times to build hash indexes

От
"David Monarchi"
Дата:
Hello -

We have a database of about 250GB.  The core table contains about 140M rows that are all integers and small integers.  Aside from the key field, the rest are all foreign keys.  Virtually all of our queries use equalities rather than inequalities/ranges.  The database changes at a fairly even rate due to insertions and deletions.  Updates are rare.  Insertions dominate the environment, and we expect to have about 400M rows in the core table by the end of the year. 

We need to build indexes on 10 foreign key fields in the core table.  Based on the type of queries and the fact that insertions in it are fast, we are building hash indexes on those fields.  We have successfully built 5 of the 10 hash indexes.  Each one required about 20 hours to construct. 

When we got to the 6th field, we found that the indexing process would not terminate even after 70 hours.  We then tried the 7th field with the same result.  Is there something that we've overlooked?  Is there a limit of some kind that we've missed?

Any advice/suggestions would be very much appreciated.

Thank you.

david

Re: Very long times to build hash indexes

От
Tom Lane
Дата:
"David Monarchi" <david.e.monarchi@gmail.com> writes:
> We need to build indexes on 10 foreign key fields in the core table.  Based
> on the type of queries and the fact that insertions in it are fast, we are
> building hash indexes on those fields.  We have successfully built 5 of the
> 10 hash indexes.  Each one required about 20 hours to construct.

> When we got to the 6th field, we found that the indexing process would not
> terminate even after 70 hours.  We then tried the 7th field with the same
> result.  Is there something that we've overlooked?

The short answer is that Postgres' hash indexes suck.  The degree of
suckiness varies by PG version (which you failed to mention) but there
is no release currently in which I would use them in preference to a
btree index.  The lack of WAL support is alone a sufficient reason why
they're unacceptable for production use, but on top of that they don't
actually have any performance advantage in any tests that I've seen.

            regards, tom lane