Обсуждение: Indexed leading substring searches - worked, now doesn't

Поиск
Список
Период
Сортировка

Indexed leading substring searches - worked, now doesn't

От
Wes
Дата:
I know my leading substring searches used to be done via indexes.  We
specifically tested that.  Since the last time I tested it, the database has
probably been reloaded to fix a corruption problem.  Now all I can get is
sequential leading substring searches.  In the examples below, the database
was vacuumed last night.

The database is very large (currently about 100 GB, and will be 100's of
gigabytes), so performance is important.  This particular table in the
example has only about 2.8 million rows.

PostgreSQL version is 7.4.5.  Prior to the reload, it was 7.4.1.

The locale is showing up as en_US.iso885915.  As far as I know, it was
always this (default RedHat install), so I don't understand why it worked
before.  Did something change between 7.4.1 and 7.4.5?  I supposed it's
possible that I specified locale=C on the original database and don't
remember that...

I'm not going to have to "initdb --locale=C" and am I?  I looked at index
classes, and that doesn't appear to be something I want to do, due to
performance.  What kind of performance hit do you actually take by using an
index class?

Wes


Pg_controldata shows:

Maximum length of locale name:        128
LC_COLLATE:                           en_US.iso885915
LC_CTYPE:                             en_US.iso885915


narc=> \d addresses
             Table "public.addresses"
   Column    |          Type          | Modifiers
-------------+------------------------+-----------
 address_key | numeric(12,0)          | not null
 address     | character varying(255) | not null
Indexes:
    "addresses_pkey" primary key, btree (address_key)
    "addresses_i_address" btree (address)

narc=> select count(*) from addresses;
  count
---------
 2829640
(1 row)

narc=> explain select * from addresses where address = 'blah';
                                      QUERY PLAN
----------------------------------------------------------------------------
----------
 Index Scan using addresses_i_address on addresses  (cost=0.00..2.81 rows=1
width=40)
   Index Cond: ((address)::text = 'blah'::text)
(2 rows)

narc=> explain select * from addresses where address like 'blah%';
                          QUERY PLAN
--------------------------------------------------------------
 Seq Scan on addresses  (cost=0.00..61244.68 rows=2 width=40)
   Filter: ((address)::text ~~ 'blah%'::text)
(2 rows)

narc=> explain analyze select * from addresses where address like 'blah%';
                                                  QUERY PLAN
----------------------------------------------------------------------------
----------------------------------
 Seq Scan on addresses  (cost=0.00..61244.68 rows=2 width=40) (actual
time=1445.386..8913.435 rows=6 loops=1)
   Filter: ((address)::text ~~ 'blah%'::text)
 Total runtime: 8913.504 ms
(3 rows)


Something else doesn't make sense..  I did the same query a few minutes ago
and the cost was totally different:


narc=> explain select * from addresses where address like 'blah%';
                                QUERY PLAN
--------------------------------------------------------------------------
 Seq Scan on addresses  (cost=100000000.00..100061244.67 rows=2 width=40)
   Filter: ((address)::text ~~ 'blah%'::text)
(2 rows)




Re: Indexed leading substring searches - worked, now doesn't

От
Tom Lane
Дата:
Wes <wespvp@syntegra.com> writes:
> The locale is showing up as en_US.iso885915.  As far as I know, it was
> always this (default RedHat install), so I don't understand why it worked
> before.  Did something change between 7.4.1 and 7.4.5?  I supposed it's
> possible that I specified locale=C on the original database and don't
> remember that...

You got it in one ...

> I'm not going to have to "initdb --locale=C" and am I?

You could use the alternative operator classes for pattern searches, but
if you want the same indexes to also serve for normal text sorting,
C locale is the better bet.

            regards, tom lane

Re: Indexed leading substring searches - worked, now

От
Wes
Дата:
On 2/3/05 2:29 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

> You could use the alternative operator classes for pattern searches, but
> if you want the same indexes to also serve for normal text sorting,
> C locale is the better bet.

As an interim solution until I can reload the database (takes a weekend),
can I define two indexes on the same field, one using operator classes and
one not (and have them automatically used as appropriate)?

Because of the time involved, I'm trying to hold off on another reload until
we upgrade to 8.x.

Wes



Re: Indexed leading substring searches - worked, now doesn't

От
Tom Lane
Дата:
Wes <wespvp@syntegra.com> writes:
> As an interim solution until I can reload the database (takes a weekend),
> can I define two indexes on the same field, one using operator classes and
> one not (and have them automatically used as appropriate)?

Certainly.

            regards, tom lane