Обсуждение: index on timestamp performance

Поиск
Список
Период
Сортировка

index on timestamp performance

От
Eric Cholet
Дата:
I have this schema:


 motid  | integer                     | not null
 objid  | integer                     | not null
 date   | timestamp without time zone | not null
Indexes: dico_frs_motid_date btree (motid, date)
         dico_frs_objid btree (objid)

The performance I'm getting from the index that contains
'date' is much slower than when using the objid index
(different queries of course). This is a 10 million row
table. Am I right to assume that postgres needs to do
more work because it has to convert the dates to some
internal (integer?) format?
Consequently would I gain performance if I changed the
schema so that 'date' is an integer (which would not
be a problem for my application) ?

Thanks,
--
Eric Cholet


UNSUBSCRIBE ME PLEASE!

От
Evan Macosko
Дата:
To whomever runs this list:

I have tried repeatedly to get majordomo to unsubscribe me from this list,
to no avail.  I do not currently have the means of filtering my inbox at
this address, so receiving these emails is EXTREMELY annoying for me.
PLEASE PLEASE PLEASE PLEASE unsubscribe me.  PLEASE.





On Wed, 29 Jan 2003, Eric Cholet wrote:

> I have this schema:
>
>
>  motid  | integer                     | not null
>  objid  | integer                     | not null
>  date   | timestamp without time zone | not null
> Indexes: dico_frs_motid_date btree (motid, date)
>          dico_frs_objid btree (objid)
>
> The performance I'm getting from the index that contains
> 'date' is much slower than when using the objid index
> (different queries of course). This is a 10 million row
> table. Am I right to assume that postgres needs to do
> more work because it has to convert the dates to some
> internal (integer?) format?
> Consequently would I gain performance if I changed the
> schema so that 'date' is an integer (which would not
> be a problem for my application) ?
>
> Thanks,
> --
> Eric Cholet
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>


Re: index on timestamp performance

От
Stephan Szabo
Дата:
On Wed, 29 Jan 2003, Eric Cholet wrote:

> I have this schema:
>
>
>  motid  | integer                     | not null
>  objid  | integer                     | not null
>  date   | timestamp without time zone | not null
> Indexes: dico_frs_motid_date btree (motid, date)
>          dico_frs_objid btree (objid)
>
> The performance I'm getting from the index that contains
> 'date' is much slower than when using the objid index
> (different queries of course). This is a 10 million row
> table. Am I right to assume that postgres needs to do
> more work because it has to convert the dates to some
> internal (integer?) format?

What does explain (analyze if possible) show for the two queries?
It could just be a difference in plans or estimates.


Re: index on timestamp performance

От
Tom Lane
Дата:
Eric Cholet <cholet@logilune.com> writes:
> Consequently would I gain performance if I changed the
> schema so that 'date' is an integer

No, you wouldn't, at least not enough to notice.  (But if you only need
a date and not a time of day, consider changing the column type to
"date".)

I suspect the issue is selectivity of the queries, but with no actual
query examples or EXPLAIN ANALYZE results to look at, that's just idle
speculation.

            regards, tom lane

Re: index on timestamp performance

От
Eric Cholet
Дата:

--On Wednesday, January 29, 2003 08:51:30 -0800 Stephan Szabo
<sszabo@megazone23.bigpanda.com> wrote:

> On Wed, 29 Jan 2003, Eric Cholet wrote:
>
>> I have this schema:
>>
>>
>>  motid  | integer                     | not null
>>  objid  | integer                     | not null
>>  date   | timestamp without time zone | not null
>> Indexes: dico_frs_motid_date btree (motid, date)
>>          dico_frs_objid btree (objid)
>>
>> The performance I'm getting from the index that contains
>> 'date' is much slower than when using the objid index
>> (different queries of course). This is a 10 million row
>> table. Am I right to assume that postgres needs to do
>> more work because it has to convert the dates to some
>> internal (integer?) format?
>
> What does explain (analyze if possible) show for the two queries?

I can't really run two equivalent queries that will each use
a different index. Here's a query that uses the index with 'date'
(output wrapped manually)

=> explain analyze select objid from dico_frs where motid=1247
   and date <= '2003-01-29 17:55:17' and date >= '2002-10-29 17:55:17'
   order by date desc limit 11;

 Limit  (cost=4752.14..4752.17 rows=11 width=12) (actual time=63.20..63.37
rows=11 loops=1)
   ->  Sort  (cost=4752.14..4755.11 rows=1187 width=12) (actual
time=63.17..63.23 rows=12 loops=1)
         Sort Key: date
         ->  Index Scan using dico_frs_motid_date on dico_frs
(cost=0.00..4691.50 rows=1187 width=12) (actual time=0.08..41.88 rows=2924
loops=1)
               Index Cond: ((motid = 1247) AND (date <= '2003-01-29
17:55:17'::timestamp without time zone) AND (date >= '2002-10-29
17:55:17'::timestamp without time zone))
 Total runtime: 63.93 msec
(6 rows)


> It could just be a difference in plans or estimates.

Right, but still I'd like to know whether the timestamp datatype in the
index
results in more work than an integer datatype.

Thanks,
--
Eric Cholet


Re: index on timestamp performance

От
Stephan Szabo
Дата:
On Wed, 29 Jan 2003, Eric Cholet wrote:

> --On Wednesday, January 29, 2003 08:51:30 -0800 Stephan Szabo
> <sszabo@megazone23.bigpanda.com> wrote:
>
> > On Wed, 29 Jan 2003, Eric Cholet wrote:
> >
> >> I have this schema:
> >>
> >>
> >>  motid  | integer                     | not null
> >>  objid  | integer                     | not null
> >>  date   | timestamp without time zone | not null
> >> Indexes: dico_frs_motid_date btree (motid, date)
> >>          dico_frs_objid btree (objid)
> >>
> >> The performance I'm getting from the index that contains
> >> 'date' is much slower than when using the objid index
> >> (different queries of course). This is a 10 million row
> >> table. Am I right to assume that postgres needs to do
> >> more work because it has to convert the dates to some
> >> internal (integer?) format?
> >
> > What does explain (analyze if possible) show for the two queries?
>
> I can't really run two equivalent queries that will each use
> a different index. Here's a query that uses the index with 'date'
> (output wrapped manually)

Well, I was wondering for example, the objid queries you were comparing
to, were they returning an equivalent number of about 3000 rows from the
index scan?  I think part of the difference may be that here the plan
grabs the rows from the index and then sorts them all so the limit doesn't
actually save you significant time.  For better speed on this particular
sort of query, you might be better off with an order of :
 order by motid desc, date desc


> => explain analyze select objid from dico_frs where motid=1247
>    and date <= '2003-01-29 17:55:17' and date >= '2002-10-29 17:55:17'
>    order by date desc limit 11;
>
>  Limit  (cost=4752.14..4752.17 rows=11 width=12) (actual time=63.20..63.37
> rows=11 loops=1)
>    ->  Sort  (cost=4752.14..4755.11 rows=1187 width=12) (actual
> time=63.17..63.23 rows=12 loops=1)
>          Sort Key: date
>          ->  Index Scan using dico_frs_motid_date on dico_frs
> (cost=0.00..4691.50 rows=1187 width=12) (actual time=0.08..41.88 rows=2924
> loops=1)
>                Index Cond: ((motid = 1247) AND (date <= '2003-01-29
> 17:55:17'::timestamp without time zone) AND (date >= '2002-10-29
> 17:55:17'::timestamp without time zone))
>  Total runtime: 63.93 msec
> (6 rows)
>
>
> > It could just be a difference in plans or estimates.
>
> Right, but still I'd like to know whether the timestamp datatype in the
> index
> results in more work than an integer datatype.

Almost certainly a little bit, but not so much that I would expect to see
order of magnitude differences or anything.


Re: index on timestamp performance

От
Eric Cholet
Дата:
--On Wednesday, January 29, 2003 10:45:34 -0800 Stephan Szabo
<sszabo@megazone23.bigpanda.com> wrote:

>
>> >>  motid  | integer                     | not null
>> >>  objid  | integer                     | not null
>> >>  date   | timestamp without time zone | not null
>> >> Indexes: dico_frs_motid_date btree (motid, date)
>> >>          dico_frs_objid btree (objid)
>> >>

> Well, I was wondering for example, the objid queries you were comparing
> to, were they returning an equivalent number of about 3000 rows from the
> index scan?  I think part of the difference may be that here the plan
> grabs the rows from the index and then sorts them all so the limit doesn't
> actually save you significant time.  For better speed on this particular
> sort of query, you might be better off with an order of :
>  order by motid desc, date desc

Indeed, the speed gain is amazing, thanks for your explanation, I understand
things a little better now. Note to self: next time don't make any
assumptions,
just post the explain analyze!

>> => explain analyze select objid from dico_frs where motid=1247
>>    and date <= '2003-01-29 17:55:17' and date >= '2002-10-29 17:55:17'
>>    order by date desc limit 11;
>>
>>  Limit  (cost=4752.14..4752.17 rows=11 width=12) (actual
>>  time=63.20..63.37 rows=11 loops=1)
>>    ->  Sort  (cost=4752.14..4755.11 rows=1187 width=12) (actual
>> time=63.17..63.23 rows=12 loops=1)
>>          Sort Key: date
>>          ->  Index Scan using dico_frs_motid_date on dico_frs
>> (cost=0.00..4691.50 rows=1187 width=12) (actual time=0.08..41.88
>> rows=2924 loops=1)
>>                Index Cond: ((motid = 1247) AND (date <= '2003-01-29
>> 17:55:17'::timestamp without time zone) AND (date >= '2002-10-29
>> 17:55:17'::timestamp without time zone))
>>  Total runtime: 63.93 msec
>> (6 rows)

--
Eric Cholet