Обсуждение: An optimisation question

Поиск
Список
Период
Сортировка

An optimisation question

От
Constantin Teodorescu
Дата:
Hello,

I have a very big table (valori) with the following columns:
- data datetime
- debitor float8
- creditor float8

It has a btree index on data (non unique).

The following select is using the index:

select * from valori where data > '25-10-1999'

NOTICE:  QUERY PLAN:
Index Scan using valori_data on valori  (cost=1550.17 rows=24324
width=8)


But this one:

select data from valori order by desc limit 1
NOTICE:  QUERY PLAN:

Sort  (cost=3216.01 rows=72970 width=8) ->  Seq Scan on valori  (cost=3216.01 rows=72970 width=8)

I thought that if the 'order by' implies an column which have a btree
index, the sort would not be actually executed and the index will be
used instead. But it seems that it won't.

Then, the question is : How should I retrieve extremely fast the first
'data' greater than a given value from that table.

Also, the following query :

select max(data) from valori where data<'2-3-1999'

is not using optimally the index, it just limit the records for the
aggregate function instead of picking the first value from the left of
the index tree lower than '2-3-1999'.


Waiting for some ideas,
Constantin Teodorescu
FLEX Consulting Braila, ROMANIA


Re: [HACKERS] An optimisation question

От
Tom Lane
Дата:
Constantin Teodorescu <teo@flex.ro> writes:
> select data from valori order by desc limit 1
> NOTICE:  QUERY PLAN:
> Sort  (cost=3216.01 rows=72970 width=8)
-> Seq Scan on valori  (cost=3216.01 rows=72970 width=8)

> I thought that if the 'order by' implies an column which have a btree
> index, the sort would not be actually executed and the index will be
> used instead. But it seems that it won't.

That's fixed for 6.6.  A workaround that partially solves the problem
for 6.5 is to add a dummy WHERE clause referencing the ORDER-BY item:select data from valori where data >
'1/1/1800'orderby data limit 1;
 
The WHERE is needed to get the 6.5 optimizer to consider the index
at all.  In a quick test it seems this works for normal order but not
DESC order... you could try applying the backwards-index patch that
someone (Hiroshi or Tatsuo, I think) posted recently.

> Also, the following query :
> select max(data) from valori where data<'2-3-1999'
> is not using optimally the index, it just limit the records for the
> aggregate function instead of picking the first value from the left of
> the index tree lower than '2-3-1999'.

There's no prospect of that happening anytime soon, I fear; there is no
connection between aggregate functions and indexes in the system, and
no easy way of making one.  This workaround works in current sources:

explain select data from valori where data<'2-3-1999'
order by data desc limit 1;
NOTICE:  QUERY PLAN:

Index Scan Backward using valori_i on valori  (cost=21.67 rows=334 width=8)
        regards, tom lane


Re: [HACKERS] An optimisation question

От
Constantin Teodorescu
Дата:
Tom Lane wrote:
> 
> That's fixed for 6.6.  A workaround that partially solves the problem
> for 6.5 is to add a dummy WHERE clause referencing the ORDER-BY item:
>         select data from valori where data > '1/1/1800'
>         order by data limit 1;
> The WHERE is needed to get the 6.5 optimizer to consider the index
> at all.  In a quick test it seems this works for normal order but not
> DESC order... you could try applying the backwards-index patch that
> someone (Hiroshi or Tatsuo, I think) posted recently.

Yeap , I will search for it.

> There's no prospect of that happening anytime soon, I fear; there is no
> connection between aggregate functions and indexes in the system, and
> no easy way of making one.

Understand that, but.
Selects that deal ONLY with columns included in an index should operate
exclusively on that index and return the results. Example : select
sum(price) , price*1.2, max(price) from products , assuming that price
is included in an index it would be less cost to scan the index rather
than the whole table.

I remember that Paradox tables had  indexes and the index was also a
Paradox table or some sort of that. Internally it's possible that a
number of procedures related to tables could be applied to indexes. So,
a sum(price) from a_table could be easily switched to be done on any
index that contain the price field.

What do you think?

Constantin Teodorescu
FLEX Consulting BRaila, ROMANIA