Обсуждение: Some Improvement

Поиск
Список
Период
Сортировка

Some Improvement

От
Tim Perdue
Дата:
I added the suggested index and changed my sql and the subjective tests
seem to be improved somewhat. I checked EXPLAIN and it is using the new
index.

I still think there must be sorting going on, as the result is returned
instantly if you remove the ORDER BY. I don't know - I do think it's
much better now.

Thanks for all your help - I (of course) will let you know if I have any
troubles with corruption on 7.0.2 ;-)

Tim

-- 
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723


Re: Some Improvement

От
Tom Lane
Дата:
Tim Perdue <tperdue@valinux.com> writes:
> I added the suggested index and changed my sql and the subjective tests
> seem to be improved somewhat. I checked EXPLAIN and it is using the new
> index.

> I still think there must be sorting going on, as the result is returned
> instantly if you remove the ORDER BY.

You "think"?  What does EXPLAIN show in the two cases?
        regards, tom lane


Re: Some Improvement

От
Tim Perdue
Дата:
Tom Lane wrote:
> 
> Tim Perdue <tperdue@valinux.com> writes:
> > I added the suggested index and changed my sql and the subjective tests
> > seem to be improved somewhat. I checked EXPLAIN and it is using the new
> > index.
> 
> > I still think there must be sorting going on, as the result is returned
> > instantly if you remove the ORDER BY.
> 
> You "think"?  What does EXPLAIN show in the two cases?
> 
>                         regards, tom lane

Following is the info - again thanks for your help. If you need, I can
try to re-install 6.5.3 and re-import the database. Although with tables
of this size, it is a true nightmare to do this. If you feel the info is
valuable, I'd like to help.

Tim


With the ORDER BY


db_geocrawler=# explain verbose SELECT fld_mailid, fld_mail_date,
fld_mail_is_followup, fld_mail_from, fld_mail_subject FROM
tbl_mail_archive WHERE fld_mail_list='35' AND fld_mail_date between
'20000100' AND '20000199' ORDER BY fld_mail_date DESC LIMIT 51 OFFSET 0;
NOTICE:  QUERY DUMP:

{ SORT :startup_cost 5.03 :total_cost 5.03 :rows 1 :width 44 :state <>
:qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 23
:restypmod -1 :resname fld_mailid :reskey 0 :reskeyop 0 :ressortgroupref
0 :resjunk false } :expr { VAR :varno 1 :varattno 1 :vartype 23
:vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 1}} { TARGETENTRY
:resdom { RESDOM :resno 2 :restype 1042 :restypmod 18 :resname
fld_mail_date :reskey 1 :reskeyop 1051 :ressortgroupref 1 :resjunk false
} :expr { VAR :varno 1 :varattno 3 :vartype 1042 :vartypmod 18 
:varlevelsup 0 :varnoold 1 :varoattno 3}} { TARGETENTRY :resdom { RESDOM
:resno 3 :restype 23 :restypmod -1 :resname fld_mail_is_followup :reskey
0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1
:varattno 4 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1
:varoattno 4}} { TARGETENTRY :resdom { RESDOM :resno 4 :restype 25
:restypmod -1 :resname fld_mail_from :reskey 0 :reskeyop 0
:ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :varattno 5
:vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 5}} {
TARGETENTRY :resdom { RESDOM :resno 5 :restype 25 :restypmod -1 :resname
fld_mail_subject :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false
} :expr { VAR :varno 1 :varattno 6 :vartype 25 :vartypmod -1 
:varlevelsup 0 :varnoold 1 :varoattno 6}}) :qpqual <> :lefttree {
INDEXSCAN :startup_cost 0.00 :total_cost 5.02 :rows 1 :width 44 :state
<> :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 23
:restypmod -1 :resname fld_mailid :reskey 0 :reskeyop 0 :ressortgroupref
0 :resjunk false } :expr { VAR :varno 1 :varattno 1 :vartype 23
:vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 1}} { TARGETENTRY
:resdom { RESDOM :resno 2 :restype 1042 :restypmod 18 :resname
fld_mail_date :reskey 0 :reskeyop 0 :ressortgroupref 1 :resjunk false }
:expr { VAR :varno 1 :varattno 3 :vartype 1042 :vartypmod 18 
:varlevelsup 0 :varnoold 1 :varoattno 3}} { TARGETENTRY :resdom { RESDOM
:resno 3 :restype 23 :restypmod -1 :resname fld_mail_is_followup :reskey
0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1
:varattno 4 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1
:varoattno 4}} { TARGETENTRY :resdom { RESDOM :resno 4 :restype 25
:restypmod -1 :resname fld_mail_from :reskey 0 :reskeyop 0
:ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :varattno 5
:vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 5}} {
TARGETENTRY :resdom { RESDOM :resno 5 :restype 25 :restypmod -1 :resname
fld_mail_subject :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false
} :expr { VAR :varno 1 :varattno 6 :vartype 25 :vartypmod -1 
:varlevelsup 0 :varnoold 1 :varoattno 6}}) :qpqual <> :lefttree <>
:righttree <> :extprm () :locprm () :initplan <> :nprm 0  :scanrelid 1
:indxid ( 5913536) :indxqual (({ EXPR :typeOid 16  :opType op :oper {
OPER :opno 96 :opid 65 :opresulttype 16 } :args ({ VAR :varno 1
:varattno 1 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1
:varoattno 2} { CONST :consttype 23 :constlen 4 :constisnull false
:constvalue  4 [ 35 0 0 0 ]  :constbyval true })} { EXPR :typeOid 16 
:opType op :oper { OPER :opno 1061 :opid 1052 :opresulttype 16 } :args
({ VAR :varno 1 :varattno 2 :vartype 1042 :vartypmod 18  :varlevelsup 0
:varnoold 1 :varoattno 3} { CONST :consttype 1042 :constlen -1
:constisnull false :constvalue  12 [ 12 0 0 0 50 48 48 48 48 49 48 48 ] 
:constbyval false })} { EXPR :typeOid 16  :opType op :oper { OPER :opno
1059 :opid 1050 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 2
:vartype 1042 :vartypmod 18  :varlevelsup 0 :varnoold 1 :varoattno 3} {
CONST :consttype 1042 :constlen -1 :constisnull false :constvalue  12 [
12 0 0 0 50 48 48 48 48 49 57 57 ]  :constbyval false })}))
:indxqualorig (({ EXPR :typeOid 16  :opType op :oper { OPER :opno 96
:opid 65 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 2 :vartype
23 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 2} { CONST
:consttype 23 :constlen 4 :constisnull false :constvalue  4 [ 35 0 0 0
]  :constbyval true })} { EXPR :typeOid 16  :opType op :oper { OPER
:opno 1061 :opid 1052 :opresulttype 16 } :args ({ VAR :varno 1 :varattno
3 :vartype 1042 :vartypmod 18  :varlevelsup 0 :varnoold 1 :varoattno 3}
{ CONST :consttype 1042 :constlen -1 :constisnull false :constvalue  12
[ 12 0 0 0 50 48 48 48 48 49 48 48 ]  :constbyval false })} { EXPR
:typeOid 16  :opType op :oper { OPER :opno 1059 :opid 1050 :opresulttype
16 } :args ({ VAR :varno 1 :varattno 3 :vartype 1042 :vartypmod 18 
:varlevelsup 0 :varnoold 1 :varoattno 3} { CONST :consttype 1042
:constlen -1 :constisnull false :constvalue  12 [ 12 0 0 0 50 48 48 48
48 49 57 57 ]  :constbyval false })})) :indxorderdir 1 } :righttree <>
:extprm () :locprm () :initplan <> :nprm 0  :nonameid 0 :keycount 1 }
NOTICE:  QUERY PLAN:

Sort  (cost=5.03..5.03 rows=1 width=44) ->  Index Scan using idx_archive_list_date on tbl_mail_archive 
(cost=0.00..5.02 rows=1 width=44)

EXPLAIN
db_geocrawler=#


Without the ORDER BY



db_geocrawler=#
db_geocrawler=# explain verbose SELECT fld_mailid, fld_mail_date,
fld_mail_is_followup, fld_mail_from, fld_mail_subject FROM
tbl_mail_archive WHERE fld_mail_list='35' AND fld_mail_date between
'20000100' AND '20000199' LIMIT 51 OFFSET 0;
NOTICE:  QUERY DUMP:

{ INDEXSCAN :startup_cost 0.00 :total_cost 5.02 :rows 1 :width 44 :state
<> :qptargetlist ({ TARGETENTRY :resdom { RESDOM :resno 1 :restype 23
:restypmod -1 :resname fld_mailid :reskey 0 :reskeyop 0 :ressortgroupref
0 :resjunk false } :expr { VAR :varno 1 :varattno 1 :vartype 23
:vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 1}} { TARGETENTRY
:resdom { RESDOM :resno 2 :restype 1042 :restypmod 18 :resname
fld_mail_date :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false }
:expr { VAR :varno 1 :varattno 3 :vartype 1042 :vartypmod 18 
:varlevelsup 0 :varnoold 1 :varoattno 3}} { TARGETENTRY :resdom { RESDOM
:resno 3 :restype 23 :restypmod -1 :resname fld_mail_is_followup :reskey
0 :reskeyop 0 :ressortgroupref 0 :resjunk false } :expr { VAR :varno 1
:varattno 4 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1
:varoattno 4}} { TARGETENTRY :resdom { RESDOM :resno 4 :restype 25
:restypmod -1 :resname fld_mail_from :reskey 0 :reskeyop 0
:ressortgroupref 0 :resjunk false } :expr { VAR :varno 1 :varattno 5
:vartype 25 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 5}} {
TARGETENTRY :resdom { RESDOM :resno 5 :restype 25 :restypmod -1 :resname
fld_mail_subject :reskey 0 :reskeyop 0 :ressortgroupref 0 :resjunk false
} :expr { VAR :varno 1 :varattno 6 :vartype 25 :vartypmod -1 
:varlevelsup 0 :varnoold 1 :varoattno 6}}) :qpqual <> :lefttree <>
:righttree <> :extprm () :locprm () :initplan <> :nprm 0  :scanrelid 1
:indxid ( 5913536) :indxqual (({ EXPR :typeOid 16  :opType op :oper {
OPER :opno 96 :opid 65 :opresulttype 16 } :args ({ VAR :varno 1
:varattno 1 :vartype 23 :vartypmod -1  :varlevelsup 0 :varnoold 1
:varoattno 2} { CONST :consttype 23 :constlen 4 :constisnull false
:constvalue  4 [ 35 0 0 0 ]  :constbyval true })} { EXPR :typeOid 16
:opType op :oper { OPER :opno 1061 :opid 1052 :opresulttype 16 } :args
({ VAR :varno 1 :varattno 2 :vartype 1042 :vartypmod 18  :varlevelsup 0
:varnoold 1 :varoattno 3} { CONST :consttype 1042 :constlen -1
:constisnull false :constvalue  12 [ 12 0 0 0 50 48 48 48 48 49 48 48 ] 
:constbyval false })} { EXPR :typeOid 16  :opType op :oper { OPER :opno
1059 :opid 1050 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 2
:vartype 1042 :vartypmod 18  :varlevelsup 0 :varnoold 1 :varoattno 3} {
CONST :consttype 1042 :constlen -1 :constisnull false :constvalue  12 [
12 0 0 0 50 48 48 48 48 49 57 57 ]  :constbyval false })}))
:indxqualorig (({ EXPR :typeOid 16  :opType op :oper { OPER :opno 96
:opid 65 :opresulttype 16 } :args ({ VAR :varno 1 :varattno 2 :vartype
23 :vartypmod -1  :varlevelsup 0 :varnoold 1 :varoattno 2} { CONST
:consttype 23 :constlen 4 :constisnull false :constvalue  4 [ 35 0 0 0
]  :constbyval true })} { EXPR :typeOid 16  :opType op :oper { OPER
:opno 1061 :opid 1052 :opresulttype 16 } :args ({ VAR :varno 1 :varattno
3 :vartype 1042 :vartypmod 18  :varlevelsup 0 :varnoold 1 :varoattno 3}
{ CONST :consttype 1042 :constlen -1 :constisnull false :constvalue  12
[ 12 0 0 0 50 48 48 48 48 49 48 48 ]  :constbyval false })} { EXPR
:typeOid 16  :opType op :oper { OPER :opno 1059 :opid 1050 :opresulttype
16 } :args ({ VAR :varno 1 :varattno 3 :vartype 1042 :vartypmod 18 
:varlevelsup 0 :varnoold 1 :varoattno 3} { CONST :consttype 1042
:constlen -1 :constisnull false :constvalue  12 [ 12 0 0 0 50 48 48 48
48 49 57 57 ]  :constbyval false })})) :indxorderdir 1 }
NOTICE:  QUERY PLAN:

Index Scan using idx_archive_list_date on tbl_mail_archive 
(cost=0.00..5.02 rows=1 width=44)

EXPLAIN



-- 
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723


Re: Some Improvement

От
Tom Lane
Дата:
Tim Perdue <tperdue@valinux.com> writes:
> Tom Lane wrote:
>> Tim Perdue <tperdue@valinux.com> writes:
>>>> I still think there must be sorting going on, as the result is returned
>>>> instantly if you remove the ORDER BY.
>> 
>> You "think"?  What does EXPLAIN show in the two cases?

> With the ORDER BY

> db_geocrawler=# explain verbose SELECT fld_mailid, fld_mail_date,
> fld_mail_is_followup, fld_mail_from, fld_mail_subject FROM
> tbl_mail_archive WHERE fld_mail_list='35' AND fld_mail_date between
> '20000100' AND '20000199' ORDER BY fld_mail_date DESC LIMIT 51 OFFSET 0;

> NOTICE:  QUERY PLAN:

> Sort  (cost=5.03..5.03 rows=1 width=44)
>    -> Index Scan using idx_archive_list_date on tbl_mail_archive 
>       (cost=0.00..5.02 rows=1 width=44)

Well, you obviously are getting a sort step here, which you want to
avoid because the LIMIT isn't doing you much good when there's a SORT
in between --- the indexscan has to run over the whole month then.

I assume idx_archive_list_date is an index on tbl_mail_archive
(fld_mail_list, fld_mail_date) in that order?  The reason you're
getting the extra sort is that the planner believes the indexscan
will produce data ordered likeORDER BY fld_mail_list, fld_mail_date
which is not what you asked for: you asked for a sort by fld_mail_date,
period.  (Now you know and I know that since the query retrieves only
tuples with a single value of fld_mail_list, there's no practical
difference.  The planner, however, is less bright than we are and does
not make the connection.)  To avoid the extra sort, you need to specify
an ORDER BY that the planner will recognize as compatible with the
index:ORDER BY fld_mail_list DESC, fld_mail_date DESC
Note it's important that both clauses be marked DESC or neither;
otherwise the clause still won't look like it matches the index's
ordering.  But with the correct ORDER BY incantation, you should
get a plan like

Index Scan Backwards using idx_archive_list_date on tbl_mail_archive 

and then you will be happy ;-).

(Alternatively, you could declare the index on (fld_mail_date,
fld_mail_list) and then ORDER BY fld_mail_date DESC would work by
itself.  You should think about which ordering you'd want for a
query retrieving rows from more than one list before you decide.)

BTW, the 6.5 planner was quite incapable of generating a plan like
this, so I'm still not sure why you saw better performance with 6.5.
Was there anything to the theory about LOCALE slowing down the sort?
        regards, tom lane


Re: Some Improvement

От
Tom Lane
Дата:
> But I don't see the "Backwards index scan" you mentioned.

Then we're not there yet.  It looks like there may indeed be a bug
here.  Trying it with a dummy table:

regression=# create table ff1 (f1 int, f2 char(14));
CREATE
regression=# create index ff1i on ff1(f1,f2);
CREATE
regression=# explain select * from ff1 where f1 = 3 and f2 between '4' and '5'
regression-# order by f1,f2;
NOTICE:  QUERY PLAN:

Index Scan using ff1i on ff1  (cost=0.00..2.02 rows=1 width=16)

EXPLAIN
regression=# explain select * from ff1 where f1 = 3 and f2 between '4' and '5'
regression-# order by f1 desc,f2 desc;
NOTICE:  QUERY PLAN:

Sort  (cost=2.03..2.03 rows=1 width=16) ->  Index Scan using ff1i on ff1  (cost=0.00..2.02 rows=1 width=16)

EXPLAIN
regression=# set enable_sort TO off;
SET VARIABLE

regression=# explain select * from ff1 where f1 = 3 and f2 between '4' and '5'
regression-# order by f1 desc, f2 desc;
NOTICE:  QUERY PLAN:

Index Scan Backward using ff1i on ff1  (cost=0.00..67.50 rows=1 width=16)

EXPLAIN

So it knows how to generate an indexscan backwards plan, but it's not
choosing that because there's something wacko with the cost estimate.
Hmm.  This works great for single-column indexes, I wonder what's wrong
with the multi-column case?  Will start digging.

I hesitate to suggest that you throw "SET enable_sort TO off" and then
"SET enable_sort TO on" around your query, because it's so ugly,
but that might be the best short-term answer.

>> Was there anything to the theory about LOCALE slowing down the sort?

> Well, I didn't intentionally compile LOCALE support. Just did the usual 

> ./configure --with-max-backends=128 (or whatever)
> gmake

That shouldn't cause LOCALE to get compiled.  I'm still at a loss why
6.5 would be faster for your original query.  For sure it's not
generating a more intelligent plan...
        regards, tom lane


Re: Some Improvement

От
Tom Lane
Дата:
>> But I don't see the "Backwards index scan" you mentioned.
> Then we're not there yet.

> I hesitate to suggest that you throw "SET enable_sort TO off" and then
> "SET enable_sort TO on" around your query, because it's so ugly,
> but that might be the best short-term answer.

No, actually that's no short-term answer at all.  It turns out that the
"bogus" cost estimate was perfectly correct, because what the planner
was actually generating was a plan for a backwards index scan over the
whole table, with restrictions applied after the fact :-(.  Forcing it
to use that plan won't help.

I have corrected this silly oversight.  Attached is the patch needed to
make backwards index scans work properly in 7.0.*.
        regards, tom lane

*** src/backend/optimizer/path/indxpath.c.orig    Sun Apr 16 00:41:01 2000
--- src/backend/optimizer/path/indxpath.c    Thu Jul 13 01:49:51 2000
***************
*** 196,202 ****              useful_for_ordering(root, rel, index, ForwardScanDirection))
add_path(rel,(Path *)                          create_index_path(root, rel, index,
 
!                                            NIL,                                            ForwardScanDirection));
    } 
 
--- 196,202 ----              useful_for_ordering(root, rel, index, ForwardScanDirection))
add_path(rel,(Path *)                          create_index_path(root, rel, index,
 
!                                            restrictclauses,
ForwardScanDirection));        } 
 
***************
*** 208,214 ****         if (useful_for_ordering(root, rel, index, BackwardScanDirection))             add_path(rel,
(Path*)                      create_index_path(root, rel, index,
 
!                                        NIL,                                        BackwardScanDirection));
/*
--- 208,214 ----         if (useful_for_ordering(root, rel, index, BackwardScanDirection))             add_path(rel,
(Path*)                      create_index_path(root, rel, index,
 
!                                        restrictclauses,
BackwardScanDirection));         /*
 


Re: Some Improvement

От
Tim Perdue
Дата:
Tom Lane wrote:
> I have corrected this silly oversight.  Attached is the patch needed to
> make backwards index scans work properly in 7.0.*.

God - don't you love Open Source Software? You probably don't remember,
but 1 1/2 years ago, I ran into this "2GB Disaster limit" and I think
you were the one that came up with a patch to bring the segment size
down to 1GB within like 24hrs.

I'll apply the patch and rebuild - thanks.

Tim

-- 
Founder - PHPBuilder.com / Geocrawler.com
Lead Developer - SourceForge
VA Linux Systems
408-542-5723