Обсуждение: Speedups

Поиск
Список
Период
Сортировка

Speedups

От
jwieck@debis.com (Jan Wieck)
Дата:
Hi,

    just to let anyone know:

    I  did  some analyzing and searched for areas that could gain
    more speedups  for  6.4.   First  I  had  something  like  an
    optimizer cache in mind (planner remembers parsetree and if a
    subsequent parsetree only differs in const values, substitute
    consts  by params and reuse saved plans instead of creating a
    new plan all the time).

    But this is what I got for the complete regression test (only
    queries that went through the planner counted):

        Parsing and rule rewriting    14 %
        Optimizer and planning         6 %
        Query execution               80 %
                                    ------
        Total time in backend        100 %

    It  clearly  shows  that  there's  no  need  to  speedup  the
    optimizer.  The parser and the executor  are  the  ones  that
    consume  the  time.   Making  the  planner/optimizer  smarter
    resulting better plans faster to execute is the way.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [HACKERS] Speedups

От
Bruce Momjian
Дата:
>         Parsing and rule rewriting    14 %
>         Optimizer and planning         6 %
>         Query execution               80 %
>                                     ------
>         Total time in backend        100 %
>

Nice analysis.  Certainly looks like Query Execution is the way to go.
profiling has shown quite a lot to help us.  Usually it is not the
executor itself, but the subsystems it calls.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Speedups

От
ocie@paracel.com
Дата:
Jan Wieck wrote:
>
> Hi,
>
>     just to let anyone know:
>
>     I  did  some analyzing and searched for areas that could gain
>     more speedups  for  6.4.   First  I  had  something  like  an
>     optimizer cache in mind (planner remembers parsetree and if a
>     subsequent parsetree only differs in const values, substitute
>     consts  by params and reuse saved plans instead of creating a
>     new plan all the time).
>
>     But this is what I got for the complete regression test (only
>     queries that went through the planner counted):
>
>         Parsing and rule rewriting    14 %
>         Optimizer and planning         6 %
>         Query execution               80 %
>                                     ------
>         Total time in backend        100 %
>
>     It  clearly  shows  that  there's  no  need  to  speedup  the
>     optimizer.  The parser and the executor  are  the  ones  that
>     consume  the  time.   Making  the  planner/optimizer  smarter
>     resulting better plans faster to execute is the way.

This may sound like an obvious question, but if a user defines a
query, do we save the query plan?  This would reduce the
communications between the client and server (a small gain), and allow
the server to start executing the query as soon as it recognized the
name of the stored query and parsed the arguments.

Ocie Mitchell

Re: [HACKERS] Speedups

От
Peter T Mount
Дата:
On Wed, 4 Mar 1998 ocie@paracel.com wrote:

> This may sound like an obvious question, but if a user defines a
> query, do we save the query plan?  This would reduce the
> communications between the client and server (a small gain), and allow
> the server to start executing the query as soon as it recognized the
> name of the stored query and parsed the arguments.

Not sure ofhand, but it would be useful for JDBC's PreparedStatement and
CallableStatement classes

--
Peter T Mount  petermount@earthling.net or pmount@maidast.demon.co.uk
Main Homepage: http://www.demon.co.uk/finder
Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk


Re: [HACKERS] Speedups

От
"Vadim B. Mikheev"
Дата:
Peter T Mount wrote:
>
> On Wed, 4 Mar 1998 ocie@paracel.com wrote:
>
> > This may sound like an obvious question, but if a user defines a
> > query, do we save the query plan?  This would reduce the
> > communications between the client and server (a small gain), and allow
> > the server to start executing the query as soon as it recognized the
> > name of the stored query and parsed the arguments.
>
> Not sure ofhand, but it would be useful for JDBC's PreparedStatement and
> CallableStatement classes

We can implement it very easy, and fast. Execution plan may be reused
many times. Is this feature in standard ?
What is proposed syntax if not ?

Vadim

Re: [HACKERS] Speedups

От
ocie@paracel.com
Дата:
Vadim B. Mikheev wrote:
>
> Peter T Mount wrote:
> >
> > On Wed, 4 Mar 1998 ocie@paracel.com wrote:
> >
> > > This may sound like an obvious question, but if a user defines a
> > > query, do we save the query plan?  This would reduce the
> > > communications between the client and server (a small gain), and allow
> > > the server to start executing the query as soon as it recognized the
> > > name of the stored query and parsed the arguments.
> >
> > Not sure ofhand, but it would be useful for JDBC's PreparedStatement and
> > CallableStatement classes
>
> We can implement it very easy, and fast. Execution plan may be reused
> many times. Is this feature in standard ?
> What is proposed syntax if not ?

I don't think it is so much a question of syntax as it is a question
of what we do in the backend.  Suppose I create a stored query in SQL.
We already store the SQL source for this in the database, right?  So
when it comes time to execute the query, we take this SQL and execute
it as if the user had entered it directly.  What I am proposing would
be to basically store the compiled query plan as well.

I do see a couple sticky points:

We would need some information about which variables are to be
substituted into this query plan, but this should be fairly
straightforward.

Some querys may not respond well to this, for example, if a table had
an index on an integer field f1, this would probably be the best way
to satisfy a select where f1<10.  But if this were in a query as f1<x,
then a sufficiently high value of x might make this not such a good
way to run the query.  I haven't looked into this, but I would assume
that the optimizer relies on the specific values in such cases.

We need to be able to handle changes to the structures and contents of
the tables.  If the query plan is built and we add 10000 rows to a
table it references, the query should probably be recompiled.  We
could probably do this at vacuum time.  There is also a small chance
that a table or index that the query plan was using is dropped.  We
could automatically rebuild the query if the table was created after
the query was compiled.


Boy, to look at this, you'd think I had already built one of these :)
I haven't but I'm willing to give it a shot.

Ocie

Re: [HACKERS] Speedups

От
"Vadim B. Mikheev"
Дата:
ocie@paracel.com wrote:
>
> > > Not sure ofhand, but it would be useful for JDBC's PreparedStatement and
> > > CallableStatement classes
> >
> > We can implement it very easy, and fast. Execution plan may be reused
> > many times. Is this feature in standard ?
> > What is proposed syntax if not ?
>
> I do see a couple sticky points:
>
> We would need some information about which variables are to be
> substituted into this query plan, but this should be fairly
> straightforward.

Parser, Planner/Optimizer and Executor are able to handle parameters!
No problems with this.

> Some querys may not respond well to this, for example, if a table had
> an index on an integer field f1, this would probably be the best way
> to satisfy a select where f1<10.  But if this were in a query as f1<x,
> then a sufficiently high value of x might make this not such a good
> way to run the query.  I haven't looked into this, but I would assume
> that the optimizer relies on the specific values in such cases.
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Unfortunately, no!
We have to add this feature of 'course.
I don't know how we could deal with pre-compiled plans after this :(
May be, we could prepare/store not single plan, but some number of
possible plans.

> We need to be able to handle changes to the structures and contents of
> the tables.  If the query plan is built and we add 10000 rows to a
> table it references, the query should probably be recompiled.  We
> could probably do this at vacuum time.  There is also a small chance
> that a table or index that the query plan was using is dropped.  We
> could automatically rebuild the query if the table was created after
> the query was compiled.

We could mark stored plans as durty in such cases to force re-compiling
when an application tries to use this plan.

Vadim

Re: [HACKERS] Speedups

От
jwieck@debis.com (Jan Wieck)
Дата:
Vadim wrote:
>
> ocie@paracel.com wrote:
> >
> > > > Not sure ofhand, but it would be useful for JDBC's PreparedStatement and
> > > > CallableStatement classes
> > >
> > > We can implement it very easy, and fast. Execution plan may be reused
> > > many times. Is this feature in standard ?
> > > What is proposed syntax if not ?
> >
> > I do see a couple sticky points:
> >
> > We would need some information about which variables are to be
> > substituted into this query plan, but this should be fairly
> > straightforward.
>
> Parser, Planner/Optimizer and Executor are able to handle parameters!
> No problems with this.

    Nice discussion - especially when looking at what I initially
    posted.

    I assume you think about using SPI's saved plan  feature  for
    it. Right?

>
> > Some querys may not respond well to this, for example, if a table had
> > an index on an integer field f1, this would probably be the best way
> > to satisfy a select where f1<10.  But if this were in a query as f1<x,
> > then a sufficiently high value of x might make this not such a good
> > way to run the query.  I haven't looked into this, but I would assume
> > that the optimizer relies on the specific values in such cases.
>                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Unfortunately, no!
> We have to add this feature of 'course.
> I don't know how we could deal with pre-compiled plans after this :(
> May be, we could prepare/store not single plan, but some number of
> possible plans.

    That's  something  I  thought  about  when  I  used  the  SPI
    functions when I built PL/Tcl. Since the saved plan  will  be
    executed via SPI_execp(), we could change some details there.
    Currently SPI_prepare() and SPI_saveplan()  return  the  plan
    itself.  But  they  could also return a little control struct
    that contains the plan and other information. Since  I  don't
    expect  someone  uses  these  plans  for  something else than
    calling SPI_execp(), it wouldn't break anything.

    SPI_execp()  can  do  some  timing  calculations.  For   each
    execution  of  a plan it collects the runtime in microseconds
    (gettimeofday()). After the 5th or 10th call,  it  builds  an
    average  and  remembers that permanently.  For all subsequent
    calls it calculates the average time of the last 10 calls and
    if that gets much higher than the initial average it wouldn't
    hurt to silently prepare  and  save  the  plan  again.  Using
    averages   lowers   the   problem  that  differences  in  the
    parameters can cause the runtime differences.

    Another possible reason for the runtime  differences  is  the
    overall  workload  of  the  server.  This  could be very high
    during the initial average calculation. So I think  it  could
    be  smart  to rebuild the plan after e.g. 1000 calls ignoring
    any runtimes.

>
> > We need to be able to handle changes to the structures and contents of
> > the tables.  If the query plan is built and we add 10000 rows to a
> > table it references, the query should probably be recompiled.  We
> > could probably do this at vacuum time.  There is also a small chance
> > that a table or index that the query plan was using is dropped.  We
> > could automatically rebuild the query if the table was created after
> > the query was compiled.
>
> We could mark stored plans as durty in such cases to force re-compiling
> when an application tries to use this plan.

    Yep. SPI must remember all  prepared  and  saved  plans  (and
    forget  about  only prepared ones at transaction end). Things
    like dropping an index or modifying a table  structure  cause
    invalidations in the relcache, syscache and catcache (even if
    another backend did it in some cases).  I think  it  must  be
    possible  to  tell SPI from there that something happened and
    which relations are affected. If a plans rangetable  contains
    the affected relation, the plan is marked durty.

    Things  like  functions,  operators  and  aggregates are also
    objects that might change (drop/recreate function -> funcnode
    in plan get's unusable).

    I  think  the  best  would  be  that SPI_prepare() set's up a
    collection of Oid's  that  cause  plan  invalidation  in  the
    control  structure.   These  are  the  Oid's  of  ALL objects
    (relations, indices, functions etc.)  used in the plan.  Then
    a  call  to  SPI_invalidate(Oid)  from the cache invalidation
    handlers doesn't have to walk through the plan itself.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

Re: [HACKERS] Speedups

От
Bruce Momjian
Дата:
> I don't think it is so much a question of syntax as it is a question
> of what we do in the backend.  Suppose I create a stored query in SQL.
> We already store the SQL source for this in the database, right?  So
> when it comes time to execute the query, we take this SQL and execute
> it as if the user had entered it directly.  What I am proposing would
> be to basically store the compiled query plan as well.
>
> I do see a couple sticky points:
>
> We would need some information about which variables are to be
> substituted into this query plan, but this should be fairly
> straightforward.
>
> Some querys may not respond well to this, for example, if a table had
> an index on an integer field f1, this would probably be the best way
> to satisfy a select where f1<10.  But if this were in a query as f1<x,
> then a sufficiently high value of x might make this not such a good
> way to run the query.  I haven't looked into this, but I would assume
> that the optimizer relies on the specific values in such cases.

I have thought about this.  If we take a query string, remove all quoted
constants and numeric constants, we can automatically split apart the
query from the parameters.  We can then look up the non-parameter query
in our cache, and if it matches, replace the new contants with the old
and run the query.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Speedups

От
Peter T Mount
Дата:
On Thu, 5 Mar 1998, Vadim B. Mikheev wrote:

> ocie@paracel.com wrote:
> >
> > > > Not sure ofhand, but it would be useful for JDBC's PreparedStatement and
> > > > CallableStatement classes
> > >
> > > We can implement it very easy, and fast. Execution plan may be reused
> > > many times. Is this feature in standard ?
> > > What is proposed syntax if not ?
> >
> > I do see a couple sticky points:
> >
> > We would need some information about which variables are to be
> > substituted into this query plan, but this should be fairly
> > straightforward.
>
> Parser, Planner/Optimizer and Executor are able to handle parameters!
> No problems with this.
>
> > Some querys may not respond well to this, for example, if a table had
> > an index on an integer field f1, this would probably be the best way
> > to satisfy a select where f1<10.  But if this were in a query as f1<x,
> > then a sufficiently high value of x might make this not such a good
> > way to run the query.  I haven't looked into this, but I would assume
> > that the optimizer relies on the specific values in such cases.
>                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Unfortunately, no!
> We have to add this feature of 'course.
> I don't know how we could deal with pre-compiled plans after this :(
> May be, we could prepare/store not single plan, but some number of
> possible plans.

For inserts & updates I would have thought that the plan would be almost
identical (correct me here if I'm wrong here), and this is an area where
Java really crawls at the moment. Try some 40k inserts in JDBC... and go
out for the night. If we can improve things, it would be a big bonus.

--
Peter T Mount  petermount@earthling.net or pmount@maidast.demon.co.uk
Main Homepage: http://www.demon.co.uk/finder
Work Homepage: http://www.maidstone.gov.uk Work EMail: peter@maidstone.gov.uk