Обсуждение: Statement Pooling

Поиск
Список
Период
Сортировка

Statement Pooling

От
Janning
Дата:
Hi,

we are running java6/hibernate/c3p0/postgresql stack.
Our JDBC Driver is 8.4-701.jdbc3

I have a few questions about Prepared Statements. I have read
<http://www.theserverside.com/news/1365244/Why-Prepared-Statements-are-
important-and-how-to-use-them-properly>

At the moment we have
  c3p0.maxStatements = 0
  c3p0.maxStatementsPerConnection  =   0

I don't know if I do understand statement pooling properly, so please help me.
In my understanding it is like this:

Our hibernate stack uses prepared statements. Postgresql is caching the
execution plan. Next time the same statement is used, postgresql reuses the
execution plan. This saves time planning statements inside DB.

Additionally c3p0 can cache java instances of "java.sql.PreparedStatement"
which means it is caching the java object. So when using
c3p0.maxStatementsPerConnection  =   100 it caches at most 100 different
objects. It saves time on creating objects, but this has nothing to do with
the postgresql database and its prepared statements.

Right?

As we use about 100 different statements I would set
  c3p0.maxStatementsPerConnection  =   100

Is this reasonable? Is there a real benefit activating it?

I remember postgresql 8.4 is replanning prepared statements when statistics
change occur, but I didn't find it in the release notes. It is just saying
"Invalidate cached plans when referenced schemas, functions, operators, or
operator classes are modified". Does PG replans prepared statements from time
to time if underlying data statistics change?

I am glad if you help me and give me some insights to managing my connection
pool the right way.

kind regards
Janning

PS: I think its ok to post this on general and not on pgsql-jdbc as I am not
subcribed to pgsql-jdbc. If not, please let me know and I will repost there.



Re: Statement Pooling

От
Joshua Tolley
Дата:
On Tue, May 25, 2010 at 05:28:10PM +0200, Janning wrote:
> Our hibernate stack uses prepared statements. Postgresql is caching the
> execution plan. Next time the same statement is used, postgresql reuses the
> execution plan. This saves time planning statements inside DB.

It only uses the cached plan if you prepare the statement and run that
prepared statement. Running "SELECT foo FROM bar" twice in a row without any
preparing will result in the query being parsed, planned, and executed twice.
On the other hand, doing something like this:

p = conn.prepareStatement("SELECT foo FROM bar");

...and then repeatedly executed p, parsing and planning for the query would
occur only once, at the time of the prepareStatement call.

> Additionally c3p0 can cache java instances of "java.sql.PreparedStatement"
> which means it is caching the java object. So when using
> c3p0.maxStatementsPerConnection  =   100 it caches at most 100 different
> objects. It saves time on creating objects, but this has nothing to do with
> the postgresql database and its prepared statements.
>
> Right?

That's the idea.

> As we use about 100 different statements I would set
>   c3p0.maxStatementsPerConnection  =   100
>
> Is this reasonable? Is there a real benefit activating it?

Answering that question for your situation really requires benchmarking with
and without statement caching turned on. Your best bet is probably to set it
to a value that seems decent, and revisit it if you find a performance
bottleneck you need to resolve which looks like it's related to statement
caching.

> I remember postgresql 8.4 is replanning prepared statements when statistics
> change occur, but I didn't find it in the release notes. It is just saying
> "Invalidate cached plans when referenced schemas, functions, operators, or
> operator classes are modified". Does PG replans prepared statements from time
> to time if underlying data statistics change?

I don't think so, though I may be wrong. The change you refer to replans such
things when the actual objects change, such as when you remove a column or
something that would make the plan fail to execute.


--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

Вложения

Re: Statement Pooling

От
Janning
Дата:
On Thursday 27 May 2010 16:56:28 you wrote:
> > Additionally c3p0 can cache java instances of
> > "java.sql.PreparedStatement" which means it is caching the java object.
> > So when using
> > c3p0.maxStatementsPerConnection  =   100 it caches at most 100 different
> > objects. It saves time on creating objects, but this has nothing to do
> > with the postgresql database and its prepared statements.
> >
> > Right?
>
> That's the idea.

Really? Today I think c3p0 statement pooling it is not only about saving time
in object creation, but to let postgresql reuse an already prwpeared
statement.

> > As we use about 100 different statements I would set
> >   c3p0.maxStatementsPerConnection  =   100
> >
> > Is this reasonable? Is there a real benefit activating it?
>
> Answering that question for your situation really requires benchmarking
> with and without statement caching turned on. Your best bet is probably to
> set it to a value that seems decent, and revisit it if you find a
> performance bottleneck you need to resolve which looks like it's related to
> statement caching.

Ok thanks for your detailed answer. I did some benchmarking and it seems to be
faster with c3p0 statement cache.

> > I remember postgresql 8.4 is replanning prepared statements when
> > statistics change occur, but I didn't find it in the release notes. It is
> > just saying "Invalidate cached plans when referenced schemas, functions,
> > operators, or operator classes are modified". Does PG replans prepared
> > statements from time to time if underlying data statistics change?
>
> I don't think so, though I may be wrong. The change you refer to replans
> such things when the actual objects change, such as when you remove a
> column or something that would make the plan fail to execute.

Ok, so i should be aware of this. if the execution plan postgresql is caching
gets old, i can run into trouble.

So i should close my connections in the pool from time to time to force
replanning of my prepared statements.

i have asked this question on stackoverflow too, because i thought it was to
jdbc/hibernate specific for this list after sending and not getting an answer.
So for further reference i post a link to this question, where you can find an
excellent answer, too:

http://stackoverflow.com/questions/2920740/should-i-activate-c3p0-statement-
pooling/2921340#2921340

kind regards
Janning

http://www.kicktipp.de/

>
> --
> Joshua Tolley / eggyknap
> End Point Corporation
> http://www.endpoint.com



Re: Statement Pooling

От
Joshua Tolley
Дата:
On Fri, May 28, 2010 at 10:09:22PM +0200, Janning wrote:
> On Thursday 27 May 2010 16:56:28 you wrote:
> > > Additionally c3p0 can cache java instances of
> > > "java.sql.PreparedStatement" which means it is caching the java object.
> > > So when using
> > > c3p0.maxStatementsPerConnection  =   100 it caches at most 100 different
> > > objects. It saves time on creating objects, but this has nothing to do
> > > with the postgresql database and its prepared statements.
> > >
> > > Right?
> >
> > That's the idea.
>
> Really? Today I think c3p0 statement pooling it is not only about saving time
> in object creation, but to let postgresql reuse an already prwpeared
> statement.

Well, it was probably too strong to say it "has nothing to do with" your
database. Anyway, it sounds like you've got it right.

<snip>

> > > I remember postgresql 8.4 is replanning prepared statements when
> > > statistics change occur, but I didn't find it in the release notes. It is
> > > just saying "Invalidate cached plans when referenced schemas, functions,
> > > operators, or operator classes are modified". Does PG replans prepared
> > > statements from time to time if underlying data statistics change?
> >
> > I don't think so, though I may be wrong. The change you refer to replans
> > such things when the actual objects change, such as when you remove a
> > column or something that would make the plan fail to execute.
>
> Ok, so i should be aware of this. if the execution plan postgresql is caching
> gets old, i can run into trouble.
>
> So i should close my connections in the pool from time to time to force
> replanning of my prepared statements.

I'm not sure this is a great idea, but it would depend on your application.
With caching and pooling and such, a prepared statement might live a really
long time, but your database's statistics might take even longer to change
sufficiently to suggest one plan is better than another. Explicitly replanning
sounds like it's probably more work than it's worth, unless you see specific
plan problems.

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

Вложения

Re: Statement Pooling

От
Janning
Дата:
> > > > I remember postgresql 8.4 is replanning prepared statements when
> > > > statistics change occur, but I didn't find it in the release notes.
> > > > It is just saying "Invalidate cached plans when referenced schemas,
> > > > functions, operators, or operator classes are modified". Does PG
> > > > replans prepared statements from time to time if underlying data
> > > > statistics change?
> > >
> > > I don't think so, though I may be wrong. The change you refer to
> > > replans such things when the actual objects change, such as when you
> > > remove a column or something that would make the plan fail to execute.
> >
> > Ok, so i should be aware of this. if the execution plan postgresql is
> > caching gets old, i can run into trouble.
> >
> > So i should close my connections in the pool from time to time to force
> > replanning of my prepared statements.
>
> I'm not sure this is a great idea, but it would depend on your application.
> With caching and pooling and such, a prepared statement might live a really
> long time, but your database's statistics might take even longer to change
> sufficiently to suggest one plan is better than another. Explicitly
> replanning sounds like it's probably more work than it's worth, unless you
> see specific plan problems.

I know close my connections every 5th hour, so a new connection can create new
preparedStatements with a new execution plan.

I had some situations in the past, where an old plan took so much longer. I
think planning a statement doesn't cost so much compared with executing a
statement with a wrong plan (like sequence scan instead of using an index)

But I remember a post of tom lane who said that prepared statements get
replaned since 8.4 if underlying statistics change. But I really can't find it.
Maybe I do not remember correctly.

kind regards
Janning

> --
> Joshua Tolley / eggyknap
> End Point Corporation
> http://www.endpoint.com


Re: Statement Pooling

От
Joshua Tolley
Дата:
On Wed, Jun 02, 2010 at 10:10:24AM +0200, Janning wrote:
> But I remember a post of tom lane who said that prepared statements get
> replaned since 8.4 if underlying statistics change. But I really can't find it.
> Maybe I do not remember correctly.

I saw a similar post yesterday, and think I was wrong in my original
assertion that they wouldn't be replanned. So now I'm thinking don't bother
manually refreshing your prepared statements, because postgresql will do it
for you :)

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

Вложения

Re: Statement Pooling

От
Tom Lane
Дата:
Janning <ml@planwerk6.de> writes:
> But I remember a post of tom lane who said that prepared statements get
> replaned since 8.4 if underlying statistics change.

Since 8.3, IIRC.  8.4 extended the mechanism to apply in some more
cases, but an ANALYZE should cause a replan in 8.3.

            regards, tom lane