Обсуждение: multicolumn index and setting effective_cache_size using human-readable-numbers

Поиск
Список
Период
Сортировка

multicolumn index and setting effective_cache_size using human-readable-numbers

От
Geoff Winkless
Дата:
I'm sure I'm missing something here.

A query takes 50 seconds; it's doing a seq-scan on a joined table,
even though the table is joined via a field that's the leftmost column
in a multicolumn index
(http://www.postgresql.org/docs/9.5/static/indexes-multicolumn.html
says "equality constraints on leading columns ... will be used to
limit the portion of the index that is scanned")

http://explain.depesz.com/s/suv

If I create an individual index on just the linked key, the explain
shows the index being used and the query takes 1.7s.

http://explain.depesz.com/s/b9ZS

Now here's the odd bit:

  SET effective_cache_size TO '2146435072'

causes the index to be used.

   SET effective_cache_size TO '2047MB'

causes it to use tablescan. Shouldn't those two be equivalent? Is
there a blowup in the planner checking effective_cache_size value not
expecting the human-readable value?

Thanks for suggestions

Geoff


Re: multicolumn index and setting effective_cache_size using human-readable-numbers

От
Jim Mlodgenski
Дата:


On Mon, Feb 29, 2016 at 8:56 AM, Geoff Winkless <pgsqladmin@geoff.dj> wrote:
I'm sure I'm missing something here.

A query takes 50 seconds; it's doing a seq-scan on a joined table,
even though the table is joined via a field that's the leftmost column
in a multicolumn index
(http://www.postgresql.org/docs/9.5/static/indexes-multicolumn.html
says "equality constraints on leading columns ... will be used to
limit the portion of the index that is scanned")

http://explain.depesz.com/s/suv

If I create an individual index on just the linked key, the explain
shows the index being used and the query takes 1.7s.

http://explain.depesz.com/s/b9ZS

Now here's the odd bit:

  SET effective_cache_size TO '2146435072'

causes the index to be used.

   SET effective_cache_size TO '2047MB'

causes it to use tablescan. Shouldn't those two be equivalent?

No they are not the same. When you don't include a unit for effective_cache_size, it defaults to page size so you're saying 2146435072 * 8K
 
Is
there a blowup in the planner checking effective_cache_size value not
expecting the human-readable value?

Thanks for suggestions

Geoff


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: multicolumn index and setting effective_cache_size using human-readable-numbers

От
Geoff Winkless
Дата:
On 29 February 2016 at 14:06, Jim Mlodgenski <jimmy76@gmail.com> wrote:
> No they are not the same. When you don't include a unit for
> effective_cache_size, it defaults to page size so you're saying 2146435072 *
> 8K

Hah.

Thanks Jim, like I said I was sure I'd be missing something :)

Geoff


Re: multicolumn index and setting effective_cache_size using human-readable-numbers

От
Geoff Winkless
Дата:
On 29 February 2016 at 14:07, Geoff Winkless <pgsqladmin@geoff.dj> wrote:
> On 29 February 2016 at 14:06, Jim Mlodgenski <jimmy76@gmail.com> wrote:
>> No they are not the same. When you don't include a unit for
>> effective_cache_size, it defaults to page size so you're saying 2146435072 *
>> 8K
>
> Hah.
>
> Thanks Jim, like I said I was sure I'd be missing something :)

So ignoring my effective_cache_size vs units stupidity, and coming
back to the problem I was originally going to email about before I got
sidetracked...

Is there a reason why the single-column index is used when
effective_cache_size is so much lower, even though the index sizes are
not much different (2.3GB vs 3.2GB)? I can increase
effective_cache_size from (the current) 3GB up to 8GB before it starts
using the multicolumn index, which seems excessive given the relative
index sizes.

Geoff


Re: multicolumn index and setting effective_cache_size using human-readable-numbers

От
Geoff Winkless
Дата:
Just as a continuation of this, I can set effective_cache_size to 64MB
and it will still use the single-column index, but PG flatly refuses
to use the multicolumn index without effective_cache_size being an
unfeasibly large number (2x the RAM in the machine, in this case).

Geoff


Re: multicolumn index and setting effective_cache_size using human-readable-numbers

От
"Joshua D. Drake"
Дата:
On 02/29/2016 10:05 AM, Geoff Winkless wrote:
> Just as a continuation of this, I can set effective_cache_size to 64MB
> and it will still use the single-column index, but PG flatly refuses
> to use the multicolumn index without effective_cache_size being an
> unfeasibly large number (2x the RAM in the machine, in this case).

I haven't been following this thread but did you try looking at the costs?

#seq_page_cost = 1.0                    # measured on an arbitrary scale
#random_page_cost = 4.0                 # same scale as above
#cpu_tuple_cost = 0.01                  # same scale as above
#cpu_index_tuple_cost = 0.005           # same scale as above
#cpu_operator_cost = 0.0025             # same scale as above
#effective_cache_size = 128MB

Especially seq_page_cost, random_page_cost and cpu_index_tuple_cost?

JD

>
> Geoff
>
>


--
Command Prompt, Inc.                  http://the.postgres.company/
                         +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


Re: multicolumn index and setting effective_cache_size using human-readable-numbers

От
Geoff Winkless
Дата:
On 29 February 2016 at 18:31, Joshua D. Drake <jd@commandprompt.com> wrote:
> I haven't been following this thread but did you try looking at the costs?

Thanks for the response...

> #seq_page_cost = 1.0                    # measured on an arbitrary scale
> #random_page_cost = 4.0                 # same scale as above
> #cpu_tuple_cost = 0.01                  # same scale as above
> #cpu_index_tuple_cost = 0.005           # same scale as above
> #cpu_operator_cost = 0.0025             # same scale as above
> #effective_cache_size = 128MB
>
> Especially seq_page_cost, random_page_cost and cpu_index_tuple_cost?

seq_page_cost: 1
random_page_cost: 4
cpu_tuple_cost: 0.01
cpu_index_tuple_cost: 0.005
cpu_operator_cost: 0.0025
effective_cache_size: 3GB

I'm not really sure what changes I could make that would make one
index that's ostensibly equivalent to the other not be attractive to
the planner though. I can mess with those figures but as I said before
the only one that flicks the switch is to change effective_cache_size
to 8GB, which makes no sense to me.

Geoff


Re: multicolumn index and setting effective_cache_size using human-readable-numbers

От
Kevin Grittner
Дата:
On Mon, Feb 29, 2016 at 2:10 PM, Geoff Winkless <pgsqladmin@geoff.dj> wrote:

> I'm not really sure what changes I could make that would make one
> index that's ostensibly equivalent to the other not be attractive to
> the planner though. I can mess with those figures but as I said before
> the only one that flicks the switch is to change effective_cache_size
> to 8GB, which makes no sense to me.

effective_cache_size doesn't affect how memory is allocated, it
tells the optimizer what to assume about the combined cache space
(essentially shared_buffers + OS cache) so that it can estimate the
amount of random storage I/O needed to use an indexed plan.  If you
tell it that you only have 64MB between those two types of cache,
it will assume that the index (particularly if it is deep and/or
wide) will be very expensive.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: multicolumn index and setting effective_cache_size using human-readable-numbers

От
Geoff Winkless
Дата:


On 29 Feb 2016 22:47, "Kevin Grittner" <kgrittn@gmail.com> wrote:
>
> On Mon, Feb 29, 2016 at 2:10 PM, Geoff Winkless <pgsqladmin@geoff.dj> wrote:
>
> > I'm not really sure what changes I could make that would make one
> > index that's ostensibly equivalent to the other not be attractive to
> > the planner though. I can mess with those figures but as I said before
> > the only one that flicks the switch is to change effective_cache_size
> > to 8GB, which makes no sense to me.
>
> effective_cache_size doesn't affect how memory is allocated, it
> tells the optimizer what to assume about the combined cache space
> (essentially shared_buffers + OS cache) so that it can estimate the
> amount of random storage I/O needed to use an indexed plan.  If you
> tell it that you only have 64MB between those two types of cache,
> it will assume that the index (particularly if it is deep and/or
> wide) will be very expensive

Perhaps I'm not being clear. Index 1 has field a and is used in the join no matter how small I set effective_cache_size (even 32mb). Index 2 has fields a,b but will not be used at ecs of 3gb, 6gb, whatever up til 8gb, when it's suddenly used.

For the variation in size of the two indexes (say 20%?) that just doesn't add up.

Geoff

Re: multicolumn index and setting effective_cache_size using human-readable-numbers

От
Jeff Janes
Дата:
On Mon, Feb 29, 2016 at 3:02 PM, Geoff Winkless <pgsqladmin@geoff.dj> wrote:
>
> Perhaps I'm not being clear. Index 1 has field a and is used in the join no
> matter how small I set effective_cache_size (even 32mb). Index 2 has fields
> a,b but will not be used at ecs of 3gb, 6gb, whatever up til 8gb, when it's
> suddenly used.
>
> For the variation in size of the two indexes (say 20%?) that just doesn't
> add up.

What version of PostgreSQL are you using?  Any chance you can share a
data-generator which reproduces this problem?

Cheers,

Jeff