Обсуждение: Do table-level CHECK constraints affect the query optimizer?

Поиск
Список
Период
Сортировка

Do table-level CHECK constraints affect the query optimizer?

От
Ron
Дата:
Postgresql 12.5

I've got a big (about 50M rows, but 1.4TB because of xml attachments) 
partitioned table full of data that we're seeing sequential scans on, even 
though there are supporting indices. Will adding CHECK constraints on the 
children, which match the partition ranges influence the query optimizer?

(We'd try this on the test system -- which has just a few weeks less data 
than prod -- but queries on it use the proper indices, and thus perform as 
expected.  Both "test" and "prod" have identical IO systems, and 
postgresql.conf files.)

Partitions: request_xml_p2015_07 FOR VALUES FROM ('2015-07-01 00:00:00') TO 
('2015-08-01 00:00:00'),
             request_xml_p2015_08 FOR VALUES FROM ('2015-08-01 00:00:00') TO 
('2015-09-01 00:00:00'),
             request_xml_p2015_09 FOR VALUES FROM ('2015-09-01 00:00:00') TO 
('2015-10-01 00:00:00'),
             request_xml_p2015_10 FOR VALUES FROM ('2015-10-01 00:00:00') TO 
('2015-11-01 00:00:00'),
[snip]
             request_xml_p2021_06 FOR VALUES FROM ('2021-06-01 00:00:00') TO 
('2021-07-01 00:00:00'),
             request_xml_p2021_07 FOR VALUES FROM ('2021-07-01 00:00:00') TO 
('2021-08-01 00:00:00'),
             request_xml_p2021_08 FOR VALUES FROM ('2021-08-01 00:00:00') TO 
('2021-09-01 00:00:00')


-- 
Angular momentum makes the world go 'round.



Re: Do table-level CHECK constraints affect the query optimizer?

От
Michael Lewis
Дата:
Are vacuum and analyze happening regularly on the live system? What's an example query that uses indexes on test and does not on live? Does the live system show poor estimates when executing 'explain analyze select...' and the test system show semi-accurate row estimates?

50 million seems to be a fairly low row count to be partitioned. What version is this on?

Re: Do table-level CHECK constraints affect the query optimizer?

От
Ron
Дата:
On 6/29/21 10:41 AM, Michael Lewis wrote:
Are vacuum and analyze happening regularly on the live system?

Yes.  There's a nightly cron job which vacuums those it thinks needs it (though it's INSERT-heavy), and ditto ANALYZE.

Specifically, I ran ANALYZE on the prod table just before running the query.

Also, the sampling rate on all tables in both prod and test is 60000 rows.

What's an example query that uses indexes on test and does not on live?

SELECT COUNT(*) FROM sep_info_report_extract;

On prod, there's a list of "Parallel Seq Scan on xxxx_partname" records in the EXPLAIN output, while the test system has a list of "Parallel Index Only Scan using ..._idx" records.

(Yes, this is a simple COUNT(*) but it's a simple reproducer, which demonstrates the kind of problems we're having on much more complex queries.)

Does the live system show poor estimates when executing 'explain analyze select...' and the test system show semi-accurate row estimates?

They're within 2% of each other.

50 million seems to be a fairly low row count to be partitioned. What version is this on?

As explained in the OP, each record has a (sometimes large) XML record; months can have up to 240GB.  Besides, partitioning makes for easy archiving.

--
Angular momentum makes the world go 'round.

Re: Do table-level CHECK constraints affect the query optimizer?

От
Michael Lewis
Дата:
Other than rows being frozen on test (and not on live), I'm not aware of anything that would push the planner to choose to do an index scan on an entire table. Maybe someone else will chime in. Or, if you try running vacuum freeze on live and can verify if that changes the result.

I'm not sure why sampling rate would matter at all if you are reading the entire set of data.

What version?

Re: Do table-level CHECK constraints affect the query optimizer?

От
Tom Lane
Дата:
Ron <ronljohnsonjr@gmail.com> writes:
> On 6/29/21 10:41 AM, Michael Lewis wrote:
>> What's an example query that uses indexes on test and does not on live?

> SELECT COUNT(*) FROM sep_info_report_extract;

> On prod, there's a list of "Parallel Seq Scan on xxxx_partname" records in
> the EXPLAIN output, while the test system has a list of "Parallel Index Only
> Scan using ..._idx" records.

It'd be worth checking pg_class.relallvisible page counts for the
partitions on both systems.  If an IOS is possible, the main thing
that might push the planner to do a seqscan instead is if it thinks
that too little of the table is all-visible, which would tend to
inflate the index-only scan towards the same cost as a regular index
scan (which'll almost always be considered slower than seqscan).

If there's a significant difference in relallvisible fractions, that
would point to something different in your VACUUM housekeeping on
the two systems.

            regards, tom lane



Re: Do table-level CHECK constraints affect the query optimizer?

От
Ron
Дата:
On 6/29/21 11:42 AM, Tom Lane wrote:
> Ron <ronljohnsonjr@gmail.com> writes:
>> On 6/29/21 10:41 AM, Michael Lewis wrote:
>>> What's an example query that uses indexes on test and does not on live?
>> SELECT COUNT(*) FROM sep_info_report_extract;
>> On prod, there's a list of "Parallel Seq Scan on xxxx_partname" records in
>> the EXPLAIN output, while the test system has a list of "Parallel Index Only
>> Scan using ..._idx" records.
> It'd be worth checking pg_class.relallvisible page counts for the
> partitions on both systems.

Lots of 0 records in prod, and lots of "numbers" in test.

>   If an IOS is possible, the main thing
> that might push the planner to do a seqscan instead is if it thinks
> that too little of the table is all-visible, which would tend to
> inflate the index-only scan towards the same cost as a regular index
> scan (which'll almost always be considered slower than seqscan).
>
> If there's a significant difference in relallvisible fractions, that
> would point to something different in your VACUUM housekeeping on
> the two systems.

Prod is brand new.  Loaded on Saturday; we saw this problem on Sunday during 
pre-acceptance.  Thus, while running ANALYZE was top of the list of Things 
To Do, running VACUUM was low.

Is that a mistaken belief?

-- 
Angular momentum makes the world go 'round.



Re: Do table-level CHECK constraints affect the query optimizer?

От
Tom Lane
Дата:
Ron <ronljohnsonjr@gmail.com> writes:
> On 6/29/21 11:42 AM, Tom Lane wrote:
>> If there's a significant difference in relallvisible fractions, that
>> would point to something different in your VACUUM housekeeping on
>> the two systems.

> Prod is brand new.  Loaded on Saturday; we saw this problem on Sunday during
> pre-acceptance.  Thus, while running ANALYZE was top of the list of Things
> To Do, running VACUUM was low.

> Is that a mistaken belief?

ANALYZE won't update relallvisible AFAIR, while VACUUM will.
So if you are depending on lots of IOS, you need a round of
vacuuming.

            regards, tom lane



Re: Do table-level CHECK constraints affect the query optimizer?

От
Mark Dilger
Дата:

> On Jun 29, 2021, at 10:33 AM, Ron <ronljohnsonjr@gmail.com> wrote:
>
> Prod is brand new.  Loaded on Saturday; we saw this problem on Sunday during pre-acceptance.  Thus, while running
ANALYZEwas top of the list of Things To Do, running VACUUM was low. 
>
> Is that a mistaken belief?

You might want to run VACUUM FREEZE and then retry your test query using EXPLAIN.  See if it switches to an index only
scanafter the VACUUM FREEZE. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Do table-level CHECK constraints affect the query optimizer?

От
Ron
Дата:
On 6/29/21 12:46 PM, Tom Lane wrote:
> Ron <ronljohnsonjr@gmail.com> writes:
>> On 6/29/21 11:42 AM, Tom Lane wrote:
>>> If there's a significant difference in relallvisible fractions, that
>>> would point to something different in your VACUUM housekeeping on
>>> the two systems.
>> Prod is brand new.  Loaded on Saturday; we saw this problem on Sunday during
>> pre-acceptance.  Thus, while running ANALYZE was top of the list of Things
>> To Do, running VACUUM was low.
>> Is that a mistaken belief?
> ANALYZE won't update relallvisible AFAIR, while VACUUM will.
> So if you are depending on lots of IOS, you need a round of
> vacuuming.

What's an IOS?

-- 
Angular momentum makes the world go 'round.



Re: Do table-level CHECK constraints affect the query optimizer?

От
Mark Dilger
Дата:

> On Jun 29, 2021, at 11:02 AM, Ron <ronljohnsonjr@gmail.com> wrote:
>
> What's an IOS?

An Index Only Scan.  See https://www.postgresql.org/docs/14/indexes-index-only-scans.html
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company