Обсуждение: row estimate for partial index
Hi, I'm trying to figure out why a certain query doesn't use a partial index. The context: The table has about 100M rows with "contacts". Every contact belongs to an "org", and can be marked "deleted". Since we're generally only interested in non-deleted contacts we have a (partial) index on the table: "contact_organization_id" btree (org_id, id) WHERE deleted IS NULL for queries such as: "select * from contacts where org_id=123 and deleted is null order by id" Works well enough. However, we now have an org_id which has > 10% of the rows, but only a handful rows where "deleted is null" matches (so the org has a lot of "deleted" contacts). The planner doesn't like this and it falls back to a full table scan for the above query. I've added a dedicated index just for that org_id, to see if that helps: "org123" btree (id) WHERE deleted IS NULL AND org_id = 123 The planner seems to use it now, however the row estimate is way off: my_db=> explain SELECT c.id, COALESCE(c.first_name, '') FROM contacts AS c WHERE c.organization_id = 123 AND c.deletedIS NULL ORDER BY id ASC; QUERY PLAN ──────────────────────────────────────────────────────────────────────────────────────────────────── Gather Merge (cost=1761204.72..1883324.98 rows=1046672 width=36) Workers Planned: 2 -> Sort (cost=1760204.70..1761513.04 rows=523336 width=36) Sort Key: id -> Parallel Bitmap Heap Scan on contact c (cost=318.13..1696183.14 rows=523336 width=36) Recheck Cond: ((deleted IS NULL) AND (organization_id = 8448)) -> Bitmap Index Scan on org123 (cost=0.00..4.13 rows=1256006 width=0) (7 rows) Time: 3.337 ms That "1256006" estimate from the org123 index is wrong. There are 7 contacts in that org: -> Bitmap Index Scan on org8448 (cost=0.00..4.13 rows=1256006 width=0) (actual time=0.841..0.841 rows=7loops=1) Is this expected? Is there a way to improve this? Thanks! Harmen
Harmen <harmen@lijzij.de> writes: > Works well enough. However, we now have an org_id which has > 10% of the rows, > but only a handful rows where "deleted is null" matches (so the org has a lot > of "deleted" contacts). The planner doesn't like this and it falls back to a > full table scan for the above query. > I've added a dedicated index just for that org_id, to see if that helps: > "org123" btree (id) WHERE deleted IS NULL AND org_id = 123 > The planner seems to use it now, however the row estimate is way off: Yeah, so that indicates that it isn't producing a good selectivity estimate for the combination of those two conditions: it will assume the org_id and deleted columns are independent, which per your statements they are not. If you are running a reasonably recent PG version you should be able to fix that by setting up "extended statistics" on that pair of columns: https://www.postgresql.org/docs/current/planner-stats.html#PLANNER-STATS-EXTENDED (I might be wrong, but I think that will help even when one of the troublesome conditions is a null-check. If it doesn't, then we have something to improve there ...) regards, tom lane
On Sat, Jan 14, 2023 at 11:23:07AM -0500, Tom Lane wrote: > Harmen <harmen@lijzij.de> writes: > > Works well enough. However, we now have an org_id which has > 10% of the rows, > > but only a handful rows where "deleted is null" matches (so the org has a lot > > of "deleted" contacts). The planner doesn't like this and it falls back to a > > full table scan for the above query. > > > I've added a dedicated index just for that org_id, to see if that helps: > > "org123" btree (id) WHERE deleted IS NULL AND org_id = 123 > > The planner seems to use it now, however the row estimate is way off: > > Yeah, so that indicates that it isn't producing a good selectivity > estimate for the combination of those two conditions: it will assume > the org_id and deleted columns are independent, which per your statements > they are not. > > If you are running a reasonably recent PG version you should be able to > fix that by setting up "extended statistics" on that pair of columns: > > https://www.postgresql.org/docs/current/planner-stats.html#PLANNER-STATS-EXTENDED > > (I might be wrong, but I think that will help even when one of > the troublesome conditions is a null-check. If it doesn't, then > we have something to improve there ...) Thanks for your explanation, Tom. I've setup a local test scenario, where I then add a "dependencies" stat, but that doesn't give a better plan, unfortunately. This is my test table (I use a boolean field for "deleted" to keep this test case as simple as possible. In my real case this is a "timestamptz null" field): DROP table if exists contactsbool; CREATE table contactsbool (id int not null, org_id int not null, deleted boolean not null, firstname text); CREATE index contactsbool_orgs on contactsbool (org_id, id) where not deleted; Testdata has a very low number of "orgs", and one org has almost only deleted contacts: WITH ids as (select * from generate_series(0, 10000000)) insert into contactsbool select ids.generate_series, mod(ids.generate_series,7), false, 'hello world' from ids; UPDATE contactsbool set deleted = true where id > 100 and org_id = 5; ANALYZE contactsbool; Now the new stats: CREATE STATISTICS dist4 (ndistinct) ON deleted, org_id FROM contactsbool; CREATE STATISTICS dist4b (ndistinct) ON org_id, deleted FROM contactsbool; ANALYZE contactsbool; harmen=> explain (analyze) select id, firstname from contactsbool where org_id = 5 and not deleted order by id; QUERY PLAN ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Gather Merge (cost=181983.91..299104.42 rows=1003820 width=16) (actual time=448.244..454.770 rows=14 loops=1) Workers Planned: 2 Workers Launched: 2 -> Sort (cost=180983.88..182238.66 rows=501910 width=16) (actual time=413.761..413.762 rows=5 loops=3) Sort Key: id Sort Method: quicksort Memory: 25kB Worker 0: Sort Method: quicksort Memory: 25kB Worker 1: Sort Method: quicksort Memory: 25kB -> Parallel Seq Scan on contactsbool (cost=0.00..124881.86 rows=501910 width=16) (actual time=267.318..413.673rows=5 loops=3) Filter: ((NOT deleted) AND (org_id = 5)) Rows Removed by Filter: 3333329 Planning Time: 0.565 ms JIT: Functions: 12 Options: Inlining false, Optimization false, Expressions true, Deforming true Timing: Generation 2.444 ms, Inlining 0.000 ms, Optimization 1.163 ms, Emission 13.288 ms, Total 16.895 ms Execution Time: 456.498 ms (17 rows) The "rows=501910" is what I don't expect. I expect/want/hope the plan to use the contactsbool_orgs index. (If I really (hard) delete the "deleted" contacts everything works perfectly for all orgs.) Any ideas? Thanks again, Harmen
Harmen <harmen@lijzij.de> writes: > On Sat, Jan 14, 2023 at 11:23:07AM -0500, Tom Lane wrote: >> If you are running a reasonably recent PG version you should be able to >> fix that by setting up "extended statistics" on that pair of columns: > CREATE STATISTICS dist4 (ndistinct) ON deleted, org_id FROM contactsbool; > CREATE STATISTICS dist4b (ndistinct) ON org_id, deleted FROM contactsbool; 1. ndistinct is not the correct stats type for this problem. (I think dependencies is, but generally speaking, it's not worth trying to be smarter than the system about which ones you need. Just create 'em all.) 2. Per the CREATE STATISTICS man page, the order of the columns is not significant, so you're just doubling the amount of work for ANALYZE without gaining anything. I think you will find that CREATE STATISTICS stats1 ON deleted, org_id FROM contactsbool; is enough to fix this. It improved the estimate for me in v14 and HEAD, anyway. regards, tom lane
On Mon, Jan 16, 2023 at 09:59:38AM -0500, Tom Lane wrote: > Harmen <harmen@lijzij.de> writes: > > On Sat, Jan 14, 2023 at 11:23:07AM -0500, Tom Lane wrote: > >> If you are running a reasonably recent PG version you should be able to > >> fix that by setting up "extended statistics" on that pair of columns: > > > CREATE STATISTICS dist4 (ndistinct) ON deleted, org_id FROM contactsbool; > > CREATE STATISTICS dist4b (ndistinct) ON org_id, deleted FROM contactsbool; Hello Tom, thanks again for your help, it's much appreciated. > 1. ndistinct is not the correct stats type for this problem. > (I think dependencies is, but generally speaking, it's not worth > trying to be smarter than the system about which ones you need. > Just create 'em all.) I don't understand the "dependencies" stat here. The documentation has as examples zip codes with city names, and month/day columns. Those I follow. But in this case I generally select a single "org_id", which can have an arbitrary number of "deleted" rows. I don't understand how having the org_id available gives a better prediction than taking the plain ratio of all "deleted" rows in the whole table. > 2. Per the CREATE STATISTICS man page, the order of the columns is > not significant, so you're just doubling the amount of work for > ANALYZE without gaining anything. Noted. Are these statistics expensive? Or do you expect them to be basically noise? > I think you will find that > > CREATE STATISTICS stats1 ON deleted, org_id FROM contactsbool; > > is enough to fix this. It improved the estimate for me in > v14 and HEAD, anyway. It does indeed for me as well! I'm not 100% sure it's because of the test data being too simple, but I'll test. Unfortunately, as you already remarked might be the case in your original reply, I don't get the same good results if I change the "deleted" column from a "boolean not null" to a "timestamptz null", though. Then it's back to (pg15.1): -> Parallel Seq Scan on contacts (cost=0.00..126284.02 rows=513913 width=16) (actual time=254.677..393.448 rows=5loops=3) Filter: ((deleted IS NULL) AND (org_id = 5)) Rows Removed by Filter: 3333329 Test table in case that's convenient to have: DROP table if exists contacts; CREATE table contacts (id int not null, org_id int not null, deleted timestamptz null, firstname text); CREATE index contacts_orgs on contacts (org_id, id) where deleted is null; WITH ids as (select * from generate_series(0, 10000000)) insert into contacts select ids.generate_series, mod(ids.generate_series,7), null, 'hello world' from ids; UPDATE contacts set deleted = now() where id > 100 and org_id = 5; CREATE STATISTICS dist4 (ndistinct) ON deleted, org_id FROM contacts; CREATE STATISTICS dist5 (dependencies) ON deleted, org_id FROM contacts; ANALYZE contacts; explain (analyze) select id, firstname from contacts where org_id = 5 and deleted is null order by id; Again, this is only a problem when a single org_is has such a large percentage of the table that the planner switches to table scans, which can only ever be a few org_ids. On production I've added an index with the problematic org_id in the "where" part, and that avoids the problem for now (milliseconds vs minutes), but it's not a great fix. Thanks! Harmen