Обсуждение: [HACKERS] Extended statistics is not working on Vars hidden under a RelabelType
[HACKERS] Extended statistics is not working on Vars hidden under a RelabelType
От
David Rowley
Дата:
Basically, $subject is causing us not to properly find matching extended stats in this case. The attached patch fixes it. The following test cases is an example of the misbehaviour. Note rows=1 vs rows=98 in the Gather node. create table ab (a varchar, b varchar); insert into ab select (x%1000)::varchar, (x%10000)::Varchar from generate_Series(1,1000000)x; create statistics ab_a_b_stats (dependencies) on a,b from ab; analyze ab; -- Unpatched explain analyze select * from ab where a = '1' and b = '1'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------- Gather (cost=1000.00..12466.10 rows=1 width=7) (actual time=0.441..90.515 rows=100 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on ab (cost=0.00..11466.00 rows=1 width=7) (actual time=1.081..74.944 rows=33 loops=3) Filter: (((a)::text = '1'::text) AND ((b)::text = '1'::text)) Rows Removed by Filter: 333300 Planning time: 0.184 ms Execution time: 105.878 ms (8 rows) -- Patched explain analyze select * from ab where a = '1' and b = '1'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------ Gather (cost=1000.00..12475.80 rows=98 width=7) (actual time=1.076..92.595 rows=100 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on ab (cost=0.00..11466.00 rows=41 width=7) (actual time=0.491..77.833 rows=33 loops=3) Filter: (((a)::text = '1'::text) AND ((b)::text = '1'::text)) Rows Removed by Filter: 333300 Planning time: 2.175 ms Execution time: 106.326 ms (8 rows) -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
[HACKERS] Re: Extended statistics is not working on Vars hidden under aRelabelType
От
Tomas Vondra
Дата:
On 10/10/2017 05:03 AM, David Rowley wrote: > Basically, $subject is causing us not to properly find matching > extended stats in this case. > > The attached patch fixes it. > > The following test cases is an example of the misbehaviour. Note > rows=1 vs rows=98 in the Gather node. > Thanks for noticing this. The patch seems fine to me. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Extended statistics is not working on Vars hidden under aRelabelType
От
Alvaro Herrera
Дата:
Tomas Vondra wrote: > On 10/10/2017 05:03 AM, David Rowley wrote: > > Basically, $subject is causing us not to properly find matching > > extended stats in this case. > > > > The attached patch fixes it. > > > > The following test cases is an example of the misbehaviour. Note > > rows=1 vs rows=98 in the Gather node. > > Thanks for noticing this. The patch seems fine to me. I propose this slightly larger change. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
Re: [HACKERS] Extended statistics is not working on Vars hidden under a RelabelType
От
David Rowley
Дата:
On 13 October 2017 at 02:17, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > I propose this slightly larger change. hmm, this is not right. You're not checking that there's a Var below the RelabelType. I tried with: explain select * from ab where (a||a)::varchar = '' and b = ''; and your code assumed the OpExpr was a Var. The reason Tomas coded it the way it was coded is due to the fact that there's already code that works exactly the same way in clauselist_selectivity(). Personally, I don't particularly like that code, but I'd rather not invent a new way to do the same thing. -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Extended statistics is not working on Vars hidden under aRelabelType
От
Alvaro Herrera
Дата:
David Rowley wrote: > The reason Tomas coded it the way it was coded is due to the fact that > there's already code that works exactly the same way in > clauselist_selectivity(). Personally, I don't particularly like that > code, but I'd rather not invent a new way to do the same thing. I pushed your original fix. I still maintain that this should be written differently, but I'm not going to try to change that in a hurry and together with a bugfix. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended statistics is not working on Vars hidden under a RelabelType
От
David Rowley
Дата:
On 13 October 2017 at 04:56, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > I pushed your original fix. Thanks for committing -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended statistics is not working on Vars hidden undera RelabelType
От
Robert Haas
Дата:
On Mon, Oct 9, 2017 at 11:03 PM, David Rowley <david.rowley@2ndquadrant.com> wrote: > -- Unpatched > Planning time: 0.184 ms > Execution time: 105.878 ms > > -- Patched > Planning time: 2.175 ms > Execution time: 106.326 ms This might not be the best example to show the advantages of the patch, honestly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended statistics is not working on Vars hidden undera RelabelType
От
Tomas Vondra
Дата:
On 10/13/2017 10:04 PM, Robert Haas wrote: > On Mon, Oct 9, 2017 at 11:03 PM, David Rowley > <david.rowley@2ndquadrant.com> wrote: >> -- Unpatched >> Planning time: 0.184 ms >> Execution time: 105.878 ms >> >> -- Patched >> Planning time: 2.175 ms >> Execution time: 106.326 ms > > This might not be the best example to show the advantages of the > patch, honestly. > Not sure what exactly is your point? If you're suggesting this example is bad because the planning time increased from 0.184 to 2.175 ms, then perhaps consider the plans were likely generated on a assert-enabled build and on a laptop (both of which adds quite a bit of noise to occasional timings). The patch has no impact on planning time (at least I've been unable to measure any). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended statistics is not working on Vars hidden undera RelabelType
От
David Rowley
Дата:
On 14 October 2017 at 09:04, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Oct 9, 2017 at 11:03 PM, David Rowley > <david.rowley@2ndquadrant.com> wrote: >> -- Unpatched >> Planning time: 0.184 ms >> Execution time: 105.878 ms >> >> -- Patched >> Planning time: 2.175 ms >> Execution time: 106.326 ms > > This might not be the best example to show the advantages of the > patch, honestly. The focus was on the row estimate. I try to highlight that by mentioning "Note rows=1 vs rows=98 in the Gather node.". I can't imagine the test I added would have made the planner about 12 times slower, but just for the record: create table ab (a varchar, b varchar); insert into ab select (x%1000)::varchar, (x%10000)::Varchar from generate_Series(1,1000000)x; create statistics ab_a_b_stats (dependencies) on a,b from ab; vacuum analyze ab; $ cat a.sql explain select * from ab where a = '1' and b = '1'; e9ef11ac8bb2acc2d2462fc17ec3291a959589e7 (Patched) $ pgbench -f a.sql -T 60 -n transaction type: a.sql scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 60 s number of transactions actually processed: 496950 latency average = 0.121 ms tps = 8282.481310 (including connections establishing) tps = 8282.750821 (excluding connections establishing) e9ef11ac8bb2acc2d2462fc17ec3291a959589e7~1 (Unpatched) $ pgbench -f a.sql -T 60 -n transaction type: a.sql scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 60 s number of transactions actually processed: 511250 latency average = 0.117 ms tps = 8520.822410 (including connections establishing) tps = 8521.132784 (excluding connections establishing) With the patch we are making use of the extended statistics, which we do expect to be more work for the planner. Although, we didn't add extended statistics to speed up the planner. -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended statistics is not working on Vars hidden undera RelabelType
От
Robert Haas
Дата:
On Fri, Oct 13, 2017 at 4:44 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > On 10/13/2017 10:04 PM, Robert Haas wrote: >> On Mon, Oct 9, 2017 at 11:03 PM, David Rowley >> <david.rowley@2ndquadrant.com> wrote: >>> -- Unpatched >>> Planning time: 0.184 ms >>> Execution time: 105.878 ms >>> >>> -- Patched >>> Planning time: 2.175 ms >>> Execution time: 106.326 ms >> >> This might not be the best example to show the advantages of the >> patch, honestly. > > Not sure what exactly is your point? If you're suggesting this example > is bad because the planning time increased from 0.184 to 2.175 ms, then > perhaps consider the plans were likely generated on a assert-enabled > build and on a laptop (both of which adds quite a bit of noise to > occasional timings). The patch has no impact on planning time (at least > I've been unable to measure any). I don't really think there's a problem with the patch; I just noticed that with the patch applied both the planning and execution time went up. I understand that's because this is a toy example, not a real one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended statistics is not working on Vars hidden undera RelabelType
От
Robert Haas
Дата:
On Fri, Oct 13, 2017 at 4:49 PM, David Rowley <david.rowley@2ndquadrant.com> wrote: > tps = 8282.481310 (including connections establishing) > tps = 8282.750821 (excluding connections establishing) vs. > tps = 8520.822410 (including connections establishing) > tps = 8521.132784 (excluding connections establishing) > > With the patch we are making use of the extended statistics, which we > do expect to be more work for the planner. Although, we didn't add > extended statistics to speed up the planner. Sure, I understand. That's actually a pretty substantial regression - I guess that means that it's pretty important to avoid creating extended statistics that are not needed, at least for short-running queries. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended statistics is not working on Vars hidden undera RelabelType
От
Tomas Vondra
Дата:
On 10/14/2017 07:49 PM, Robert Haas wrote: > On Fri, Oct 13, 2017 at 4:49 PM, David Rowley > <david.rowley@2ndquadrant.com> wrote: >> tps = 8282.481310 (including connections establishing) >> tps = 8282.750821 (excluding connections establishing) > > vs. > >> tps = 8520.822410 (including connections establishing) >> tps = 8521.132784 (excluding connections establishing) >> >> With the patch we are making use of the extended statistics, which >> we do expect to be more work for the planner. Although, we didn't >> add extended statistics to speed up the planner. > > Sure, I understand. That's actually a pretty substantial regression > - I guess that means that it's pretty important to avoid creating > extended statistics that are not needed, at least for short-running > queries. > Well, it's only about 3% difference in a single run, which may be easily due to slightly different binary layout, random noise etc. So I wouldn't call that "substantial regression", at least not based on this one test. I've done more thorough testing, and what I see is 1.0-1.2% drop, but on a test that's rather extreme (statistics on empty table). So again, likely well within noise, and on larger tables it'll get even less significant. But of course - it's not free. It's a bit more work we need to do. But if you don't need multi-column statistics, don't create them. If your queries are already fast, you probably don't need them at all. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended statistics is not working on Vars hidden undera RelabelType
От
David Rowley
Дата:
On 15 October 2017 at 06:49, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Oct 13, 2017 at 4:49 PM, David Rowley > <david.rowley@2ndquadrant.com> wrote: >> tps = 8282.481310 (including connections establishing) >> tps = 8282.750821 (excluding connections establishing) > > vs. > >> tps = 8520.822410 (including connections establishing) >> tps = 8521.132784 (excluding connections establishing) >> >> With the patch we are making use of the extended statistics, which we >> do expect to be more work for the planner. Although, we didn't add >> extended statistics to speed up the planner. > > Sure, I understand. That's actually a pretty substantial regression - > I guess that means that it's pretty important to avoid creating > extended statistics that are not needed, at least for short-running > queries. To be honest, I ran that on a VM on my laptop. I was getting quite a bit of noise. I just posted that to show that the 12x slowdown didn't exist. I don't know what the actual slowdown is. I just know extended stats are not free and that nobody expected that they ever would be. The good news is that they're off by default and if the bad ever outweighs the good then the fix for that starts with "DROP STATISTICS" I personally think it's great we're starting to see a useful feature materialise that can help with poor row estimates from the planner. -- David Rowley http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Extended statistics is not working on Vars hidden undera RelabelType
От
Robert Haas
Дата:
On Mon, Oct 16, 2017 at 2:36 AM, David Rowley <david.rowley@2ndquadrant.com> wrote: > I personally think it's great we're starting to see a useful feature > materialise that can help with poor row estimates from the planner. I agree. My original post to this thread was more of a throw-away comment than anything, and I'm not attacking the feature. I didn't think it was a very clear example and, TBH, I still don't. But I don't want to blow that up into a big debate on the virtues of this feature, which I never intended to question, or on the correctness of the patch, which I also did not intend to question. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers