Обсуждение: Effect of the WindowAgg on the Nested Loop
Greetings. I've been playing with a small query that I've been asked to optimize and noticed a strange (for me) effect. Query uses this table: Table "clc06_tiles" Column | Type | Modifiers ------------+-----------------------+----------------------------------------------------------- geometry | geometry | code_06 | character varying(3) | gid | bigint | not null default nextval('clc06_tiles_gid_seq'::regclass) Indexes: "clc06_tiles_pkey" PRIMARY KEY, btree (gid) "i_clc06_tiles_geometry" gist (geometry) Check constraints: "enforce_dims_geom" CHECK (st_ndims(geometry) = 2) "enforce_geotype_geom" CHECK (geometrytype(geometry) = 'MULTIPOLYGON'::text OR geometrytype(geometry) = 'POLYGON'::text OR geometry IS NULL) "enforce_srid_geom" CHECK (st_srid(geometry) = 3035) and this function: CREATE OR REPLACE FUNCTION my_trans(x1 float8, y1 float8, x2 float8, y2 float8) RETURNS geometry AS $my_trans$ SELECT st_Transform( st_GeomFromText('LINESTRING('||x1::text||' '||y1::text|| ', '||x2::text||' '||y2::text||')',4326),3035); $my_trans$ LANGUAGE sql IMMUTABLE STRICT; and these constants: \set x1 4.56 \set y1 52.54 \set x2 5.08 \set y2 53.34 Original query looks like this ( http://explain.depesz.com/s/pzv ): SELECT n, i.*, st_NumGeometries(i.geom) FROM ( SELECT a.code_06 as code_06, st_Multi(st_Intersection(a.geometry, my_trans(:x1,:y1,:x2,:y2))) as geom FROM clc06_tiles a WHERE st_Intersects(a.geometry, my_trans(:x1,:y1,:x2,:y2))) i JOIN generate_series(1,10) n ON n <= st_NumGeometries(i.geom); After a while I added row_number() to the inner part ( http://explain.depesz.com/s/hfs ): SELECT n, i.*, st_NumGeometries(i.geom) FROM ( SELECT row_number() OVER () AS rn, a.code_06 as code_06, st_Multi(st_Intersection(a.geometry, my_trans(:x1,:y1,:x2,:y2))) as geom FROM clc06_tiles a WHERE st_Intersects(a.geometry, my_trans(:x1,:y1,:x2,:y2))) i JOIN generate_series(1,10) n ON n <= st_NumGeometries(i.geom); It was really surprising to see a "side" effect of 8x performance boost. The only difference I can see is an extra WindowAgg step in the second variant. Could you kindly explain how WindowAgg node affects the overall performance, please? PostgreSQL 9.2.1 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1, 64-bit archive_command | (disabled) | configuration file bgwriter_delay | 100ms | configuration file bgwriter_lru_maxpages | 200 | configuration file checkpoint_segments | 30 | configuration file default_text_search_config | pg_catalog.english | configuration file effective_cache_size | 3GB | configuration file listen_addresses | * | configuration file log_checkpoints | on | configuration file log_connections | on | configuration file log_destination | csvlog | configuration file log_disconnections | on | configuration file log_lock_waits | on | configuration file log_min_duration_statement | 100ms | configuration file log_rotation_age | 1d | configuration file log_temp_files | 20MB | configuration file log_timezone | UTC | configuration file logging_collector | on | configuration file maintenance_work_mem | 64MB | configuration file max_connections | 100 | configuration file max_stack_depth | 2MB | environment variable max_wal_senders | 2 | configuration file port | 5432 | configuration file shared_buffers | 768MB | configuration file temp_buffers | 32MB | configuration file TimeZone | UTC | configuration file wal_level | hot_standby | configuration file work_mem | 8MB | configuration file -- Victor Y. Yegorov
On Tue, Jan 22, 2013 at 3:57 PM, Виктор Егоров <vyegorov@gmail.com> wrote: > Greetings. > > I've been playing with a small query that I've been asked to optimize > and noticed a strange (for me) effect. > Query uses this table: > > Table "clc06_tiles" > Column | Type | > Modifiers > ------------+-----------------------+----------------------------------------------------------- > geometry | geometry | > code_06 | character varying(3) | > gid | bigint | not null default > nextval('clc06_tiles_gid_seq'::regclass) > Indexes: > "clc06_tiles_pkey" PRIMARY KEY, btree (gid) > "i_clc06_tiles_geometry" gist (geometry) > Check constraints: > "enforce_dims_geom" CHECK (st_ndims(geometry) = 2) > "enforce_geotype_geom" CHECK (geometrytype(geometry) = > 'MULTIPOLYGON'::text OR geometrytype(geometry) = 'POLYGON'::text OR > geometry IS NULL) > "enforce_srid_geom" CHECK (st_srid(geometry) = 3035) > > and this function: > CREATE OR REPLACE FUNCTION my_trans(x1 float8, y1 float8, x2 float8, > y2 float8) RETURNS geometry AS $my_trans$ > SELECT st_Transform( > st_GeomFromText('LINESTRING('||x1::text||' '||y1::text|| > ', '||x2::text||' '||y2::text||')',4326),3035); > $my_trans$ LANGUAGE sql IMMUTABLE STRICT; > > and these constants: > \set x1 4.56 > \set y1 52.54 > \set x2 5.08 > \set y2 53.34 > > > Original query looks like this ( http://explain.depesz.com/s/pzv ): > > SELECT n, i.*, st_NumGeometries(i.geom) > FROM ( > SELECT a.code_06 as code_06, > st_Multi(st_Intersection(a.geometry, > my_trans(:x1,:y1,:x2,:y2))) as geom > FROM clc06_tiles a > WHERE st_Intersects(a.geometry, my_trans(:x1,:y1,:x2,:y2))) i > JOIN generate_series(1,10) n ON n <= st_NumGeometries(i.geom); > > > After a while I added row_number() to the inner part ( > http://explain.depesz.com/s/hfs ): > > SELECT n, i.*, st_NumGeometries(i.geom) > FROM ( > SELECT row_number() OVER () AS rn, a.code_06 as code_06, > st_Multi(st_Intersection(a.geometry, > my_trans(:x1,:y1,:x2,:y2))) as geom > FROM clc06_tiles a > WHERE st_Intersects(a.geometry, my_trans(:x1,:y1,:x2,:y2))) i > JOIN generate_series(1,10) n ON n <= st_NumGeometries(i.geom); > > > It was really surprising to see a "side" effect of 8x performance boost. > The only difference I can see is an extra WindowAgg step in the second variant. > > Could you kindly explain how WindowAgg node affects the overall > performance, please? Apologies for resurrecting an old thread, but I just came across this post while doing some research and I don't see any responses. This seems like a mighty interesting example. I'm not sure what's going on here, but let me guess. I think that the WindowAgg is forcing some operation - detoasting, maybe? - to happen under the materialize node. As a result, it only gets done once. But in the other plan, the detoast happens at the nested loop level, above the materialize node, and therefore it happens 10x instead of 1x. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2013/5/15 Robert Haas <robertmhaas@gmail.com>
I was playing with the query a while ago and put it aside since then,
--
Victor Y. Yegorov
Apologies for resurrecting an old thread, but I just came across this> Original query looks like this ( http://explain.depesz.com/s/pzv ):
>
> After a while I added row_number() to the inner part (
> http://explain.depesz.com/s/hfs ):
>
> It was really surprising to see a "side" effect of 8x performance boost.
> The only difference I can see is an extra WindowAgg step in the second variant.
post while doing some research and I don't see any responses.
This seems like a mighty interesting example. I'm not sure what's
going on here, but let me guess. I think that the WindowAgg is
forcing some operation - detoasting, maybe? - to happen under the
materialize node. As a result, it only gets done once. But in the
other plan, the detoast happens at the nested loop level, above the
materialize node, and therefore it happens 10x instead of 1x.
need time to come back to this thing.
I will try to put together a testcase for this example, I'd like to achieve
the same behavior on a non-GIS data set.
Victor Y. Yegorov