Обсуждение: Slow query: select * order by XXX desc offset 10 limit 10
Hello, I use PostgreSQL 8.4.7 on CentOS 6.0 / 64 bit machine with Quad-Core AMD Opteron(tm) Processor 2352 and 16 GB RAM and use it for 1 PHP script - which selects and displays data in jQuery DataTables (i.e. an HTML-table which can be viewed page by page). I select records from 1 view which unites 2 identical tables: quincy=> \d quincyview View "public.quincyview" Column | Type | Modifiers -------------+-----------------------------+----------- qdatetime | timestamp without time zone | id | character varying(20) | name | character varying(20) | category | character varying(120) | appsversion | character varying(30) | osversion | character varying(30) | beta_prog | character varying(20) | catinfo | character varying(120) | details | character varying(50) | devinfo | character varying(4000) | email | character varying(320) | emailid | character varying(16) | imei | character varying(25) | pin | character varying(12) | formfactor | character varying(10) | copied | timestamp without time zone | View definition: SELECT quincynoreset.qdatetime, quincynoreset.id, quincynoreset.name, quincynoreset.category, quincynoreset.appsversion, quincynoreset.osversion, quincynoreset.beta_prog, quincynoreset.catinfo, quincynoreset.details, quincynoreset.devinfo, quincynoreset.email, quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, quincynoreset.formfactor, quincynoreset.copied FROM quincynoreset UNION SELECT quincytrack.qdatetime, quincytrack.id, quincytrack.name, quincytrack.category, quincytrack.appsversion, quincytrack.osversion, quincytrack.beta_prog, quincytrack.catinfo, quincytrack.details, quincytrack.devinfo, quincytrack.email, quincytrack.emailid, quincytrack.imei, quincytrack.pin, quincytrack.formfactor, quincytrack.copied FROM quincytrack; And here is 1 of the 2 tables (the other is same, except its name): quincy=> \d quincytrack; Table "public.quincytrack" Column | Type | Modifiers -------------+-----------------------------+--------------- appsversion | character varying(30) | beta_prog | character varying(20) | category | character varying(120) | catinfo | character varying(120) | details | character varying(50) | devinfo | character varying(4000) | emailid | character varying(16) | email | character varying(320) | formfactor | character varying(10) | id | character varying(20) | not null imei | character varying(25) | name | character varying(20) | osversion | character varying(30) | pin | character varying(12) | qdatetime | timestamp without time zone | copied | timestamp without time zone | default now() Indexes: "quincytrack_pkey" PRIMARY KEY, btree (id) There are around 1 mio records in the view: quincy=> select count(*) from quincyview ; count -------- 950476 (1 row) My problem is, that select's are very slow and using my script is no fun despite all the AJAX stuff - which only tries to retrieve "offset X limit Y" records: quincy=> explain select to_char(qdatetime, 'YYYY-MM-DD') as QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from quincyview where qdatetime <= now() order by QDATETIME desc offset 10 limit 10; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------- ---------------------------------------------- Limit (cost=600344.67..600344.70 rows=10 width=1172) -> Sort (cost=600344.65..602859.16 rows=1005804 width=1172) Sort Key: (to_char(quincyview.qdatetime, 'YYYY-MM-DD'::text)) -> Subquery Scan quincyview (cost=518261.35..573580.57 rows=1005804 width=1172) -> Unique (cost=518261.35..561008.02 rows=1005804 width=252) -> Sort (cost=518261.35..520775.86 rows=1005804 width=252) Sort Key: quincynoreset.qdatetime, quincynoreset.id, quincynoreset.name, quincynoreset.cate gory, quincynoreset.appsversion, quincynoreset.osversion, quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese t.details, quincynoreset.devinfo, quincynoreset.email, quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q uincynoreset.formfactor, quincynoreset.copied -> Append (cost=0.00..57003.60 rows=1005804 width=252) -> Seq Scan on quincynoreset (cost=0.00..40011.20 rows=863394 width=242) Filter: (qdatetime <= now()) -> Seq Scan on quincytrack (cost=0.00..6934.36 rows=142410 width=312) Filter: (qdatetime <= now()) Does anybody please have an idea, how to speed up my select statements? Regards Alex
Does anybody please have an idea,
how to speed up my select statements?
Create one or more indexes.
David J.
In response to Alexander Farber <alexander.farber@gmail.com>: > Hello, > > I use PostgreSQL 8.4.7 on CentOS 6.0 / 64 bit machine > with Quad-Core AMD Opteron(tm) Processor 2352 and > 16 GB RAM and use it for 1 PHP script - which selects > and displays data in jQuery DataTables (i.e. an > HTML-table which can be viewed page by page). > > I select records from 1 view which unites 2 identical tables: > > quincy=> \d quincyview > View "public.quincyview" > Column | Type | Modifiers > -------------+-----------------------------+----------- > qdatetime | timestamp without time zone | > id | character varying(20) | > name | character varying(20) | > category | character varying(120) | > appsversion | character varying(30) | > osversion | character varying(30) | > beta_prog | character varying(20) | > catinfo | character varying(120) | > details | character varying(50) | > devinfo | character varying(4000) | > email | character varying(320) | > emailid | character varying(16) | > imei | character varying(25) | > pin | character varying(12) | > formfactor | character varying(10) | > copied | timestamp without time zone | > View definition: > SELECT quincynoreset.qdatetime, quincynoreset.id, > quincynoreset.name, quincynoreset.category, quincynoreset.appsversion, > quincynoreset.osversion, quincynoreset.beta_prog, > quincynoreset.catinfo, quincynoreset.details, quincynoreset.devinfo, > quincynoreset.email, quincynoreset.emailid, quincynoreset.imei, > quincynoreset.pin, quincynoreset.formfactor, quincynoreset.copied > FROM quincynoreset > UNION > SELECT quincytrack.qdatetime, quincytrack.id, > quincytrack.name, quincytrack.category, quincytrack.appsversion, > quincytrack.osversion, quincytrack.beta_prog, quincytrack.catinfo, > quincytrack.details, quincytrack.devinfo, quincytrack.email, > quincytrack.emailid, quincytrack.imei, quincytrack.pin, > quincytrack.formfactor, quincytrack.copied > FROM quincytrack; > > And here is 1 of the 2 tables (the other is same, except its name): > > quincy=> \d quincytrack; > Table "public.quincytrack" > Column | Type | Modifiers > -------------+-----------------------------+--------------- > appsversion | character varying(30) | > beta_prog | character varying(20) | > category | character varying(120) | > catinfo | character varying(120) | > details | character varying(50) | > devinfo | character varying(4000) | > emailid | character varying(16) | > email | character varying(320) | > formfactor | character varying(10) | > id | character varying(20) | not null > imei | character varying(25) | > name | character varying(20) | > osversion | character varying(30) | > pin | character varying(12) | > qdatetime | timestamp without time zone | > copied | timestamp without time zone | default now() > Indexes: > "quincytrack_pkey" PRIMARY KEY, btree (id) > > There are around 1 mio records in the view: > > quincy=> select count(*) from quincyview ; > count > -------- > 950476 > (1 row) > > My problem is, that select's are very slow and > using my script is no fun despite all the AJAX stuff - > which only tries to retrieve "offset X limit Y" records: > > quincy=> explain select to_char(qdatetime, 'YYYY-MM-DD') as > QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from > quincyview where qdatetime <= now() order by QDATETIME desc offset 10 > limit 10; > > > QUERY PLAN > > > ---------------------------------------------------------------------------------------------------------------------- > ---------------------------------------------------------------------------------------------------------------------- > ---------------------------------------------------------------------------------------------------------------------- > ---------------------------------------------- > Limit (cost=600344.67..600344.70 rows=10 width=1172) > -> Sort (cost=600344.65..602859.16 rows=1005804 width=1172) > Sort Key: (to_char(quincyview.qdatetime, 'YYYY-MM-DD'::text)) > -> Subquery Scan quincyview (cost=518261.35..573580.57 > rows=1005804 width=1172) > -> Unique (cost=518261.35..561008.02 rows=1005804 width=252) > -> Sort (cost=518261.35..520775.86 rows=1005804 > width=252) > Sort Key: quincynoreset.qdatetime, > quincynoreset.id, quincynoreset.name, quincynoreset.cate > gory, quincynoreset.appsversion, quincynoreset.osversion, > quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese > t.details, quincynoreset.devinfo, quincynoreset.email, > quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q > uincynoreset.formfactor, quincynoreset.copied > -> Append (cost=0.00..57003.60 > rows=1005804 width=252) > -> Seq Scan on quincynoreset > (cost=0.00..40011.20 rows=863394 width=242) > Filter: (qdatetime <= now()) > -> Seq Scan on quincytrack > (cost=0.00..6934.36 rows=142410 width=312) > Filter: (qdatetime <= now()) > > Does anybody please have an idea, > how to speed up my select statements? #1 Add indexes on qdatetime on both tables #2 don't try to order/filter by a calculated value. Instead modify the query to order and filter by the raw timestamptz column, which will allow that to be done without converting it all to text first. This will require you to change your aliasing in your query. A possible solution to #2: select to_char(qdatetime, 'YYYY-MM-DD') as QDATETIMEFORMATTED, ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from quincyview where qdatetime <= now() order by QDATETIME desc offset 10 limit 10; -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/
Hello Bill and others, On Thu, Oct 13, 2011 at 4:09 PM, Bill Moran <wmoran@potentialtech.com> wrote: > In response to Alexander Farber <alexander.farber@gmail.com>: >> I use PostgreSQL 8.4.7 on CentOS 6.0 / 64 bit machine >> with Quad-Core AMD Opteron(tm) Processor 2352 and >> 16 GB RAM and use it for 1 PHP script - which selects >> and displays data in jQuery DataTables (i.e. an >> HTML-table which can be viewed page by page). >> >> My problem is, that select's are very slow and >> using my script is no fun despite all the AJAX stuff - >> which only tries to retrieve "offset X limit Y" records: >> >> quincy=> explain select to_char(qdatetime, 'YYYY-MM-DD') as >> QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from >> quincyview where qdatetime <= now() order by QDATETIME desc offset 10 >> limit 10; >> >> >> QUERY PLAN >> >> >> ---------------------------------------------------------------------------------------------------------------------- >> ---------------------------------------------------------------------------------------------------------------------- >> ---------------------------------------------------------------------------------------------------------------------- >> ---------------------------------------------- >> Limit (cost=600344.67..600344.70 rows=10 width=1172) >> -> Sort (cost=600344.65..602859.16 rows=1005804 width=1172) >> Sort Key: (to_char(quincyview.qdatetime, 'YYYY-MM-DD'::text)) >> -> Subquery Scan quincyview (cost=518261.35..573580.57 >> rows=1005804 width=1172) >> -> Unique (cost=518261.35..561008.02 rows=1005804 width=252) >> -> Sort (cost=518261.35..520775.86 rows=1005804 >> width=252) >> Sort Key: quincynoreset.qdatetime, >> quincynoreset.id, quincynoreset.name, quincynoreset.cate >> gory, quincynoreset.appsversion, quincynoreset.osversion, >> quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese >> t.details, quincynoreset.devinfo, quincynoreset.email, >> quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q >> uincynoreset.formfactor, quincynoreset.copied >> -> Append (cost=0.00..57003.60 >> rows=1005804 width=252) >> -> Seq Scan on quincynoreset >> (cost=0.00..40011.20 rows=863394 width=242) >> Filter: (qdatetime <= now()) >> -> Seq Scan on quincytrack >> (cost=0.00..6934.36 rows=142410 width=312) >> Filter: (qdatetime <= now()) >> > > #1 Add indexes on qdatetime on both tables > #2 don't try to order/filter by a calculated value. Instead modify the > query to order and filter by the raw timestamptz column, which will > allow that to be done without converting it all to text first. This > will require you to change your aliasing in your query. > > A possible solution to #2: > select to_char(qdatetime, 'YYYY-MM-DD') as QDATETIMEFORMATTED, > ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO > from quincyview > where qdatetime <= now() > order by QDATETIME desc > offset 10 limit 10; I've added 3 new indices on both tables: quincy=> \d quincynoreset Table "public.quincynoreset" Column | Type | Modifiers -------------+-----------------------------+--------------- appsversion | character varying(30) | beta_prog | character varying(20) | category | character varying(120) | catinfo | character varying(120) | details | character varying(50) | devinfo | character varying(4000) | emailid | character varying(16) | email | character varying(320) | formfactor | character varying(10) | id | character varying(20) | not null imei | character varying(25) | name | character varying(20) | osversion | character varying(30) | pin | character varying(12) | qdatetime | timestamp without time zone | copied | timestamp without time zone | default now() Indexes: "quincynoreset_pkey" PRIMARY KEY, btree (id) "quincynoreset_appsversion_index" btree (appsversion) "quincynoreset_osversion_index" btree (osversion) "quincynoreset_qdatetime_index" btree (qdatetime) And in my query I've renamed the string column to QDATETIME_2 (if I've got your suggestion #2 correctly) - still no visible improvement: quincy=> explain select to_char(qdatetime, 'YYYY-MM-DD') as QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from quincyview where qdatetime <= now() order by QDATETIME desc offset 10 limit 10; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------- ---------------------------------------------- Limit (cost=558551.88..558551.91 rows=10 width=1172) -> Sort (cost=558551.86..560883.79 rows=932773 width=1172) Sort Key: (to_char(quincyview.qdatetime, 'YYYY-MM-DD'::text)) -> Subquery Scan quincyview (cost=482428.59..533731.10 rows=932773 width=1172) -> Unique (cost=482428.59..522071.44 rows=932773 width=252) -> Sort (cost=482428.59..484760.52 rows=932773 width=252) Sort Key: quincynoreset.qdatetime, quincynoreset.id, quincynoreset.name, quincynoreset.cate gory, quincynoreset.appsversion, quincynoreset.osversion, quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese t.details, quincynoreset.devinfo, quincynoreset.email, quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q uincynoreset.formfactor, quincynoreset.copied -> Append (cost=0.00..55177.71 rows=932773 width=252) -> Seq Scan on quincynoreset (cost=0.00..39171.89 rows=807446 width=242) Filter: (qdatetime <= now()) -> Seq Scan on quincytrack (cost=0.00..6678.09 rows=125327 width=315) Filter: (qdatetime <= now()) (12 rows) (XXX same query below but with QDATETIME_2 as column name XXX): quincy=> explain select to_char(qdatetime, 'YYYY-MM-DD') as QDATETIME_2,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from quincyview where qdatetime <= now() order by QDATETIME desc offset 10 limit 10; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------- ---------------------------------------------- Limit (cost=558551.88..558551.91 rows=10 width=1172) -> Sort (cost=558551.86..560883.79 rows=932773 width=1172) Sort Key: quincyview.qdatetime -> Subquery Scan quincyview (cost=482428.59..533731.10 rows=932773 width=1172) -> Unique (cost=482428.59..522071.44 rows=932773 width=252) -> Sort (cost=482428.59..484760.52 rows=932773 width=252) Sort Key: quincynoreset.qdatetime, quincynoreset.id, quincynoreset.name, quincynoreset.cate gory, quincynoreset.appsversion, quincynoreset.osversion, quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese t.details, quincynoreset.devinfo, quincynoreset.email, quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q uincynoreset.formfactor, quincynoreset.copied -> Append (cost=0.00..55177.71 rows=932773 width=252) -> Seq Scan on quincynoreset (cost=0.00..39171.89 rows=807446 width=242) Filter: (qdatetime <= now()) -> Seq Scan on quincytrack (cost=0.00..6678.09 rows=125327 width=315) Filter: (qdatetime <= now()) (12 rows) Regards Alex
I've also tried opening cursor: quincy=> open ref for select to_char(qdatetime, 'YYYY-MM-DD') as QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from quincyview where qdatetime <= now() order by QDATETIME desc ; ERROR: syntax error at or near "open" LINE 1: open ref for select to_char(qdatetime, 'YYYY-MM-DD') as QDAT... ^
Hello you should to use a DECLARE statement http://www.postgresql.org/docs/9.1/interactive/sql-declare.html and fetch statement http://www.postgresql.org/docs/9.1/interactive/sql-fetch.html Regards Pavel Stehule 2011/10/14 Alexander Farber <alexander.farber@gmail.com>: > I've also tried opening cursor: > > quincy=> open ref for select to_char(qdatetime, 'YYYY-MM-DD') as > QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from > quincyview where qdatetime <= now() order by QDATETIME desc ; > ERROR: syntax error at or near "open" > LINE 1: open ref for select to_char(qdatetime, 'YYYY-MM-DD') as QDAT... > ^ > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
Thank you - On Fri, Oct 14, 2011 at 11:30 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > you should to use a DECLARE statement > http://www.postgresql.org/docs/9.1/interactive/sql-declare.html > and fetch statement > http://www.postgresql.org/docs/9.1/interactive/sql-fetch.html I've managed to create a cursor and can fetch the data row by row: quincy=> start TRANSACTION; quincy=> declare XXX cursor for select to_char(qdatetime, 'YYYY-MM-DD') as QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from quincyview where qdatetime <= now() order by QDATETIME desc ; quincy=> fetch XXX; ..... quincy=> fetch XXX; ..... But how do I "go back"? For my jQuery HTML table (DataTables.net) I need to be able to go back and forth. Regards Alex
2011/10/14 Alexander Farber <alexander.farber@gmail.com>: > Thank you - > > On Fri, Oct 14, 2011 at 11:30 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: >> you should to use a DECLARE statement >> http://www.postgresql.org/docs/9.1/interactive/sql-declare.html >> and fetch statement >> http://www.postgresql.org/docs/9.1/interactive/sql-fetch.html > > I've managed to create a cursor > and can fetch the data row by row: > > quincy=> start TRANSACTION; > quincy=> declare XXX cursor for select to_char(qdatetime, > 'YYYY-MM-DD') as > QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from > quincyview where qdatetime <= now() order by QDATETIME desc ; > quincy=> fetch XXX; > ..... > quincy=> fetch XXX; > ..... > > But how do I "go back"? > > For my jQuery HTML table (DataTables.net) > I need to be able to go back and forth. > > Regards > Alex you can use a scrollable cursors. BEGIN WORK; -- Set up a cursor: DECLARE liahona SCROLL CURSOR FOR SELECT * FROM films; -- Fetch the first 5 rows in the cursor liahona: FETCH FORWARD 5 FROM liahona; code | title | did | date_prod | kind | len -------+-------------------------+-----+------------+----------+------- BL101 | The Third Man | 101 | 1949-12-23 | Drama | 01:44 BL102 | The African Queen | 101 | 1951-08-11 | Romantic | 01:43 JL201 | Une Femme est une Femme | 102 | 1961-03-12 | Romantic | 01:25 P_301 | Vertigo | 103 | 1958-11-14 | Action | 02:08 P_302 | Becket | 103 | 1964-02-03 | Drama | 02:28 -- Fetch the previous row: FETCH PRIOR FROM liahona; code | title | did | date_prod | kind | len -------+---------+-----+------------+--------+------- P_301 | Vertigo | 103 | 1958-11-14 | Action | 02:08 -- Close the cursor and end the transaction: CLOSE liahona; COMMIT WORK; this example is from doc http://www.postgresql.org/docs/9.1/interactive/sql-fetch.html Regards Pavel > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
On 14 Oct 2011, at 11:14, Alexander Farber wrote: > I've added 3 new indices on both tables: > > > quincy=> \d quincynoreset > Table "public.quincynoreset" > Column | Type | Modifiers > -------------+-----------------------------+--------------- > appsversion | character varying(30) | > beta_prog | character varying(20) | > category | character varying(120) | > catinfo | character varying(120) | > details | character varying(50) | > devinfo | character varying(4000) | > emailid | character varying(16) | > email | character varying(320) | > formfactor | character varying(10) | > id | character varying(20) | not null > imei | character varying(25) | > name | character varying(20) | > osversion | character varying(30) | > pin | character varying(12) | > qdatetime | timestamp without time zone | > copied | timestamp without time zone | default now() > Indexes: > "quincynoreset_pkey" PRIMARY KEY, btree (id) > "quincynoreset_appsversion_index" btree (appsversion) > "quincynoreset_osversion_index" btree (osversion) > "quincynoreset_qdatetime_index" btree (qdatetime) (...) > quincy=> explain select to_char(qdatetime, 'YYYY-MM-DD') as > QDATETIME_2,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO > from quincyview where qdatetime <= now() order by QDATETIME desc > offset 10 limit 10; > > > QUERY PLAN > > > ---------------------------------------------------------------------------------------------------------------------- > ---------------------------------------------------------------------------------------------------------------------- > ---------------------------------------------------------------------------------------------------------------------- > ---------------------------------------------- > Limit (cost=558551.88..558551.91 rows=10 width=1172) > -> Sort (cost=558551.86..560883.79 rows=932773 width=1172) > Sort Key: quincyview.qdatetime > -> Subquery Scan quincyview (cost=482428.59..533731.10 > rows=932773 width=1172) > -> Unique (cost=482428.59..522071.44 rows=932773 width=252) > -> Sort (cost=482428.59..484760.52 rows=932773 width=252) > Sort Key: quincynoreset.qdatetime, > quincynoreset.id, quincynoreset.name, quincynoreset.cate > gory, quincynoreset.appsversion, quincynoreset.osversion, > quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese > t.details, quincynoreset.devinfo, quincynoreset.email, > quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q > uincynoreset.formfactor, quincynoreset.copied > -> Append (cost=0.00..55177.71 > rows=932773 width=252) > -> Seq Scan on quincynoreset > (cost=0.00..39171.89 rows=807446 width=242) > Filter: (qdatetime <= now()) > -> Seq Scan on quincytrack > (cost=0.00..6678.09 rows=125327 width=315) > Filter: (qdatetime <= now()) > (12 rows) An explain analyse would have been a bit more informative. Anyway, I think you get the sequential scans because the UNION requires to sort all the data from both tables to guaranteethat the results are unique (hence that long Sort Key at the 7th line of explain output). For that reason, an index on qdatetime alone won't help much, especially when most of your data has qdatetime <= now(), whichis probably the case. It doesn't matter that you only want 10 results from that set, the database will first have to figure out which those rowsare. That gets more complicated because they can come from two different tables, due to the UNION. Do you really need unique results from that view, or are duplicates acceptable (one from each table)? In that case, try UNIONALL instead of UNION. If you do need unique results, then you could create an index on the combination of all those fields. That should take outthe need for those sequential scans. Alban Hertroys -- The scale of a problem often equals the size of an ego.
Hi Alban and others - On Fri, Oct 14, 2011 at 1:34 PM, Alban Hertroys <haramrae@gmail.com> wrote: > Anyway, I think you get the sequential scans because the UNION requires to sort all the data from both tables to guaranteethat the results are unique (hence that long Sort Key at the 7th line of explain output). > For that reason, an index on qdatetime alone won't help much, especially when most of your data has qdatetime <= now(),which is probably the case. > > It doesn't matter that you only want 10 results from that set, the database will first have to figure out which those rowsare. That gets more complicated because they can come from two different tables, due to the UNION. > > Do you really need unique results from that view, or are duplicates acceptable (one from each table)? In that case, tryUNION ALL instead of UNION. I don't need unique at all! So I've run "explain analyse" on the old view: quincy=> explain analyse select to_char(qdatetime, 'YYYY-MM-DD') as QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from quincyview where qdatetime <= now() order by QDATETIME desc offset 10 limit 10; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------- ---------------------------------------------- Limit (cost=559200.14..559200.16 rows=10 width=1172) (actual time=11311.537..11311.541 rows=10 loops=1) -> Sort (cost=559200.11..561534.85 rows=933894 width=1172) (actual time=11311.532..11311.536 rows=20 loops=1) Sort Key: (to_char(quincyview.qdatetime, 'YYYY-MM-DD'::text)) Sort Method: top-N heapsort Memory: 27kB -> Subquery Scan quincyview (cost=482985.36..534349.53 rows=933894 width=1172) (actual time=5778.592..9004. 663 rows=934084 loops=1) -> Unique (cost=482985.36..522675.85 rows=933894 width=254) (actual time=5777.972..7320.816 rows=9340 84 loops=1) -> Sort (cost=482985.36..485320.09 rows=933894 width=254) (actual time=5777.969..6557.012 rows= 934084 loops=1) Sort Key: quincynoreset.qdatetime, quincynoreset.id, quincynoreset.name, quincynoreset.cate gory, quincynoreset.appsversion, quincynoreset.osversion, quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese t.details, quincynoreset.devinfo, quincynoreset.email, quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q uincynoreset.formfactor, quincynoreset.copied Sort Method: external merge Disk: 180992kB -> Append (cost=0.00..55205.73 rows=933894 width=254) (actual time=11.592..2242.501 rows= 934084 loops=1) -> Seq Scan on quincynoreset (cost=0.00..39188.71 rows=808567 width=244) (actual ti me=11.591..1739.695 rows=808647 loops=1) Filter: (qdatetime <= now()) -> Seq Scan on quincytrack (cost=0.00..6678.09 rows=125327 width=315) (actual time= 6.801..298.642 rows=125437 loops=1) Filter: (qdatetime <= now()) Total runtime: 11363.393 ms (15 rows) Then I'v dropped and recreated the view with "union all": quincy=> drop view quincyview ; DROP VIEW quincy=> create view quincyview as quincy-> select quincy-> qdatetime, quincy-> id, quincy-> name, quincy-> category, quincy-> appsversion, quincy-> osversion, quincy-> beta_prog, quincy-> catinfo, quincy-> details, quincy-> devinfo, quincy-> email, quincy-> emailid, quincy-> imei, quincy-> pin, quincy-> formfactor, quincy-> copied quincy-> from quincynoreset quincy-> union all quincy-> select quincy-> qdatetime, quincy-> id, quincy-> name, quincy-> category, quincy-> appsversion, quincy-> osversion, quincy-> beta_prog, quincy-> catinfo, quincy-> details, quincy-> devinfo, quincy-> email, quincy-> emailid, quincy-> imei, quincy-> pin, quincy-> formfactor, quincy-> copied quincy-> from quincytrack quincy-> ; CREATE VIEW Then "explain analyse" on the same select query again: quincy=> explain analyse select to_char(qdatetime, 'YYYY-MM-DD') as QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from quincyview where qdatetime <= now() order by QDATETIME desc offset 10 limit 10; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- -------------------------- Limit (cost=73052.13..73052.16 rows=10 width=111) (actual time=3782.645..3782.649 rows=10 loops=1) -> Sort (cost=73052.11..75386.84 rows=933894 width=111) (actual time=3782.640..3782.643 rows=20 loops=1) Sort Key: (to_char(quincynoreset.qdatetime, 'YYYY-MM-DD'::text)) Sort Method: top-N heapsort Memory: 27kB -> Result (cost=0.00..48201.53 rows=933894 width=111) (actual time=0.039..2660.561 rows=934084 loops=1) -> Append (cost=0.00..45866.79 rows=933894 width=111) (actual time=0.021..1239.916 rows=934084 loops= 1) -> Seq Scan on quincynoreset (cost=0.00..39188.71 rows=808567 width=95) (actual time=0.020..916 .249 rows=808647 loops=1) Filter: (qdatetime <= now()) -> Seq Scan on quincytrack (cost=0.00..6678.09 rows=125327 width=215) (actual time=0.030..125.6 49 rows=125437 loops=1) Filter: (qdatetime <= now()) Total runtime: 3782.759 ms (11 rows) Now the script is noticably more enjoyable, thank you! Do I still need to add indices over the whole union and what's the syntax please? I'm also thinking about adding some "pipelining" (i.e. prefetching 5-10 pages for the HTML-table): http://datatables.net/release-datatables/examples/server_side/pipeline.html Regards Alex > > If you do need unique results, then you could create an index on the combination of all those fields. That should takeout the need for those sequential scans.
On 14 Oct 2011, at 13:58, Alexander Farber wrote: > Hi Alban and others - > > On Fri, Oct 14, 2011 at 1:34 PM, Alban Hertroys <haramrae@gmail.com> wrote: >> Anyway, I think you get the sequential scans because the UNION requires to sort all the data from both tables to guaranteethat the results are unique (hence that long Sort Key at the 7th line of explain output). >> For that reason, an index on qdatetime alone won't help much, especially when most of your data has qdatetime <= now(),which is probably the case. >> >> It doesn't matter that you only want 10 results from that set, the database will first have to figure out which thoserows are. That gets more complicated because they can come from two different tables, due to the UNION. >> >> Do you really need unique results from that view, or are duplicates acceptable (one from each table)? In that case, tryUNION ALL instead of UNION. > > I don't need unique at all! > > So I've run "explain analyse" on the old view: > > quincy=> explain analyse select to_char(qdatetime, 'YYYY-MM-DD') as > QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from > quincyview where qdatetime <= now() order by QDATETIME desc offset 10 > limit 10; > > > QUERY PLAN > > > ---------------------------------------------------------------------------------------------------------------------- > ---------------------------------------------------------------------------------------------------------------------- > ---------------------------------------------------------------------------------------------------------------------- > ---------------------------------------------- > Limit (cost=559200.14..559200.16 rows=10 width=1172) (actual > time=11311.537..11311.541 rows=10 loops=1) > -> Sort (cost=559200.11..561534.85 rows=933894 width=1172) > (actual time=11311.532..11311.536 rows=20 loops=1) > Sort Key: (to_char(quincyview.qdatetime, 'YYYY-MM-DD'::text)) > Sort Method: top-N heapsort Memory: 27kB > -> Subquery Scan quincyview (cost=482985.36..534349.53 > rows=933894 width=1172) (actual time=5778.592..9004. > 663 rows=934084 loops=1) > -> Unique (cost=482985.36..522675.85 rows=933894 > width=254) (actual time=5777.972..7320.816 rows=9340 > 84 loops=1) > -> Sort (cost=482985.36..485320.09 rows=933894 > width=254) (actual time=5777.969..6557.012 rows= > 934084 loops=1) > Sort Key: quincynoreset.qdatetime, > quincynoreset.id, quincynoreset.name, quincynoreset.cate > gory, quincynoreset.appsversion, quincynoreset.osversion, > quincynoreset.beta_prog, quincynoreset.catinfo, quincynorese > t.details, quincynoreset.devinfo, quincynoreset.email, > quincynoreset.emailid, quincynoreset.imei, quincynoreset.pin, q > uincynoreset.formfactor, quincynoreset.copied > Sort Method: external merge Disk: 180992kB > -> Append (cost=0.00..55205.73 > rows=933894 width=254) (actual time=11.592..2242.501 rows= > 934084 loops=1) > -> Seq Scan on quincynoreset > (cost=0.00..39188.71 rows=808567 width=244) (actual ti > me=11.591..1739.695 rows=808647 loops=1) > Filter: (qdatetime <= now()) > -> Seq Scan on quincytrack > (cost=0.00..6678.09 rows=125327 width=315) (actual time= > 6.801..298.642 rows=125437 loops=1) > Filter: (qdatetime <= now()) > Total runtime: 11363.393 ms > (15 rows) You can paste those in http://explain.depesz.com/ to get an analysis of where the most time gets spent in your query. That'soften a bit quicker at pointing you to the sore points. It shows that much of the time in the query is spent on those sorts, and we just eliminated the worst offender ;) > Then I'v dropped and recreated the view with "union all": > Then "explain analyse" on the same select query again: > > quincy=> explain analyse select to_char(qdatetime, 'YYYY-MM-DD') as > QDATETIME,ID,NAME,CATEGORY,APPSVERSION,OSVERSION,DETAILS,DEVINFO from > quincyview where qdatetime <= now() order by QDATETIME desc offset 10 > limit 10; > QUERY PLAN > > ---------------------------------------------------------------------------------------------------------------------- > -------------------------- > Limit (cost=73052.13..73052.16 rows=10 width=111) (actual > time=3782.645..3782.649 rows=10 loops=1) > -> Sort (cost=73052.11..75386.84 rows=933894 width=111) (actual > time=3782.640..3782.643 rows=20 loops=1) > Sort Key: (to_char(quincynoreset.qdatetime, 'YYYY-MM-DD'::text)) > Sort Method: top-N heapsort Memory: 27kB > -> Result (cost=0.00..48201.53 rows=933894 width=111) > (actual time=0.039..2660.561 rows=934084 loops=1) > -> Append (cost=0.00..45866.79 rows=933894 width=111) > (actual time=0.021..1239.916 rows=934084 loops= > 1) > -> Seq Scan on quincynoreset > (cost=0.00..39188.71 rows=808567 width=95) (actual time=0.020..916 > .249 rows=808647 loops=1) > Filter: (qdatetime <= now()) > -> Seq Scan on quincytrack (cost=0.00..6678.09 > rows=125327 width=215) (actual time=0.030..125.6 > 49 rows=125437 loops=1) > Filter: (qdatetime <= now()) > Total runtime: 3782.759 ms > (11 rows) > > Now the script is noticably more enjoyable, thank you! This time it's spending a large portion of it's time sorting on that to_char function. As Bill mentioned, the qdatetime inthe SELECT list causes that the qdatetime in the ORDER BY uses the "updated definition" from your SELECT list. It doesn'tneed to do that, the actual timestamp is just as good at that and on that column you have an index! > Do I still need to add indices over the whole union > and what's the syntax please? Nope, you only needed that because the query was sorting on all those columns. For the record, an index like that is called a multi-column index and the definition would be: CREATE INDEX quincynoreset_full_idx ON quincynoreset (qdatetime, id, name, category, appsversion, osversion, beta_prog, catinfo,details, devinfo, email, emailid, imei, pin, formfactor, copied); (Don't pay too much attention to the order of columns there, I just took the columns and their respective order from theearlier query plan) > I'm also thinking about adding some "pipelining" > (i.e. prefetching 5-10 pages for the HTML-table): > http://datatables.net/release-datatables/examples/server_side/pipeline.html Firing more selective queries at the database would help as well. It looks like you're implementing some kind of paging through the result set, but do you really think people will want topage through 100,000 pages of results? I think they'd be much happier if you would rank the results somehow, so that they get what they're looking for relativelyquickly. Or you could divide up the information - if they have some idea of when the event they're looking for occurred, that helpsnarrow down the data set a lot and actually increases their chances of finding it. Those are just a few examples, it much depends on the data and the users you're working with. Alban Hertroys -- The scale of a problem often equals the size of an ego.