Обсуждение: Testing 9.2 in ~production environment
I'm giving 9.2-beta2 a test simulating a production workflow. Everything looks OK except the speed. Most (all?) queries take about five to six times as long as they do with 9.1. The configurations are essentially the same, the query plans are the same. A (hot) example, pulled semi-randomly from a run, with the names mangled to protect the innocent: ===================================== 9.1 =====================================Nested Loop (cost=0.00..26.92 rows=1 width=28)(actual time=0.114..0.514 rows=19 loops=1) -> Index Scan using ms_pkey on ms msg (cost=0.00..26.03 rows=1 width=20)(actual time=0.026..0.207 rows=19 loops=1) Index Cond: ((ms_id >= 407) AND (ms_id <= 435) AND (mb_id = 50222)) Filter: (status = ANY ('{0,1,2}'::integer[])) -> Index Scan using ph_pkey on ph pm (cost=0.00..0.87 rows=1width=16) (actual time=0.010..0.010 rows=1 loops=19) Index Cond: (id = msg.ph_id)Total runtime: 0.605 ms ===================================== 9.2 =====================================Nested Loop (cost=0.00..30.12 rows=1 width=28)(actual time=0.439..2.540 rows=19 loops=1) -> Index Scan using ms_pkey on ms msg (cost=0.00..29.18 rows=1 width=20)(actual time=0.155..1.157 rows=19 loops=1) Index Cond: ((ms_id >= 407) AND (ms_id <= 435) AND (mb_id = 50222)) Filter: (status = ANY ('{0,1,2}'::integer[])) -> Index Scan using ph_pkey on ph pm (cost=0.00..0.93 rows=1width=16) (actual time=0.053..0.054 rows=1 loops=19) Index Cond: (id = msg.ph_id)Total runtime: 2.752 ms All of the tables and indices for the run in question fit into ram. The effective cache, work mem, costs, etc were optimized in 9.0, and kept for 9.1 and the beta. That the plans are the same suggests that isn't the problem, yes? I think I recall mention from a previous beta (but goog isn't helping me confirm) that there is some extra debugging or such enabled in the betas. If so, and if turning that off would provide a better comparison, where in the src should I look? -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6
Hi, On Monday, June 18, 2012 12:51:51 AM James Cloos wrote: > I'm giving 9.2-beta2 a test simulating a production workflow. > > Everything looks OK except the speed. Most (all?) queries take about > five to six times as long as they do with 9.1. > > The configurations are essentially the same, the query plans are the same. Is it possible that you compiled with assertions enabled? That would roughly fit that magnitude. SHOW debug_assertions; Should show you whether it was enabled. Greetings, Andres
On sön, 2012-06-17 at 18:51 -0400, James Cloos wrote: > I think I recall mention from a previous beta (but goog isn't helping > me confirm) that there is some extra debugging or such enabled in the > betas. That depends on how you built it. Just being a beta by itself doesn't turn on any extra debugging. > > If so, and if turning that off would provide a better comparison, > where in the src should I look? Compare the output of pg_config --configure from both installations.
>>>>> "AF" == Andres Freund <andres@2ndquadrant.com> writes: AF> Is it possible that you compiled with assertions enabled? That would roughly AF> fit that magnitude. SHOW debug_assertions; Should show you whether it was AF> enabled. Thanks, but SHOW debug_assertions reports off. -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6
>>>>> "PE" == Peter Eisentraut <peter_e@gmx.net> writes: PE> That depends on how you built it. Just being a beta by itself doesn't PE> turn on any extra debugging. OK. So either I misremembered or it was something no longer done. PE> That depends on how you built it. Its a Gentoo box; both were build from their ebuilds, with the same gcc, flags, etc. PE> Compare the output of pg_config --configure from both installations. The only differences are 9.1 vs 9.2 in the paths. Thanks, -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6
> PE> Compare the output of pg_config --configure from both installations. > > The only differences are 9.1 vs 9.2 in the paths. Can you check the collations of the two databases? I'm wondering if 9.1 is in "C" collation and 9.2 is something else. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
>>>>> "JB" == Josh Berkus <josh@agliodbs.com> writes: JB> Can you check the collations of the two databases? I'm wondering if 9.1 JB> is in "C" collation and 9.2 is something else. Thanks! pg_dump -C tells me these two differences: -SET client_encoding = 'SQL_ASCII';+SET client_encoding = 'UTF8'; -CREATE DATABASE dbm WITH TEMPLATE = template0 ENCODING = 'SQL_ASCII' LC_COLLATE = 'C' LC_CTYPE = 'C';+CREATE DATABASE dbmWITH TEMPLATE = template0 ENCODING = 'UTF8' LC_COLLATE = 'C' LC_CTYPE = 'en_US.UTF-8'; for every db in the clusters. I presume that lc_ctype is the significant difference? LC_CTYPE *is* specified as 'C' in the dump from which I created the 9.2 cluster, so it must have been overridden by pg_restore. I see that my dist's /etc rc script now sets LC_CTYPE. Would that explain why lc_ctype changed between the two clusters? Is there any way to alter a db's lc_ctype w/o dumping and restoring? I want to preserve some of the changes made since I copied the 9.1 cluster. Alter database reports that lc_ctype cannot be changed. -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6
On mån, 2012-06-18 at 17:57 -0400, James Cloos wrote: > >>>>> "JB" == Josh Berkus <josh@agliodbs.com> writes: > > JB> Can you check the collations of the two databases? I'm wondering if 9.1 > JB> is in "C" collation and 9.2 is something else. > > Thanks! > > pg_dump -C tells me these two differences: > > -SET client_encoding = 'SQL_ASCII'; > +SET client_encoding = 'UTF8'; > > -CREATE DATABASE dbm WITH TEMPLATE = template0 ENCODING = 'SQL_ASCII' LC_COLLATE = 'C' LC_CTYPE = 'C'; > +CREATE DATABASE dbm WITH TEMPLATE = template0 ENCODING = 'UTF8' LC_COLLATE = 'C' LC_CTYPE = 'en_US.UTF-8'; > > for every db in the clusters. > > I presume that lc_ctype is the significant difference? It certainly makes some difference, but it's a bit shocking that makes things that much slower. > LC_CTYPE *is* specified as 'C' in the dump from which I created the 9.2 > cluster, so it must have been overridden by pg_restore. I see that my > dist's /etc rc script now sets LC_CTYPE. Would that explain why lc_ctype > changed between the two clusters? It's possible, depending on how exactly the start up script maze is set up on your particular OS. > Is there any way to alter a db's lc_ctype w/o dumping and restoring? I > want to preserve some of the changes made since I copied the 9.1 cluster. > Alter database reports that lc_ctype cannot be changed. Not really, but in practice you can probably just update pg_database directly. If you don't have any case-insensitive indexes, nothing should change. Worst case, reindex everything.
Peter Eisentraut <peter_e@gmx.net> writes: > On mån, 2012-06-18 at 17:57 -0400, James Cloos wrote: >> I presume that lc_ctype is the significant difference? > It certainly makes some difference, but it's a bit shocking that makes > things that much slower. If James is testing text-comparison-heavy operations, it doesn't seem shocking in the least. strcoll() in most non-C locales is a pig. >> LC_CTYPE *is* specified as 'C' in the dump from which I created the 9.2 >> cluster, so it must have been overridden by pg_restore. I see that my >> dist's /etc rc script now sets LC_CTYPE. Would that explain why lc_ctype >> changed between the two clusters? > It's possible, depending on how exactly the start up script maze is set > up on your particular OS. pg_dumpall should generate a script that correctly restores database locales. However, pg_dump+pg_restore is dependent on user creation of the specific database, which is likely to be environment sensitive. We really oughta do something about that ... regards, tom lane
On tis, 2012-06-19 at 02:38 -0400, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > On mån, 2012-06-18 at 17:57 -0400, James Cloos wrote: > >> I presume that lc_ctype is the significant difference? > > > It certainly makes some difference, but it's a bit shocking that > makes > > things that much slower. > > If James is testing text-comparison-heavy operations, it doesn't seem > shocking in the least. strcoll() in most non-C locales is a pig. Ah yes, of course, having lc_ctype != C also selects strcoll instead of strcmp.
Peter Eisentraut <peter_e@gmx.net> writes: > On tis, 2012-06-19 at 02:38 -0400, Tom Lane wrote: >> If James is testing text-comparison-heavy operations, it doesn't seem >> shocking in the least. strcoll() in most non-C locales is a pig. > Ah yes, of course, having lc_ctype != C also selects strcoll instead of > strcmp. Come to think of it, another possible factor is that LIKE can't use ordinary indexes on text if the locale isn't C. regards, tom lane
On tis, 2012-06-19 at 09:33 -0400, Tom Lane wrote: > Come to think of it, another possible factor is that LIKE can't use > ordinary indexes on text if the locale isn't C. But he reported that the plans are the same.
Updating pg_database to set datctype='C' did solve the speed issues with the two largs dbs. Presumably, since LC_CTYPE=en_US.UTF-8 was in the env when I ran pg_restore, it overrode the ctype setting in the dump files. Some of the slow selects do use ilike; even w/ datctype='C' the indices are skipped for at least this query: # explain analyze SELECT mb_id FROM mb WHERE name ILIKE 'foo@bar' AND ownerid=7; QUERY PLAN -------------------------------------------------------------------------------------------------------------Seq Scan onmb (cost=0.00..570.96 rows=3 width=4) (actual time=9.443..25.039 rows=1 loops=1) Filter: ((name ~~* 'foo@bar'::text) AND(ownerid = 7)) Rows Removed by Filter: 34827Total runtime: 25.071 ms (4 rows) The mb table has several indices, including separate ones on name and ownerid. (not my design, btw. And I really do need to re-write the middleware....) Whether it is strcoll(3) (even though LC_COLLATE is explicitly C) or LIKE, it does make a significant difference for those two apps. -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6
As a followup, I find that I can avoid the seq scan by adding an index to that table as: create index mb_name_own_idx on mb ( lower(name), ownerid ); and changing the query from using the idiom: WHERE name ILIKE 'foo@bar' AND ownerid=7; to using: WHERE lower(name) = lower('foo@bar') AND ownerid=7; which saves 20+ ms on each of the 30+ k such selects in a full run. I haven't tested how fast it would be with that change and a utf8 ctype. Because of how the middleware achives its portability between pg, my et al, changing it to use lower and = will require significant surgery. Is there any way to specify the index such that the ILIKE query will use said index? -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6
James Cloos <cloos@jhcloos.com> wrote: > create index mb_name_own_idx on mb ( lower(name), ownerid ); > WHERE lower(name) = lower('foo@bar') AND ownerid=7; If you expect to be using an equality test on ownerid, you should put that first in the index. BTW, this is starting to sound more like something for the pgsql-performance list than the pgsql-hackers list. -Kevin