Обсуждение: Spped of max
Greetings all! Quick question: how long should finding the maximum value in a column take? Table: syslog_event Column | Type | Modifiers ------------------+--------------------------+----------- event_id | bigint | not null ... Index: syslog_event_pkey Column | Type ----------+-------- event_id | bigint unique btree (primary key) The query: explain select event_id, timestamp, clean_message from syslog_event order by event_id desc limit 5; NOTICE: QUERY PLAN: Limit (cost=0.00..3.49 rows=5 width=119) -> Index Scan Backward using syslog_event_pkey on syslog_event (cost=0.00..2435803.36 rows=3490163 width=119) and this returns almost instantaneously as expected. explain select max(event_id) from syslog_event; NOTICE: QUERY PLAN: Aggregate (cost=237923.04..237923.04 rows=1 width=8) -> Seq Scan on syslog_event (cost=0.00..229197.63 rows=3490163 width=8) This takes forever! Now, shouldn't this query be answered out of index (possibly with a check for validity if necessary)? I mean, isn't this really equivalent to: explain select max(event_id) from (select event_id from syslog_event order by event_id desc limit 1) as syslog_event; NOTICE: QUERY PLAN: Aggregate (cost=0.70..0.70 rows=1 width=8) -> Subquery Scan syslog_event (cost=0.00..0.70 rows=1 width=8) -> Limit (cost=0.00..0.70 rows=1 width=8) -> Index Scan Backward using syslog_event_pkey on syslog_event (cost=0.00..2435803.36 rows=3490163 width=8) which flies as expected? Now, this type of thing gets me real worried about how good the optimizer really is. I have a number of fairly complicated queries that are created via criteria from the web, and it would be a pain to get them all "hand-optimized" if I cannot rely on the optimizer at least picking reasonable methods. Miscellaneous: Using 7.2.1 under OpenBSD 3.0 Ed
Whoops, a litle fast on the keyboard. This bit about: >I mean, isn't this really equivalent to: > >explain select max(event_id) from (select event_id from syslog_event order >by event_id desc limit 1) as syslog_event; >NOTICE: QUERY PLAN: > >Aggregate (cost=0.70..0.70 rows=1 width=8) > -> Subquery Scan syslog_event (cost=0.00..0.70 rows=1 width=8) > -> Limit (cost=0.00..0.70 rows=1 width=8) > -> Index Scan Backward using syslog_event_pkey on syslog_event (cost=0.00..2435803.36 rows=3490163 width=8) > can obviously be shortened to: explain select event_id from syslog_event order by event_id desc limit 1; NOTICE: QUERY PLAN: Limit (cost=0.00..0.70 rows=1 width=8) -> Index Scan Backward using syslog_event_pkey on syslog_event (cost=0.00..2435803.36 rows=3490163 width=8) Ed which flies as expected? Now, this type of thing gets me real worried about how good the optimizer really is. I have a number of fairly complicated queries that are created via criteria from the web, and it would be a pain to get them all "hand-optimized" if I cannot rely on the optimizer at least picking reasonable methods. Miscellaneous: Using 7.2.1 under OpenBSD 3.0 Ed
On Tue, May 14, 2002 at 01:35:47PM -0400, Edmund Dengler wrote: > > which flies as expected? Now, this type of thing gets me real worried > about how good the optimizer really is. The max() (and min()) operations are not rewritten, as you note. There's no rule for it. There was a recent discussion about this -- check the archives. There's a similar set of worries, apparently, for NOT EXISTS versus NOT IN. In the vast majority of cases, it seems, NOT EXISTS is considerably faster than NOT IN; but not in all cases. So the queries are executed differently, and never get rewritten one to the other. The only answer is to try to make sure your criteria-written queries are generated in a way that is usually pretty good. For sure, min() and max(), and NOT IN, and a couple of others (count() comes to mind) are almost always losers, so you should avoid them. A -- ---- Andrew Sullivan 87 Mowat Avenue Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M6K 3E3 +1 416 646 3304 x110
On Tue, May 14, 2002 at 01:27:30PM -0400, Edmund Dengler wrote: > Greetings all! > > Quick question: how long should finding the maximum value in a column > take? This is a known issue: max() requires a sequential scan of the table. Since PostgreSQL allows user-defined aggregates, this is somewhat difficult to optimize. No one has yet bothered to create special cases for max(), min() and perhaps count(), although it could probably be done. I'd suggest simply using the workaround you noticed (ORDER BY, LIMIT 1). > Now, this type of thing gets me real worried about how good the > optimizer really is. The problem doesn't involve the optimizer, AFAICT. Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC
>Since PostgreSQL allows user-defined aggregates, this is somewhat >difficult to optimize. No one has yet bothered to create special >cases for max(), min() and perhaps count(), although it could >probably be done. Perhaps you can suggest the fastest way of getting a table count, if it is not SELECT COUNT(*) FROM x WHERE ...; Thanks, Doug
On Tue, May 14, 2002 at 09:18:43PM -0400, Doug Fields wrote: > >Since PostgreSQL allows user-defined aggregates, this is somewhat > >difficult to optimize. No one has yet bothered to create special > >cases for max(), min() and perhaps count(), although it could > >probably be done. > > Perhaps you can suggest the fastest way of getting a table count, if it is > not > > SELECT COUNT(*) FROM x WHERE ...; Well, if you have a qualification (a WHERE ... clause), then count() can be fast: it depends on the number of rows that match the qualification. I can't see an obvious way to optimize count() on a large subset of a table. If you don't have a qualification (i.e. SELECT count(*) FROM x), there are a couple ways to do it: use triggers to increment/decrement a counter of the number of rows in the table, create a btree-indexed SERIAL column and do an ORDER BY serial_column DESC LIMIT 1 on it (note that this is fragile, your sequence could easily have holes), or if you only need an approximation, you could use the pg_class attribute "reltuples" for the OID of your table. My impression is that most people choose the first method. Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC
Greetings all! On Tue, 14 May 2002, Neil Conway wrote: > On Tue, May 14, 2002 at 09:18:43PM -0400, Doug Fields wrote: > > Perhaps you can suggest the fastest way of getting a table count, if it is > > not > > > > SELECT COUNT(*) FROM x WHERE ...; > > Well, if you have a qualification (a WHERE ... clause), then count() > can be fast: it depends on the number of rows that match the > qualification. I can't see an obvious way to optimize count() on a > large subset of a table. > > If you don't have a qualification (i.e. SELECT count(*) FROM x), there > are a couple ways to do it: use triggers to increment/decrement a > counter of the number of rows in the table, create a btree-indexed > SERIAL column and do an ORDER BY serial_column DESC LIMIT 1 on it (note > that this is fragile, your sequence could easily have holes), or if you > only need an approximation, you could use the pg_class attribute > "reltuples" for the OID of your table. My impression is that most > people choose the first method. Unfortunately, for any database that is being used in a drilldown manner (ie, allowing the almost arbitrary addition of constraints to select subsets) (note: this is in fact quite a common configuration for data mining systems!!) then the use of triggers is almost a no show. This basically requires that you know _what_ queries will be run (and that you must rewrite said queries to use the special table). In addition, even if you have a limited set of queries, each trigger will slow down inserts, which potentially could be to the point of being slower than data arrival rates (note that I have a system where this could be the case very soon, in which event, Postgresql is unfortunately going to have to be replaced by something else). Is there any timeline in which these problems will be fixed/alleviated, or should I be looking at alternatives now? Regards, Ed
Neil Conway wrote: >[...] > If you don't have a qualification (i.e. SELECT count(*) FROM x), there > are a couple ways to do it: use triggers to increment/decrement a > counter of the number of rows in the table, create a btree-indexed > SERIAL column and do an ORDER BY serial_column DESC LIMIT 1 on it (note > that this is fragile, your sequence could easily have holes), or if you > only need an approximation, you could use the pg_class attribute > "reltuples" for the OID of your table. My impression is that most > people choose the first method. > The real question is: Why is reltuples only an approximation?
On Wed, May 15, 2002 at 10:24:55AM -0400, Jean-Luc Lachance wrote: > The real question is: > > Why is reltuples only an approximation? It's only an approximation because it is updated by VACUUM. It's used to estimate the cost of queries. Secondly, remember that there is not really a canonical number-of-tuples-in-a-table. If you start a transaction and insert a row, you'll see one more row than any other transaction running at the time. If you're using a trigger to keep a count of the total, you'll get different answers depending on whether your trigger is deferred or not. I have no idea what happens if the trigger is not deferred but a transaction aborts. Deadlock? I wish people would remember this before declaring the total number of tuples in a table a trivial problem. HTH, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Canada, Mexico, and Australia form the Axis of Nations That > Are Actually Quite Nice But Secretly Have Nasty Thoughts About America
Just curious... but often, I have only needed a "fairly" accurate count of rows. As in, outside the context of a transaction. In other words, how difficult would it be to keep track of the number of fully committed and visible rows in a table? I would be willing to accept a limitation such as: I open a transaction, delete a bunch of rows and before comitting my transaction get a row count which returns the number of rows that are in the table "outside of" my transaction (the count would include the rows I have just deleted in my transaction). Would it be difficult to count the number of rows in a table that exist outside of all transactions?
Martijn van Oosterhout wrote:
Martijn van Oosterhout wrote:
On Wed, May 15, 2002 at 10:24:55AM -0400, Jean-Luc Lachance wrote:The real question is: Why is reltuples only an approximation?It's only an approximation because it is updated by VACUUM. It's used to estimate the cost of queries. Secondly, remember that there is not really a canonical number-of-tuples-in-a-table. If you start a transaction and insert a row, you'll see one more row than any other transaction running at the time. If you're using a trigger to keep a count of the total, you'll get different answers depending on whether your trigger is deferred or not. I have no idea what happens if the trigger is not deferred but a transaction aborts. Deadlock? I wish people would remember this before declaring the total number of tuples in a table a trivial problem. HTH,
Andy DePue <adepue@eworksmart.com> writes: > Just curious... but often, I have only needed a "fairly" accurate > count of rows. As in, outside the context of a transaction. In other > words, how difficult would it be to keep track of the number of fully > committed and visible rows in a table? Well, if you run vacuum (or analyze) on a reasonably frequent basis, seems like pg_class.reltuples would do fine. regards, tom lane