Обсуждение: select is fast, update based on same where clause is slow
I have a select statement that returns a set of 74,000+ results back in under a minute as follows: select s.sessid, s.membid, s.datetime from sessions2 s, (select min(datetime) as datetime, membid from sessions2 where membid is not null group by membid) as minsess where s.membid = minsess.membid and s.datetime = minsess.datetime; The final cost from explain for the above select is 22199.15 ... 24318.40 with rows = 5 and width = 28. Then I issue an update as follows (to update those 74,000+ rows): update sessions2 set sinceinception = 0 from sessions2 s, (select min(datetime) as datetime, membid from sessions2 group by membid) as mindate where s.membid = mindate.membid and s.datetime = mindate.datetime; The final cost from explain for the above update is 31112.11...98869.91 with rows = 2013209 and width=87. This update statement has been left running over night and does not complete. The ram usage on the machine is at about 3/4 capacity (800mb) during this time and CPU usage is near 100%. The machine has the -F option set and memory segments of 200mb and is running 7.1.2. What could be causing this update statement to not complete? Why are the costs so different since it seems to me that besides the cost of the update they are the same query? Any help would be great! Jeff Barrett
On Fri, 21 Sep 2001, Jeff Barrett wrote: > I have a select statement that returns a set of 74,000+ results back in > under a minute as follows: > > select s.sessid, s.membid, s.datetime > from sessions2 s, (select min(datetime) as datetime, membid > from sessions2 > where membid is not null > group by membid) as minsess > where s.membid = minsess.membid > and s.datetime = minsess.datetime; > > The final cost from explain for the above select is 22199.15 ... 24318.40 > with rows = 5 and width = 28. > > Then I issue an update as follows (to update those 74,000+ rows): > update sessions2 set sinceinception = 0 > from sessions2 s, (select min(datetime) as datetime, membid from sessions2 > group by membid) as mindate > where s.membid = mindate.membid > and s.datetime = mindate.datetime; > > The final cost from explain for the above update is 31112.11...98869.91 with > rows = 2013209 and width=87. > > This update statement has been left running over night and does not > complete. The ram usage on the machine is at about 3/4 capacity (800mb) > during this time and CPU usage is near 100%. The machine has the -F option > set and memory segments of 200mb and is running 7.1.2. > > What could be causing this update statement to not complete? > Why are the costs so different since it seems to me that besides the cost of > the update they are the same query? I thought that the updated table is always in your from list (implicitly), so you'd want: update sessions2 set sinceinception = 0 from (select min(datetime) as datetime, membid from sessions2 group bymembid) as mindate where sessions2.membid=mindate.membid and sessions2.datetime=mindate.datetime; I think your query would be joining the s/mindate results against sessions2.
Jeff, I think that you're running into a syntactical problem here: > Then I issue an update as follows (to update those 74,000+ rows): > update sessions2 set sinceinception = 0 > from sessions2 s, (select min(datetime) as datetime, membid from > sessions2 > group by membid) as mindate > where s.membid = mindate.membid > and s.datetime = mindate.datetime; In the query above, there is no linkage between the instance of sessions2 (sessions2) you are updating and the instance you are limiting (s). As a result, you are running an update on 74,000^2 rows (about 55 trillion), and if the query ever completed you would find that sinceinception would be set to 0 for all rows in sessions2. Your query should read: update sessions2 set sinceinception = 0from (select min(datetime) as datetime, membid from sessions2 group by membid)as mindatewhere sessions2.membid = mindate.membid and sessions2.datetime = mindate.datetime; It's a peculiarity of Update queries that the name of the updated table cannot be aliased anywhere in the query. To add to the confusion, in MS Transact SQL, you are expected to repeat the name of the updated table in the FROM clause, while in Postgres such repetition is prohibited. This is mainly due to the fact that UPDATE ... FROM is not well-defined in the SQL 92 standard. -Josh ______AGLIO DATABASE SOLUTIONS___________________________ Josh Berkus Complete informationtechnology josh@agliodbs.com and data management solutions (415) 565-7293 for law firms, small businesses fax 621-2533 and non-profit organizations. San Francisco
That did the trick. Thank you for the quick detailed answer. It runs in about a minute now. Jeff Barrett "Stephan Szabo" <sszabo@megazone23.bigpanda.com> wrote in message news:Pine.BSF.4.21.0109210830550.88512-100000@megazone23.bigpanda.com... > On Fri, 21 Sep 2001, Jeff Barrett wrote: > > > I have a select statement that returns a set of 74,000+ results back in > > under a minute as follows: > > > > select s.sessid, s.membid, s.datetime > > from sessions2 s, (select min(datetime) as datetime, membid > > from sessions2 > > where membid is not null > > group by membid) as minsess > > where s.membid = minsess.membid > > and s.datetime = minsess.datetime; > > > > The final cost from explain for the above select is 22199.15 ... 24318.40 > > with rows = 5 and width = 28. > > > > Then I issue an update as follows (to update those 74,000+ rows): > > update sessions2 set sinceinception = 0 > > from sessions2 s, (select min(datetime) as datetime, membid from sessions2 > > group by membid) as mindate > > where s.membid = mindate.membid > > and s.datetime = mindate.datetime; > > > > The final cost from explain for the above update is 31112.11...98869.91 with > > rows = 2013209 and width=87. > > > > This update statement has been left running over night and does not > > complete. The ram usage on the machine is at about 3/4 capacity (800mb) > > during this time and CPU usage is near 100%. The machine has the -F option > > set and memory segments of 200mb and is running 7.1.2. > > > > What could be causing this update statement to not complete? > > Why are the costs so different since it seems to me that besides the cost of > > the update they are the same query? > > I thought that the updated table is always in your from list (implicitly), > so you'd want: > update sessions2 set sinceinception = 0 > from (select min(datetime) as datetime, membid from sessions2 group by > membid) as mindate > where sessions2.membid=mindate.membid and > sessions2.datetime=mindate.datetime; > > I think your query would be joining the s/mindate results against > sessions2. > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly