Обсуждение: Need SQL Help Finding Current Status of members

Поиск
Список
Период
Сортировка

Need SQL Help Finding Current Status of members

От
"Michael Avila"
Дата:
I have a table which keeps track of the status of members. In the table is


member_id int(8)
status_code char(1)
status_date date

KEY member_id (member_id,status_code,status_date)


Each member can have multiple records because a record is added each time
the status changes but the old record is kept for history.

What I want to do is find the latest status for each member. Actually I want
to find all those with an status of "A". But it must be the current (latest)
status. How do I find the most current date for each member in a pile of
many records for many members with many status settings with one SQL
statement? This is a bit beyond my capabilities so I am asking for help.

My initial SQL is

SELECT * FROM memberstatus WHERE status_code = 'a'

but that is my limit. I know an AND comes next but need help after that! I
know that MAX is not for dates so that is out. Is there a LATEST DATE? I did
not see one when I was looking at the date functions.

Appreciate the help.

Mike


Вложения

Re: Need SQL Help Finding Current Status of members

От
Michael Fuhr
Дата:
On Thu, Dec 15, 2005 at 08:31:09PM -0500, Michael Avila wrote:
> What I want to do is find the latest status for each member. Actually I want
> to find all those with an status of "A". But it must be the current (latest)
> status. How do I find the most current date for each member in a pile of
> many records for many members with many status settings with one SQL
> statement?

Suppose you have this table:

SELECT * FROM memberstatus;
member_id | status_code | status_date 
-----------+-------------+-------------        1 | a           | 2005-01-01        2 | x           | 2005-01-01
3| x           | 2005-01-01        4 | x           | 2005-01-01        1 | x           | 2005-12-15        2 | a
  | 2005-12-15        3 | y           | 2005-12-15        4 | a           | 2005-12-15
 
(8 rows)

Let's order the data so all of a member's records are shown together,
with the latest one first:

SELECT * FROM memberstatus
ORDER BY member_id, status_date DESC;
member_id | status_code | status_date 
-----------+-------------+-------------        1 | x           | 2005-12-15        1 | a           | 2005-01-01
2| a           | 2005-12-15        2 | x           | 2005-01-01        3 | y           | 2005-12-15        3 | x
  | 2005-01-01        4 | a           | 2005-12-15        4 | x           | 2005-01-01
 
(8 rows)

One way to get only the first record for each member is to use
PostgreSQL's nonstandard DISTINCT ON construct:

SELECT DISTINCT ON (member_id) * FROM memberstatus
ORDER BY member_id, status_date DESC;
member_id | status_code | status_date 
-----------+-------------+-------------        1 | x           | 2005-12-15        2 | a           | 2005-12-15
3| y           | 2005-12-15        4 | a           | 2005-12-15
 
(4 rows)

We could put the above in a subquery and restrict the output to the
records we want:

SELECT * FROM ( SELECT DISTINCT ON (member_id) * FROM memberstatus ORDER BY member_id, status_date DESC
) AS s
WHERE status_code = 'a'
ORDER BY member_id;
member_id | status_code | status_date 
-----------+-------------+-------------        2 | a           | 2005-12-15        4 | a           | 2005-12-15
(2 rows)

This isn't the only way; search the archives for alternatives.

-- 
Michael Fuhr


Re: Need SQL Help Finding Current Status of members

От
Richard Huxton
Дата:
Michael Avila wrote:
> I have a table which keeps track of the status of members. In the table is 
> 
> 
> member_id int(8)
> status_code char(1)
> status_date date 
> 
> KEY member_id (member_id,status_code,status_date)
> 
> 
> Each member can have multiple records because a record is added each time
> the status changes but the old record is kept for history.
> 
> What I want to do is find the latest status for each member.

Michael Fuhr has already described on solution, but if you can alter the 
table definition then there might be a neater solution.

Replace "status_date" with "status_expires" and make it a "timestamp 
with time zone". Set the expiry to 'infinity' for the current record and  you then have a simple select to find the
mostrecent.
 

If you regularly want to find which record was active on a particular 
time you'll want two columns: valid_from and valid_to. This makes it 
much easier to find a row for a specific date.
--   Richard Huxton  Archonet Ltd


Re: Need SQL Help Finding Current Status of members

От
Patrick JACQUOT
Дата:
Richard Huxton wrote:

> Michael Avila wrote:
>
>> I have a table which keeps track of the status of members. In the 
>> table is
>>
>> member_id int(8)
>> status_code char(1)
>> status_date date
>> KEY member_id (member_id,status_code,status_date)
>>
>>
>> Each member can have multiple records because a record is added each 
>> time
>> the status changes but the old record is kept for history.
>>
>> What I want to do is find the latest status for each member.
>
>
> Michael Fuhr has already described on solution, but if you can alter 
> the table definition then there might be a neater solution.
>
> Replace "status_date" with "status_expires" and make it a "timestamp 
> with time zone". Set the expiry to 'infinity' for the current record 
> and  you then have a simple select to find the most recent.
>
> If you regularly want to find which record was active on a particular 
> time you'll want two columns: valid_from and valid_to. This makes it 
> much easier to find a row for a specific date.

There is a standard way :

Select * from memberstatus A where not exists  (select * from emberstatus B where B.member_id=A.member_id and  
B.status_date >A.status_date)


Re: Need SQL Help Finding Current Status of members

От
"Michael Avila"
Дата:
Interesting. I think I understand that. I have never worked with a SELECT
within a SELECT (I think that is called a subquery). I am guessing that it
works its way through the member status records until the latest date
"floats" to the top (nothing is > than it).

Will that be a problem performance-wise if there are thousands of records?

Thanks for the help.

Mike


> -----Original Message-----
> From: pgsql-sql-owner@postgresql.org
> [mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Patrick JACQUOT
> Sent: Friday, December 16, 2005 5:12 AM
> Cc: SQL PostgreSQL MailList
> Subject: Re: [SQL] Need SQL Help Finding Current Status of members
>
>
> Richard Huxton wrote:
>
> > Michael Avila wrote:
> >
> >> I have a table which keeps track of the status of members. In the
> >> table is
> >>
> >> member_id int(8)
> >> status_code char(1)
> >> status_date date
> >> KEY member_id (member_id,status_code,status_date)
> >>
> >>
> >> Each member can have multiple records because a record is added each
> >> time
> >> the status changes but the old record is kept for history.
> >>
> >> What I want to do is find the latest status for each member.
> >
> >
> > Michael Fuhr has already described on solution, but if you can alter
> > the table definition then there might be a neater solution.
> >
> > Replace "status_date" with "status_expires" and make it a "timestamp
> > with time zone". Set the expiry to 'infinity' for the current record
> > and  you then have a simple select to find the most recent.
> >
> > If you regularly want to find which record was active on a particular
> > time you'll want two columns: valid_from and valid_to. This makes it
> > much easier to find a row for a specific date.
>
> There is a standard way :
>
> Select * from memberstatus A where not exists
>    (select * from emberstatus B where B.member_id=A.member_id and
> B.status_date >A.status_date)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings



Re: Need SQL Help Finding Current Status of members

От
Alvaro Herrera
Дата:
Michael Avila wrote:
> Interesting. I think I understand that. I have never worked with a SELECT
> within a SELECT (I think that is called a subquery). I am guessing that it
> works its way through the member status records until the latest date
> "floats" to the top (nothing is > than it).
>
> Will that be a problem performance-wise if there are thousands of records?

The actual execution of the query may be something completely different,
depending on what the optimizer decides.  I doubt it will suck much,
because the optimizer is quite good.  Try executing

EXPLAIN query;

to see how it would be executed, and 

EXPLAIN ANALYZE query;

to see the above plus the real numbers that the executor got after
executing it (number of times each loop was executed, time spent at each
execution step, etc).

Read the ANALYZE page to see how are these things supposed to be read.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Need SQL Help Finding Current Status of members

От
"Michael Avila"
Дата:
Just tried it and it returned nothing.

> > Select * from memberstatus A where not exists
> >    (select * from emberstatus B where B.member_id=A.member_id and
> > B.status_date >A.status_date)

Why the WHERE NOT EXISTS?

Here is a copy and paste of my code

$query = "SELECT * FROM memberstatus A WHERE NOT EXISTS (SELECT * from
memberstatus B WHERE B.member_id=A.member_id AND B.status_date >
A.status_date)";

Mike


> -----Original Message-----
> From: Michael Avila [mailto:Michael.Avila.1@sbcglobal.net]
> Sent: Friday, December 16, 2005 8:46 AM
> To: SQL PostgreSQL MailList
> Subject: RE: [SQL] Need SQL Help Finding Current Status of members
>
>
> Interesting. I think I understand that. I have never worked with
> a SELECT within a SELECT (I think that is called a subquery). I
> am guessing that it works its way through the member status
> records until the latest date "floats" to the top (nothing is > than it).
>
> Will that be a problem performance-wise if there are thousands of records?
>
> Thanks for the help.
>
> Mike
>
>
> > -----Original Message-----
> > From: pgsql-sql-owner@postgresql.org
> > [mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Patrick JACQUOT
> > Sent: Friday, December 16, 2005 5:12 AM
> > Cc: SQL PostgreSQL MailList
> > Subject: Re: [SQL] Need SQL Help Finding Current Status of members
> >
> >
> > Richard Huxton wrote:
> >
> > > Michael Avila wrote:
> > >
> > >> I have a table which keeps track of the status of members. In the
> > >> table is
> > >>
> > >> member_id int(8)
> > >> status_code char(1)
> > >> status_date date
> > >> KEY member_id (member_id,status_code,status_date)
> > >>
> > >>
> > >> Each member can have multiple records because a record is added each
> > >> time
> > >> the status changes but the old record is kept for history.
> > >>
> > >> What I want to do is find the latest status for each member.
> > >
> > >
> > > Michael Fuhr has already described on solution, but if you can alter
> > > the table definition then there might be a neater solution.
> > >
> > > Replace "status_date" with "status_expires" and make it a "timestamp
> > > with time zone". Set the expiry to 'infinity' for the current record
> > > and  you then have a simple select to find the most recent.
> > >
> > > If you regularly want to find which record was active on a particular
> > > time you'll want two columns: valid_from and valid_to. This makes it
> > > much easier to find a row for a specific date.
> >
> > There is a standard way :
> >
> > Select * from memberstatus A where not exists
> >    (select * from emberstatus B where B.member_id=A.member_id and
> > B.status_date >A.status_date)
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 5: don't forget to increase your free space map settings



Re: Need SQL Help Finding Current Status of members

От
Michael Fuhr
Дата:
On Fri, Dec 16, 2005 at 07:44:46PM -0500, Michael Avila wrote:
> Just tried it and it returned nothing.
> 
> > > Select * from memberstatus A where not exists
> > >    (select * from emberstatus B where B.member_id=A.member_id and
> > > B.status_date >A.status_date)
> 
> Why the WHERE NOT EXISTS?

The query selects each row in memberstatus for which no other rows
(WHERE NOT EXISTS) with the same member_id have a later status_date;
in other words, the row(s) with the latest status_date for each
member_id.  For example, given
member_id | status_code | status_date 
-----------+-------------+-------------        1 | a           | 2005-12-01        1 | b           | 2005-12-02
1| c           | 2005-12-03        2 | x           | 2005-12-11        2 | y           | 2005-12-12        2 | z
  | 2005-12-13
 

the query should return
member_id | status_code | status_date 
-----------+-------------+-------------        1 | c           | 2005-12-03        2 | z           | 2005-12-13

Offhand I can't think of why the query would return nothing unless
the table is empty, but maybe I'm overlooking something or making
unwarranted assumptions about the data.  Can you post a sample data
set for which the query returns no rows?

-- 
Michael Fuhr


Re: Need SQL Help Finding Current Status of members

От
"Michael Avila"
Дата:
The table is not empty. I did some playing around with the SQL Statement but
got no where. So I added a record status column that will be non-NULL when
the status becomes history because a new status is added. Yeah, I cheated to
make it easy!! LOL But it now works and that is what counts!

Thanks for the help.

Mike


> -----Original Message-----
> From: Michael Fuhr [mailto:mike@fuhr.org]
> Sent: Saturday, December 17, 2005 7:26 PM
> To: Michael Avila
> Cc: SQL PostgreSQL MailList
> Subject: Re: [SQL] Need SQL Help Finding Current Status of members
>
>
> On Fri, Dec 16, 2005 at 07:44:46PM -0500, Michael Avila wrote:
> > Just tried it and it returned nothing.
> >
> > > > Select * from memberstatus A where not exists
> > > >    (select * from emberstatus B where B.member_id=A.member_id and
> > > > B.status_date >A.status_date)
> >
> > Why the WHERE NOT EXISTS?
>
> The query selects each row in memberstatus for which no other rows
> (WHERE NOT EXISTS) with the same member_id have a later status_date;
> in other words, the row(s) with the latest status_date for each
> member_id.  For example, given
>
>  member_id | status_code | status_date
> -----------+-------------+-------------
>          1 | a           | 2005-12-01
>          1 | b           | 2005-12-02
>          1 | c           | 2005-12-03
>          2 | x           | 2005-12-11
>          2 | y           | 2005-12-12
>          2 | z           | 2005-12-13
>
> the query should return
>
>  member_id | status_code | status_date
> -----------+-------------+-------------
>          1 | c           | 2005-12-03
>          2 | z           | 2005-12-13
>
> Offhand I can't think of why the query would return nothing unless
> the table is empty, but maybe I'm overlooking something or making
> unwarranted assumptions about the data.  Can you post a sample data
> set for which the query returns no rows?
>
> --
> Michael Fuhr



Re: Need SQL Help Finding Current Status of members

От
Michael Fuhr
Дата:
On Sat, Dec 17, 2005 at 07:34:22PM -0500, Michael Avila wrote:
> The table is not empty. I did some playing around with the SQL Statement but
> got no where. So I added a record status column that will be non-NULL when
> the status becomes history because a new status is added. Yeah, I cheated to
> make it easy!! LOL But it now works and that is what counts!

Even if you're satisfied with whatever you ended up doing, it might
still be educational to find out why the query didn't work.  If you
post a test case we'll take a look; we might even be able to suggest
a better solution than the "cheat" you used.

-- 
Michael Fuhr