Обсуждение: grouping subsets

Поиск
Список
Период
Сортировка

grouping subsets

От
Rainer Stengele
Дата:
Hi,

having a table similar to

| 1 | B | [2010-07-15 Do] |
| 1 | B | [2010-07-16 Fr] |
|---+---+-----------------|
| 2 | C | [2010-07-17 Sa] |
| 2 | C | [2010-07-18 So] |
|---+---+-----------------|
| 1 | B | [2010-07-19 Mo] |
| 1 | B | [2010-07-20 Di] |
| 1 | B | [2010-07-21 Mi] |
| 1 | B | [2010-07-22 Do] |
|---+---+-----------------|
| 3 | D | [2010-07-23 Fr] |

a simple group by gives me:

| 6 | B |
| 4 | C |
| 3 | D |


What I want to get is the values grouped by "subset", where a subset is a set of rows with identical column until the
columchanges.
 
Is there a way to get

| 2 | B |
| 4 | C |
| 4 | B |
| 3 | D |

by SQL only?

- Rainer






Re: grouping subsets

От
"A. Kretschmer"
Дата:
In response to Rainer Stengele :
> Hi,
> 
> having a table similar to
> 
> | 1 | B | [2010-07-15 Do] |
> | 1 | B | [2010-07-16 Fr] |
> |---+---+-----------------|
> | 2 | C | [2010-07-17 Sa] |
> | 2 | C | [2010-07-18 So] |
> |---+---+-----------------|
> | 1 | B | [2010-07-19 Mo] |
> | 1 | B | [2010-07-20 Di] |
> | 1 | B | [2010-07-21 Mi] |
> | 1 | B | [2010-07-22 Do] |
> |---+---+-----------------|
> | 3 | D | [2010-07-23 Fr] |
> 
> a simple group by gives me:
> 
> | 6 | B |
> | 4 | C |
> | 3 | D |
> 
> 
> What I want to get is the values grouped by "subset", where a subset is a set of rows with identical column until the
columchanges.
 
> Is there a way to get
> 
> | 2 | B |
> | 4 | C |
> | 4 | B |
> | 3 | D |
> 
> by SQL only?

I think, the problem is that there are 2 identical groups. I think, you
can write a pl/pgsql-proc, selecting all ordered by the date-field and
walking through the result to do the grouping, checking if the 2nd
column is different from the previous. 

With plain SQL it's maybe possible too, but i don't know how ...

Regards, Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99


Re: grouping subsets

От
Richard Huxton
Дата:
On 22/07/10 11:02, A. Kretschmer wrote:
> In response to Rainer Stengele :
>> What I want to get is the values grouped by "subset", where a subset is a set of rows with identical column until
thecolum changes.
 
>> Is there a way to get
>>
>> | 2 | B |
>> | 4 | C |
>> | 4 | B |
>> | 3 | D |
>>
>> by SQL only?
>
> I think, the problem is that there are 2 identical groups. I think, you
> can write a pl/pgsql-proc, selecting all ordered by the date-field and
> walking through the result to do the grouping, checking if the 2nd
> column is different from the previous.
>
> With plain SQL it's maybe possible too, but i don't know how ...

It should be do-able in 8.4 onwards, look into windowing functions. In 
particular the lag() function:

SELECT    mycode,    mydate,    lag(mycode) OVER (ORDER BY mydate) AS prev_code
FROM    mytable
ORDER BY mydate;

It should be possible to use that as a subquery with an outer query that 
compares mycode=prev_code to get a run length.

--   Richard Huxton  Archonet Ltd


Re: grouping subsets

От
Tim Landscheidt
Дата:
Richard Huxton <dev@archonet.com> wrote:

>>> What I want to get is the values grouped by "subset", where a subset is a set of rows with identical column until
thecolum changes.
 
>>> Is there a way to get

>>> | 2 | B |
>>> | 4 | C |
>>> | 4 | B |
>>> | 3 | D |

>>> by SQL only?

>> I think, the problem is that there are 2 identical groups. I think, you
>> can write a pl/pgsql-proc, selecting all ordered by the date-field and
>> walking through the result to do the grouping, checking if the 2nd
>> column is different from the previous.

>> With plain SQL it's maybe possible too, but i don't know how ...

> It should be do-able in 8.4 onwards, look into windowing
> functions. In particular the lag() function:

> SELECT
>     mycode,
>     mydate,
>     lag(mycode) OVER (ORDER BY mydate) AS prev_code
> FROM
>     mytable
> ORDER BY mydate;

> It should be possible to use that as a subquery with an
> outer query that compares mycode=prev_code to get a run
> length.

Hmmm. Can the outer query be done without using "WITH
RECURSIVE"?

Tim



Re: grouping subsets

От
"Oliveiros d'Azevedo Cristina"
Дата:
Howdy, Rainer.

Please advice me,

The dates always follow that sequential pattern?

Or can be holes on the dates sequence?

Best,
Oliveiros

----- Original Message ----- 
From: "Rainer Stengele" <rainer.stengele@diplan.de>
To: <pgsql-sql@postgresql.org>
Sent: Thursday, July 22, 2010 9:09 AM
Subject: [SQL] grouping subsets


> Hi,
>
> having a table similar to
>
> | 1 | B | [2010-07-15 Do] |
> | 1 | B | [2010-07-16 Fr] |
> |---+---+-----------------|
> | 2 | C | [2010-07-17 Sa] |
> | 2 | C | [2010-07-18 So] |
> |---+---+-----------------|
> | 1 | B | [2010-07-19 Mo] |
> | 1 | B | [2010-07-20 Di] |
> | 1 | B | [2010-07-21 Mi] |
> | 1 | B | [2010-07-22 Do] |
> |---+---+-----------------|
> | 3 | D | [2010-07-23 Fr] |
>
> a simple group by gives me:
>
> | 6 | B |
> | 4 | C |
> | 3 | D |
>
>
> What I want to get is the values grouped by "subset", where a subset is a 
> set of rows with identical column until the colum changes.
> Is there a way to get
>
> | 2 | B |
> | 4 | C |
> | 4 | B |
> | 3 | D |
>
> by SQL only?
>
> - Rainer
>
>
>
>
>
> -- 
> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql 



Re: grouping subsets

От
Rainer Stengele
Дата:
Hi Oliveiros,

yes, the date is always incremented - but anyway the date column is not really the point!
Actually the first tow columns are relevant.
I want them gouped together as indicated, adding up column 1 in the blocks with identical second column, but not adding
upover all the rows.
 

Hope I could express it!

Mit freundlichen Grüßen / Best Regards   Rainer Stengele 

__|___  | Dipl. Inf. (Univ.) Rainer Stengele    | Technical Control - System Administration   | | email    :
Rainer.Stengele@diplan.de | voice/fax: ++49-9131-7778-85/88 | WWW      : http://www.diplan.de  | | diplan GmbH
                      | Wetterkreuz 27 | 91058 Erlangen, Germany                      
 
Diese E-Mail kann vertrauliche und/oder rechtlich geschützte Informationen enthalten. Wenn Sie nicht der richtige
Adressatsind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte den Absender und vernichten Sie diese
Mail.Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet.
 

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have
receivedthis e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying,
disclosureor distribution of the material in this e-mail is strictly forbidden.
 



Am 22.07.2010 15:18, schrieb Oliveiros d'Azevedo Cristina:
> Howdy, Rainer.
>
> Please advice me,
>
> The dates always follow that sequential pattern?
>
> Or can be holes on the dates sequence?
>
> Best,
> Oliveiros
>
> ----- Original Message ----- From: "Rainer Stengele" <rainer.stengele@diplan.de>
> To: <pgsql-sql@postgresql.org>
> Sent: Thursday, July 22, 2010 9:09 AM
> Subject: [SQL] grouping subsets
>
>
>> Hi,
>>
>> having a table similar to
>>
>> | 1 | B | [2010-07-15 Do] |
>> | 1 | B | [2010-07-16 Fr] |
>> |---+---+-----------------|
>> | 2 | C | [2010-07-17 Sa] |
>> | 2 | C | [2010-07-18 So] |
>> |---+---+-----------------|
>> | 1 | B | [2010-07-19 Mo] |
>> | 1 | B | [2010-07-20 Di] |
>> | 1 | B | [2010-07-21 Mi] |
>> | 1 | B | [2010-07-22 Do] |
>> |---+---+-----------------|
>> | 3 | D | [2010-07-23 Fr] |
>>
>> a simple group by gives me:
>>
>> | 6 | B |
>> | 4 | C |
>> | 3 | D |
>>
>>
>> What I want to get is the values grouped by "subset", where a subset is a set of rows with identical column until
thecolum changes.
 
>> Is there a way to get
>>
>> | 2 | B |
>> | 4 | C |
>> | 4 | B |
>> | 3 | D |
>>
>> by SQL only?
>>
>> - Rainer
>>
>>
>>
>>
>>
>> -- 
>> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-sql 
>
>


Re: grouping subsets

От
Tim Landscheidt
Дата:
Rainer Stengele <rainer.stengele@diplan.de> wrote:

> yes, the date is always incremented - but anyway the date
> column is not really the point! Actually the first tow
> columns are relevant. I want them gouped together as
> indicated, adding up column 1 in the blocks with identical
> second column, but not adding up over all the rows.
> [...]

If the date column wasn't relevant, how would you group the
first two columns?

Tim



Re: grouping subsets

От
"Oliveiros d'Azevedo Cristina"
Дата:
Howdy, Rainer.

It's been a while, so I don't know if you are still interested in this 
problem or if you, in the meantime, found yourself a solution,
but I've tried this on a local copy of the example you provided   and it 
seems to work.

The problem is that I suspect that if you have several thousands of records 
on your table it will become slow...

Best,
Oliveiros

SELECT SUM(tudo.parcela),tudo.a
FROM
(
SELECT fo.parcela,fo.a,fo.b,fo.c,MIN(th.c) as d
FROM
(
SELECT se.a as parcela,se.b as a,se.c as b, MAX(pr.c) as c
FROM
yourTable se
LEFT JOIN
(
SELECT a.*
FROM yourTable a
JOIN yourTable b
ON (b.b <> a.b)
AND ((age(a.c,b.c) = '1 day'::interval)

)
) pr
ON pr.b = se.b
AND    se.c >= pr.c
GROUP BY se.a,se.b,se.c
) fo
LEFT JOIN
(
SELECT a.*
FROM yourTable a
JOIN yourTable b
ON (b.b <> a.b)
AND ((age(a.c,b.c) = '-1 day'::interval)
)
) th
ON fo.a = th.b
AND fo.b <= th.c
GROUP BY fo.parcela,fo.a,fo.b,fo.c
) tudo
GROUP BY tudo.a,tudo.c,tudo.d




To: <pgsql-sql@postgresql.org>
Sent: Thursday, July 22, 2010 9:09 AM
Subject: [SQL] grouping subsets


> Hi,
>
> having a table similar to
>
> | 1 | B | [2010-07-15 Do] |
> | 1 | B | [2010-07-16 Fr] |
> |---+---+-----------------|
> | 2 | C | [2010-07-17 Sa] |
> | 2 | C | [2010-07-18 So] |
> |---+---+-----------------|
> | 1 | B | [2010-07-19 Mo] |
> | 1 | B | [2010-07-20 Di] |
> | 1 | B | [2010-07-21 Mi] |
> | 1 | B | [2010-07-22 Do] |
> |---+---+-----------------|
> | 3 | D | [2010-07-23 Fr] |
>
> a simple group by gives me:
>
> | 6 | B |
> | 4 | C |
> | 3 | D |
>
>
> What I want to get is the values grouped by "subset", where a subset is a 
> set of rows with identical column until the colum changes.
> Is there a way to get
>
> | 2 | B |
> | 4 | C |
> | 4 | B |
> | 3 | D |
>
> by SQL only?
>
> - Rainer
>
>
>
>
>
> -- 
> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql 



Re: grouping subsets

От
Rainer Stengele
Дата:
Howdy Cristina,

unfortunately things are more complicated. I have inserted an excerpt of the real data here:

================================================================================
TableID         MasterID dtBegin                           dtEnd                              idR      idL      idB
consumption
4057312           295530 2010-07-01 00:59:21.077           2010-07-01 01:32:59.670           1726     3212     1428
279
4061043           295574 2010-07-01 01:59:31.137           2010-07-01 02:32:09.373           1726     3212     1428
183
4083397           295838 2010-07-01 07:57:51.327           2010-07-01 08:28:28.117           318      1846     1012
30
4090858           295920 2010-07-01 09:52:33.777           2010-07-01 10:31:34.393           318      1846     1012
487
4094589           295961 2010-07-01 10:47:59.370           2010-07-01 11:32:20.903           318      1846     1012
472
4098330           296013 2010-07-01 11:58:53.890           2010-07-01 12:31:35.730           318      1846     1012
195
4102069           296058 2010-07-01 12:36:19.170           2010-07-01 13:32:13.950           318      1846     1012
338
4105809           296102 2010-07-01 13:58:53.170           2010-07-01 14:02:57.710           318      1846     1012
105
4109555           296150 2010-07-01 14:59:11.663           2010-07-01 15:32:33.810           318      1846     1012
187
4113305           296194 2010-07-01 15:59:01.797           2010-07-01 16:02:27.260           318      1846     1012
108
4117048           296238 2010-07-01 16:20:47.997           2010-07-01 17:32:49.367           318      1846     1012
179
4120791           296282 2010-07-01 17:58:27.657           2010-07-01 18:29:01.733           318      1846     1012
256
4128291           296370 2010-07-01 19:54:17.687           2010-07-01 20:32:53.850           318      1846     1012
239
4132044           296413 2010-07-01 20:31:37.653           2010-07-01 21:29:13.497           318      1846     1012
39
4135797           296458 2010-07-01 21:59:13.983           2010-07-01 22:32:46.503           318      1846     1012
157
4139572           296506 2010-07-01 22:58:49.530           2010-07-01 23:32:22.543           318      1846     1012
218
4142941           296554 2010-07-01 23:59:13.857           2010-07-02 00:32:30.390           318      1846     1012
248
4146289           296598 2010-07-02 00:58:55.763           2010-07-02 01:32:41.983           318      1846     1012
204
4149616           296642 2010-07-02 01:46:57.357           2010-07-02 02:32:56.983           318      1846     1012
42
4152952           296686 2010-07-02 02:55:19.653           2010-07-02 03:32:28.013           318      1846     1012
135
4156289           296730 2010-07-02 03:43:52.777           2010-07-02 04:32:55.250           318      1846     1012
743
4159624           296774 2010-07-02 04:43:15.310           2010-07-02 05:32:44.547           318      1846     1012
277
4162961           296817 2010-07-02 05:58:59.483           2010-07-02 06:32:37.340           318      1846     1012
121
4166303           296862 2010-07-02 06:58:50.733           2010-07-02 07:32:39.113           318      1846     1012
239
4172981           296950 2010-07-02 07:28:55.293           2010-07-02 09:33:01.200           318      1846     1012
512
4176322           296993 2010-07-02 09:59:04.607           2010-07-02 10:33:01.903           318      1846     1012
139
4179667           297038 2010-07-02 10:55:27.760           2010-07-02 11:32:56.560           318      1846     1012
722
4183012           297082 2010-07-02 11:59:33.650           2010-07-02 12:32:14.700            318      1846     1012
163
 
4186351           297126 2010-07-02 12:23:45.997           2010-07-02 13:32:59.500            318      1846     1012
284
 
4189689           297169 2010-07-02 13:44:21.253           2010-07-02 14:18:05.080            318      1846     1012
254
 
4196371           297258 2010-07-02 16:16:19.123           2010-07-02 16:32:53.437           1706     3541     1511
161
4199720           297301 2010-07-02 16:59:35.127           2010-07-02 17:32:57.950           1706     3541     1511
250
4203068           297346 2010-07-02 17:59:34.027           2010-07-02 18:32:54.337           1706     3541     1511
302
4206413           297389 2010-07-02 18:59:28.730           2010-07-02 19:32:37.950           1706     3541     1511
276
4209758           297434 2010-07-02 19:54:00.243           2010-07-02 20:32:57.433           1706     3541     1511
209
4213102           297473 2010-07-02 20:49:10.963           2010-07-02 21:30:44.540           1706     3541     1511
76
4216447           297511 2010-07-02 21:59:34.810           2010-07-02 22:33:00.603           1706     3541     1511
287
4219818           297569 2010-07-02 22:56:52.750           2010-07-02 23:59:31.607           1706     3541     1511
1877
4219819           297570 2010-07-02 23:59:21.577           2010-07-03 00:54:40.153           1706     3541     1511
1798
4219821           297572 2010-07-03 00:48:03.310           2010-07-03 01:59:37.920           1706     3541     1511
1125
4219823           297574 2010-07-03 01:51:01.057           2010-07-03 02:59:45.433           1706     3541     1511
1629
4219820           297571 2010-07-03 02:59:29.393           2010-07-03 03:59:54.920           1706     3541     1511
2462
4219822           297573 2010-07-03 03:59:18.663           2010-07-03 04:01:48.810           1706     3541     1511
70
4225738           297656 2010-07-03 06:13:34.980           2010-07-03 06:28:09.697           1726     3212     1428
46
4228694           297695 2010-07-03 06:59:15.560           2010-07-03 07:32:45.653           1726     3212     1428
251
4231649           297733 2010-07-03 07:59:11.937           2010-07-03 08:32:57.217           1726     3212     1428
284
4234604           297771 2010-07-03 08:57:00.357           2010-07-03 09:32:47.903           1726     3212     1428
227
4237559           297809 2010-07-03 09:59:19.813           2010-07-03 10:33:02.063           1726     3212     1428
285
4261156           298596 2010-07-04 22:59:09.863           2010-07-04 23:33:45.530           1726     3212     1428
1286
4264114           298646 2010-07-04 23:59:16.967           2010-07-05 00:33:08.107           1726     3212     1428
297
4267067           298690 2010-07-05 00:59:15.187           2010-07-05 01:32:48.300           1726     3212     1428
333
4270023           298734 2010-07-05 01:59:02.497           2010-07-05 02:32:48.780           1726     3212     1428
270
4272977           298778 2010-07-05 02:41:43.737           2010-07-05 03:32:56.043           1726     3212     1428
317
4275927           298822 2010-07-05 03:59:17.027           2010-07-05 04:33:14.947           1726     3212     1428
1623
================================================================================

Description:
1. Column: some ID
2. Column: reference to another table
3. and 4. column: timestamp from/to of the item
5. Column: ID R
6. Column: ID L
7. Column: ID B
8. Column: Sum of components

Requirement:
Sum over all components (from column 8) for each combination of ID R, ID L, ID B, but (!)
rows with same keys (R,L,B) should be summed up only until the keys change.
Do not sum up the components for identical keys, if there are other keys between them.

Example result:

idR       idL      idB                 SUM
1726     3212     1428                 462318     1846     1012                 ...
1706     3541     1511                 ...
1726     3212     1428                 ...


Note that the first and last entry here has the same keys

Maybe you find a similar monster SQL solving such a requirement.
Thanks for considering!

Rainer



Am 27.07.2010 12:37, schrieb Oliveiros d'Azevedo Cristina:
> Howdy, Rainer.
> 
> It's been a while, so I don't know if you are still interested in this problem or if you, in the meantime, found
yourselfa solution, 
> but I've tried this on a local copy of the example you provided   and it seems to work.
> 
> The problem is that I suspect that if you have several thousands of records on your table it will become slow...
> 
> Best,
> Oliveiros
> 
> SELECT SUM(tudo.parcela),tudo.a
> FROM
> (
> SELECT fo.parcela,fo.a,fo.b,fo.c,MIN(th.c) as d
> FROM
> (
> SELECT se.a as parcela,se.b as a,se.c as b, MAX(pr.c) as c
> FROM
> yourTable se
> LEFT JOIN
> (
> SELECT a.*
> FROM yourTable a
> JOIN yourTable b
> ON (b.b <> a.b)
> AND ((age(a.c,b.c) = '1 day'::interval)
> 
> )
> ) pr
> ON pr.b = se.b
> AND    se.c >= pr.c
> GROUP BY se.a,se.b,se.c
> ) fo
> LEFT JOIN
> (
> SELECT a.*
> FROM yourTable a
> JOIN yourTable b
> ON (b.b <> a.b)
> AND ((age(a.c,b.c) = '-1 day'::interval)
> )
> ) th
> ON fo.a = th.b
> AND fo.b <= th.c
> GROUP BY fo.parcela,fo.a,fo.b,fo.c
> ) tudo
> GROUP BY tudo.a,tudo.c,tudo.d
> 
> 
> 
> 
> To: <pgsql-sql@postgresql.org>
> Sent: Thursday, July 22, 2010 9:09 AM
> Subject: [SQL] grouping subsets
> 
> 
>> Hi,
>>
>> having a table similar to
>>
>> | 1 | B | [2010-07-15 Do] |
>> | 1 | B | [2010-07-16 Fr] |
>> |---+---+-----------------|
>> | 2 | C | [2010-07-17 Sa] |
>> | 2 | C | [2010-07-18 So] |
>> |---+---+-----------------|
>> | 1 | B | [2010-07-19 Mo] |
>> | 1 | B | [2010-07-20 Di] |
>> | 1 | B | [2010-07-21 Mi] |
>> | 1 | B | [2010-07-22 Do] |
>> |---+---+-----------------|
>> | 3 | D | [2010-07-23 Fr] |
>>
>> a simple group by gives me:
>>
>> | 6 | B |
>> | 4 | C |
>> | 3 | D |
>>
>>
>> What I want to get is the values grouped by "subset", where a subset is a set of rows with identical column until
thecolum changes.
 
>> Is there a way to get
>>
>> | 2 | B |
>> | 4 | C |
>> | 4 | B |
>> | 3 | D |
>>
>> by SQL only?
>>
>> - Rainer
>>
>>
>>
>>
>>
>> -- 
>> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-sql 
> 
> 



Re: grouping subsets

От
"Oliveiros d'Azevedo Cristina"
Дата:
Yes. This is somewhat more complicated because it has more constraints.
I've noticed that a given combination doesn't appear with holes on a certain 
day.

For ex, on a daily basis, we have every three key combinations together.

We dont have things like
2010-7-01  1726 3212 1428
2010-7-01  1726 3212 1428
...                318   1846 1012
2010-7-01  1726 3212 1428

Can I assume that, for a certain day , the records for the same three 
combination are all together? There is just one set per day for a given 
combination?

Or is it possible to have the same combination on one day with several sets?

Best,
Oliveiros


----- Original Message ----- 
From: "Rainer Stengele" <rainer.stengele@diplan.de>
Newsgroups: gmane.comp.db.postgresql.sql
To: "Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>
Cc: <>
Sent: Thursday, July 29, 2010 10:41 AM
Subject: Re: grouping subsets


> Howdy Cristina,
>
> unfortunately things are more complicated. I have inserted an excerpt of 
> the real data here:
>
> ================================================================================
> TableID         MasterID dtBegin                           dtEnd 
> idR      idL      idB     consumption
> 4057312           295530 2010-07-01 00:59:21.077           2010-07-01 
> 01:32:59.670           1726     3212     1428     279
> 4061043           295574 2010-07-01 01:59:31.137           2010-07-01 
> 02:32:09.373           1726     3212     1428     183
> 4083397           295838 2010-07-01 07:57:51.327           2010-07-01 
> 08:28:28.117           318      1846     1012     30
> 4090858           295920 2010-07-01 09:52:33.777           2010-07-01 
> 10:31:34.393           318      1846     1012     487
> 4094589           295961 2010-07-01 10:47:59.370           2010-07-01 
> 11:32:20.903           318      1846     1012     472
> 4098330           296013 2010-07-01 11:58:53.890           2010-07-01 
> 12:31:35.730           318      1846     1012     195
> 4102069           296058 2010-07-01 12:36:19.170           2010-07-01 
> 13:32:13.950           318      1846     1012     338
> 4105809           296102 2010-07-01 13:58:53.170           2010-07-01 
> 14:02:57.710           318      1846     1012     105
> 4109555           296150 2010-07-01 14:59:11.663           2010-07-01 
> 15:32:33.810           318      1846     1012     187
> 4113305           296194 2010-07-01 15:59:01.797           2010-07-01 
> 16:02:27.260           318      1846     1012     108
> 4117048           296238 2010-07-01 16:20:47.997           2010-07-01 
> 17:32:49.367           318      1846     1012     179
> 4120791           296282 2010-07-01 17:58:27.657           2010-07-01 
> 18:29:01.733           318      1846     1012     256
> 4128291           296370 2010-07-01 19:54:17.687           2010-07-01 
> 20:32:53.850           318      1846     1012     239
> 4132044           296413 2010-07-01 20:31:37.653           2010-07-01 
> 21:29:13.497           318      1846     1012     39
> 4135797           296458 2010-07-01 21:59:13.983           2010-07-01 
> 22:32:46.503           318      1846     1012     157
> 4139572           296506 2010-07-01 22:58:49.530           2010-07-01 
> 23:32:22.543           318      1846     1012     218
> 4142941           296554 2010-07-01 23:59:13.857           2010-07-02 
> 00:32:30.390           318      1846     1012     248
> 4146289           296598 2010-07-02 00:58:55.763           2010-07-02 
> 01:32:41.983           318      1846     1012     204
> 4149616           296642 2010-07-02 01:46:57.357           2010-07-02 
> 02:32:56.983           318      1846     1012     42
> 4152952           296686 2010-07-02 02:55:19.653           2010-07-02 
> 03:32:28.013           318      1846     1012     135
> 4156289           296730 2010-07-02 03:43:52.777           2010-07-02 
> 04:32:55.250           318      1846     1012     743
> 4159624           296774 2010-07-02 04:43:15.310           2010-07-02 
> 05:32:44.547           318      1846     1012     277
> 4162961           296817 2010-07-02 05:58:59.483           2010-07-02 
> 06:32:37.340           318      1846     1012     121
> 4166303           296862 2010-07-02 06:58:50.733           2010-07-02 
> 07:32:39.113           318      1846     1012     239
> 4172981           296950 2010-07-02 07:28:55.293           2010-07-02 
> 09:33:01.200           318      1846     1012     512
> 4176322           296993 2010-07-02 09:59:04.607           2010-07-02 
> 10:33:01.903           318      1846     1012     139
> 4179667           297038 2010-07-02 10:55:27.760           2010-07-02 
> 11:32:56.560           318      1846     1012     722
> 4183012           297082 2010-07-02 11:59:33.650           2010-07-02 
> 12:32:14.700            318      1846     1012     163
> 4186351           297126 2010-07-02 12:23:45.997           2010-07-02 
> 13:32:59.500            318      1846     1012     284
> 4189689           297169 2010-07-02 13:44:21.253           2010-07-02 
> 14:18:05.080            318      1846     1012     254
> 4196371           297258 2010-07-02 16:16:19.123           2010-07-02 
> 16:32:53.437           1706     3541     1511     161
> 4199720           297301 2010-07-02 16:59:35.127           2010-07-02 
> 17:32:57.950           1706     3541     1511     250
> 4203068           297346 2010-07-02 17:59:34.027           2010-07-02 
> 18:32:54.337           1706     3541     1511     302
> 4206413           297389 2010-07-02 18:59:28.730           2010-07-02 
> 19:32:37.950           1706     3541     1511     276
> 4209758           297434 2010-07-02 19:54:00.243           2010-07-02 
> 20:32:57.433           1706     3541     1511     209
> 4213102           297473 2010-07-02 20:49:10.963           2010-07-02 
> 21:30:44.540           1706     3541     1511     76
> 4216447           297511 2010-07-02 21:59:34.810           2010-07-02 
> 22:33:00.603           1706     3541     1511     287
> 4219818           297569 2010-07-02 22:56:52.750           2010-07-02 
> 23:59:31.607           1706     3541     1511     1877
> 4219819           297570 2010-07-02 23:59:21.577           2010-07-03 
> 00:54:40.153           1706     3541     1511     1798
> 4219821           297572 2010-07-03 00:48:03.310           2010-07-03 
> 01:59:37.920           1706     3541     1511     1125
> 4219823           297574 2010-07-03 01:51:01.057           2010-07-03 
> 02:59:45.433           1706     3541     1511     1629
> 4219820           297571 2010-07-03 02:59:29.393           2010-07-03 
> 03:59:54.920           1706     3541     1511     2462
> 4219822           297573 2010-07-03 03:59:18.663           2010-07-03 
> 04:01:48.810           1706     3541     1511     70
> 4225738           297656 2010-07-03 06:13:34.980           2010-07-03 
> 06:28:09.697           1726     3212     1428     46
> 4228694           297695 2010-07-03 06:59:15.560           2010-07-03 
> 07:32:45.653           1726     3212     1428     251
> 4231649           297733 2010-07-03 07:59:11.937           2010-07-03 
> 08:32:57.217           1726     3212     1428     284
> 4234604           297771 2010-07-03 08:57:00.357           2010-07-03 
> 09:32:47.903           1726     3212     1428     227
> 4237559           297809 2010-07-03 09:59:19.813           2010-07-03 
> 10:33:02.063           1726     3212     1428     285
> 4261156           298596 2010-07-04 22:59:09.863           2010-07-04 
> 23:33:45.530           1726     3212     1428     1286
> 4264114           298646 2010-07-04 23:59:16.967           2010-07-05 
> 00:33:08.107           1726     3212     1428     297
> 4267067           298690 2010-07-05 00:59:15.187           2010-07-05 
> 01:32:48.300           1726     3212     1428     333
> 4270023           298734 2010-07-05 01:59:02.497           2010-07-05 
> 02:32:48.780           1726     3212     1428     270
> 4272977           298778 2010-07-05 02:41:43.737           2010-07-05 
> 03:32:56.043           1726     3212     1428     317
> 4275927           298822 2010-07-05 03:59:17.027           2010-07-05
> 04:33:14.947           1726     3212     1428     1623
> ================================================================================
>
> Description:
> 1. Column: some ID
> 2. Column: reference to another table
> 3. and 4. column: timestamp from/to of the item
> 5. Column: ID R
> 6. Column: ID L
> 7. Column: ID B
> 8. Column: Sum of components
>
> Requirement:
> Sum over all components (from column 8) for each combination of ID R, ID 
> L, ID B, but (!)
> rows with same keys (R,L,B) should be summed up only until the keys 
> change.
> Do not sum up the components for identical keys, if there are other keys 
> between them.
>
> Example result:
>
> idR       idL      idB                 SUM
> 1726     3212     1428                 462
> 318     1846     1012                 ...
> 1706     3541     1511                 ...
> 1726     3212     1428                 ...
>
>
> Note that the first and last entry here has the same keys
>
> Maybe you find a similar monster SQL solving such a requirement.
> Thanks for considering!
>
> Rainer
>
>
>
> Am 27.07.2010 12:37, schrieb Oliveiros d'Azevedo Cristina:
>> Howdy, Rainer.
>>
>> It's been a while, so I don't know if you are still interested in this 
>> problem or if you, in the meantime, found yourself a solution,
>> but I've tried this on a local copy of the example you provided   and it 
>> seems to work.
>>
>> The problem is that I suspect that if you have several thousands of 
>> records on your table it will become slow...
>>
>> Best,
>> Oliveiros
>>
>> SELECT SUM(tudo.parcela),tudo.a
>> FROM
>> (
>> SELECT fo.parcela,fo.a,fo.b,fo.c,MIN(th.c) as d
>> FROM
>> (
>> SELECT se.a as parcela,se.b as a,se.c as b, MAX(pr.c) as c
>> FROM
>> yourTable se
>> LEFT JOIN
>> (
>> SELECT a.*
>> FROM yourTable a
>> JOIN yourTable b
>> ON (b.b <> a.b)
>> AND ((age(a.c,b.c) = '1 day'::interval)
>>
>> )
>> ) pr
>> ON pr.b = se.b
>> AND    se.c >= pr.c
>> GROUP BY se.a,se.b,se.c
>> ) fo
>> LEFT JOIN
>> (
>> SELECT a.*
>> FROM yourTable a
>> JOIN yourTable b
>> ON (b.b <> a.b)
>> AND ((age(a.c,b.c) = '-1 day'::interval)
>> )
>> ) th
>> ON fo.a = th.b
>> AND fo.b <= th.c
>> GROUP BY fo.parcela,fo.a,fo.b,fo.c
>> ) tudo
>> GROUP BY tudo.a,tudo.c,tudo.d
>>
>>
>>
>>
>> To: <pgsql-sql@postgresql.org>
>> Sent: Thursday, July 22, 2010 9:09 AM
>> Subject: [SQL] grouping subsets
>>
>>
>>> Hi,
>>>
>>> having a table similar to
>>>
>>> | 1 | B | [2010-07-15 Do] |
>>> | 1 | B | [2010-07-16 Fr] |
>>> |---+---+-----------------|
>>> | 2 | C | [2010-07-17 Sa] |
>>> | 2 | C | [2010-07-18 So] |
>>> |---+---+-----------------|
>>> | 1 | B | [2010-07-19 Mo] |
>>> | 1 | B | [2010-07-20 Di] |
>>> | 1 | B | [2010-07-21 Mi] |
>>> | 1 | B | [2010-07-22 Do] |
>>> |---+---+-----------------|
>>> | 3 | D | [2010-07-23 Fr] |
>>>
>>> a simple group by gives me:
>>>
>>> | 6 | B |
>>> | 4 | C |
>>> | 3 | D |
>>>
>>>
>>> What I want to get is the values grouped by "subset", where a subset is 
>>> a set of rows with identical column until the colum changes.
>>> Is there a way to get
>>>
>>> | 2 | B |
>>> | 4 | C |
>>> | 4 | B |
>>> | 3 | D |
>>>
>>> by SQL only?
>>>
>>> - Rainer
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-sql
>>
>> 



Re: grouping subsets

От
Rainer Stengele
Дата:
No. This is by accident.
We have to assume that the combinations do change anytime, and many times per day.

So

"Or is it possible to have the same combination on one day with several sets?"

YES!

Rainer



Am 29.07.2010 13:47, schrieb Oliveiros d'Azevedo Cristina:
> Yes. This is somewhat more complicated because it has more constraints.
> I've noticed that a given combination doesn't appear with holes on a certain day.
>
> For ex, on a daily basis, we have every three key combinations together.
>
> We dont have things like
> 2010-7-01  1726 3212 1428
> 2010-7-01  1726 3212 1428
> ...                318   1846 1012
> 2010-7-01  1726 3212 1428
>
> Can I assume that, for a certain day , the records for the same three combination are all together? There is just one
setper day for a given combination?
 
>
> Or is it possible to have the same combination on one day with several sets?
>
> Best,
> Oliveiros
>
>
> ----- Original Message ----- From: "Rainer Stengele" <rainer.stengele@diplan.de>
> Newsgroups: gmane.comp.db.postgresql.sql
> To: "Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>
> Cc: <>
> Sent: Thursday, July 29, 2010 10:41 AM
> Subject: Re: grouping subsets
>
>
>> Howdy Cristina,
>>
>> unfortunately things are more complicated. I have inserted an excerpt of the real data here:
>>
>> ================================================================================
>> TableID         MasterID dtBegin                           dtEnd idR      idL      idB     consumption
>> 4057312           295530 2010-07-01 00:59:21.077           2010-07-01 01:32:59.670           1726     3212     1428
  279
 
>> 4061043           295574 2010-07-01 01:59:31.137           2010-07-01 02:32:09.373           1726     3212     1428
  183
 
>> 4083397           295838 2010-07-01 07:57:51.327           2010-07-01 08:28:28.117           318      1846     1012
  30
 
>> 4090858           295920 2010-07-01 09:52:33.777           2010-07-01 10:31:34.393           318      1846     1012
  487
 
>> 4094589           295961 2010-07-01 10:47:59.370           2010-07-01 11:32:20.903           318      1846     1012
  472
 
>> 4098330           296013 2010-07-01 11:58:53.890           2010-07-01 12:31:35.730           318      1846     1012
  195
 
>> 4102069           296058 2010-07-01 12:36:19.170           2010-07-01 13:32:13.950           318      1846     1012
  338
 
>> 4105809           296102 2010-07-01 13:58:53.170           2010-07-01 14:02:57.710           318      1846     1012
  105
 
>> 4109555           296150 2010-07-01 14:59:11.663           2010-07-01 15:32:33.810           318      1846     1012
  187
 
>> 4113305           296194 2010-07-01 15:59:01.797           2010-07-01 16:02:27.260           318      1846     1012
  108
 
>> 4117048           296238 2010-07-01 16:20:47.997           2010-07-01 17:32:49.367           318      1846     1012
  179
 
>> 4120791           296282 2010-07-01 17:58:27.657           2010-07-01 18:29:01.733           318      1846     1012
  256
 
>> 4128291           296370 2010-07-01 19:54:17.687           2010-07-01 20:32:53.850           318      1846     1012
  239
 
>> 4132044           296413 2010-07-01 20:31:37.653           2010-07-01 21:29:13.497           318      1846     1012
  39
 
>> 4135797           296458 2010-07-01 21:59:13.983           2010-07-01 22:32:46.503           318      1846     1012
  157
 
>> 4139572           296506 2010-07-01 22:58:49.530           2010-07-01 23:32:22.543           318      1846     1012
  218
 
>> 4142941           296554 2010-07-01 23:59:13.857           2010-07-02 00:32:30.390           318      1846     1012
  248
 
>> 4146289           296598 2010-07-02 00:58:55.763           2010-07-02 01:32:41.983           318      1846     1012
  204
 
>> 4149616           296642 2010-07-02 01:46:57.357           2010-07-02 02:32:56.983           318      1846     1012
  42
 
>> 4152952           296686 2010-07-02 02:55:19.653           2010-07-02 03:32:28.013           318      1846     1012
  135
 
>> 4156289           296730 2010-07-02 03:43:52.777           2010-07-02 04:32:55.250           318      1846     1012
  743
 
>> 4159624           296774 2010-07-02 04:43:15.310           2010-07-02 05:32:44.547           318      1846     1012
  277
 
>> 4162961           296817 2010-07-02 05:58:59.483           2010-07-02 06:32:37.340           318      1846     1012
  121
 
>> 4166303           296862 2010-07-02 06:58:50.733           2010-07-02 07:32:39.113           318      1846     1012
  239
 
>> 4172981           296950 2010-07-02 07:28:55.293           2010-07-02 09:33:01.200           318      1846     1012
  512
 
>> 4176322           296993 2010-07-02 09:59:04.607           2010-07-02 10:33:01.903           318      1846     1012
  139
 
>> 4179667           297038 2010-07-02 10:55:27.760           2010-07-02 11:32:56.560           318      1846     1012
  722
 
>> 4183012           297082 2010-07-02 11:59:33.650           2010-07-02 12:32:14.700            318      1846     1012
   163
 
>> 4186351           297126 2010-07-02 12:23:45.997           2010-07-02 13:32:59.500            318      1846     1012
   284
 
>> 4189689           297169 2010-07-02 13:44:21.253           2010-07-02 14:18:05.080            318      1846     1012
   254
 
>> 4196371           297258 2010-07-02 16:16:19.123           2010-07-02 16:32:53.437           1706     3541     1511
  161
 
>> 4199720           297301 2010-07-02 16:59:35.127           2010-07-02 17:32:57.950           1706     3541     1511
  250
 
>> 4203068           297346 2010-07-02 17:59:34.027           2010-07-02 18:32:54.337           1706     3541     1511
  302
 
>> 4206413           297389 2010-07-02 18:59:28.730           2010-07-02 19:32:37.950           1706     3541     1511
  276
 
>> 4209758           297434 2010-07-02 19:54:00.243           2010-07-02 20:32:57.433           1706     3541     1511
  209
 
>> 4213102           297473 2010-07-02 20:49:10.963           2010-07-02 21:30:44.540           1706     3541     1511
  76
 
>> 4216447           297511 2010-07-02 21:59:34.810           2010-07-02 22:33:00.603           1706     3541     1511
  287
 
>> 4219818           297569 2010-07-02 22:56:52.750           2010-07-02 23:59:31.607           1706     3541     1511
  1877
 
>> 4219819           297570 2010-07-02 23:59:21.577           2010-07-03 00:54:40.153           1706     3541     1511
  1798
 
>> 4219821           297572 2010-07-03 00:48:03.310           2010-07-03 01:59:37.920           1706     3541     1511
  1125
 
>> 4219823           297574 2010-07-03 01:51:01.057           2010-07-03 02:59:45.433           1706     3541     1511
  1629
 
>> 4219820           297571 2010-07-03 02:59:29.393           2010-07-03 03:59:54.920           1706     3541     1511
  2462
 
>> 4219822           297573 2010-07-03 03:59:18.663           2010-07-03 04:01:48.810           1706     3541     1511
  70
 
>> 4225738           297656 2010-07-03 06:13:34.980           2010-07-03 06:28:09.697           1726     3212     1428
  46
 
>> 4228694           297695 2010-07-03 06:59:15.560           2010-07-03 07:32:45.653           1726     3212     1428
  251
 
>> 4231649           297733 2010-07-03 07:59:11.937           2010-07-03 08:32:57.217           1726     3212     1428
  284
 
>> 4234604           297771 2010-07-03 08:57:00.357           2010-07-03 09:32:47.903           1726     3212     1428
  227
 
>> 4237559           297809 2010-07-03 09:59:19.813           2010-07-03 10:33:02.063           1726     3212     1428
  285
 
>> 4261156           298596 2010-07-04 22:59:09.863           2010-07-04 23:33:45.530           1726     3212     1428
  1286
 
>> 4264114           298646 2010-07-04 23:59:16.967           2010-07-05 00:33:08.107           1726     3212     1428
  297
 
>> 4267067           298690 2010-07-05 00:59:15.187           2010-07-05 01:32:48.300           1726     3212     1428
  333
 
>> 4270023           298734 2010-07-05 01:59:02.497           2010-07-05 02:32:48.780           1726     3212     1428
  270 
>> 4272977           298778 2010-07-05 02:41:43.737           2010-07-05 03:32:56.043           1726     3212     1428
  317
 
>> 4275927           298822 2010-07-05 03:59:17.027           2010-07-05 04:33:14.947           1726     3212     1428
  1623
 
>> ================================================================================
>>
>> Description:
>> 1. Column: some ID
>> 2. Column: reference to another table
>> 3. and 4. column: timestamp from/to of the item
>> 5. Column: ID R
>> 6. Column: ID L
>> 7. Column: ID B
>> 8. Column: Sum of components
>>
>> Requirement:
>> Sum over all components (from column 8) for each combination of ID R, ID L, ID B, but (!)
>> rows with same keys (R,L,B) should be summed up only until the keys change.
>> Do not sum up the components for identical keys, if there are other keys between them.
>>
>> Example result:
>>
>> idR       idL      idB                 SUM
>> 1726     3212     1428                 462
>> 318     1846     1012                 ...
>> 1706     3541     1511                 ...
>> 1726     3212     1428                 ...
>>
>>
>> Note that the first and last entry here has the same keys
>>
>> Maybe you find a similar monster SQL solving such a requirement.
>> Thanks for considering!
>>
>> Rainer
>>
>>
>>
>> Am 27.07.2010 12:37, schrieb Oliveiros d'Azevedo Cristina:
>>> Howdy, Rainer.
>>>
>>> It's been a while, so I don't know if you are still interested in this problem or if you, in the meantime, found
yourselfa solution,
 
>>> but I've tried this on a local copy of the example you provided   and it seems to work.
>>>
>>> The problem is that I suspect that if you have several thousands of records on your table it will become slow...
>>>
>>> Best,
>>> Oliveiros
>>>
>>> SELECT SUM(tudo.parcela),tudo.a
>>> FROM
>>> (
>>> SELECT fo.parcela,fo.a,fo.b,fo.c,MIN(th.c) as d
>>> FROM
>>> (
>>> SELECT se.a as parcela,se.b as a,se.c as b, MAX(pr.c) as c
>>> FROM
>>> yourTable se
>>> LEFT JOIN
>>> (
>>> SELECT a.*
>>> FROM yourTable a
>>> JOIN yourTable b
>>> ON (b.b <> a.b)
>>> AND ((age(a.c,b.c) = '1 day'::interval)
>>>
>>> )
>>> ) pr
>>> ON pr.b = se.b
>>> AND    se.c >= pr.c
>>> GROUP BY se.a,se.b,se.c
>>> ) fo
>>> LEFT JOIN
>>> (
>>> SELECT a.*
>>> FROM yourTable a
>>> JOIN yourTable b
>>> ON (b.b <> a.b)
>>> AND ((age(a.c,b.c) = '-1 day'::interval)
>>> )
>>> ) th
>>> ON fo.a = th.b
>>> AND fo.b <= th.c
>>> GROUP BY fo.parcela,fo.a,fo.b,fo.c
>>> ) tudo
>>> GROUP BY tudo.a,tudo.c,tudo.d
>>>
>>>
>>>
>>>
>>> To: <pgsql-sql@postgresql.org>
>>> Sent: Thursday, July 22, 2010 9:09 AM
>>> Subject: [SQL] grouping subsets
>>>
>>>
>>>> Hi,
>>>>
>>>> having a table similar to
>>>>
>>>> | 1 | B | [2010-07-15 Do] |
>>>> | 1 | B | [2010-07-16 Fr] |
>>>> |---+---+-----------------|
>>>> | 2 | C | [2010-07-17 Sa] |
>>>> | 2 | C | [2010-07-18 So] |
>>>> |---+---+-----------------|
>>>> | 1 | B | [2010-07-19 Mo] |
>>>> | 1 | B | [2010-07-20 Di] |
>>>> | 1 | B | [2010-07-21 Mi] |
>>>> | 1 | B | [2010-07-22 Do] |
>>>> |---+---+-----------------|
>>>> | 3 | D | [2010-07-23 Fr] |
>>>>
>>>> a simple group by gives me:
>>>>
>>>> | 6 | B |
>>>> | 4 | C |
>>>> | 3 | D |
>>>>
>>>>
>>>> What I want to get is the values grouped by "subset", where a subset is a set of rows with identical column until
thecolum changes.
 
>>>> Is there a way to get
>>>>
>>>> | 2 | B |
>>>> | 4 | C |
>>>> | 4 | B |
>>>> | 3 | D |
>>>>
>>>> by SQL only?
>>>>
>>>> - Rainer
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> -- 
>>>> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
>>>> To make changes to your subscription:
>>>> http://www.postgresql.org/mailpref/pgsql-sql
>>>
>>>
>


Re: grouping subsets

От
"Oliveiros d'Azevedo Cristina"
Дата:
Fine.

Please advice me,

How long can
your table be? Thousands? Millions of records?

Do you really need it in pure SQL
?

It seems to me that it might be possible, I'm just affraid that the query 
would become too complex and thus slow...

Best,
Oliveiros

----- Original Message ----- 
From: "Rainer Stengele" <rainer.stengele@diplan.de>
To: "Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>
Cc: <pgsql-sql@postgresql.org>
Sent: Thursday, July 29, 2010 1:10 PM
Subject: Re: grouping subsets


> No. This is by accident.
> We have to assume that the combinations do change anytime, and many times 
> per day.
>
> So
>
> "Or is it possible to have the same combination on one day with several 
> sets?"
>
> YES!
>
> Rainer
>
>
>
> Am 29.07.2010 13:47, schrieb Oliveiros d'Azevedo Cristina:
>> Yes. This is somewhat more complicated because it has more constraints.
>> I've noticed that a given combination doesn't appear with holes on a 
>> certain day.
>>
>> For ex, on a daily basis, we have every three key combinations together.
>>
>> We dont have things like
>> 2010-7-01  1726 3212 1428
>> 2010-7-01  1726 3212 1428
>> ...                318   1846 1012
>> 2010-7-01  1726 3212 1428
>>
>> Can I assume that, for a certain day , the records for the same three 
>> combination are all together? There is just one set per day for a given 
>> combination?
>>
>> Or is it possible to have the same combination on one day with several 
>> sets?
>>
>> Best,
>> Oliveiros
>>
>>
>> ----- Original Message ----- From: "Rainer Stengele" 
>> <rainer.stengele@diplan.de>
>> Newsgroups: gmane.comp.db.postgresql.sql
>> To: "Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>
>> Cc: <>
>> Sent: Thursday, July 29, 2010 10:41 AM
>> Subject: Re: grouping subsets
>>
>>
>>> Howdy Cristina,
>>>
>>> unfortunately things are more complicated. I have inserted an excerpt of 
>>> the real data here:
>>>
>>> ================================================================================
>>> TableID         MasterID dtBegin                           dtEnd idR 
>>> idL      idB     consumption
>>> 4057312           295530 2010-07-01 00:59:21.077           2010-07-01 
>>> 01:32:59.670           1726     3212     1428     279
>>> 4061043           295574 2010-07-01 01:59:31.137           2010-07-01 
>>> 02:32:09.373           1726     3212     1428     183
>>> 4083397           295838 2010-07-01 07:57:51.327           2010-07-01 
>>> 08:28:28.117           318      1846     1012     30
>>> 4090858           295920 2010-07-01 09:52:33.777           2010-07-01 
>>> 10:31:34.393           318      1846     1012     487
>>> 4094589           295961 2010-07-01 10:47:59.370           2010-07-01 
>>> 11:32:20.903           318      1846     1012     472
>>> 4098330           296013 2010-07-01 11:58:53.890           2010-07-01 
>>> 12:31:35.730           318      1846     1012     195
>>> 4102069           296058 2010-07-01 12:36:19.170           2010-07-01 
>>> 13:32:13.950           318      1846     1012     338
>>> 4105809           296102 2010-07-01 13:58:53.170           2010-07-01 
>>> 14:02:57.710           318      1846     1012     105
>>> 4109555           296150 2010-07-01 14:59:11.663           2010-07-01 
>>> 15:32:33.810           318      1846     1012     187
>>> 4113305           296194 2010-07-01 15:59:01.797           2010-07-01 
>>> 16:02:27.260           318      1846     1012     108
>>> 4117048           296238 2010-07-01 16:20:47.997           2010-07-01 
>>> 17:32:49.367           318      1846     1012     179
>>> 4120791           296282 2010-07-01 17:58:27.657           2010-07-01 
>>> 18:29:01.733           318      1846     1012     256
>>> 4128291           296370 2010-07-01 19:54:17.687           2010-07-01 
>>> 20:32:53.850           318      1846     1012     239
>>> 4132044           296413 2010-07-01 20:31:37.653           2010-07-01 
>>> 21:29:13.497           318      1846     1012     39
>>> 4135797           296458 2010-07-01 21:59:13.983           2010-07-01 
>>> 22:32:46.503           318      1846     1012     157
>>> 4139572           296506 2010-07-01 22:58:49.530           2010-07-01 
>>> 23:32:22.543           318      1846     1012     218
>>> 4142941           296554 2010-07-01 23:59:13.857           2010-07-02 
>>> 00:32:30.390           318      1846     1012     248
>>> 4146289           296598 2010-07-02 00:58:55.763           2010-07-02 
>>> 01:32:41.983           318      1846     1012     204
>>> 4149616           296642 2010-07-02 01:46:57.357           2010-07-02 
>>> 02:32:56.983           318      1846     1012     42
>>> 4152952           296686 2010-07-02 02:55:19.653           2010-07-02 
>>> 03:32:28.013           318      1846     1012     135
>>> 4156289           296730 2010-07-02 03:43:52.777           2010-07-02 
>>> 04:32:55.250           318      1846     1012     743
>>> 4159624           296774 2010-07-02 04:43:15.310           2010-07-02 
>>> 05:32:44.547           318      1846     1012     277
>>> 4162961           296817 2010-07-02 05:58:59.483           2010-07-02 
>>> 06:32:37.340           318      1846     1012     121
>>> 4166303           296862 2010-07-02 06:58:50.733           2010-07-02 
>>> 07:32:39.113           318      1846     1012     239
>>> 4172981           296950 2010-07-02 07:28:55.293           2010-07-02 
>>> 09:33:01.200           318      1846     1012     512
>>> 4176322           296993 2010-07-02 09:59:04.607           2010-07-02 
>>> 10:33:01.903           318      1846     1012     139
>>> 4179667           297038 2010-07-02 10:55:27.760           2010-07-02 
>>> 11:32:56.560           318      1846     1012     722
>>> 4183012           297082 2010-07-02 11:59:33.650           2010-07-02 
>>> 12:32:14.700            318      1846     1012     163
>>> 4186351           297126 2010-07-02 12:23:45.997           2010-07-02 
>>> 13:32:59.500            318      1846     1012     284
>>> 4189689           297169 2010-07-02 13:44:21.253           2010-07-02 
>>> 14:18:05.080            318      1846     1012     254
>>> 4196371           297258 2010-07-02 16:16:19.123           2010-07-02 
>>> 16:32:53.437           1706     3541     1511     161
>>> 4199720           297301 2010-07-02 16:59:35.127           2010-07-02 
>>> 17:32:57.950           1706     3541     1511     250
>>> 4203068           297346 2010-07-02 17:59:34.027           2010-07-02 
>>> 18:32:54.337           1706     3541     1511     302
>>> 4206413           297389 2010-07-02 18:59:28.730           2010-07-02 
>>> 19:32:37.950           1706     3541     1511     276
>>> 4209758           297434 2010-07-02 19:54:00.243           2010-07-02 
>>> 20:32:57.433           1706     3541     1511     209
>>> 4213102           297473 2010-07-02 20:49:10.963           2010-07-02 
>>> 21:30:44.540           1706     3541     1511     76
>>> 4216447           297511 2010-07-02 21:59:34.810           2010-07-02 
>>> 22:33:00.603           1706     3541     1511     287
>>> 4219818           297569 2010-07-02 22:56:52.750           2010-07-02 
>>> 23:59:31.607           1706     3541     1511     1877
>>> 4219819           297570 2010-07-02 23:59:21.577           2010-07-03 
>>> 00:54:40.153           1706     3541     1511     1798
>>> 4219821           297572 2010-07-03 00:48:03.310           2010-07-03 
>>> 01:59:37.920           1706     3541     1511     1125
>>> 4219823           297574 2010-07-03 01:51:01.057           2010-07-03 
>>> 02:59:45.433           1706     3541     1511     1629
>>> 4219820           297571 2010-07-03 02:59:29.393           2010-07-03 
>>> 03:59:54.920           1706     3541     1511     2462
>>> 4219822           297573 2010-07-03 03:59:18.663           2010-07-03 
>>> 04:01:48.810           1706     3541     1511     70
>>> 4225738           297656 2010-07-03 06:13:34.980           2010-07-03 
>>> 06:28:09.697           1726     3212     1428     46
>>> 4228694           297695 2010-07-03 06:59:15.560           2010-07-03
>>> 07:32:45.653           1726     3212     1428     251
>>> 4231649           297733 2010-07-03 07:59:11.937           2010-07-03 
>>> 08:32:57.217           1726     3212     1428     284
>>> 4234604           297771 2010-07-03 08:57:00.357           2010-07-03 
>>> 09:32:47.903           1726     3212     1428     227
>>> 4237559           297809 2010-07-03 09:59:19.813           2010-07-03 
>>> 10:33:02.063           1726     3212     1428     285
>>> 4261156           298596 2010-07-04 22:59:09.863           2010-07-04 
>>> 23:33:45.530           1726     3212     1428     1286
>>> 4264114           298646 2010-07-04 23:59:16.967           2010-07-05 
>>> 00:33:08.107           1726     3212     1428     297
>>> 4267067           298690 2010-07-05 00:59:15.187           2010-07-05 
>>> 01:32:48.300           1726     3212     1428     333
>>> 4270023           298734 2010-07-05 01:59:02.497           2010-07-05 
>>> 02:32:48.780           1726     3212     1428     270
>>> 4272977           298778 2010-07-05 02:41:43.737           2010-07-05 
>>> 03:32:56.043           1726     3212     1428     317
>>> 4275927           298822 2010-07-05 03:59:17.027           2010-07-05 
>>> 04:33:14.947           1726     3212     1428     1623
>>> ================================================================================
>>>
>>> Description:
>>> 1. Column: some ID
>>> 2. Column: reference to another table
>>> 3. and 4. column: timestamp from/to of the item
>>> 5. Column: ID R
>>> 6. Column: ID L
>>> 7. Column: ID B
>>> 8. Column: Sum of components
>>>
>>> Requirement:
>>> Sum over all components (from column 8) for each combination of ID R, ID 
>>> L, ID B, but (!)
>>> rows with same keys (R,L,B) should be summed up only until the keys 
>>> change.
>>> Do not sum up the components for identical keys, if there are other keys 
>>> between them.
>>>
>>> Example result:
>>>
>>> idR       idL      idB                 SUM
>>> 1726     3212     1428                 462
>>> 318     1846     1012                 ...
>>> 1706     3541     1511                 ...
>>> 1726     3212     1428                 ...
>>>
>>>
>>> Note that the first and last entry here has the same keys
>>>
>>> Maybe you find a similar monster SQL solving such a requirement.
>>> Thanks for considering!
>>>
>>> Rainer
>>>
>>>
>>>
>>> Am 27.07.2010 12:37, schrieb Oliveiros d'Azevedo Cristina:
>>>> Howdy, Rainer.
>>>>
>>>> It's been a while, so I don't know if you are still interested in this 
>>>> problem or if you, in the meantime, found yourself a solution,
>>>> but I've tried this on a local copy of the example you provided   and 
>>>> it seems to work.
>>>>
>>>> The problem is that I suspect that if you have several thousands of 
>>>> records on your table it will become slow...
>>>>
>>>> Best,
>>>> Oliveiros
>>>>
>>>> SELECT SUM(tudo.parcela),tudo.a
>>>> FROM
>>>> (
>>>> SELECT fo.parcela,fo.a,fo.b,fo.c,MIN(th.c) as d
>>>> FROM
>>>> (
>>>> SELECT se.a as parcela,se.b as a,se.c as b, MAX(pr.c) as c
>>>> FROM
>>>> yourTable se
>>>> LEFT JOIN
>>>> (
>>>> SELECT a.*
>>>> FROM yourTable a
>>>> JOIN yourTable b
>>>> ON (b.b <> a.b)
>>>> AND ((age(a.c,b.c) = '1 day'::interval)
>>>>
>>>> )
>>>> ) pr
>>>> ON pr.b = se.b
>>>> AND    se.c >= pr.c
>>>> GROUP BY se.a,se.b,se.c
>>>> ) fo
>>>> LEFT JOIN
>>>> (
>>>> SELECT a.*
>>>> FROM yourTable a
>>>> JOIN yourTable b
>>>> ON (b.b <> a.b)
>>>> AND ((age(a.c,b.c) = '-1 day'::interval)
>>>> )
>>>> ) th
>>>> ON fo.a = th.b
>>>> AND fo.b <= th.c
>>>> GROUP BY fo.parcela,fo.a,fo.b,fo.c
>>>> ) tudo
>>>> GROUP BY tudo.a,tudo.c,tudo.d
>>>>
>>>>
>>>>
>>>>
>>>> To: <pgsql-sql@postgresql.org>
>>>> Sent: Thursday, July 22, 2010 9:09 AM
>>>> Subject: [SQL] grouping subsets
>>>>
>>>>
>>>>> Hi,
>>>>>
>>>>> having a table similar to
>>>>>
>>>>> | 1 | B | [2010-07-15 Do] |
>>>>> | 1 | B | [2010-07-16 Fr] |
>>>>> |---+---+-----------------|
>>>>> | 2 | C | [2010-07-17 Sa] |
>>>>> | 2 | C | [2010-07-18 So] |
>>>>> |---+---+-----------------|
>>>>> | 1 | B | [2010-07-19 Mo] |
>>>>> | 1 | B | [2010-07-20 Di] |
>>>>> | 1 | B | [2010-07-21 Mi] |
>>>>> | 1 | B | [2010-07-22 Do] |
>>>>> |---+---+-----------------|
>>>>> | 3 | D | [2010-07-23 Fr] |
>>>>>
>>>>> a simple group by gives me:
>>>>>
>>>>> | 6 | B |
>>>>> | 4 | C |
>>>>> | 3 | D |
>>>>>
>>>>>
>>>>> What I want to get is the values grouped by "subset", where a subset 
>>>>> is a set of rows with identical column until the colum changes.
>>>>> Is there a way to get
>>>>>
>>>>> | 2 | B |
>>>>> | 4 | C |
>>>>> | 4 | B |
>>>>> | 3 | D |
>>>>>
>>>>> by SQL only?
>>>>>
>>>>> - Rainer
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
>>>>> To make changes to your subscription:
>>>>> http://www.postgresql.org/mailpref/pgsql-sql
>>>>
>>>>
>> 



Re: grouping subsets

От
Rainer Stengele
Дата:
the table may include up to maybe 30 entries per day, average maybe 10-15
After a year this makes about 10.000 entries - maximum, average about 5000 entries.

For the problem described I have to use a Microsoft SQL database and would like to use pure SQL.
As I use postgres on my Linux servers I found this newsgroup and thought I ask here.

Thanks!
Rainer

Am 29.07.2010 15:31, schrieb Oliveiros d'Azevedo Cristina:
> Fine.
> 
> Please advice me,
> 
> How long can
> your table be? Thousands? Millions of records?
> 
> Do you really need it in pure SQL
> ?
> 
> It seems to me that it might be possible, I'm just affraid that the query would become too complex and thus slow...
> 
> Best,
> Oliveiros
> 
> ----- Original Message ----- From: "Rainer Stengele" <rainer.stengele@diplan.de>
> To: "Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>
> Cc: <pgsql-sql@postgresql.org>
> Sent: Thursday, July 29, 2010 1:10 PM
> Subject: Re: grouping subsets
> 
> 
>> No. This is by accident.
>> We have to assume that the combinations do change anytime, and many times per day.
>>
>> So
>>
>> "Or is it possible to have the same combination on one day with several sets?"
>>
>> YES!
>>
>> Rainer
>>
>>
>>
>> Am 29.07.2010 13:47, schrieb Oliveiros d'Azevedo Cristina:
>>> Yes. This is somewhat more complicated because it has more constraints.
>>> I've noticed that a given combination doesn't appear with holes on a certain day.
>>>
>>> For ex, on a daily basis, we have every three key combinations together.
>>>
>>> We dont have things like
>>> 2010-7-01  1726 3212 1428
>>> 2010-7-01  1726 3212 1428
>>> ...                318   1846 1012
>>> 2010-7-01  1726 3212 1428
>>>
>>> Can I assume that, for a certain day , the records for the same three combination are all together? There is just
oneset per day for a given combination?
 
>>>
>>> Or is it possible to have the same combination on one day with several sets?
>>>
>>> Best,
>>> Oliveiros
>>>
>>>
>>> ----- Original Message ----- From: "Rainer Stengele" <rainer.stengele@diplan.de>
>>> Newsgroups: gmane.comp.db.postgresql.sql
>>> To: "Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>
>>> Cc: <>
>>> Sent: Thursday, July 29, 2010 10:41 AM
>>> Subject: Re: grouping subsets
>>>
>>>
>>>> Howdy Cristina,
>>>>
>>>> unfortunately things are more complicated. I have inserted an excerpt of the real data here:
>>>>
>>>> ================================================================================
>>>> TableID         MasterID dtBegin                           dtEnd idR idL      idB     consumption
>>>> 4057312           295530 2010-07-01 00:59:21.077           2010-07-01 01:32:59.670           1726     3212
1428    279
 
>>>> 4061043           295574 2010-07-01 01:59:31.137           2010-07-01 02:32:09.373           1726     3212
1428    183
 
>>>> 4083397           295838 2010-07-01 07:57:51.327           2010-07-01 08:28:28.117           318      1846
1012    30
 
>>>> 4090858           295920 2010-07-01 09:52:33.777           2010-07-01 10:31:34.393           318      1846
1012    487
 
>>>> 4094589           295961 2010-07-01 10:47:59.370           2010-07-01 11:32:20.903           318      1846
1012    472
 
>>>> 4098330           296013 2010-07-01 11:58:53.890           2010-07-01 12:31:35.730           318      1846
1012    195
 
>>>> 4102069           296058 2010-07-01 12:36:19.170           2010-07-01 13:32:13.950           318      1846
1012    338
 
>>>> 4105809           296102 2010-07-01 13:58:53.170           2010-07-01 14:02:57.710           318      1846
1012    105
 
>>>> 4109555           296150 2010-07-01 14:59:11.663           2010-07-01 15:32:33.810           318      1846
1012    187
 
>>>> 4113305           296194 2010-07-01 15:59:01.797           2010-07-01 16:02:27.260           318      1846
1012    108
 
>>>> 4117048           296238 2010-07-01 16:20:47.997           2010-07-01 17:32:49.367           318      1846
1012    179
 
>>>> 4120791           296282 2010-07-01 17:58:27.657           2010-07-01 18:29:01.733           318      1846
1012    256
 
>>>> 4128291           296370 2010-07-01 19:54:17.687           2010-07-01 20:32:53.850           318      1846
1012    239
 
>>>> 4132044           296413 2010-07-01 20:31:37.653           2010-07-01 21:29:13.497           318      1846
1012    39
 
>>>> 4135797           296458 2010-07-01 21:59:13.983           2010-07-01 22:32:46.503           318      1846
1012    157
 
>>>> 4139572           296506 2010-07-01 22:58:49.530           2010-07-01 23:32:22.543           318      1846
1012    218
 
>>>> 4142941           296554 2010-07-01 23:59:13.857           2010-07-02 00:32:30.390           318      1846
1012    248
 
>>>> 4146289           296598 2010-07-02 00:58:55.763           2010-07-02 01:32:41.983           318      1846
1012    204
 
>>>> 4149616           296642 2010-07-02 01:46:57.357           2010-07-02 02:32:56.983           318      1846
1012    42
 
>>>> 4152952           296686 2010-07-02 02:55:19.653           2010-07-02 03:32:28.013           318      1846
1012    135
 
>>>> 4156289           296730 2010-07-02 03:43:52.777           2010-07-02 04:32:55.250           318      1846
1012    743
 
>>>> 4159624           296774 2010-07-02 04:43:15.310           2010-07-02 05:32:44.547           318      1846
1012    277
 
>>>> 4162961           296817 2010-07-02 05:58:59.483           2010-07-02 06:32:37.340           318      1846
1012    121
 
>>>> 4166303           296862 2010-07-02 06:58:50.733           2010-07-02 07:32:39.113           318      1846
1012    239
 
>>>> 4172981           296950 2010-07-02 07:28:55.293           2010-07-02 09:33:01.200           318      1846
1012    512
 
>>>> 4176322           296993 2010-07-02 09:59:04.607           2010-07-02 10:33:01.903           318      1846
1012    139
 
>>>> 4179667           297038 2010-07-02 10:55:27.760           2010-07-02 11:32:56.560           318      1846
1012    722
 
>>>> 4183012           297082 2010-07-02 11:59:33.650           2010-07-02 12:32:14.700            318      1846
1012    163
 
>>>> 4186351           297126 2010-07-02 12:23:45.997           2010-07-02 13:32:59.500            318      1846
1012    284
 
>>>> 4189689           297169 2010-07-02 13:44:21.253           2010-07-02 14:18:05.080            318      1846
1012    254
 
>>>> 4196371           297258 2010-07-02 16:16:19.123           2010-07-02 16:32:53.437           1706     3541
1511    161
 
>>>> 4199720           297301 2010-07-02 16:59:35.127           2010-07-02 17:32:57.950           1706     3541
1511    250
 
>>>> 4203068           297346 2010-07-02 17:59:34.027           2010-07-02 18:32:54.337           1706     3541
1511    302
 
>>>> 4206413           297389 2010-07-02 18:59:28.730           2010-07-02 19:32:37.950           1706     3541
1511    276
 
>>>> 4209758           297434 2010-07-02 19:54:00.243           2010-07-02 20:32:57.433           1706     3541
1511    209
 
>>>> 4213102           297473 2010-07-02 20:49:10.963           2010-07-02 21:30:44.540           1706     3541
1511    76
 
>>>> 4216447           297511 2010-07-02 21:59:34.810           2010-07-02 22:33:00.603           1706     3541
1511    287
 
>>>> 4219818           297569 2010-07-02 22:56:52.750           2010-07-02 23:59:31.607           1706     3541
1511    1877
 
>>>> 4219819           297570 2010-07-02 23:59:21.577           2010-07-03 00:54:40.153           1706     3541
1511    1798
 
>>>> 4219821           297572 2010-07-03 00:48:03.310           2010-07-03 01:59:37.920           1706     3541
1511    1125
 
>>>> 4219823           297574 2010-07-03 01:51:01.057           2010-07-03 02:59:45.433           1706     3541
1511    1629
 
>>>> 4219820           297571 2010-07-03 02:59:29.393           2010-07-03 03:59:54.920           1706     3541
1511    2462
 
>>>> 4219822           297573 2010-07-03 03:59:18.663           2010-07-03 04:01:48.810           1706     3541
1511    70 
>>>> 4225738           297656 2010-07-03 06:13:34.980           2010-07-03 06:28:09.697           1726     3212
1428    46
 
>>>> 4228694           297695 2010-07-03 06:59:15.560           2010-07-03 07:32:45.653           1726     3212
1428    251
 
>>>> 4231649           297733 2010-07-03 07:59:11.937           2010-07-03 08:32:57.217           1726     3212
1428    284
 
>>>> 4234604           297771 2010-07-03 08:57:00.357           2010-07-03 09:32:47.903           1726     3212
1428    227
 
>>>> 4237559           297809 2010-07-03 09:59:19.813           2010-07-03 10:33:02.063           1726     3212
1428    285
 
>>>> 4261156           298596 2010-07-04 22:59:09.863           2010-07-04 23:33:45.530           1726     3212
1428    1286
 
>>>> 4264114           298646 2010-07-04 23:59:16.967           2010-07-05 00:33:08.107           1726     3212
1428    297
 
>>>> 4267067           298690 2010-07-05 00:59:15.187           2010-07-05 01:32:48.300           1726     3212
1428    333
 
>>>> 4270023           298734 2010-07-05 01:59:02.497           2010-07-05 02:32:48.780           1726     3212
1428    270
 
>>>> 4272977           298778 2010-07-05 02:41:43.737           2010-07-05 03:32:56.043           1726     3212
1428    317
 
>>>> 4275927           298822 2010-07-05 03:59:17.027           2010-07-05 04:33:14.947           1726     3212
1428    1623
 
>>>> ================================================================================
>>>>
>>>> Description:
>>>> 1. Column: some ID
>>>> 2. Column: reference to another table
>>>> 3. and 4. column: timestamp from/to of the item
>>>> 5. Column: ID R
>>>> 6. Column: ID L
>>>> 7. Column: ID B
>>>> 8. Column: Sum of components
>>>>
>>>> Requirement:
>>>> Sum over all components (from column 8) for each combination of ID R, ID L, ID B, but (!)
>>>> rows with same keys (R,L,B) should be summed up only until the keys change.
>>>> Do not sum up the components for identical keys, if there are other keys between them.
>>>>
>>>> Example result:
>>>>
>>>> idR       idL      idB                 SUM
>>>> 1726     3212     1428                 462
>>>> 318     1846     1012                 ...
>>>> 1706     3541     1511                 ...
>>>> 1726     3212     1428                 ...
>>>>
>>>>
>>>> Note that the first and last entry here has the same keys
>>>>
>>>> Maybe you find a similar monster SQL solving such a requirement.
>>>> Thanks for considering!
>>>>
>>>> Rainer
>>>>
>>>>
>>>>
>>>> Am 27.07.2010 12:37, schrieb Oliveiros d'Azevedo Cristina:
>>>>> Howdy, Rainer.
>>>>>
>>>>> It's been a while, so I don't know if you are still interested in this problem or if you, in the meantime, found
yourselfa solution,
 
>>>>> but I've tried this on a local copy of the example you provided   and it seems to work.
>>>>>
>>>>> The problem is that I suspect that if you have several thousands of records on your table it will become slow...
>>>>>
>>>>> Best,
>>>>> Oliveiros
>>>>>
>>>>> SELECT SUM(tudo.parcela),tudo.a
>>>>> FROM
>>>>> (
>>>>> SELECT fo.parcela,fo.a,fo.b,fo.c,MIN(th.c) as d
>>>>> FROM
>>>>> (
>>>>> SELECT se.a as parcela,se.b as a,se.c as b, MAX(pr.c) as c
>>>>> FROM
>>>>> yourTable se
>>>>> LEFT JOIN
>>>>> (
>>>>> SELECT a.*
>>>>> FROM yourTable a
>>>>> JOIN yourTable b
>>>>> ON (b.b <> a.b)
>>>>> AND ((age(a.c,b.c) = '1 day'::interval)
>>>>>
>>>>> )
>>>>> ) pr
>>>>> ON pr.b = se.b
>>>>> AND    se.c >= pr.c
>>>>> GROUP BY se.a,se.b,se.c
>>>>> ) fo
>>>>> LEFT JOIN
>>>>> (
>>>>> SELECT a.*
>>>>> FROM yourTable a
>>>>> JOIN yourTable b
>>>>> ON (b.b <> a.b)
>>>>> AND ((age(a.c,b.c) = '-1 day'::interval)
>>>>> )
>>>>> ) th
>>>>> ON fo.a = th.b
>>>>> AND fo.b <= th.c
>>>>> GROUP BY fo.parcela,fo.a,fo.b,fo.c
>>>>> ) tudo
>>>>> GROUP BY tudo.a,tudo.c,tudo.d
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> To: <pgsql-sql@postgresql.org>
>>>>> Sent: Thursday, July 22, 2010 9:09 AM
>>>>> Subject: [SQL] grouping subsets
>>>>>
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> having a table similar to
>>>>>>
>>>>>> | 1 | B | [2010-07-15 Do] |
>>>>>> | 1 | B | [2010-07-16 Fr] |
>>>>>> |---+---+-----------------|
>>>>>> | 2 | C | [2010-07-17 Sa] |
>>>>>> | 2 | C | [2010-07-18 So] |
>>>>>> |---+---+-----------------|
>>>>>> | 1 | B | [2010-07-19 Mo] |
>>>>>> | 1 | B | [2010-07-20 Di] |
>>>>>> | 1 | B | [2010-07-21 Mi] |
>>>>>> | 1 | B | [2010-07-22 Do] |
>>>>>> |---+---+-----------------|
>>>>>> | 3 | D | [2010-07-23 Fr] |
>>>>>>
>>>>>> a simple group by gives me:
>>>>>>
>>>>>> | 6 | B |
>>>>>> | 4 | C |
>>>>>> | 3 | D |
>>>>>>
>>>>>>
>>>>>> What I want to get is the values grouped by "subset", where a subset is a set of rows with identical column
untilthe colum changes.
 
>>>>>> Is there a way to get
>>>>>>
>>>>>> | 2 | B |
>>>>>> | 4 | C |
>>>>>> | 4 | B |
>>>>>> | 3 | D |
>>>>>>
>>>>>> by SQL only?
>>>>>>
>>>>>> - Rainer
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
>>>>>> To make changes to your subscription:
>>>>>> http://www.postgresql.org/mailpref/pgsql-sql
>>>>>
>>>>>
>>>
> 
> 



Re: grouping subsets

От
"Oliveiros d'Azevedo Cristina"
Дата:
I See.

And the analysis you need to do, the sum of the rows with the same keys 
(until they change) will have to be done over all
table?
Or just over some predefined interval ?

Best,
Oliveiros
----- Original Message ----- 
From: "Rainer Stengele" <rainer.stengele@diplan.de>
Newsgroups: gmane.comp.db.postgresql.sql
To: "Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>
Cc: <pgsql-sql@postgresql.org>
Sent: Friday, July 30, 2010 10:35 AM
Subject: Re: grouping subsets


> the table may include up to maybe 30 entries per day, average maybe 10-15
> After a year this makes about 10.000 entries - maximum, average about 5000 
> entries.
>
> For the problem described I have to use a Microsoft SQL database and would 
> like to use pure SQL.
> As I use postgres on my Linux servers I found this newsgroup and thought I 
> ask here.
>
> Thanks!
> Rainer
>
> Am 29.07.2010 15:31, schrieb Oliveiros d'Azevedo Cristina:
>> Fine.
>>
>> Please advice me,
>>
>> How long can
>> your table be? Thousands? Millions of records?
>>
>> Do you really need it in pure SQL
>> ?
>>
>> It seems to me that it might be possible, I'm just affraid that the query 
>> would become too complex and thus slow...
>>
>> Best,
>> Oliveiros
>>
>> ----- Original Message ----- From: "Rainer Stengele" 
>> <rainer.stengele@diplan.de>
>> To: "Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>
>> Cc: <pgsql-sql@postgresql.org>
>> Sent: Thursday, July 29, 2010 1:10 PM
>> Subject: Re: grouping subsets
>>
>>
>>> No. This is by accident.
>>> We have to assume that the combinations do change anytime, and many 
>>> times per day.
>>>
>>> So
>>>
>>> "Or is it possible to have the same combination on one day with several 
>>> sets?"
>>>
>>> YES!
>>>
>>> Rainer
>>>
>>>
>>>
>>> Am 29.07.2010 13:47, schrieb Oliveiros d'Azevedo Cristina:
>>>> Yes. This is somewhat more complicated because it has more constraints.
>>>> I've noticed that a given combination doesn't appear with holes on a 
>>>> certain day.
>>>>
>>>> For ex, on a daily basis, we have every three key combinations 
>>>> together.
>>>>
>>>> We dont have things like
>>>> 2010-7-01  1726 3212 1428
>>>> 2010-7-01  1726 3212 1428
>>>> ...                318   1846 1012
>>>> 2010-7-01  1726 3212 1428
>>>>
>>>> Can I assume that, for a certain day , the records for the same three 
>>>> combination are all together? There is just one set per day for a given 
>>>> combination?
>>>>
>>>> Or is it possible to have the same combination on one day with several 
>>>> sets?
>>>>
>>>> Best,
>>>> Oliveiros
>>>>
>>>>
>>>> ----- Original Message ----- From: "Rainer Stengele" 
>>>> <rainer.stengele@diplan.de>
>>>> Newsgroups: gmane.comp.db.postgresql.sql
>>>> To: "Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>
>>>> Cc: <>
>>>> Sent: Thursday, July 29, 2010 10:41 AM
>>>> Subject: Re: grouping subsets
>>>>
>>>>
>>>>> Howdy Cristina,
>>>>>
>>>>> unfortunately things are more complicated. I have inserted an excerpt 
>>>>> of the real data here:
>>>>>
>>>>> ================================================================================
>>>>> TableID         MasterID dtBegin                           dtEnd idR 
>>>>> idL      idB     consumption
>>>>> 4057312           295530 2010-07-01 00:59:21.077           2010-07-01 
>>>>> 01:32:59.670           1726     3212     1428     279
>>>>> 4061043           295574 2010-07-01 01:59:31.137           2010-07-01 
>>>>> 02:32:09.373           1726     3212     1428     183
>>>>> 4083397           295838 2010-07-01 07:57:51.327           2010-07-01 
>>>>> 08:28:28.117           318      1846     1012     30
>>>>> 4090858           295920 2010-07-01 09:52:33.777           2010-07-01 
>>>>> 10:31:34.393           318      1846     1012     487
>>>>> 4094589           295961 2010-07-01 10:47:59.370           2010-07-01 
>>>>> 11:32:20.903           318      1846     1012     472
>>>>> 4098330           296013 2010-07-01 11:58:53.890           2010-07-01 
>>>>> 12:31:35.730           318      1846     1012     195
>>>>> 4102069           296058 2010-07-01 12:36:19.170           2010-07-01 
>>>>> 13:32:13.950           318      1846     1012     338
>>>>> 4105809           296102 2010-07-01 13:58:53.170           2010-07-01 
>>>>> 14:02:57.710           318      1846     1012     105
>>>>> 4109555           296150 2010-07-01 14:59:11.663           2010-07-01 
>>>>> 15:32:33.810           318      1846     1012     187
>>>>> 4113305           296194 2010-07-01 15:59:01.797           2010-07-01 
>>>>> 16:02:27.260           318      1846     1012     108
>>>>> 4117048           296238 2010-07-01 16:20:47.997           2010-07-01 
>>>>> 17:32:49.367           318      1846     1012     179
>>>>> 4120791           296282 2010-07-01 17:58:27.657           2010-07-01 
>>>>> 18:29:01.733           318      1846     1012     256
>>>>> 4128291           296370 2010-07-01 19:54:17.687           2010-07-01 
>>>>> 20:32:53.850           318      1846     1012     239
>>>>> 4132044           296413 2010-07-01 20:31:37.653           2010-07-01 
>>>>> 21:29:13.497           318      1846     1012     39
>>>>> 4135797           296458 2010-07-01 21:59:13.983           2010-07-01 
>>>>> 22:32:46.503           318      1846     1012     157
>>>>> 4139572           296506 2010-07-01 22:58:49.530           2010-07-01 
>>>>> 23:32:22.543           318      1846     1012     218
>>>>> 4142941           296554 2010-07-01 23:59:13.857           2010-07-02 
>>>>> 00:32:30.390           318      1846     1012     248
>>>>> 4146289           296598 2010-07-02 00:58:55.763           2010-07-02 
>>>>> 01:32:41.983           318      1846     1012     204
>>>>> 4149616           296642 2010-07-02 01:46:57.357           2010-07-02 
>>>>> 02:32:56.983           318      1846     1012     42
>>>>> 4152952           296686 2010-07-02 02:55:19.653           2010-07-02 
>>>>> 03:32:28.013           318      1846     1012     135
>>>>> 4156289           296730 2010-07-02 03:43:52.777           2010-07-02 
>>>>> 04:32:55.250           318      1846     1012     743
>>>>> 4159624           296774 2010-07-02 04:43:15.310           2010-07-02 
>>>>> 05:32:44.547           318      1846     1012     277
>>>>> 4162961           296817 2010-07-02 05:58:59.483           2010-07-02 
>>>>> 06:32:37.340           318      1846     1012     121
>>>>> 4166303           296862 2010-07-02 06:58:50.733           2010-07-02 
>>>>> 07:32:39.113           318      1846     1012     239
>>>>> 4172981           296950 2010-07-02 07:28:55.293           2010-07-02 
>>>>> 09:33:01.200           318      1846     1012     512
>>>>> 4176322           296993 2010-07-02 09:59:04.607           2010-07-02 
>>>>> 10:33:01.903           318      1846     1012     139
>>>>> 4179667           297038 2010-07-02 10:55:27.760           2010-07-02 
>>>>> 11:32:56.560           318      1846     1012     722
>>>>> 4183012           297082 2010-07-02 11:59:33.650           2010-07-02 
>>>>> 12:32:14.700            318      1846     1012     163
>>>>> 4186351           297126 2010-07-02 12:23:45.997           2010-07-02 
>>>>> 13:32:59.500            318      1846     1012     284
>>>>> 4189689           297169 2010-07-02 13:44:21.253           2010-07-02 
>>>>> 14:18:05.080            318      1846     1012     254
>>>>> 4196371           297258 2010-07-02 16:16:19.123           2010-07-02 
>>>>> 16:32:53.437           1706     3541     1511     161
>>>>> 4199720           297301 2010-07-02 16:59:35.127           2010-07-02 
>>>>> 17:32:57.950           1706     3541     1511     250
>>>>> 4203068           297346 2010-07-02 17:59:34.027           2010-07-02 
>>>>> 18:32:54.337           1706     3541     1511     302
>>>>> 4206413           297389 2010-07-02 18:59:28.730           2010-07-02 
>>>>> 19:32:37.950           1706     3541     1511     276
>>>>> 4209758           297434 2010-07-02 19:54:00.243           2010-07-02 
>>>>> 20:32:57.433           1706     3541     1511     209
>>>>> 4213102           297473 2010-07-02 20:49:10.963           2010-07-02 
>>>>> 21:30:44.540           1706     3541     1511     76
>>>>> 4216447           297511 2010-07-02 21:59:34.810           2010-07-02 
>>>>> 22:33:00.603           1706     3541     1511     287
>>>>> 4219818           297569 2010-07-02 22:56:52.750           2010-07-02 
>>>>> 23:59:31.607           1706     3541     1511     1877
>>>>> 4219819           297570 2010-07-02 23:59:21.577           2010-07-03 
>>>>> 00:54:40.153           1706     3541     1511     1798
>>>>> 4219821           297572 2010-07-03 00:48:03.310           2010-07-03 
>>>>> 01:59:37.920           1706     3541     1511     1125
>>>>> 4219823           297574 2010-07-03 01:51:01.057           2010-07-03 
>>>>> 02:59:45.433           1706     3541     1511     1629
>>>>> 4219820           297571 2010-07-03 02:59:29.393           2010-07-03 
>>>>> 03:59:54.920           1706     3541     1511     2462
>>>>> 4219822           297573 2010-07-03 03:59:18.663           2010-07-03 
>>>>> 04:01:48.810           1706     3541     1511     70
>>>>> 4225738           297656 2010-07-03 06:13:34.980           2010-07-03 
>>>>> 06:28:09.697           1726     3212     1428     46
>>>>> 4228694           297695 2010-07-03 06:59:15.560           2010-07-03 
>>>>> 07:32:45.653           1726     3212     1428     251
>>>>> 4231649           297733 2010-07-03 07:59:11.937           2010-07-03 
>>>>> 08:32:57.217           1726     3212     1428     284
>>>>> 4234604           297771 2010-07-03 08:57:00.357           2010-07-03 
>>>>> 09:32:47.903           1726     3212     1428     227
>>>>> 4237559           297809 2010-07-03 09:59:19.813           2010-07-03 
>>>>> 10:33:02.063           1726     3212     1428     285
>>>>> 4261156           298596 2010-07-04 22:59:09.863           2010-07-04 
>>>>> 23:33:45.530           1726     3212     1428     1286
>>>>> 4264114           298646 2010-07-04 23:59:16.967           2010-07-05 
>>>>> 00:33:08.107           1726     3212     1428     297
>>>>> 4267067           298690 2010-07-05 00:59:15.187           2010-07-05 
>>>>> 01:32:48.300           1726     3212     1428     333
>>>>> 4270023           298734 2010-07-05 01:59:02.497           2010-07-05 
>>>>> 02:32:48.780           1726     3212     1428     270
>>>>> 4272977           298778 2010-07-05 02:41:43.737           2010-07-05 
>>>>> 03:32:56.043           1726     3212     1428     317
>>>>> 4275927           298822 2010-07-05 03:59:17.027           2010-07-05 
>>>>> 04:33:14.947           1726     3212     1428     1623
>>>>> ================================================================================
>>>>>
>>>>> Description:
>>>>> 1. Column: some ID
>>>>> 2. Column: reference to another table
>>>>> 3. and 4. column: timestamp from/to of the item
>>>>> 5. Column: ID R
>>>>> 6. Column: ID L
>>>>> 7. Column: ID B
>>>>> 8. Column: Sum of components
>>>>>
>>>>> Requirement:
>>>>> Sum over all components (from column 8) for each combination of ID R, 
>>>>> ID L, ID B, but (!)
>>>>> rows with same keys (R,L,B) should be summed up only until the keys 
>>>>> change.
>>>>> Do not sum up the components for identical keys, if there are other 
>>>>> keys between them.
>>>>>
>>>>> Example result:
>>>>>
>>>>> idR       idL      idB                 SUM
>>>>> 1726     3212     1428                 462
>>>>> 318     1846     1012                 ...
>>>>> 1706     3541     1511                 ...
>>>>> 1726     3212     1428                 ...
>>>>>
>>>>>
>>>>> Note that the first and last entry here has the same keys
>>>>>
>>>>> Maybe you find a similar monster SQL solving such a requirement.
>>>>> Thanks for considering!
>>>>>
>>>>> Rainer
>>>>>
>>>>>
>>>>>
>>>>> Am 27.07.2010 12:37, schrieb Oliveiros d'Azevedo Cristina:
>>>>>> Howdy, Rainer.
>>>>>>
>>>>>> It's been a while, so I don't know if you are still interested in 
>>>>>> this problem or if you, in the meantime, found yourself a solution,
>>>>>> but I've tried this on a local copy of the example you provided   and 
>>>>>> it seems to work.
>>>>>>
>>>>>> The problem is that I suspect that if you have several thousands of 
>>>>>> records on your table it will become slow...
>>>>>>
>>>>>> Best,
>>>>>> Oliveiros
>>>>>>
>>>>>> SELECT SUM(tudo.parcela),tudo.a
>>>>>> FROM
>>>>>> (
>>>>>> SELECT fo.parcela,fo.a,fo.b,fo.c,MIN(th.c) as d
>>>>>> FROM
>>>>>> (
>>>>>> SELECT se.a as parcela,se.b as a,se.c as b, MAX(pr.c) as c
>>>>>> FROM
>>>>>> yourTable se
>>>>>> LEFT JOIN
>>>>>> (
>>>>>> SELECT a.*
>>>>>> FROM yourTable a
>>>>>> JOIN yourTable b
>>>>>> ON (b.b <> a.b)
>>>>>> AND ((age(a.c,b.c) = '1 day'::interval)
>>>>>>
>>>>>> )
>>>>>> ) pr
>>>>>> ON pr.b = se.b
>>>>>> AND    se.c >= pr.c
>>>>>> GROUP BY se.a,se.b,se.c
>>>>>> ) fo
>>>>>> LEFT JOIN
>>>>>> (
>>>>>> SELECT a.*
>>>>>> FROM yourTable a
>>>>>> JOIN yourTable b
>>>>>> ON (b.b <> a.b)
>>>>>> AND ((age(a.c,b.c) = '-1 day'::interval)
>>>>>> )
>>>>>> ) th
>>>>>> ON fo.a = th.b
>>>>>> AND fo.b <= th.c
>>>>>> GROUP BY fo.parcela,fo.a,fo.b,fo.c
>>>>>> ) tudo
>>>>>> GROUP BY tudo.a,tudo.c,tudo.d
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> To: <pgsql-sql@postgresql.org>
>>>>>> Sent: Thursday, July 22, 2010 9:09 AM
>>>>>> Subject: [SQL] grouping subsets
>>>>>>
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> having a table similar to
>>>>>>>
>>>>>>> | 1 | B | [2010-07-15 Do] |
>>>>>>> | 1 | B | [2010-07-16 Fr] |
>>>>>>> |---+---+-----------------|
>>>>>>> | 2 | C | [2010-07-17 Sa] |
>>>>>>> | 2 | C | [2010-07-18 So] |
>>>>>>> |---+---+-----------------|
>>>>>>> | 1 | B | [2010-07-19 Mo] |
>>>>>>> | 1 | B | [2010-07-20 Di] |
>>>>>>> | 1 | B | [2010-07-21 Mi] |
>>>>>>> | 1 | B | [2010-07-22 Do] |
>>>>>>> |---+---+-----------------|
>>>>>>> | 3 | D | [2010-07-23 Fr] |
>>>>>>>
>>>>>>> a simple group by gives me:
>>>>>>>
>>>>>>> | 6 | B |
>>>>>>> | 4 | C |
>>>>>>> | 3 | D |
>>>>>>>
>>>>>>>
>>>>>>> What I want to get is the values grouped by "subset", where a subset 
>>>>>>> is a set of rows with identical column until the colum changes.
>>>>>>> Is there a way to get
>>>>>>>
>>>>>>> | 2 | B |
>>>>>>> | 4 | C |
>>>>>>> | 4 | B |
>>>>>>> | 3 | D |
>>>>>>>
>>>>>>> by SQL only?
>>>>>>>
>>>>>>> - Rainer
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
>>>>>>> To make changes to your subscription:
>>>>>>> http://www.postgresql.org/mailpref/pgsql-sql
>>>>>>
>>>>>>
>>>>
>>
>> 



Re: grouping subsets

От
Joshua Tolley
Дата:
On Thu, Jul 22, 2010 at 11:31:23AM +0000, Tim Landscheidt wrote:
> Richard Huxton <dev@archonet.com> wrote:
>
> >>> What I want to get is the values grouped by "subset", where a subset is a set of rows with identical column until
thecolum changes. 
> >>> Is there a way to get
>
> >>> | 2 | B |
> >>> | 4 | C |
> >>> | 4 | B |
> >>> | 3 | D |
>
> >>> by SQL only?
>
> >> I think, the problem is that there are 2 identical groups. I think, you
> >> can write a pl/pgsql-proc, selecting all ordered by the date-field and
> >> walking through the result to do the grouping, checking if the 2nd
> >> column is different from the previous.
>
> >> With plain SQL it's maybe possible too, but i don't know how ...
>
> > It should be do-able in 8.4 onwards, look into windowing
> > functions. In particular the lag() function:
>
> > SELECT
> >     mycode,
> >     mydate,
> >     lag(mycode) OVER (ORDER BY mydate) AS prev_code
> > FROM
> >     mytable
> > ORDER BY mydate;
>
> > It should be possible to use that as a subquery with an
> > outer query that compares mycode=prev_code to get a run
> > length.
>
> Hmmm. Can the outer query be done without using "WITH
> RECURSIVE"?

How about this:

select   a, b, c, d, sum
from (   select       a, b, c, d, new_partition,       sum(e) over (partition by partition_num)    from (       select
        a, b, c, d, e,           case when               lag(a, 1, null) over (order by d) is null or
lag(a,1, null) over (order by d) != a or               lag(b, 1, null) over (order by d) is null or
lag(b,1, null) over (order by d) != b or               lag(c, 1, null) over (order by d) is null or
lag(c,1, null) over (order by d) != c                   then nextval('a')               else currval('a')           end
aspartition_num,           case when               lag(a, 1, null) over (order by d) is null or               lag(a, 1,
null)over (order by d) != a or               lag(b, 1, null) over (order by d) is null or               lag(b, 1, null)
over(order by d) != b or               lag(c, 1, null) over (order by d) is null or               lag(c, 1, null) over
(orderby d) != c                   then 'T'::boolean               else 'f'::boolean           end as new_partition
 from foo   ) bar 
) baz
where   new_partition;

Here's my test table:
5432 josh@postgres# select * from foo;a | b | c |           d            | e
---+---+---+------------------------+---1 | 9 | 1 | 2007-01-01 00:00:05-07 | 89 | 2 | 1 | 2007-01-01 00:00:10-07 | 49 |
2| 1 | 2007-01-01 00:00:15-07 | 69 | 2 | 1 | 2007-01-01 00:00:20-07 | 26 | 5 | 7 | 2007-01-01 00:00:25-07 | 34 | 9 | 0
|2007-01-01 00:00:30-07 | 04 | 9 | 0 | 2007-01-01 00:00:35-07 | 75 | 2 | 7 | 2007-01-01 00:01:25-07 | 75 | 2 | 7 |
2007-01-0100:01:30-07 | 75 | 2 | 7 | 2007-01-01 00:01:35-07 | 95 | 2 | 7 | 2007-01-01 00:01:40-07 | 25 | 2 | 7 |
2007-01-0100:01:45-07 | 55 | 2 | 7 | 2007-01-01 00:01:50-07 | 85 | 2 | 7 | 2007-01-01 00:01:55-07 | 55 | 2 | 7 |
2007-01-0100:02:00-07 | 97 | 8 | 8 | 2007-01-01 00:02:05-07 | 77 | 8 | 8 | 2007-01-01 00:02:10-07 | 89 | 3 | 0 |
2007-01-0100:02:15-07 | 09 | 3 | 0 | 2007-01-01 00:02:20-07 | 89 | 2 | 1 | 2007-01-01 00:02:25-07 | 39 | 2 | 1 |
2007-01-0100:02:30-07 | 3 
(21 rows)

...and these results...a | b | c |           d            | sum
---+---+---+------------------------+-----1 | 9 | 1 | 2007-01-01 00:00:05-07 |   89 | 2 | 1 | 2007-01-01 00:00:10-07 |
126| 5 | 7 | 2007-01-01 00:00:25-07 |   34 | 9 | 0 | 2007-01-01 00:00:30-07 |   75 | 2 | 7 | 2007-01-01 00:01:25-07 |
527| 8 | 8 | 2007-01-01 00:02:05-07 |  159 | 3 | 0 | 2007-01-01 00:02:15-07 |   89 | 2 | 1 | 2007-01-01 00:02:25-07 |
6
(8 rows)

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

Re: grouping subsets

От
Rainer Stengele
Дата:
 The analysis will have to be done over a calendar range which the user 
selects via web interface. So - yes - normally not the whole table, but
maybe he chooses one month and thats already a lot of rows in the table ...

Best, too,
Rainer

Am 30.07.2010 17:07, schrieb Oliveiros d'Azevedo Cristina:
> I See.
>
> And the analysis you need to do, the sum of the rows with the same
> keys (until they change) will have to be done over all
> table?
> Or just over some predefined interval ?
>
> Best,
> Oliveiros
> ----- Original Message ----- From: "Rainer Stengele"
> <rainer.stengele@diplan.de>
> Newsgroups: gmane.comp.db.postgresql.sql
> To: "Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>
> Cc: <pgsql-sql@postgresql.org>
> Sent: Friday, July 30, 2010 10:35 AM
> Subject: Re: grouping subsets
>
>
>> the table may include up to maybe 30 entries per day, average maybe
>> 10-15
>> After a year this makes about 10.000 entries - maximum, average about
>> 5000 entries.
>>
>> For the problem described I have to use a Microsoft SQL database and
>> would like to use pure SQL.
>> As I use postgres on my Linux servers I found this newsgroup and
>> thought I ask here.
>>
>> Thanks!
>> Rainer
>>
>> Am 29.07.2010 15:31, schrieb Oliveiros d'Azevedo Cristina:
>>> Fine.
>>>
>>> Please advice me,
>>>
>>> How long can
>>> your table be? Thousands? Millions of records?
>>>
>>> Do you really need it in pure SQL
>>> ?
>>>
>>> It seems to me that it might be possible, I'm just affraid that the
>>> query would become too complex and thus slow...
>>>
>>> Best,
>>> Oliveiros
>>>
>>> ----- Original Message ----- From: "Rainer Stengele"
>>> <rainer.stengele@diplan.de>
>>> To: "Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>
>>> Cc: <pgsql-sql@postgresql.org>
>>> Sent: Thursday, July 29, 2010 1:10 PM
>>> Subject: Re: grouping subsets
>>>
>>>
>>>> No. This is by accident.
>>>> We have to assume that the combinations do change anytime, and many
>>>> times per day.
>>>>
>>>> So
>>>>
>>>> "Or is it possible to have the same combination on one day with
>>>> several sets?"
>>>>
>>>> YES!
>>>>
>>>> Rainer
>>>>
>>>>
>>>>
>>>> Am 29.07.2010 13:47, schrieb Oliveiros d'Azevedo Cristina:
>>>>> Yes. This is somewhat more complicated because it has more
>>>>> constraints.
>>>>> I've noticed that a given combination doesn't appear with holes on
>>>>> a certain day.
>>>>>
>>>>> For ex, on a daily basis, we have every three key combinations
>>>>> together.
>>>>>
>>>>> We dont have things like
>>>>> 2010-7-01  1726 3212 1428
>>>>> 2010-7-01  1726 3212 1428
>>>>> ...                318   1846 1012
>>>>> 2010-7-01  1726 3212 1428
>>>>>
>>>>> Can I assume that, for a certain day , the records for the same
>>>>> three combination are all together? There is just one set per day
>>>>> for a given combination?
>>>>>
>>>>> Or is it possible to have the same combination on one day with
>>>>> several sets?
>>>>>
>>>>> Best,
>>>>> Oliveiros
>>>>>
>>>>>
>>>>> ----- Original Message ----- From: "Rainer Stengele"
>>>>> <rainer.stengele@diplan.de>
>>>>> Newsgroups: gmane.comp.db.postgresql.sql
>>>>> To: "Oliveiros d'Azevedo Cristina" <oliveiros.cristina@marktest.pt>
>>>>> Cc: <>
>>>>> Sent: Thursday, July 29, 2010 10:41 AM
>>>>> Subject: Re: grouping subsets
>>>>>
>>>>>
>>>>>> Howdy Cristina,
>>>>>>
>>>>>> unfortunately things are more complicated. I have inserted an
>>>>>> excerpt of the real data here:
>>>>>>
>>>>>> ================================================================================
>>>>>>
>>>>>> TableID         MasterID dtBegin                           dtEnd
>>>>>> idR idL      idB     consumption
>>>>>> 4057312           295530 2010-07-01 00:59:21.077          
>>>>>> 2010-07-01 01:32:59.670           1726     3212     1428     279
>>>>>> 4061043           295574 2010-07-01 01:59:31.137          
>>>>>> 2010-07-01 02:32:09.373           1726     3212     1428     183
>>>>>> 4083397           295838 2010-07-01 07:57:51.327          
>>>>>> 2010-07-01 08:28:28.117           318      1846     1012     30
>>>>>> 4090858           295920 2010-07-01 09:52:33.777          
>>>>>> 2010-07-01 10:31:34.393           318      1846     1012     487
>>>>>> 4094589           295961 2010-07-01 10:47:59.370          
>>>>>> 2010-07-01 11:32:20.903           318      1846     1012     472
>>>>>> 4098330           296013 2010-07-01 11:58:53.890          
>>>>>> 2010-07-01 12:31:35.730           318      1846     1012     195
>>>>>> 4102069           296058 2010-07-01 12:36:19.170          
>>>>>> 2010-07-01 13:32:13.950           318      1846     1012     338
>>>>>> 4105809           296102 2010-07-01 13:58:53.170          
>>>>>> 2010-07-01 14:02:57.710           318      1846     1012     105
>>>>>> 4109555           296150 2010-07-01 14:59:11.663          
>>>>>> 2010-07-01 15:32:33.810           318      1846     1012     187
>>>>>> 4113305           296194 2010-07-01 15:59:01.797          
>>>>>> 2010-07-01 16:02:27.260           318      1846     1012     108
>>>>>> 4117048           296238 2010-07-01 16:20:47.997          
>>>>>> 2010-07-01 17:32:49.367           318      1846     1012     179
>>>>>> 4120791           296282 2010-07-01 17:58:27.657          
>>>>>> 2010-07-01 18:29:01.733           318      1846     1012     256
>>>>>> 4128291           296370 2010-07-01 19:54:17.687          
>>>>>> 2010-07-01 20:32:53.850           318      1846     1012     239
>>>>>> 4132044           296413 2010-07-01 20:31:37.653          
>>>>>> 2010-07-01 21:29:13.497           318      1846     1012     39
>>>>>> 4135797           296458 2010-07-01 21:59:13.983          
>>>>>> 2010-07-01 22:32:46.503           318      1846     1012     157
>>>>>> 4139572           296506 2010-07-01 22:58:49.530          
>>>>>> 2010-07-01 23:32:22.543           318      1846     1012     218
>>>>>> 4142941           296554 2010-07-01 23:59:13.857          
>>>>>> 2010-07-02 00:32:30.390           318      1846     1012     248
>>>>>> 4146289           296598 2010-07-02 00:58:55.763          
>>>>>> 2010-07-02 01:32:41.983           318      1846     1012     204
>>>>>> 4149616           296642 2010-07-02 01:46:57.357          
>>>>>> 2010-07-02 02:32:56.983           318      1846     1012     42
>>>>>> 4152952           296686 2010-07-02 02:55:19.653          
>>>>>> 2010-07-02 03:32:28.013           318      1846     1012     135
>>>>>> 4156289           296730 2010-07-02 03:43:52.777          
>>>>>> 2010-07-02 04:32:55.250           318      1846     1012     743
>>>>>> 4159624           296774 2010-07-02 04:43:15.310          
>>>>>> 2010-07-02 05:32:44.547           318      1846     1012     277
>>>>>> 4162961           296817 2010-07-02 05:58:59.483          
>>>>>> 2010-07-02 06:32:37.340           318      1846     1012     121
>>>>>> 4166303           296862 2010-07-02 06:58:50.733          
>>>>>> 2010-07-02 07:32:39.113           318      1846     1012     239
>>>>>> 4172981           296950 2010-07-02 07:28:55.293          
>>>>>> 2010-07-02 09:33:01.200           318      1846     1012     512
>>>>>> 4176322           296993 2010-07-02 09:59:04.607          
>>>>>> 2010-07-02 10:33:01.903           318      1846     1012     139
>>>>>> 4179667           297038 2010-07-02 10:55:27.760          
>>>>>> 2010-07-02 11:32:56.560           318      1846     1012     722
>>>>>> 4183012           297082 2010-07-02 11:59:33.650          
>>>>>> 2010-07-02 12:32:14.700            318      1846     1012     163
>>>>>> 4186351           297126 2010-07-02 12:23:45.997          
>>>>>> 2010-07-02 13:32:59.500            318      1846     1012     284
>>>>>> 4189689           297169 2010-07-02 13:44:21.253          
>>>>>> 2010-07-02 14:18:05.080            318      1846     1012     254
>>>>>> 4196371           297258 2010-07-02 16:16:19.123          
>>>>>> 2010-07-02 16:32:53.437           1706     3541     1511     161
>>>>>> 4199720           297301 2010-07-02 16:59:35.127          
>>>>>> 2010-07-02 17:32:57.950           1706     3541     1511     250
>>>>>> 4203068           297346 2010-07-02 17:59:34.027          
>>>>>> 2010-07-02 18:32:54.337           1706     3541     1511     302
>>>>>> 4206413           297389 2010-07-02 18:59:28.730          
>>>>>> 2010-07-02 19:32:37.950           1706     3541     1511     276
>>>>>> 4209758           297434 2010-07-02 19:54:00.243          
>>>>>> 2010-07-02 20:32:57.433           1706     3541     1511     209
>>>>>> 4213102           297473 2010-07-02 20:49:10.963          
>>>>>> 2010-07-02 21:30:44.540           1706     3541     1511     76
>>>>>> 4216447           297511 2010-07-02 21:59:34.810          
>>>>>> 2010-07-02 22:33:00.603           1706     3541     1511     287
>>>>>> 4219818           297569 2010-07-02 22:56:52.750          
>>>>>> 2010-07-02 23:59:31.607           1706     3541     1511     1877
>>>>>> 4219819           297570 2010-07-02 23:59:21.577          
>>>>>> 2010-07-03 00:54:40.153           1706     3541     1511     1798
>>>>>> 4219821           297572 2010-07-03 00:48:03.310          
>>>>>> 2010-07-03 01:59:37.920           1706     3541     1511     1125
>>>>>> 4219823           297574 2010-07-03 01:51:01.057          
>>>>>> 2010-07-03 02:59:45.433           1706     3541     1511     1629
>>>>>> 4219820           297571 2010-07-03 02:59:29.393          
>>>>>> 2010-07-03 03:59:54.920           1706     3541     1511     2462
>>>>>> 4219822           297573 2010-07-03 03:59:18.663          
>>>>>> 2010-07-03 04:01:48.810           1706     3541     1511     70
>>>>>> 4225738           297656 2010-07-03 06:13:34.980          
>>>>>> 2010-07-03 06:28:09.697           1726     3212     1428     46
>>>>>> 4228694           297695 2010-07-03 06:59:15.560          
>>>>>> 2010-07-03 07:32:45.653           1726     3212     1428     251
>>>>>> 4231649           297733 2010-07-03 07:59:11.937          
>>>>>> 2010-07-03 08:32:57.217           1726     3212     1428     284
>>>>>> 4234604           297771 2010-07-03 08:57:00.357          
>>>>>> 2010-07-03 09:32:47.903           1726     3212     1428     227
>>>>>> 4237559           297809 2010-07-03 09:59:19.813          
>>>>>> 2010-07-03 10:33:02.063           1726     3212     1428     285
>>>>>> 4261156           298596 2010-07-04 22:59:09.863          
>>>>>> 2010-07-04 23:33:45.530           1726     3212     1428     1286
>>>>>> 4264114           298646 2010-07-04 23:59:16.967          
>>>>>> 2010-07-05 00:33:08.107           1726     3212     1428     297
>>>>>> 4267067           298690 2010-07-05 00:59:15.187          
>>>>>> 2010-07-05 01:32:48.300           1726     3212     1428     333
>>>>>> 4270023           298734 2010-07-05 01:59:02.497          
>>>>>> 2010-07-05 02:32:48.780           1726     3212     1428     270
>>>>>> 4272977           298778 2010-07-05 02:41:43.737          
>>>>>> 2010-07-05 03:32:56.043           1726     3212     1428     317
>>>>>> 4275927           298822 2010-07-05 03:59:17.027          
>>>>>> 2010-07-05 04:33:14.947           1726     3212     1428     1623
>>>>>> ================================================================================
>>>>>>
>>>>>>
>>>>>> Description:
>>>>>> 1. Column: some ID
>>>>>> 2. Column: reference to another table
>>>>>> 3. and 4. column: timestamp from/to of the item
>>>>>> 5. Column: ID R
>>>>>> 6. Column: ID L
>>>>>> 7. Column: ID B
>>>>>> 8. Column: Sum of components
>>>>>>
>>>>>> Requirement:
>>>>>> Sum over all components (from column 8) for each combination of
>>>>>> ID R, ID L, ID B, but (!)
>>>>>> rows with same keys (R,L,B) should be summed up only until the
>>>>>> keys change.
>>>>>> Do not sum up the components for identical keys, if there are
>>>>>> other keys between them.
>>>>>>
>>>>>> Example result:
>>>>>>
>>>>>> idR       idL      idB                 SUM
>>>>>> 1726     3212     1428                 462
>>>>>> 318     1846     1012                 ...
>>>>>> 1706     3541     1511                 ...
>>>>>> 1726     3212     1428                 ...
>>>>>>
>>>>>>
>>>>>> Note that the first and last entry here has the same keys
>>>>>>
>>>>>> Maybe you find a similar monster SQL solving such a requirement.
>>>>>> Thanks for considering!
>>>>>>
>>>>>> Rainer
>>>>>>
>>>>>>
>>>>>>
>>>>>> Am 27.07.2010 12:37, schrieb Oliveiros d'Azevedo Cristina:
>>>>>>> Howdy, Rainer.
>>>>>>>
>>>>>>> It's been a while, so I don't know if you are still interested
>>>>>>> in this problem or if you, in the meantime, found yourself a
>>>>>>> solution,
>>>>>>> but I've tried this on a local copy of the example you
>>>>>>> provided   and it seems to work.
>>>>>>>
>>>>>>> The problem is that I suspect that if you have several thousands
>>>>>>> of records on your table it will become slow...
>>>>>>>
>>>>>>> Best,
>>>>>>> Oliveiros
>>>>>>>
>>>>>>> SELECT SUM(tudo.parcela),tudo.a
>>>>>>> FROM
>>>>>>> (
>>>>>>> SELECT fo.parcela,fo.a,fo.b,fo.c,MIN(th.c) as d
>>>>>>> FROM
>>>>>>> (
>>>>>>> SELECT se.a as parcela,se.b as a,se.c as b, MAX(pr.c) as c
>>>>>>> FROM
>>>>>>> yourTable se
>>>>>>> LEFT JOIN
>>>>>>> (
>>>>>>> SELECT a.*
>>>>>>> FROM yourTable a
>>>>>>> JOIN yourTable b
>>>>>>> ON (b.b <> a.b)
>>>>>>> AND ((age(a.c,b.c) = '1 day'::interval)
>>>>>>>
>>>>>>> )
>>>>>>> ) pr
>>>>>>> ON pr.b = se.b
>>>>>>> AND    se.c >= pr.c
>>>>>>> GROUP BY se.a,se.b,se.c
>>>>>>> ) fo
>>>>>>> LEFT JOIN
>>>>>>> (
>>>>>>> SELECT a.*
>>>>>>> FROM yourTable a
>>>>>>> JOIN yourTable b
>>>>>>> ON (b.b <> a.b)
>>>>>>> AND ((age(a.c,b.c) = '-1 day'::interval)
>>>>>>> )
>>>>>>> ) th
>>>>>>> ON fo.a = th.b
>>>>>>> AND fo.b <= th.c
>>>>>>> GROUP BY fo.parcela,fo.a,fo.b,fo.c
>>>>>>> ) tudo
>>>>>>> GROUP BY tudo.a,tudo.c,tudo.d
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> To: <pgsql-sql@postgresql.org>
>>>>>>> Sent: Thursday, July 22, 2010 9:09 AM
>>>>>>> Subject: [SQL] grouping subsets
>>>>>>>
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> having a table similar to
>>>>>>>>
>>>>>>>> | 1 | B | [2010-07-15 Do] |
>>>>>>>> | 1 | B | [2010-07-16 Fr] |
>>>>>>>> |---+---+-----------------|
>>>>>>>> | 2 | C | [2010-07-17 Sa] |
>>>>>>>> | 2 | C | [2010-07-18 So] |
>>>>>>>> |---+---+-----------------|
>>>>>>>> | 1 | B | [2010-07-19 Mo] |
>>>>>>>> | 1 | B | [2010-07-20 Di] |
>>>>>>>> | 1 | B | [2010-07-21 Mi] |
>>>>>>>> | 1 | B | [2010-07-22 Do] |
>>>>>>>> |---+---+-----------------|
>>>>>>>> | 3 | D | [2010-07-23 Fr] |
>>>>>>>>
>>>>>>>> a simple group by gives me:
>>>>>>>>
>>>>>>>> | 6 | B |
>>>>>>>> | 4 | C |
>>>>>>>> | 3 | D |
>>>>>>>>
>>>>>>>>
>>>>>>>> What I want to get is the values grouped by "subset", where a
>>>>>>>> subset is a set of rows with identical column until the colum
>>>>>>>> changes.
>>>>>>>> Is there a way to get
>>>>>>>>
>>>>>>>> | 2 | B |
>>>>>>>> | 4 | C |
>>>>>>>> | 4 | B |
>>>>>>>> | 3 | D |
>>>>>>>>
>>>>>>>> by SQL only?
>>>>>>>>
>>>>>>>> - Rainer
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
>>>>>>>> To make changes to your subscription:
>>>>>>>> http://www.postgresql.org/mailpref/pgsql-sql
>>>>>>>
>>>>>>>
>>>>>
>>>
>>>
>