Обсуждение: Creating a query to structure results grouped by two columns, pivoting only a third colum
Hello there,
In order to achieve such structure "pivoting" table and "grouping by" multiple columns. (i.e. as illustrated below), what would be the SQL implementation?
The source query is:
````
SELECT
t1.date,
t1.area,
t1.canal,
SUM(t1.peso) AS peso
FROM table1 t1
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3
````
and source query generates a initial structure as in:
###########################
date | area | canal | peso
2021-03-01 area1 can1 45.6768
2021-03-01 area1 can2 54.6768
2021-03-01 area1 can3 87.6
2021-03-01 area2 can1 1.87
2021-03-01 area2 can2 12.7687
2021-03-01 area2 can3 965.568
2021-03-01 area3 can1 968.95
2021-03-01 area3 can2 1.6867
2021-03-01 area3 can3 8.897
…. … … ...
2021-06-01 area1 can1
2021-06-01 area1 can2
2021-06-01 area1 can3
2021-06-01 area2 can1
… … ...
2021-12-01 area1
2021-12-01 area1
2021-12-01 area1
###########################
Then, the goal's to achieve a final structure grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso".
Plus, a partial total of each area, named as "total” .
I tried to write a query to support such a display, however, I got stuck at pivoting date only to the column peso
````
SELECT
t2.area,
t2.canal,
( SELECT
month,
peso_valor
FROM (
SELECT
month(t1.date) month,
t1.area,
t1.canal,
SUM(t1.peso_valor) AS peso_valor
FROM tbl_data t1
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3
) source_table
) PIVOT (
peso
FOR month IN (
1 Jan, 2 Feb, 3 Mar, 4 Apr, 5 Mai, 6 Jun, 7 Jul, 8 Aug, 9 Sep, 10 Oct, 11 Nov, 12 Dec
)
) as pivot_table
ORDER BY month
) as t2.peso
FROM tbl_data t2
GROUP BY 1, 2
````
what would be the SQL implementation to achieve a structure as such ?
(.i.e. grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso”.)
illustrated bellow?
Best wishes,
I
###########################
area | canal | 2021-03-01 | 2021-04-01 | 2021-05-01 | 2021-06-01 | ...
peso peso peso peso
area1 can1 45.6768 875.98 1.232 …
area1 can2 54.6768 665.8 2.43 ...
area1 can3 87.6 65.8 4.76 ...
area1 total SUM(45.6768+54.6768+87.6) SUM(875.98+665.8+65.8) SUM(1.232+2.43+4.76) ...
area2 can1 1.87 … ...
area2 can2 12.7687
area2 can3 965.568
area2 total SUM(1.87+12.7687+965.568) … ...
area3 can1 968.95
area3 can2 1.6867
area3 can3 8.897
...
###########################
On Sat, Sep 25, 2021 at 9:49 AM Iuri Sampaio <iuri.sampaio@gmail.com> wrote:
[FROM]
###########################date | area | canal | peso2021-03-01 area1 can1 45.67682021-03-01 area1 can2 54.67682021-03-01 area1 can3 87.62021-03-01 area2 can1 1.872021-03-01 area2 can2 12.76872021-03-01 area2 can3 965.5682021-03-01 area3 can1 968.952021-03-01 area3 can2 1.68672021-03-01 area3 can3 8.897…. … … ...2021-06-01 area1 can12021-06-01 area1 can22021-06-01 area1 can32021-06-01 area2 can1… … ...2021-12-01 area12021-12-01 area12021-12-01 area1###########################
[TO]
###########################area | canal | 2021-03-01 | 2021-04-01 | 2021-05-01 | 2021-06-01 | ...peso peso peso pesoarea1 can1 45.6768 875.98 1.232 …area1 can2 54.6768 665.8 2.43 ...area1 can3 87.6 65.8 4.76 ...area1 total SUM(45.6768+54.6768+87.6) SUM(875.98+665.8+65.8) SUM(1.232+2.43+4.76) ...area2 can1 1.87 … ...area2 can2 12.7687area2 can3 965.568area2 total SUM(1.87+12.7687+965.568) … ...area3 can1 968.95area3 can2 1.6867area3 can3 8.897...###########################
This post goes into quite a bit of detail on how to implement this type of query: https://stackoverflow.com/questions/3002499/postgresql-crosstab-query/11751905#11751905
Does that get you close enough?
Steve
On Sat, Sep 25, 2021 at 9:49 AM Iuri Sampaio <iuri.sampaio@gmail.com> wrote:
Hello there,In order to achieve such structure "pivoting" table and "grouping by" multiple columns. (i.e. as illustrated below), what would be the SQL implementation?The source query is:````SELECTt1.date,t1.area,t1.canal,SUM(t1.peso) AS pesoFROM table1 t1GROUP BY 1, 2, 3ORDER BY 1, 2, 3````and source query generates a initial structure as in:###########################date | area | canal | peso2021-03-01 area1 can1 45.67682021-03-01 area1 can2 54.67682021-03-01 area1 can3 87.62021-03-01 area2 can1 1.872021-03-01 area2 can2 12.76872021-03-01 area2 can3 965.5682021-03-01 area3 can1 968.952021-03-01 area3 can2 1.68672021-03-01 area3 can3 8.897…. … … ...2021-06-01 area1 can12021-06-01 area1 can22021-06-01 area1 can32021-06-01 area2 can1… … ...2021-12-01 area12021-12-01 area12021-12-01 area1###########################Then, the goal's to achieve a final structure grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso".Plus, a partial total of each area, named as "total” .I tried to write a query to support such a display, however, I got stuck at pivoting date only to the column peso````SELECTt2.area,t2.canal,( SELECTmonth,peso_valorFROM (SELECTmonth(t1.date) month,t1.area,t1.canal,SUM(t1.peso_valor) AS peso_valorFROM tbl_data t1GROUP BY 1, 2, 3ORDER BY 1, 2, 3) source_table) PIVOT (pesoFOR month IN (1 Jan, 2 Feb, 3 Mar, 4 Apr, 5 Mai, 6 Jun, 7 Jul, 8 Aug, 9 Sep, 10 Oct, 11 Nov, 12 Dec)) as pivot_tableORDER BY month) as t2.pesoFROM tbl_data t2GROUP BY 1, 2````what would be the SQL implementation to achieve a structure as such ?(.i.e. grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso”.)illustrated bellow?Best wishes,I###########################area | canal | 2021-03-01 | 2021-04-01 | 2021-05-01 | 2021-06-01 | ...peso peso peso pesoarea1 can1 45.6768 875.98 1.232 …area1 can2 54.6768 665.8 2.43 ...area1 can3 87.6 65.8 4.76 ...area1 total SUM(45.6768+54.6768+87.6) SUM(875.98+665.8+65.8) SUM(1.232+2.43+4.76) ...area2 can1 1.87 … ...area2 can2 12.7687area2 can3 965.568area2 total SUM(1.87+12.7687+965.568) … ...area3 can1 968.95area3 can2 1.6867area3 can3 8.897...###########################
Using PySpark API, the solution would be
````
df = df_table1.groupBy("area", "canal").pivot("date").sum("peso")
````
In oracle 11g, it would be sort of
````
SELECT * FROM table1 PIVOT (sum(peso) FOR date IN (
'2021-03-01', '2021-04-01', '2021-05-01', '2021-06-01', '2021-07-01', '2021-08-01', '2021-0-01’
) P
````
Oracle has this strange way of dealing with PIVOT, without even need to group columns.
If I were to write the query, ut would be somehtinglike this (i.e. similar to Oracle but more explicit).
SELECT * FROM (
SELECT
date,
area,
sub_canal,
SUM(peso) AS peso
FROM table1
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3
) t1
PIVOT (
peso FOR dat_referencia_mes IN ('2021-03-01', '2021-04-01', '2021-05-01', '2021-06-01', '2021-07-01', '2021-08-01', '2021-0-01' )
) p
another thing that I would have done is to replace the static set of dates to a dynamic one. (i.e. SELECT DISTINCT date FROM table1)
However, I’m dealing with syntax errors now, plus once I fix them I’m not sure if that will work ( I mean
On Sep 25, 2021, at 21:53, Steve Midgley <science@misuse.org> wrote:On Sat, Sep 25, 2021 at 9:49 AM Iuri Sampaio <iuri.sampaio@gmail.com> wrote:[FROM]###########################date | area | canal | peso2021-03-01 area1 can1 45.67682021-03-01 area1 can2 54.67682021-03-01 area1 can3 87.62021-03-01 area2 can1 1.872021-03-01 area2 can2 12.76872021-03-01 area2 can3 965.5682021-03-01 area3 can1 968.952021-03-01 area3 can2 1.68672021-03-01 area3 can3 8.897…. … … ...2021-06-01 area1 can12021-06-01 area1 can22021-06-01 area1 can32021-06-01 area2 can1… … ...2021-12-01 area12021-12-01 area12021-12-01 area1###########################[TO]###########################area | canal | 2021-03-01 | 2021-04-01 | 2021-05-01 | 2021-06-01 | ...peso peso peso pesoarea1 can1 45.6768 875.98 1.232 …area1 can2 54.6768 665.8 2.43 ...area1 can3 87.6 65.8 4.76 ...area1 total SUM(45.6768+54.6768+87.6) SUM(875.98+665.8+65.8) SUM(1.232+2.43+4.76) ...area2 can1 1.87 … ...area2 can2 12.7687area2 can3 965.568area2 total SUM(1.87+12.7687+965.568) … ...area3 can1 968.95area3 can2 1.6867area3 can3 8.897...###########################I think you'd want to use the "crosstab" function in the module "tablefunc" https://www.postgresql.org/docs/current/tablefunc.htmlThis post goes into quite a bit of detail on how to implement this type of query: https://stackoverflow.com/questions/3002499/postgresql-crosstab-query/11751905#11751905Does that get you close enough?SteveOn Sat, Sep 25, 2021 at 9:49 AM Iuri Sampaio <iuri.sampaio@gmail.com> wrote:Hello there,In order to achieve such structure "pivoting" table and "grouping by" multiple columns. (i.e. as illustrated below), what would be the SQL implementation?The source query is:````SELECTt1.date,t1.area,t1.canal,SUM(t1.peso) AS pesoFROM table1 t1GROUP BY 1, 2, 3ORDER BY 1, 2, 3````and source query generates a initial structure as in:###########################date | area | canal | peso2021-03-01 area1 can1 45.67682021-03-01 area1 can2 54.67682021-03-01 area1 can3 87.62021-03-01 area2 can1 1.872021-03-01 area2 can2 12.76872021-03-01 area2 can3 965.5682021-03-01 area3 can1 968.952021-03-01 area3 can2 1.68672021-03-01 area3 can3 8.897…. … … ...2021-06-01 area1 can12021-06-01 area1 can22021-06-01 area1 can32021-06-01 area2 can1… … ...2021-12-01 area12021-12-01 area12021-12-01 area1###########################Then, the goal's to achieve a final structure grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso".Plus, a partial total of each area, named as "total” .I tried to write a query to support such a display, however, I got stuck at pivoting date only to the column peso````SELECTt2.area,t2.canal,( SELECTmonth,peso_valorFROM (SELECTmonth(t1.date) month,t1.area,t1.canal,SUM(t1.peso_valor) AS peso_valorFROM tbl_data t1GROUP BY 1, 2, 3ORDER BY 1, 2, 3) source_table) PIVOT (pesoFOR month IN (1 Jan, 2 Feb, 3 Mar, 4 Apr, 5 Mai, 6 Jun, 7 Jul, 8 Aug, 9 Sep, 10 Oct, 11 Nov, 12 Dec)) as pivot_tableORDER BY month) as t2.pesoFROM tbl_data t2GROUP BY 1, 2````what would be the SQL implementation to achieve a structure as such ?(.i.e. grouped by columns "area" and "canal", pivoting column “date", but only to the column "peso”.)illustrated bellow?Best wishes,I###########################area | canal | 2021-03-01 | 2021-04-01 | 2021-05-01 | 2021-06-01 | ...peso peso peso pesoarea1 can1 45.6768 875.98 1.232 …area1 can2 54.6768 665.8 2.43 ...area1 can3 87.6 65.8 4.76 ...area1 total SUM(45.6768+54.6768+87.6) SUM(875.98+665.8+65.8) SUM(1.232+2.43+4.76) ...area2 can1 1.87 … ...area2 can2 12.7687area2 can3 965.568area2 total SUM(1.87+12.7687+965.568) … ...area3 can1 968.95area3 can2 1.6867area3 can3 8.897...###########################