Обсуждение: FW: query optimization question

Поиск
Список
Период
Сортировка

FW: query optimization question

От
Дата:
No offence taken, however it is incorrect, my SQL is pretty good.  I
received no other responses...  And I later realized the solution to my
question:

(EXPERTS READ ON: If anyone can show me how to use a group by or otherwise
optimize I would be grateful)

This subquery:    SELECT  project_id, marketing_name,            (SELECT count(lots.lot_id) AS lot_count
FROMdeficiency_table AS dt, lots, deficiency_status AS ds             WHERE dt.lot_id = lots.lot_id                AND
lots.division_id= proj.division_id                AND lots.project_id = proj.project_id                AND
dt.deficiency_status_id= ds.deficiency_status_id                AND ds.is_outstanding
#PreserveSingleQuotes(variables.base_query)#           ) AS def_count,
 
Actually does return a deficiency count, where there could be more then 1
deficiency per lot.  In order to get my lot_count, (number of lots with 1 or
more deficiencies) I just needed to add a DISTINCT clause in my count()
aggregate, ie  SELECT count(DISTINCT lots.lot_id)...  I forgot one could do
that:                (SELECT count(DISTINCT lots.lot_id) AS lot_count                 FROM deficiency_table AS dt,
lots,deficiency_status AS ds                 WHERE dt.lot_id = lots.lot_id                    AND lots.division_id =
proj.division_id                   AND lots.project_id = proj.project_id                    AND dt.days_old_start_date
>=#CreateODBCDate(DateAdd("d", -
 
int(ListLast(variables.aging_breakdown_list, ",")), now() ))#                    AND dt.deficiency_status_id =
ds.deficiency_status_id                   AND ds.is_outstanding
#PreserveSingleQuotes(variables.base_query)#               ) AS
lot_count_greater_#ListLast(variables.aging_breakdown_list,",")#,
 
Note the #PreserveSingleQuotes(variables.base_query)# is dynamic code that
further selects deficiencies by various criteria, eg just for a particular
supplier.

This query is actually dynamic, if all I had to do was the above 2 clauses
then I most certainly COULD do a group by.

However, for the total deficiencies I am then splitting up the total into
aging groups, eg <30, 30-60, 60-90, and >90 days old.  The query for that
looks like the below.  But before I paste it in, I would like to optimize
it, if I could do so with a group by clause I most certainly would, but I
don't see how I can BECAUSE OF THE AGING BREAKDOWN:
SELECT  project_id, marketing_name,    (SELECT count(lots.lot_id) AS lot_count     FROM deficiency_table AS dt, lots,
deficiency_statusAS ds     WHERE dt.lot_id = lots.lot_id        AND lots.division_id = proj.division_id        AND
lots.project_id= proj.project_id        AND dt.deficiency_status_id = ds.deficiency_status_id        AND
ds.is_outstanding       AND dt.assigned_supplier_id = '101690'    ) AS def_count,
 
        (SELECT count(lots.lot_id) AS lot_count         FROM deficiency_table AS dt, lots, deficiency_status AS ds
  WHERE dt.lot_id = lots.lot_id            AND lots.division_id = proj.division_id            AND lots.project_id =
proj.project_id
            AND dt.days_old_start_date < {d '2002-10-07'}            AND dt.deficiency_status_id =
ds.deficiency_status_id           AND ds.is_outstanding        AND dt.assigned_supplier_id = '101690'        ) AS
def_count_less_30,
            (SELECT count(lots.lot_id) AS lot_count             FROM deficiency_table AS dt, lots, deficiency_status AS
ds            WHERE dt.lot_id = lots.lot_id                AND lots.division_id = proj.division_id                AND
lots.project_id= proj.project_id
 
                AND dt.days_old_start_date >= {d '2002-10-07'}                AND dt.days_old_start_date < {d
'2002-09-07'}               AND dt.deficiency_status_id = ds.deficiency_status_id                AND ds.is_outstanding
     AND dt.assigned_supplier_id = '101690'            ) AS def_count_30_60,
 
            (SELECT count(lots.lot_id) AS lot_count             FROM deficiency_table AS dt, lots, deficiency_status AS
ds            WHERE dt.lot_id = lots.lot_id                AND lots.division_id = proj.division_id                AND
lots.project_id= proj.project_id
 
                AND dt.days_old_start_date >= {d '2002-09-07'}                AND dt.days_old_start_date < {d
'2002-08-08'}               AND dt.deficiency_status_id = ds.deficiency_status_id                AND ds.is_outstanding
     AND dt.assigned_supplier_id = '101690'            ) AS def_count_60_90,
 

        (SELECT count(lots.lot_id) AS lot_count         FROM deficiency_table AS dt, lots, deficiency_status AS ds
  WHERE dt.lot_id = lots.lot_id            AND lots.division_id = proj.division_id            AND lots.project_id =
proj.project_id
            AND dt.days_old_start_date >= {d '2002-08-08'}            AND dt.deficiency_status_id =
ds.deficiency_status_id           AND ds.is_outstanding        AND dt.assigned_supplier_id = '101690'        ) AS
def_count_greater_90,
        (SELECT count(DISTINCT lots.lot_id) AS lot_count         FROM deficiency_table AS dt, lots, deficiency_status
ASds         WHERE dt.lot_id = lots.lot_id            AND lots.division_id = proj.division_id            AND
lots.project_id= proj.project_id
 
            AND dt.days_old_start_date < {d '2002-10-07'}            AND dt.deficiency_status_id =
ds.deficiency_status_id           AND ds.is_outstanding        AND dt.assigned_supplier_id = '101690'        ) AS
lot_count_less_30,
            (SELECT count(DISTINCT lots.lot_id) AS lot_count             FROM deficiency_table AS dt, lots,
deficiency_statusAS ds             WHERE dt.lot_id = lots.lot_id                AND lots.division_id = proj.division_id
              AND lots.project_id = proj.project_id
 
                AND dt.days_old_start_date >= {d '2002-10-07'}                AND dt.days_old_start_date < {d
'2002-09-07'}               AND dt.deficiency_status_id = ds.deficiency_status_id                AND ds.is_outstanding
     AND dt.assigned_supplier_id = '101690'            ) AS lot_count_30_60,
 
            (SELECT count(DISTINCT lots.lot_id) AS lot_count             FROM deficiency_table AS dt, lots,
deficiency_statusAS ds             WHERE dt.lot_id = lots.lot_id                AND lots.division_id = proj.division_id
              AND lots.project_id = proj.project_id
 
                AND dt.days_old_start_date >= {d '2002-09-07'}                AND dt.days_old_start_date < {d
'2002-08-08'}               AND dt.deficiency_status_id = ds.deficiency_status_id                AND ds.is_outstanding
     AND dt.assigned_supplier_id = '101690'            ) AS lot_count_60_90,
 

        (SELECT count(DISTINCT lots.lot_id) AS lot_count         FROM deficiency_table AS dt, lots, deficiency_status
ASds         WHERE dt.lot_id = lots.lot_id            AND lots.division_id = proj.division_id            AND
lots.project_id= proj.project_id
 
            AND dt.days_old_start_date >= {d '2002-08-08'}            AND dt.deficiency_status_id =
ds.deficiency_status_id           AND ds.is_outstanding        AND dt.assigned_supplier_id = '101690'        ) AS
lot_count_greater_90,
    (SELECT count(DISTINCT lots.lot_id) AS lot_count     FROM deficiency_table AS dt, lots, deficiency_status AS ds
WHEREdt.lot_id = lots.lot_id        AND lots.division_id = proj.division_id        AND lots.project_id =
proj.project_id       AND dt.deficiency_status_id = ds.deficiency_status_id        AND ds.is_outstanding        AND
dt.assigned_supplier_id= '101690'    ) AS lot_count
 
FROM    projects AS proj
WHERE   proj.division_id = 'GGH'AND NOT EXISTS (SELECT 1 FROM menu_group_projects WHERE menu_code = 'WA'
AND division_id = proj.division_id AND project_id = proj.project_id AND
status = 'I')
ORDER BY proj.project_id

If anyone can see a way to do a group by to do this, then I will be happy to
hear about it, because currently the resultset has to do a separate
(sequential or index) scan of the deficiencies table.  The only way I can
see to do a group by would be to break out the aging categories into
separate queries, but that wins me nothing because each query then does its
own scan...

The expected simplified output of this query looks like this:
Project    <30     30-60     >=60    lot total    <30    30-60    >=60    def total
X    1    2    1    4    5    10    5    20    (if X had 4 lots, each of 5 deficiencies)
Y    1    1    0    2    3    3    0    6    (each has eg 3 deficiencies in project Y)

Terry Fielder
Network Engineer
Great Gulf Homes / Ashton Woods Homes
terry@greatgulfhomes.com



> -----Original Message-----
> From: ch@rodos.fzk.de [mailto:ch@rodos.fzk.de]
> Sent: Wednesday, November 06, 2002 4:54 AM
> To: terry@ashtonwoodshomes.com
> Subject: Re: [SQL] query optimization question
>
>
>
> Dear Terry,
> When I was reading the objective of your query, I expected at
> least one
> GROUP BY clause within. I do not intend to be offensive - not at all,
> but your query very much looks like you're lacking in basic SQL
> knowledge
> (did you receive any other reply?).
> The clause
>  WHERE ...
>  AND division_id = proj.division_id AND project_id =
> proj.project_id ...
>
> is leading to a JOIN of your projects table to itself.
> I'm pretty sure that's the main reason why the query is slow.
> As I understand your database table design, there are relations about
> divisions, projects, lots, and deficiencies of lots. And you are
> running a master database for all of them.
> I've tried to write two queries (see below) to retrieve the
> information
> you want (BTW I think your first subquery counts the total number of
> lots within the project but not the total number of deficiencies).
> Both queries may still run slow because three tables have to be joined
> (Please try them within the 'psql' interactive terminal first).
> Also, they may not work at all (I could not verify them as I did not
> know about your CREATE TABLE statements and did not have data to put
> in).
> I'm willing to help, so if it's not working this information would be
> very useful to me. I am no SQL guru, so I cannot see any way to put
> these
> two into one. But this looks like an interesting task, maybe we should
> put this topic to the list again as soon as we make the
> single ones run.
>
> Probably, you'll need to create several indexes to speed up.
>
> -- for each project, the total number of deficiencies
> SELECT   p.project_id, p.marketing_name, COUNT(d.lot_id) AS def_count
> FROM     projects AS p,
>          lots AS l LEFT JOIN deficiency_table AS d
> ON       ( d.lot_id = l.lot_id )
> WHERE    l.division_id = p.division_id
> AND       p.division_id = '#variables.local_division_id#'
> GROUP BY p.project_id, p.marketing_name ;
>
> -- for each project, the total number of lots with 1 or more
> deficiencies
> SELECT   p.project_id, p.marketing_name, COUNT(l.lot_id) AS
> def_lot_count
> FROM     projects AS p,
>          lots AS l LEFT JOIN deficiency_table AS d
> ON       ( d.lot_id = l.lot_id )
> WHERE    l.division_id = p.division_id
> AND       p.division_id = '#variables.local_division_id#'
> GROUP BY p.project_id, p.marketing_name HAVING COUNT(d.lot_id) > 0 ;
>
> Once again, no offence intended, but I recommend to read a book on SQL
> soon.
>
> Regards, Christoph
>
> >
> > The query below is slow because both the lots table and the
> deficiency_table
> > table have thousands of records.  Can anyone tell me how to do the
> second
> > subselect (lot_count) by some method of a join instead of a sub -
> subselect
> > OR any other method I can use to optimize this query to make it
> faster?
> >
> > The objective of the query is:  Tell me for each project, the total
> number
> > of deficiencies in the project, and the total number of
> lots with 1 or
> more
> > deficiencies in the project.
> >
> > SELECT  project_id, marketing_name,
> >   (SELECT COUNT(lots.lot_id) AS lot_count
> >    FROM deficiency_table AS dt, lots
> >    WHERE dt.lot_id = lots.lot_id
> >       AND lots.division_id = proj.division_id
> >       AND lots.project_id = proj.project_id
> >   ) AS def_count,
> >   (SELECT COUNT(lots.lot_id) AS lot_counter
> >    FROM lots
> >    WHERE  lots.division_id = proj.division_id
> >     AND lots.project_id = proj.project_id
> >    AND EXISTS (SELECT 1 FROM deficiency_table AS dt WHERE
> dt.lot_id =
> > lots.lot_id)
> >   ) AS lot_count
> > FROM    projects AS proj
> > WHERE   proj.division_id = '#variables.local_division_id#'
> >  AND NOT EXISTS (SELECT 1 FROM menu_group_projects WHERE menu_code =
> 'WA'
> > AND division_id = proj.division_id AND project_id = proj.project_id
> AND
> > status = 'I')
> > ORDER BY proj.project_id
> >
> > Thanks in advance
> >
> > Terry Fielder
> > Network Engineer
> > Great Gulf Homes / Ashton Woods Homes
> > terry@greatgulfhomes.com
> >
>



Re: FW: query optimization question

От
Richard Huxton
Дата:
On Wednesday 06 Nov 2002 2:01 pm, terry@ashtonwoodshomes.com wrote:

> However, for the total deficiencies I am then splitting up the total into
> aging groups, eg <30, 30-60, 60-90, and >90 days old.  The query for that
> looks like the below.  But before I paste it in, I would like to optimize
> it, if I could do so with a group by clause I most certainly would, but I
> don't see how I can BECAUSE OF THE AGING BREAKDOWN:

[one sub-query per age-range]
>                 AND dt.days_old_start_date < {d '2002-10-07'}

>             ) AS def_count_less_30,

>                     AND dt.days_old_start_date >= {d '2002-10-07'}
>                     AND dt.days_old_start_date < {d '2002-09-07'}

>                 ) AS def_count_30_60,

Could you not define a function age_range(date) to return the relevant range
text, then group on that text? I've used that before.

--  Richard Huxton


Re: FW: query optimization question

От
Stephan Szabo
Дата:
On Wed, 6 Nov 2002 terry@ashtonwoodshomes.com wrote:

> However, for the total deficiencies I am then splitting up the total into
> aging groups, eg <30, 30-60, 60-90, and >90 days old.  The query for that
> looks like the below.  But before I paste it in, I would like to optimize
> it, if I could do so with a group by clause I most certainly would, but I
> don't see how I can BECAUSE OF THE AGING BREAKDOWN:

Well, as a first step, I'd suggest using an age function as already
suggested and a temporary table to hold the grouped by values temporarily
and then doing the subselects against that.

Maybe something like (untested):
create temp table defs asselect agefunc(dt.days_old_start_date) as ageval, count(lots.lot_id) as lots from
deficiency_tableas dt, lots, deficiency_status as ds where dt.lot_id = lots.lot_id and
lots.dividion_id=proj.division_idand lots.project_id=proj.project_id and
dt.deficiency_status_id=ds.deficiency_status_idand ts.is_outstanding and dt.assigned_supplier_id='101690'group by
ageval;

-- same general thing for other repeated queries

select project_id, marketing_name,(select sum(lots) from defs) as def_count,(select lots from defs where ageval=0) as
def_count_less_30,(selectlots from defs where ageval=1) as def_count_30_60,...
 

Since you want 0's instead of nulls, you'd probably need to do
a coalesce for the subselects, and this will go through the
probably 5 or so line temp table rather than the presumably large
other table.

I haven't spent much thought trying to force it down into a
single query, but that seems a bit harder.



Re: FW: query optimization question

От
Дата:
Actually, come to think of it, just the implementation of re-querying a
temporary table could alone significantly improve performance, because the
temp table would:
a) have fewer records to scan on the subselects
b) not require any joins

Thanks!

Terry Fielder
Network Engineer
Great Gulf Homes / Ashton Woods Homes
terry@greatgulfhomes.com



> -----Original Message-----
> From: pgsql-sql-owner@postgresql.org
> [mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Stephan Szabo
> Sent: Wednesday, November 06, 2002 11:22 AM
> To: terry@ashtonwoodshomes.com
> Cc: Postgresql Sql Group (E-mail)
> Subject: Re: FW: [SQL] query optimization question
>
>
> On Wed, 6 Nov 2002 terry@ashtonwoodshomes.com wrote:
>
> > However, for the total deficiencies I am then splitting up
> the total into
> > aging groups, eg <30, 30-60, 60-90, and >90 days old.  The
> query for that
> > looks like the below.  But before I paste it in, I would
> like to optimize
> > it, if I could do so with a group by clause I most
> certainly would, but I
> > don't see how I can BECAUSE OF THE AGING BREAKDOWN:
>
> Well, as a first step, I'd suggest using an age function as already
> suggested and a temporary table to hold the grouped by values
> temporarily
> and then doing the subselects against that.
>
> Maybe something like (untested):
> create temp table defs as
>  select agefunc(dt.days_old_start_date) as ageval,
>   count(lots.lot_id) as lots from
>   deficiency_table as dt, lots, deficiency_status as ds
>   where dt.lot_id = lots.lot_id
>   and lots.dividion_id=proj.division_id
>   and lots.project_id=proj.project_id
>   and dt.deficiency_status_id=ds.deficiency_status_id
>   and ts.is_outstanding
>   and dt.assigned_supplier_id='101690'
>  group by ageval;
>
> -- same general thing for other repeated queries
>
> select project_id, marketing_name,
>  (select sum(lots) from defs) as def_count,
>  (select lots from defs where ageval=0) as def_count_less_30,
>  (select lots from defs where ageval=1) as def_count_30_60,
>  ...
>
> Since you want 0's instead of nulls, you'd probably need to do
> a coalesce for the subselects, and this will go through the
> probably 5 or so line temp table rather than the presumably large
> other table.
>
> I haven't spent much thought trying to force it down into a
> single query, but that seems a bit harder.
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>



Re: query optimization question

От
Masaru Sugawara
Дата:
On Wed, 6 Nov 2002 09:01:49 -0500
<terry@ashtonwoodshomes.com> wrote:

> If anyone can see a way to do a group by to do this, then I will be happy to
> hear about it, because currently the resultset has to do a separate
> (sequential or index) scan of the deficiencies table.  The only way I can
> see to do a group by would be to break out the aging categories into
> separate queries, but that wins me nothing because each query then does its
> own scan...
> 
> The expected simplified output of this query looks like this:
> Project    <30     30-60     >=60    lot total    <30    30-60    >=60    def total
> X    1    2    1    4    5    10    5    20    (if X had 4 lots, each of 5 deficiencies)
> Y    1    1    0    2    3    3    0    6    (each has eg 3 deficiencies in project Y)
> 


The following query may be one of the ways, but I cannot confirm whether
it goes well or not. 


SELECT   project_id,   marketing_name,   COUNT(lots.lot_id) AS def_count,   COUNT(CASE WHEN dt.days_old_start_date < {d
'2002-10-07'}                        THEN lots.lot_id ELSE NULL END) AS def_count_less_30,   COUNT(CASE WHEN
dt.days_old_start_date>= {d '2002-10-07'}                           AND dt.days_old_start_date < {d '2002-09-07'}
                 THEN lots.lot_id ELSE NULL END) AS def_count_30_60,   COUNT(CASE WHEN dt.days_old_start_date >= {d
'2002-09-07'}                          AND dt.days_old_start_date < {d '2002-08-08'}                         THEN
lots.lot_idELSE NULL END) AS def_count_60_90,   COUNT(CASE WHEN dt.days_old_start_date >= {d '2002-08-08'}
          THEN lots.lot_id ELSE NULL END) AS def_count_greater_90,   COUNT(DISTINCT(CASE WHEN
      dt.days_old_start_date < {d '2002-10-07'}                         THEN lots.lot_id ELSE NULL END )) AS
lot_count_less_30, COUNT(DISTINCT(CASE WHEN                                  dt.days_old_start_date >= {d '2002-10-07'}
                        AND dt.days_old_start_date < {d '2002-09-07'}                         THEN lots.lot_id ELSE
NULLEND )) AS lot_count_30_60,  COUNT(DISTINCT(CASE WHEN                                  dt.days_old_start_date >= {d
'2002-09-07'}                         AND dt.days_old_start_date < {d '2002-08-08'}                         THEN
lots.lot_idELSE NULL END )) AS lot_count_60_90,  COUNT(DISTINCT(CASE WHEN
dt.days_old_start_date>= {d '2002-08-08'}                         THEN lots.lot_id ELSE NULL END )) AS
lot_count_greater_90, COUNT(DISTINCT lots.lot_id) AS lot_count
 
FROM  (SELECT * FROM deficiency_table              WHERE assigned_supplier_id = '101690') AS dt,  (SELECT * FROM
deficiency_status,             WHERE ds.is_outstanding) AS ds,  (SELECT * FROM projects              WHERE
proj.division_id= 'GGH') AS proj  lots
 
WHERE  dt.lot_id = lots.lot_id  AND lots.division_id = proj.division_id  AND lots.project_id = proj.project_id  AND
dt.deficiency_status_id= ds.deficiency_status_id  AND NOT EXISTS         (SELECT 1 FROM menu_group_projects
WHEREmenu_code = 'WA'                AND division_id = proj.division_id                AND project_id = proj.project_id
              AND status = 'I')
 
ORDER BY proj.project_id




Regards,
Masaru Sugawara




Re: FW: query optimization question

От
Stephan Szabo
Дата:
On Wed, 6 Nov 2002 terry@ashtonwoodshomes.com wrote:

> Actually, come to think of it, just the implementation of re-querying a
> temporary table could alone significantly improve performance, because the
> temp table would:
> a) have fewer records to scan on the subselects
> b) not require any joins

Yeah, that's what I was thinking.  However the example I gave was
bogus.  I realized that I needed to do more, then forgot before sending.

> > Maybe something like (untested):
> > create temp table defs as
> >  select agefunc(dt.days_old_start_date) as ageval,
> >   count(lots.lot_id) as lots from
> >   deficiency_table as dt, lots, deficiency_status as ds
> >   where dt.lot_id = lots.lot_id
> >   and lots.dividion_id=proj.division_id
> >   and lots.project_id=proj.project_id
> >   and dt.deficiency_status_id=ds.deficiency_status_id
> >   and ts.is_outstanding
> >   and dt.assigned_supplier_id='101690'
> >  group by ageval;

You'll almost certainly need to add projects as proj in the from clause,
proj.project_id in the select clause and group by (and possibly
division_id - I can't quite tell if that's a composite key).

> > -- same general thing for other repeated queries
> >
> > select project_id, marketing_name,
> >  (select sum(lots) from defs) as def_count,
> >  (select lots from defs where ageval=0) as def_count_less_30,

In these you'd want to limit it to the appropriate rows from defs
by project_id (and possibly division_id).



Re: query optimization question

От
Masaru Sugawara
Дата:
On Thu, 07 Nov 2002 01:44:25 +0900
I wrote  <rk73@sea.plala.or.jp> wrote:

> On Wed, 6 Nov 2002 09:01:49 -0500
> <terry@ashtonwoodshomes.com> wrote:
> 
> > If anyone can see a way to do a group by to do this, then I will be happy to
> > hear about it, because currently the resultset has to do a separate
> > (sequential or index) scan of the deficiencies table.  The only way I can
> > see to do a group by would be to break out the aging categories into
> > separate queries, but that wins me nothing because each query then does its
> > own scan...
> > 
> > The expected simplified output of this query looks like this:
> > Project    <30     30-60     >=60    lot total    <30    30-60    >=60    def total
> > X    1    2    1    4    5    10    5    20    (if X had 4 lots, each of 5 deficiencies)
> > Y    1    1    0    2    3    3    0    6    (each has eg 3 deficiencies in project Y)
> > 
> 
> 
> The following query may be one of the ways, but I cannot confirm whether
> it goes well or not. 
> 
> 

There are some misspelling in FROM clause. Now  they are fixed.

SELECT    project_id,    marketing_name,    COUNT(lots.lot_id) AS def_count,    COUNT(CASE WHEN dt.days_old_start_date
<{d '2002-10-07'}                    THEN lots.lot_id ELSE NULL END) AS def_count_less_30,    COUNT(CASE WHEN
dt.days_old_start_date>= {d '2002-10-07'}                    AND dt.days_old_start_date < {d '2002-09-07'}
     THEN lots.lot_id ELSE NULL END) AS def_count_30_60,    COUNT(CASE WHEN dt.days_old_start_date >= {d '2002-09-07'}
                 AND dt.days_old_start_date < {d '2002-08-08'}                    THEN lots.lot_id ELSE NULL END) AS
def_count_60_90,   COUNT(CASE WHEN dt.days_old_start_date >= {d '2002-08-08'}                    THEN lots.lot_id ELSE
NULLEND) AS def_count_greater_90,    COUNT(DISTINCT(CASE WHEN                    dt.days_old_start_date < {d
'2002-10-07'}                   THEN lots.lot_id ELSE NULL END )) AS lot_count_less_30,   COUNT(DISTINCT(CASE WHEN
             dt.days_old_start_date >= {d '2002-10-07'}                    AND dt.days_old_start_date < {d
'2002-09-07'}                   THEN lots.lot_id ELSE NULL END )) AS lot_count_30_60,   COUNT(DISTINCT(CASE WHEN
           dt.days_old_start_date >= {d '2002-09-07'}                    AND dt.days_old_start_date < {d '2002-08-08'}
                 THEN lots.lot_id ELSE NULL END )) AS lot_count_60_90,   COUNT(DISTINCT(CASE WHEN
dt.days_old_start_date>= {d '2002-08-08'}                    THEN lots.lot_id ELSE NULL END )) AS lot_count_greater_90,
 COUNT(DISTINCT lots.lot_id) AS lot_countFROM   (SELECT * FROM deficiency_table               WHERE
assigned_supplier_id= '101690') AS dt,   (SELECT * FROM deficiency_status               WHERE is_outstanding) AS ds,
(SELECT* FROM projects               WHERE division_id = 'GGH') AS proj,   lotsWHERE   dt.lot_id = lots.lot_id   AND
lots.division_id= proj.division_id   AND lots.project_id = proj.project_id   AND dt.deficiency_status_id =
ds.deficiency_status_id  AND NOT EXISTS          (SELECT 1 FROM menu_group_projects             WHERE menu_code = 'WA'
              AND division_id = proj.division_id                 AND project_id = proj.project_id                 AND
status= 'I')ORDER BY proj.project_id
 


Regards,
Masaru Sugawara




Re: query optimization question

От
Дата:
That looks really promising as a possibility, however I think you intended
to add a group by clause.


Terry Fielder
Network Engineer
Great Gulf Homes / Ashton Woods Homes
terry@greatgulfhomes.com



> -----Original Message-----
> From: Masaru Sugawara [mailto:rk73@sea.plala.or.jp]
> Sent: Wednesday, November 06, 2002 11:44 AM
> To: terry@ashtonwoodshomes.com
> Cc: pgsql-sql@postgresql.org
> Subject: Re: [SQL] query optimization question
>
>
> On Wed, 6 Nov 2002 09:01:49 -0500
> <terry@ashtonwoodshomes.com> wrote:
>
> > If anyone can see a way to do a group by to do this, then I
> will be happy to
> > hear about it, because currently the resultset has to do a separate
> > (sequential or index) scan of the deficiencies table.  The
> only way I can
> > see to do a group by would be to break out the aging categories into
> > separate queries, but that wins me nothing because each
> query then does its
> > own scan...
> >
> > The expected simplified output of this query looks like this:
> > Project    <30     30-60     >=60    lot total    <30
> 30-60    >=60    def total
> > X    1    2    1    4    5    10    5
> 20    (if X had 4 lots, each of 5 deficiencies)
> > Y    1    1    0    2    3    3    0
> 6    (each has eg 3 deficiencies in project Y)
> >
>
>
> The following query may be one of the ways, but I cannot
> confirm whether
> it goes well or not.
>
>
> SELECT
>     project_id,
>     marketing_name,
>     COUNT(lots.lot_id) AS def_count,
>     COUNT(CASE WHEN dt.days_old_start_date < {d '2002-10-07'}
>                           THEN lots.lot_id ELSE NULL END) AS
> def_count_less_30,
>     COUNT(CASE WHEN dt.days_old_start_date >= {d '2002-10-07'}
>                             AND dt.days_old_start_date < {d
> '2002-09-07'}
>                           THEN lots.lot_id ELSE NULL END) AS
> def_count_30_60,
>     COUNT(CASE WHEN dt.days_old_start_date >= {d '2002-09-07'}
>                             AND dt.days_old_start_date < {d
> '2002-08-08'}
>                           THEN lots.lot_id ELSE NULL END) AS
> def_count_60_90,
>     COUNT(CASE WHEN dt.days_old_start_date >= {d '2002-08-08'}
>                           THEN lots.lot_id ELSE NULL END) AS
> def_count_greater_90,
>     COUNT(DISTINCT(CASE WHEN
>                                    dt.days_old_start_date <
> {d '2002-10-07'}
>                           THEN lots.lot_id ELSE NULL END ))
> AS lot_count_less_30,
>    COUNT(DISTINCT(CASE WHEN
>                                    dt.days_old_start_date >=
> {d '2002-10-07'}
>                            AND dt.days_old_start_date < {d
> '2002-09-07'}
>                           THEN lots.lot_id ELSE NULL END ))
> AS lot_count_30_60,
>    COUNT(DISTINCT(CASE WHEN
>                                    dt.days_old_start_date >=
> {d '2002-09-07'}
>                            AND dt.days_old_start_date < {d
> '2002-08-08'}
>                           THEN lots.lot_id ELSE NULL END ))
> AS lot_count_60_90,
>    COUNT(DISTINCT(CASE WHEN
>                                   dt.days_old_start_date >=
> {d '2002-08-08'}
>                           THEN lots.lot_id ELSE NULL END ))
> AS lot_count_greater_90,
>    COUNT(DISTINCT lots.lot_id) AS lot_count
> FROM
>    (SELECT * FROM deficiency_table
>                WHERE assigned_supplier_id = '101690') AS dt,
>    (SELECT * FROM deficiency_status,
>                WHERE ds.is_outstanding) AS ds,
>    (SELECT * FROM projects
>                WHERE proj.division_id = 'GGH') AS proj
>    lots
> WHERE
>    dt.lot_id = lots.lot_id
>    AND lots.division_id = proj.division_id
>    AND lots.project_id = proj.project_id
>    AND dt.deficiency_status_id = ds.deficiency_status_id
>    AND NOT EXISTS
>           (SELECT 1 FROM menu_group_projects
>              WHERE menu_code = 'WA'
>                  AND division_id = proj.division_id
>                  AND project_id = proj.project_id
>                  AND status = 'I')
> ORDER BY proj.project_id
>
>
>
>
> Regards,
> Masaru Sugawara
>