optimize self-join query

Поиск
Список
Период
Сортировка
От Ty Busby
Тема optimize self-join query
Дата
Msg-id BFC71945-8821-4BC9-8430-A8CACF8F3794@gmail.com
обсуждение исходный текст
Ответы Re: optimize self-join query
Re: optimize self-join query
Re: optimize self-join query
Список pgsql-sql
I have a table that stores a very large starting number called epc_start_numeric and a quantity.  I've apparently built
themost inefficient query possible for doing the job I need: find out if any records overlap.  Imagine the
epc_start_numeric+ quantity representing a block of numbers.  I need to find out if any of these blocks overlap. 

Here is the table:

CREATE TABLE ftp_epc_audit
(record_id serial NOT NULL,sent_filename text NOT NULL,pcid text NOT NULL,tsid text NOT NULL,user_sec_id text NOT
NULL,format_settext NOT NULL,format text NOT NULL,epc_start text NOT NULL,quantity integer,epc_decimal_start
numeric(29)
)
WITH OIDS;
ALTER TABLE ftp_epc_audit OWNER TO postgres;

And the query is currently this:

SELECT '', 0, a.*, '', 0, b.* FROM ftp_epc_audit_staging a,
ftp_epc_audit_staging b
WHERE a.sent_filename <> b.sent_filename
AND a.quantity > 0
AND b.quantity > 0
AND a.locked = 1
AND b.locked = 1
AND(( a.epc_decimal_start BETWEEN b.epc_decimal_start AND b.epc_decimal_start + b.quantity - 1 )
OR ( a.epc_decimal_start + a.quantity - 1 BETWEEN b.epc_decimal_start AND b.quantity - 1 )
OR ( a.epc_decimal_start + a.quantity - 1 < b.epc_decimal_start AND a.epc_decimal_start + a.quantity -1 >
b.epc_decimal_start+ b.quantity - 1 )) 

The column sent_filename is the unique value used so that a record does not find itself.  I've got an index on
sent_filename,quantity, locked and epc_decimal_start. 

The query runs fine with a very small number of records.  However, I need to process 60,000 records and this is taking
hours. There must be a fundemental flaw in either the query or my overall design because it doesn't seem like a
challengingtask.   

I've also tried a variant of this query which also takes several hours to run through 60,000 records.

SELECT * FROM ftp_epc_audit_staging a
WHERE a.quantity > 0
AND a.locked = 1
AND EXISTS ( SELECT TRUE FROM ftp_epc_audit_staging b WHERE b.locked = 1
AND b.quantity > 0 AND a.sent_filename <> b.sent_filename
AND(( a.epc_decimal_start BETWEEN b.epc_decimal_start AND b.epc_decimal_start + b.quantity - 1 )
OR ( a.epc_decimal_start + a.quantity - 1 BETWEEN b.epc_decimal_start AND b.quantity - 1 )
OR ( a.epc_decimal_start + a.quantity - 1 < b.epc_decimal_start AND a.epc_decimal_start + a.quantity -1 >
b.epc_decimal_start+ b.quantity - 1 ))) 

I would really appreciate any thoughts on this.

В списке pgsql-sql по дате отправления:

Предыдущее
От: alan
Дата:
Сообщение: how to use explain analyze
Следующее
От: Federico Dal Maso
Дата:
Сообщение: How to obtain a coalesce aggregation in a GROUP BY query?