Обсуждение: Submit query using dblink that hung the host

Поиск
Список
Период
Сортировка

Submit query using dblink that hung the host

От
Alex Lai
Дата:
My host was freeze up after submitted the following query that prevented
me to ssh to the host.
I was unable to psql and submit pg_cancel_backend.  The tables have over
20 millions rows.
Does dblink uses too much resource from the host when join large tables.
Hope someone can give me suggestion.

CREATE OR REPLACE VIEW missing_archiveset_in_mds_ops
  (filename, esdt, archiveset) AS
select * from dblink('host=ops_host port=4001 user=omiops dbname=omiops',
'select filename, esdt, archiveset from
  filemeta_archiveset join filemeta_common using(fileid)
  join file using(fileid)') as t1(filename text,esdt text,archiveset int)
where (filename, esdt, archiveset) not in (
select filename, esdt, archiveset
  from dblink('host=ops_host port=4002 user=omiops dbname=metamine',
'select filename, esdt, archiveset from
  file_archiveset join filemeta using(fileid)
  join filename using(fileid)') as t2(filename text,esdt text,archiveset
int));


--
Best regards,


Alex Lai
OMI SIPS DBA ADNET Systems , Inc.
7515 Mission Drive,
Suite A100 Lanham, MD 20706
301-352-4657 (phone)
301-352-0437 (fax)
alai@sesda2.com


Re: Submit query using dblink that hung the host

От
Merlin Moncure
Дата:
On Thu, Jun 14, 2012 at 11:15 AM, Alex Lai <mlai@sesda2.com> wrote:
> My host was freeze up after submitted the following query that prevented me
> to ssh to the host.
> I was unable to psql and submit pg_cancel_backend.  The tables have over 20
> millions rows.
> Does dblink uses too much resource from the host when join large tables.
> Hope someone can give me suggestion.
>
> CREATE OR REPLACE VIEW missing_archiveset_in_mds_ops
>  (filename, esdt, archiveset) AS
> select * from dblink('host=ops_host port=4001 user=omiops dbname=omiops',
> 'select filename, esdt, archiveset from
>  filemeta_archiveset join filemeta_common using(fileid)
>  join file using(fileid)') as t1(filename text,esdt text,archiveset int)
> where (filename, esdt, archiveset) not in (
> select filename, esdt, archiveset
>  from dblink('host=ops_host port=4002 user=omiops dbname=metamine',
> 'select filename, esdt, archiveset from
>  file_archiveset join filemeta using(fileid)
>  join filename using(fileid)') as t2(filename text,esdt text,archiveset
> int));

It would be interesting to know what exactly was the specific trigger
that brought down the server since dblink should not be allowed to do
that.  I'm assuming out of memory since libpq (used on the dblink
client side) is not memory bounded. 9.2 will include new row
processing features that should drastically reduce dblink memory
consumption and will probably prevent this from happening again.

In the meantime, restructure both dblinks to gather the data into
separate local tables (temporary if you can wing it), then create
indexes in advance of the join.

merlin

Re: Submit query using dblink that hung the host

От
Alban Hertroys
Дата:
On 14 Jun 2012, at 20:25, Merlin Moncure wrote:

>> CREATE OR REPLACE VIEW missing_archiveset_in_mds_ops
>>  (filename, esdt, archiveset) AS
>> select * from dblink('host=ops_host port=4001 user=omiops dbname=omiops',
>> 'select filename, esdt, archiveset from
>>  filemeta_archiveset join filemeta_common using(fileid)
>>  join file using(fileid)') as t1(filename text,esdt text,archiveset int)
>> where (filename, esdt, archiveset) not in (
>> select filename, esdt, archiveset
>>  from dblink('host=ops_host port=4002 user=omiops dbname=metamine',
>> 'select filename, esdt, archiveset from
>>  file_archiveset join filemeta using(fileid)
>>  join filename using(fileid)') as t2(filename text,esdt text,archiveset
>> int));
>
> In the meantime, restructure both dblinks to gather the data into
> separate local tables (temporary if you can wing it), then create
> indexes in advance of the join.


I was thinking along those lines, with the difference that I'd create a temp table with the data from the 2nd dblink
queryin the database of the first. That way you can perform the query in one database, which will only have to move the
rowsneeded for the end result. 

I also notice that you don't use any data from the local database in that query at all. Perhaps you could query the
databaseon port 4001 instead? That would seem to make more sense for this particular query. 

Alban Hertroys

--
The scale of a problem often equals the size of an ego.