Обсуждение: [BUGS] BUG #14581: invalid cache ID: 41 CONTEXT: parallel worker

Поиск
Список
Период
Сортировка

[BUGS] BUG #14581: invalid cache ID: 41 CONTEXT: parallel worker

От
stepya@ukr.net
Дата:
The following bug has been logged on the website:

Bug reference:      14581
Logged by:          Stepan Yankevych
Email address:      stepya@ukr.net
PostgreSQL version: 9.6.2
Operating system:   RedHat
Description:

Time to time i have invalid cache ID: 41 while running simple query with
parallelism.
for example 
select count(1) from client_order where date_id =  20170301;
Crashes with (see error log)  

< 2017-03-07 08:57:45.312 EST >ERROR: invalid cache ID: 41 
< 2017-03-07 08:57:45.312 EST >ERROR: invalid cache ID: 41 
< 2017-03-07 08:57:45.312 EST >ERROR: invalid cache ID: 41 
< 2017-03-07 08:57:45.312 EST >ERROR: invalid cache ID: 41 
< 2017-03-07 08:57:45.312 EST >ERROR: invalid cache ID: 41 
< 2017-03-07 08:57:45.312 EST [unknown] SELECT>ERROR: invalid cache ID: 41

< 2017-03-07 08:57:45.312 EST [unknown] SELECT>CONTEXT: parallel worker 
< 2017-03-07 08:57:45.312 EST [unknown] SELECT>STATEMENT: select count(1) 



--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14581: invalid cache ID: 41 CONTEXT: parallel worker

От
Tom Lane
Дата:
stepya@ukr.net writes:
> Time to time i have invalid cache ID: 41 while running simple query with
> parallelism.

Interesting, but unless you can show us how to reproduce this, we're not
going to be able to do much about it.

            regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14581: invalid cache ID: 41 CONTEXT: parallel worker

От
Stepan Yankevych
Дата:
Hi Tom.

Thanks for so quick response.


It quite difficult to reproduce.
The only observation. Usually it crashes with parallelism only on quite big tables  with inheritance .

The main table contains many partitions (inherited tables)
We query main table with condition on date_id = ? . In the execution plan we can see one partition only. All the next
runscrashes as well.  
Reconnect to the DB can help but no always.
After some time the same query can successfully be run.


Anyway I will try to write script to reproduce it.  But not sure if I could be so lucky to reproduce it on a sample.

Thanks!
Best Regards,
Stepan Yankevych
Lead Software Engineer


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, March 7, 2017 18:03 PM
To: stepya@ukr.net
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #14581: invalid cache ID: 41 CONTEXT: parallel worker

stepya@ukr.net writes:
> Time to time i have invalid cache ID: 41 while running simple query
> with parallelism.

Interesting, but unless you can show us how to reproduce this, we're not going to be able to do much about it.

            regards, tom lane


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14581: invalid cache ID: 41 CONTEXT: parallel worker

От
Tom Lane
Дата:
Stepan Yankevych <Stepan_Yankevych@epam.com> writes:
> It quite difficult to reproduce.
> The only observation. Usually it crashes with parallelism only on quite big tables  with inheritance .

If you can't extract a test case, one thing that would be quite helpful is
to get a stack trace from the point of the error.  There are only four
occurrences of

        elog(ERROR, "invalid cache ID: %d", cacheId);

and they're all in src/backend/utils/cache/syscache.c.  If you could
change those to elog(PANIC, ...) in a debug-enabled build, run till
you get the failure, and then use gdb to get a backtrace from the
ensuing core dump, that might be enough info to fix it.

            regards, tom lane


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14581: invalid cache ID: 41 CONTEXT: parallel worker

От
Stepan Yankevych
Дата:
Unfortunately it is almost impossible.
We experiencing such error on our PROD env only (starting from 9.6.0 version )

I will think about debug-enabled build on some of our dev environment, but not sure if we can reproduce it there due to
muchless amount of data. 


Best Regards,
Stepan Yankevych
Lead Software Engineer


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, March 7, 2017 20:16 PM
To: Stepan Yankevych <Stepan_Yankevych@epam.com>
Cc: stepya@ukr.net; pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #14581: invalid cache ID: 41 CONTEXT: parallel worker

Stepan Yankevych <Stepan_Yankevych@epam.com> writes:
> It quite difficult to reproduce.
> The only observation. Usually it crashes with parallelism only on quite big tables  with inheritance .

If you can't extract a test case, one thing that would be quite helpful is to get a stack trace from the point of the
error. There are only four occurrences of 

        elog(ERROR, "invalid cache ID: %d", cacheId);

and they're all in src/backend/utils/cache/syscache.c.  If you could change those to elog(PANIC, ...) in a
debug-enabledbuild, run till you get the failure, and then use gdb to get a backtrace from the ensuing core dump, that
mightbe enough info to fix it. 

            regards, tom lane


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: BUG #14581: invalid cache ID: 41 CONTEXT: parallel worker

От
Laurenz Albe
Дата:
stepya@ukr.net wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      14581
> Logged by:          Stepan Yankevych
> Email address:      stepya@ukr.net
> PostgreSQL version: 9.6.2
> Operating system:   RedHat
> Description:        
> 
> Time to time i have invalid cache ID: 41 while running simple query with
> parallelism.
> for example 
> select count(1) from client_order where date_id =  20170301;
> Crashes with (see error log)  
> 
> < 2017-03-07 08:57:45.312 EST >ERROR: invalid cache ID: 41 
> < 2017-03-07 08:57:45.312 EST >ERROR: invalid cache ID: 41 
> < 2017-03-07 08:57:45.312 EST >ERROR: invalid cache ID: 41 
> < 2017-03-07 08:57:45.312 EST >ERROR: invalid cache ID: 41 
> < 2017-03-07 08:57:45.312 EST >ERROR: invalid cache ID: 41 
> < 2017-03-07 08:57:45.312 EST [unknown] SELECT>ERROR: invalid cache ID: 41
> 
> < 2017-03-07 08:57:45.312 EST [unknown] SELECT>CONTEXT: parallel worker 
> < 2017-03-07 08:57:45.312 EST [unknown] SELECT>STATEMENT: select count(1) 

Could it be that the oracle_fdw extension was loaded?

There was a bug reported yesterday that would explain this error:
https://github.com/laurenz/oracle_fdw/issues/215

If oracle_fdw is involved, the latest commit should fix the problem.

Yours,
Laurenz Albe