Обсуждение: PG10.1 autovac killed building extended stats

Поиск
Список
Период
Сортировка

PG10.1 autovac killed building extended stats

От
Justin Pryzby
Дата:
After adding extended/MV stats to a few of our tables a few days ago, it looks
like I wasn't been paying attention and this first crashed 2 nights ago.  Why
at 1am?  not sure.  I have an "reindex" job which runs at 1am, and an
vacuum/analyze job which runs at 2am, but I don't use cron to change
autovac/analyze thresholds..

Nov 16 01:15:42 database kernel: postmaster[16144]: segfault at 0 ip 00000000006f3e3e sp 00007ffe2c8fc650 error 4 in
postgres[400000+693000]
[...]
Nov 17 01:15:41 database kernel: postmaster[7145]: segfault at 0 ip 00000000006f3e3e sp 00007ffe2c8fc650 error 4 in
postgres[400000+693000]

[pryzbyj@database ~]$ ps -fu postgres
UID        PID  PPID  C STIME TTY          TIME CMD
postgres  1757     1  0 Nov09 ?        00:20:43 /usr/pgsql-10/bin/postmaster -D /var/lib/pgsql/10/data
=> no longer running Alvaro's original patch on this server, which was also
first to experience the BRIN crash..

< 2017-11-16 01:15:43.103 -04  >LOG:  server process (PID 16144) was terminated by signal 11: Segmentation fault
< 2017-11-16 01:15:43.103 -04  >DETAIL:  Failed process was running: autovacuum: ANALYZE
public.daily_enodeb_baseband_view_201711

< 2017-11-17 01:15:41.673 -04  >LOG:  server process (PID 7145) was terminated by signal 11: Segmentation fault
< 2017-11-17 01:15:41.673 -04  >DETAIL:  Failed process was running: autovacuum: ANALYZE
public.daily_enodeb_baseband_view_201711

Here's a log of my reindex job for the last two nights' crashes.  It's
suspicious that the baseband table was reindexed ~15min before autovac crashed
during its processing, but it's also unclear to me why it would matter.

[...]
|Thu Nov 16 01:02:54 -04 2017: daily_enodeb_baseband_view_201711: daily_enodeb_baseband_view_201711_unique_idx(repack
current)...
|Thu Nov 16 01:02:59 -04 2017: daily_enodeb_baseband_view_201711: pg_toast.pg_toast_691157026_index(reindex system)...
...
|Thu Nov 16 01:15:22 -04 2017: eric_hss_l2if_metrics_cum: eric_hss_l2if_metrics_cum_start_time_idx(repack
non-partitioned)...
|Thu Nov 16 01:15:27 -04 2017: eric_hss_platform_metrics: eric_hss_platform_metrics_start_time_idx(repack
non-partitioned)...
|WARNING:  terminating connection because of crash of another server process
|DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because
anotherserver process exited abnormally and possibly corrupted shared memory.
 
|HINT:  In a moment you should be able to reconnect to the database and repeat your command.
|WARNING: Error creating index "public"."index_40552734": server closed the connection unexpectedly
|        This probably means the server terminated abnormally
|        before or while processing the request.
|WARNING: Skipping index swapping for "eric_hss_platform_metrics", since no new indexes built
|WARNING: repack failed for "eric_hss_platform_metrics_start_time_idx"
|psql: FATAL:  the database system is in recovery mode
|Filesystem      Size  Used Avail Use% Mounted on
|/dev/vdb        2.6T  2.1T  459G  83% /var/lib/pgsql

[...]
|Fri Nov 17 01:01:44 -04 2017: daily_enodeb_baseband_view_201711: daily_enodeb_baseband_view_201711_unique_idx(repack
current)...
|Fri Nov 17 01:01:54 -04 2017: daily_enodeb_baseband_view_201711: pg_toast.pg_toast_691157026_index(reindex system)...
...
|Fri Nov 17 01:14:58 -04 2017: link_busy_hr: lbh_start_time_idx(repack non-partitioned)...
|Fri Nov 17 01:14:58 -04 2017: loaded_cdr_files: loaded_cdr_files_filename(repack non-partitioned)...
|WARNING:  terminating connection because of crash of another server process
|DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because
anotherserver process exited abnormally and possibly corrupted shared memory.
 
|HINT:  In a moment you should be able to reconnect to the database and repeat your command.
|WARNING: Error creating index "public"."index_30971": server closed the connection unexpectedly
|        This probably means the server terminated abnormally
|        before or while processing the request.
|WARNING: Skipping index swapping for "loaded_cdr_files", since no new indexes built
|WARNING: repack failed for "loaded_cdr_files_filename"
|psql: FATAL:  the database system is in recovery mode

Core was generated by `postgres: autovacuum worker process   gtt             '.
Program terminated with signal 11, Segmentation fault.
#0  statext_ndistinct_build (totalrows=300, numrows=300, rows=0x21df3e8, attrs=<value optimized out>, stats=0x0) at
mvdistinct.c:103
103                                     item->attrs = bms_add_member(item->attrs,

(gdb) p stats
$5 = (VacAttrStats **) 0x0
=> This is an error, no ??

(gdb) p item
$1 = (MVNDistinctItem *) 0x2090928
(gdb) p result
$6 = (MVNDistinct *) 0x2090918

(gdb) p item->attrs
$2 = (Bitmapset *) 0x0
(gdb) p rows
$3 = (HeapTuple *) 0x21df3e8
(gdb) p j
$8 = <value optimized out>
(gdb) p attrs
$4 = <value optimized out>

Let me know if there's anything more I should send.

Justin

(gdb) bt f
#0  statext_ndistinct_build (totalrows=300, numrows=300, rows=0x21df3e8, attrs=<value optimized out>, stats=0x0) at
mvdistinct.c:103      item = 0x2090928       j = <value optimized out>       combination = 0x20908c8       generator =
0x2090d08      result = 0x2090918       k = 2       itemcnt = 0       numattrs = 4       numcombs = <value optimized
out>

#1  0x00000000006f1a0e in BuildRelationExtStatistics (onerel=0x7f7895485fa0, totalrows=300, numrows=300,
rows=0x21df3e8,natts=34147512, vacattrstats=0x0) at extended_stats.c:120       t = <value optimized out>       stat =
0x2090b48      stats = <value optimized out>       lc2 = 0x2090c48       ndistinct = <value optimized out>
dependencies= <value optimized out>       pg_stext = 0x7f7895495478       lc = 0x2090cb8       stats = 0x0       cxt =
0x2054cd8      oldcxt = 0x1fd0f58       __func__ = "BuildRelationExtStatistics"
 

#2  0x000000000057aad6 in do_analyze_rel (onerel=0x7f7895485fa0, options=98, params=0x2094e90, va_cols=0x0,
acquirefunc=0x57bb20<acquire_sample_rows>, relpages=80, inh=0 '\000', in_outer_xact=0 '\000',   elevel=13) at
analyze.c:593      col_context = 0x2054cd8       old_context = 0x1fd0f58       attr_cnt = 24       tcnt = <value
optimizedout>       i = <value optimized out>       ind = <value optimized out>       Irel = 0x2074838       nindexes =
1      hasindex = <value optimized out>       vacattrstats = 0x2095868       indexdata = 0x2097810       targrows =
<valueoptimized out>       numrows = 300       totalrows = 300       totaldeadrows = 20       rows = 0x21df3e8
ru0= {tv = {tv_sec = 1510809341, tv_usec = 958864}, ru = {ru_utime = {tv_sec = 0, tv_usec = 436933}, ru_stime = {tv_sec
=0, tv_usec = 18997}, ru_maxrss = 14988, ru_ixrss = 0, ru_idrss = 0,           ru_isrss = 0, ru_minflt = 5090,
ru_majflt= 0, ru_nswap = 0, ru_inblock = 2560, ru_oublock = 240, ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0,
ru_nvcsw= 163, ru_nivcsw = 26}}       starttime = 0       caller_context = 0x2055338       save_userid = <value
optimizedout>       save_sec_context = 0       save_nestlevel = 2       __func__ = "do_analyze_rel"
 

#3  0x000000000057b7a4 in analyze_rel (relid=<value optimized out>, relation=0x7ffe2c8fcfc0, options=98,
params=0x2094e90,va_cols=0x0, in_outer_xact=0 '\000', bstrategy=0x21c1358) at analyze.c:276       onerel =
0x7f7895485fa0      elevel = 13       acquirefunc = 0x57bb20 <acquire_sample_rows>       relpages = 80       __func__ =
"analyze_rel"

#4  0x00000000005e6edf in vacuum (options=98, relation=0x7ffe2c8fcfc0, relid=<value optimized out>, params=0x2094e90,
va_cols=0x0,bstrategy=<value optimized out>, isTopLevel=1 '\001') at vacuum.c:323       relid = 691157026       cur =
0x208a8e8      save_exception_stack = 0x7ffe2c8fce30       save_context_stack = 0x0       local_sigjmp_buf = {{__jmpbuf
={33904488, 2571701076356583921, 9927477, 4294967295, 200, 20, -2572437610069143055, 2571700781410898417},
__mask_was_saved= 0, __saved_mask = {__val = {0, 0, 0, 0,               34861944, 0, 140155877187248, 140160420441688,
2,140729646043152, 257015007210, 140160420441688, 8282462, 1510809341, 958742, 140160420553984}}}}       stmttype =
<valueoptimized out>       in_outer_xact = 0 '\000'       use_own_xacts = 1 '\001'       relations = 0x208a908
in_vacuum= 1 '\001'       __func__ = "vacuum"
 

#5  0x00000000006a7d22 in autovacuum_do_vac_analyze () at autovacuum.c:3096       rangevar = {type = T_Invalid,
catalogname= 0x0, schemaname = 0x213f378 "public", relname = 0x21c18b8 "daily_enodeb_baseband_view_201711", inh = 0
'\000',relpersistence = 0 '\000', alias = 0x0,          location = -1}
 
#6  do_autovacuum () at autovacuum.c:2451       save_exception_stack = 0x7ffe2c8fd110       save_context_stack = 0x0
  local_sigjmp_buf = {{__jmpbuf = {691157026, 2571696918420345329, 34163832, 4294967295, 200, 20, -2572437610046074383,
2571700883870929393},__mask_was_saved = 0, __saved_mask = {__val = {0, 0,                140729646043392,
15874306463571202545,2571702211223048689, 0, 6998306, 140160420441688, 747622240, 13239856, 13239680, 0, 6444974,
13241760,0, 0}}}}       tab = 0x2094e88       skipit = 0 '\000'       stdVacuumCostDelay = 0       stdVacuumCostLimit =
200      iter = <value optimized out>       relid = 691157026       classRel = 0x7f7895453e58       tuple = <value
optimizedout>       relScan = <value optimized out>       dbForm = <value optimized out>       table_oids = <value
optimizedout>       orphan_oids = 0x0       ctl = {num_partitions = 0, ssize = 0, dsize = 0, max_dsize = 0, ffactor =
0,keysize = 4, entrysize = 80, hash = 0, match = 0, keycopy = 0, alloc = 0, hcxt = 0x0, hctl = 0x0}
table_toast_map= 0x20a5338       cell = 0x2095020       shared = 0x206a208       dbentry = 0x206a490       bstrategy =
0x21c1358      key = {sk_flags = 0, sk_attno = 16, sk_strategy = 3, sk_subtype = 0, sk_collation = 100, sk_func =
{fn_addr= 0x750cd0 <chareq>, fn_oid = 61, fn_nargs = 2, fn_strict = 1 '\001',            fn_retset = 0 '\000', fn_stats
=2 '\002', fn_extra = 0x0, fn_mcxt = 0x1fd0378, fn_expr = 0x0}, sk_argument = 116}       pg_class_desc = 0x2093858
effective_multixact_freeze_max_age = 400000000       did_vacuum = <value optimized out>       found_concurrent_worker =
0'\000'       i = <value optimized out>       __func__ = "do_autovacuum"
 
#7  0x00000000006a84d8 in AutoVacWorkerMain (argc=<value optimized out>, argv=<value optimized out>) at
autovacuum.c:1694      dbname =
"gtt\000\000\000\000\000\325I\001\327;\000\000\000\020?\000\000\000\000\000\000E\352\200\327;\000\000\000\001\000\000\000\000\000\000\000\002\000\000\000\000\000\000\000\374$\316",
'\000'<repeats 12 times>       local_sigjmp_buf = {{__jmpbuf = {0, 2571701076356059633, 33315888, 1510809326, 16143,
1510808581,-2572437610981404175, 2571700891614532081}, __mask_was_saved = 1, __saved_mask = {__val = {
18446744066192964103,13188120, 1, 13186760, 0, 16143, 1510808581, 0, 257010229664, 1, 0, 140729646043600, 257014425560,
4,0, 4}}}}       dbid = 16400       __func__ = "AutoVacWorkerMain"
 
#8  0x00000000006a85a6 in StartAutoVacWorker () at autovacuum.c:1499       worker_pid = 0       __func__ =
"StartAutoVacWorker"
#9  0x00000000006b606a in StartAutovacuumWorker (postgres_signal_arg=<value optimized out>) at postmaster.c:5414
bn= 0x1fc5c30
 
#10 sigusr1_handler (postgres_signal_arg=<value optimized out>) at postmaster.c:5111       save_errno = 4
__func__= "sigusr1_handler"
 


Re: PG10.1 autovac killed building extended stats

От
Alvaro Herrera
Дата:
Justin Pryzby wrote:
> After adding extended/MV stats to a few of our tables a few days ago, it looks
> like I wasn't been paying attention and this first crashed 2 nights ago.  Why
> at 1am?  not sure.  I have an "reindex" job which runs at 1am, and an
> vacuum/analyze job which runs at 2am, but I don't use cron to change
> autovac/analyze thresholds..

Can you please show the definition of the table and of the extended
stats?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: PG10.1 autovac killed building extended stats

От
Alvaro Herrera
Дата:
Justin Pryzby wrote:

> Core was generated by `postgres: autovacuum worker process   gtt             '.
> Program terminated with signal 11, Segmentation fault.
> #0  statext_ndistinct_build (totalrows=300, numrows=300, rows=0x21df3e8, attrs=<value optimized out>, stats=0x0) at
mvdistinct.c:103
> 103                                     item->attrs = bms_add_member(item->attrs,
> 
> (gdb) p stats
> $5 = (VacAttrStats **) 0x0
> => This is an error, no ??

Not necessarily, but then I think this previous code is busted:
       /*        * Check if we can build these stats based on the column analyzed. If        * not, report this fact
(exceptin autovacuum) and move on.        */       stats = lookup_var_attr_stats(onerel, stat->columns,
                   natts, vacattrstats);       if (!stats && !IsAutoVacuumWorkerProcess())       {
ereport(WARNING,                  (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),                    errmsg("statistics
object\"%s.%s\" could not be computed for relation \"%s.%s\"",                           stat->schema, stat->name,
                    get_namespace_name(onerel->rd_rel->relnamespace),
RelationGetRelationName(onerel)),                   errtable(onerel)));           continue;       }
 

If I recall things correctly, the "continue" should be executed
regardless of IsAutoVacuumWorkerProcess() (only the log should be
conditional on that).  I'm not sure how I arrived at the current coding,
which I added in bf2a691e02d7 -- probably just fuzzy thinking.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: PG10.1 autovac crashed building extended stats

От
Justin Pryzby
Дата:
On Fri, Nov 17, 2017 at 01:27:49PM -0300, Alvaro Herrera wrote:
> Justin Pryzby wrote:
> > After adding extended/MV stats to a few of our tables a few days ago, it looks
> > like I wasn't been paying attention and this first crashed 2 nights ago.  Why
> > at 1am?  not sure.  I have an "reindex" job which runs at 1am, and an
> > vacuum/analyze job which runs at 2am, but I don't use cron to change
> > autovac/analyze thresholds..
> 
> Can you please show the definition of the table and of the extended
> stats?

gtt=# SELECT stxrelid::regclass, stxname, stxkind FROM pg_statistic_ext ORDER BY 1;                   stxrelid
          |                          stxname                          | stxkind 
 

-------------------------------------------------+-----------------------------------------------------------+---------daily_umts_eric_cell_traffic_hs_view_201603
   | daily_umts_eric_cell_traffic_hs_view_201603_key_stats     | {d,f}daily_umts_eric_cell_traffic_hs_eul_view_201603 |
daily_umts_eric_cell_traffic_hs_eul_view_201603_key_stats| {d,f}daily_eric_umts_rnc_utrancell_view_201603       |
daily_eric_umts_rnc_utrancell_view_201603_key_stats      | {d,f}daily_umts_eric_cell_traffic_hs_view_201504     |
daily_umts_eric_cell_traffic_hs_view_201504_key_stats    | {d,f}daily_umts_eric_cell_traffic_hs_eul_view_201504 |
daily_umts_eric_cell_traffic_hs_eul_view_201504_key_stats| {d,f}daily_eric_umts_rnc_utrancell_view_201504       |
daily_eric_umts_rnc_utrancell_view_201504_key_stats      | {d,f}daily_enodeb_ncell_view_201603                  |
daily_enodeb_ncell_view_201603_key_stats                 | {d,f}daily_enodeb_ncell_view_201503                  |
daily_enodeb_ncell_view_201503_key_stats                 | {d,f}daily_enodeb_ncell_view_201502                  |
daily_enodeb_ncell_view_201502_key_stats                 | {d,f}daily_enodeb_ncell_view_201501                  |
daily_enodeb_ncell_view_201501_key_stats                 | {d,f}daily_enodeb_baseband_view_201603               |
daily_enodeb_baseband_view_201603_key_stats              | {d,f}daily_enodeb_baseband_view_201503               |
daily_enodeb_baseband_view_201503_key_stats              | {d,f}daily_enodeb_baseband_view_201502               |
daily_enodeb_baseband_view_201502_key_stats              | {d,f}daily_enodeb_baseband_view_201501               |
daily_enodeb_baseband_view_201501_key_stats              | {d,f}daily_enodeb_cell_view_201603                   |
daily_enodeb_cell_view_201603_key_stats                  | {d,f}daily_enodeb_cell_view_201502                   |
daily_enodeb_cell_view_201502_key_stats                  | {d,f}daily_enodeb_201603                             |
daily_enodeb_201603_key_stats                            | {d,f}daily_enodeb_201503                             |
daily_enodeb_201503_key_stats                            | {d,f}daily_enodeb_201502                             |
daily_enodeb_201502_key_stats                            | {d,f}daily_enodeb_201501                             |
daily_enodeb_201501_key_stats                            | {d,f}daily_enodeb_cell_view_201710                   | x
                                                   | {d,f}daily_cdr_pstn_user_201711                      |
daily_cdr_pstn_user_201711_key_stats                     | {d,f}daily_umts_eric_cell_traffic_hs_eul_view_201711 |
daily_umts_eric_cell_traffic_hs_eul_view_201711_key_stats| {d,f}daily_umts_eric_cell_traffic_hs_view_201711     |
daily_umts_eric_cell_traffic_hs_view_201711_key_stats    | {d,f}daily_eric_umts_rnc_utrancell_view_201711       |
daily_eric_umts_rnc_utrancell_view_201711_key_stats      | {d,f}daily_enodeb_baseband_view_201711               |
daily_enodeb_baseband_view_201711_key_stats              | {d,f}daily_enodeb_cell_view_201711                   |
daily_enodeb_cell_view_201711_key_stats                  | {d,f}daily_enodeb_ncell_view_201711                  |
daily_enodeb_ncell_view_201711_key_stats                 | {d,f}daily_enodeb_201711                             |
daily_enodeb_201711_key_stats                            | {d,f}
 
(29 rows)

Here's the table which was 1) reindexed (including its toast) and 2)
autovacuumed(crashed):

gtt=# \d+ daily_enodeb_baseband_view_201711                                             Table
"public.daily_enodeb_baseband_view_201711"              Column                |            Type             | Collation
|Nullable | Default | Storage  | Stats target | Description 
 

--------------------------------------+-----------------------------+-----------+----------+---------+----------+--------------+-------------device_id
                          | smallint                    |           | not null |         | plain    | 400          |
site_id                             | smallint                    |           | not null |         | plain    | 400
    | start_time                           | timestamp without time zone |           | not null |         | plain    |
0           | bbprocessingresource                 | text                        |           | not null |         |
extended| 400          | interval_seconds                     | bigint                      |           |          |
    | plain    |              | interval_seconds_min                 | smallint                    |           |
 |         | plain    |              | interval_seconds_max                 | smallint                    |           |
        |         | plain    |              | nbevt                                | bigint                      |
    |          |         | plain    |              | nbevt_min                            | integer
|          |          |         | plain    |              | nbevt_max                            | integer
      |           |          |         | plain    |              | 
 
[...]

Indexes:   "daily_enodeb_baseband_view_201711_unique_idx" UNIQUE, btree (start_time, site_id, device_id,
bbprocessingresource)
Check constraints:   "daily_enodeb_baseband_view_201711_start_time_check" CHECK (start_time >= '2017-11-01
00:00:00'::timestampwithout time zone AND start_time < '2017-12-01 00:00:00'::timestamp without time zone)
 
Statistics objects:   "public"."daily_enodeb_baseband_view_201711_key_stats" (ndistinct, dependencies) ON device_id,
site_id,start_time, bbprocessingresource FROM daily_enodeb_baseband_view_201711
 
Inherits: daily_enodeb_baseband_view
Options: autovacuum_analyze_scale_factor=0.005, autovacuum_analyze_threshold=2

gtt=# SELECT * FROM pg_statistic_ext WHERE stxname='daily_enodeb_baseband_view_201711_key_stats';
stxrelid        | 691157026
stxname         | daily_enodeb_baseband_view_201711_key_stats
stxnamespace    | 2200
stxowner        | 18819
stxkeys         | 1 2 3 4
stxkind         | {d,f}
stxndistinct    | 
stxdependencies | 

Justin


Re: PG10.1 autovac killed building extended stats

От
Justin Pryzby
Дата:
On Fri, Nov 17, 2017 at 01:36:00PM -0300, Alvaro Herrera wrote:
> Justin Pryzby wrote:
> 
> > Core was generated by `postgres: autovacuum worker process   gtt             '.
> > Program terminated with signal 11, Segmentation fault.
> > #0  statext_ndistinct_build (totalrows=300, numrows=300, rows=0x21df3e8, attrs=<value optimized out>, stats=0x0) at
mvdistinct.c:103
> > 103                                     item->attrs = bms_add_member(item->attrs,
> > 
> > (gdb) p stats
> > $5 = (VacAttrStats **) 0x0
> > => This is an error, no ??
> 
> Not necessarily, but then I think this previous code is busted:

> If I recall things correctly, the "continue" should be executed
> regardless of IsAutoVacuumWorkerProcess() (only the log should be
> conditional on that).  I'm not sure how I arrived at the current coding,
> which I added in bf2a691e02d7 -- probably just fuzzy thinking.

Is it useful to run with that change ?

Justin


Re: PG10.1 autovac killed building extended stats

От
Justin Pryzby
Дата:
On Fri, Nov 17, 2017 at 03:43:52PM -0600, Justin Pryzby wrote:
> On Fri, Nov 17, 2017 at 01:36:00PM -0300, Alvaro Herrera wrote:
> > Justin Pryzby wrote:
> > 
> > > Core was generated by `postgres: autovacuum worker process   gtt             '.
> > > Program terminated with signal 11, Segmentation fault.
> > > #0  statext_ndistinct_build (totalrows=300, numrows=300, rows=0x21df3e8, attrs=<value optimized out>, stats=0x0)
atmvdistinct.c:103
 
> > > 103                                     item->attrs = bms_add_member(item->attrs,
> > > 
> > > (gdb) p stats
> > > $5 = (VacAttrStats **) 0x0
> > > => This is an error, no ??
> > 
> > Not necessarily, but then I think this previous code is busted:
> 
> > If I recall things correctly, the "continue" should be executed
> > regardless of IsAutoVacuumWorkerProcess() (only the log should be
> > conditional on that).  I'm not sure how I arrived at the current coding,
> > which I added in bf2a691e02d7 -- probably just fuzzy thinking.
> 
> Is it useful to run with that change ?

FYI: the affected server ran without crashing since Nov 18 with this change.

--- postgresql-10.1.orig/src/backend/statistics/extended_stats.c    2017-11-06 20:46:52.000000000 -0400
+++ postgresql-10.1/src/backend/statistics/extended_stats.c    2017-11-18 18:23:44.057988419 -0400
@@ -98,20 +98,21 @@
         if (!stats && !IsAutoVacuumWorkerProcess())
         {
             ereport(WARNING,
                     (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
                      errmsg("statistics object \"%s.%s\" could not be computed for relation \"%s.%s\"",
                             stat->schema, stat->name,
                             get_namespace_name(onerel->rd_rel->relnamespace),
                             RelationGetRelationName(onerel)),
                      errtable(onerel)));
-            continue;
         }
+        if (!stats)
+            continue;
 
         /* check allowed number of dimensions */
         Assert(bms_num_members(stat->columns) >= 2 &&
                bms_num_members(stat->columns) <= STATS_MAX_DIMENSIONS);
 
         /* compute statistic of each requested type */
         foreach(lc2, stat->types)
         {
             char        t = (char) lfirst_int(lc2);

Justin

Вложения

Re: PG10.1 autovac killed building extended stats

От
Alvaro Herrera
Дата:
Justin Pryzby wrote:
> On Fri, Nov 17, 2017 at 03:43:52PM -0600, Justin Pryzby wrote:
> > On Fri, Nov 17, 2017 at 01:36:00PM -0300, Alvaro Herrera wrote:

> > > If I recall things correctly, the "continue" should be executed
> > > regardless of IsAutoVacuumWorkerProcess() (only the log should be
> > > conditional on that).  I'm not sure how I arrived at the current coding,
> > > which I added in bf2a691e02d7 -- probably just fuzzy thinking.
> > 
> > Is it useful to run with that change ?
> 
> FYI: the affected server ran without crashing since Nov 18 with this change.

Many thanks for patching and verifying that it works.  I pushed the fix
now.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services