Обсуждение: freebsd + postgresql 8.3.3 = 100% of cpu usage on stats collector?

Поиск
Список
Период
Сортировка

freebsd + postgresql 8.3.3 = 100% of cpu usage on stats collector?

От
hubert depesz lubaczewski
Дата:
hi
first of all - i know next to nothing about bsd.

friend asked me to setup replication on their machines, and i noticed
that one of cpus (2 quad xeons) is used to 100%:
pgsql  58241 99.0  0.2 22456  7432  ??  Rs   Thu10AM 1530:35.93 postgres: stats collector process    (postgres)

what might be the reason?

here are settings:
              name               |                setting
---------------------------------+---------------------------------------
 add_missing_from                | off
 allow_system_table_mods         | off
 archive_command                 | (disabled)
 archive_mode                    | off
 archive_timeout                 | 0
 array_nulls                     | on
 authentication_timeout          | 60
 autovacuum                      | on
 autovacuum_analyze_scale_factor | 0.1
 autovacuum_analyze_threshold    | 50
 autovacuum_freeze_max_age       | 200000000
 autovacuum_max_workers          | 3
 autovacuum_naptime              | 60
 autovacuum_vacuum_cost_delay    | 20
 autovacuum_vacuum_cost_limit    | -1
 autovacuum_vacuum_scale_factor  | 0.2
 autovacuum_vacuum_threshold     | 50
 backslash_quote                 | safe_encoding
 bgwriter_delay                  | 200
 bgwriter_lru_maxpages           | 100
 bgwriter_lru_multiplier         | 2
 block_size                      | 8192
 bonjour_name                    |
 check_function_bodies           | on
 checkpoint_completion_target    | 0.5
 checkpoint_segments             | 3
 checkpoint_timeout              | 300
 checkpoint_warning              | 30
 client_encoding                 | UTF8
 client_min_messages             | notice
 commit_delay                    | 0
 commit_siblings                 | 5
 config_file                     | /usr/local/pgsql/data/postgresql.conf
 constraint_exclusion            | off
 cpu_index_tuple_cost            | 0.005
 cpu_operator_cost               | 0.0025
 cpu_tuple_cost                  | 0.01
 custom_variable_classes         |
 data_directory                  | /usr/local/pgsql/data
 DateStyle                       | ISO, YMD
 db_user_namespace               | off
 deadlock_timeout                | 1000
 debug_assertions                | off
 debug_pretty_print              | off
 debug_print_parse               | off
 debug_print_plan                | off
 debug_print_rewritten           | off
 default_statistics_target       | 10
 default_tablespace              |
 default_text_search_config      | pg_catalog.simple
 default_transaction_isolation   | read committed
 default_transaction_read_only   | off
 default_with_oids               | off

 dynamic_library_path            | $libdir
 effective_cache_size            | 16384
 enable_bitmapscan               | on
 enable_hashagg                  | on
 enable_hashjoin                 | on
 enable_indexscan                | on
 enable_mergejoin                | on
 enable_nestloop                 | on
 enable_seqscan                  | on
 enable_sort                     | on
 enable_tidscan                  | on
 escape_string_warning           | on
 explain_pretty_print            | on
 external_pid_file               |
 extra_float_digits              | 0
 from_collapse_limit             | 8
 fsync                           | on
 full_page_writes                | on
 geqo                            | on
 geqo_effort                     | 5
 geqo_generations                | 0
 geqo_pool_size                  | 0
 geqo_selection_bias             | 2
 geqo_threshold                  | 12
 gin_fuzzy_search_limit          | 0
 hba_file                        | /usr/local/pgsql/data/pg_hba.conf
 ident_file                      | /usr/local/pgsql/data/pg_ident.conf
 ignore_system_indexes           | off
 integer_datetimes               | off
 join_collapse_limit             | 8
 krb_caseins_users               | off
 krb_realm                       |
 krb_server_hostname             |
 krb_server_keyfile              |
 krb_srvname                     | postgres
 lc_collate                      | C
 lc_ctype                        | pl_PL.UTF-8
 lc_messages                     | pl_PL.UTF-8
 lc_monetary                     | pl_PL.UTF-8
 lc_numeric                      | pl_PL.UTF-8
 lc_time                         | pl_PL.UTF-8
 listen_addresses                | 192.168.0.160
 local_preload_libraries         |
 log_autovacuum_min_duration     | -1
 log_checkpoints                 | off
 log_connections                 | off
 log_destination                 | syslog
 log_directory                   | pg_log
 log_disconnections              | off
 log_duration                    | off
 log_error_verbosity             | default
 log_executor_stats              | off
 log_filename                    | postgresql-%Y-%m-%d_%H%M%S.log
 log_hostname                    | off
 log_line_prefix                 |
 log_lock_waits                  | off
 log_min_duration_statement      | -1
 log_min_error_statement         | error
 log_min_messages                | notice
 log_parser_stats                | off
 log_planner_stats               | off
 log_rotation_age                | 1440
 log_rotation_size               | 10240
 log_statement                   | none
 log_statement_stats             | off
 log_temp_files                  | -1
 log_timezone                    | Poland
 log_truncate_on_rotation        | off
 logging_collector               | off
 maintenance_work_mem            | 16384
 max_connections                 | 180
 max_files_per_process           | 1000
 max_fsm_pages                   | 204800
 max_fsm_relations               | 1000
 max_function_args               | 100
 max_identifier_length           | 63
 max_index_keys                  | 32
 max_locks_per_transaction       | 64
 max_prepared_transactions       | 5
 max_stack_depth                 | 2048
 password_encryption             | on
 port                            | 5432
 post_auth_delay                 | 0
 pre_auth_delay                  | 0
 random_page_cost                | 4
 regex_flavor                    | advanced
 search_path                     | "$user",public
 seq_page_cost                   | 1
 server_encoding                 | UTF8
 server_version                  | 8.3.3
 server_version_num              | 80303
 session_replication_role        | origin
 shared_buffers                  | 4096
 shared_preload_libraries        |
 silent_mode                     | on
 sql_inheritance                 | on
 ssl                             | off
 ssl_ciphers                     | ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH
 standard_conforming_strings     | off
 statement_timeout               | 0
 superuser_reserved_connections  | 3
 synchronize_seqscans            | on
 synchronous_commit              | on
 syslog_facility                 | LOCAL0
 syslog_ident                    | postgres
 tcp_keepalives_count            | 0
 tcp_keepalives_idle             | 0
 tcp_keepalives_interval         | 0
 temp_buffers                    | 1024
 temp_tablespaces                |
 TimeZone                        | Poland
 timezone_abbreviations          | Default
 trace_notify                    | off
 trace_sort                      | off
 track_activities                | on
 track_counts                    | on
 transaction_isolation           | read committed
 transaction_read_only           | off
 transform_null_equals           | off
 unix_socket_directory           |
 unix_socket_group               |
 unix_socket_permissions         | 511
 update_process_title            | off
 vacuum_cost_delay               | 0
 vacuum_cost_limit               | 200
 vacuum_cost_page_dirty          | 20
 vacuum_cost_page_hit            | 1
 vacuum_cost_page_miss           | 10
 vacuum_freeze_min_age           | 100000000
 wal_buffers                     | 8
 wal_sync_method                 | fsync
 wal_writer_delay                | 200
 work_mem                        | 1024
 xmlbinary                       | base64
 xmloption                       | content
 zero_damaged_pages              | off
(189 rows)

any ideas?

Best regards,

depesz


Re: freebsd + postgresql 8.3.3 = 100% of cpu usage on stats collector?

От
Tom Lane
Дата:
hubert depesz lubaczewski <depesz@depesz.com> writes:
> friend asked me to setup replication on their machines, and i noticed
> that one of cpus (2 quad xeons) is used to 100%:
> pgsql  58241 99.0  0.2 22456  7432  ??  Rs   Thu10AM 1530:35.93 postgres: stats collector process    (postgres)

Hmm, we had some problems with the stats collector going nuts a couple
of versions ago; maybe you've found another way to trigger that.
Anything unusual about this DB (lots of tables, for instance)?
Can you try strace and/or gdb to figure out what the collector is doing?

            regards, tom lane

Re: freebsd + postgresql 8.3.3 = 100% of cpu usage on stats collector?

От
Bill Moran
Дата:
On Fri, 27 Jun 2008 18:04:19 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> hubert depesz lubaczewski <depesz@depesz.com> writes:
> > friend asked me to setup replication on their machines, and i noticed
> > that one of cpus (2 quad xeons) is used to 100%:
> > pgsql  58241 99.0  0.2 22456  7432  ??  Rs   Thu10AM 1530:35.93 postgres: stats collector process    (postgres)
>
> Hmm, we had some problems with the stats collector going nuts a couple
> of versions ago; maybe you've found another way to trigger that.
> Anything unusual about this DB (lots of tables, for instance)?
> Can you try strace and/or gdb to figure out what the collector is doing?

Just in case you're not a FreeBSD expert, it's ktrace on FreeBSD.  strace
is the Linux equivalent.

--
Bill Moran <wmoran@collaborativefusion.com>

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: freebsd + postgresql 8.3.3 = 100% of cpu usage on stats collector?

От
hubert depesz lubaczewski
Дата:
On Fri, Jun 27, 2008 at 06:37:45PM -0400, Bill Moran wrote:
> Just in case you're not a FreeBSD expert, it's ktrace on FreeBSD.  strace
> is the Linux equivalent.

i'm not an freebsd expert.

i ran ktrace -p <pid> - it exited immediately.
then i ran ktrace -p <pid> -f stats.ktrace.log -t\+
and it also exited immediately.
in cwd i have 2 binary files - not text ones as i would expect from
strace.

ktrace.out and stats.ktrace.log.

are these files "ok"? i mean - i can make them available, but i'm not
sure if they're ok given the fact that they don't look like text output
of strace.

Best regards,

depesz


Re: freebsd + postgresql 8.3.3 = 100% of cpu usage on stats collector?

От
Rodrigo Gonzalez
Дата:
hubert depesz lubaczewski wrote:
> On Fri, Jun 27, 2008 at 06:37:45PM -0400, Bill Moran wrote:
>> Just in case you're not a FreeBSD expert, it's ktrace on FreeBSD.  strace
>> is the Linux equivalent.
>
> i'm not an freebsd expert.
>
> i ran ktrace -p <pid> - it exited immediately.
> then i ran ktrace -p <pid> -f stats.ktrace.log -t\+
> and it also exited immediately.
> in cwd i have 2 binary files - not text ones as i would expect from
> strace.
>
> ktrace.out and stats.ktrace.log.
>
> are these files "ok"? i mean - i can make them available, but i'm not
> sure if they're ok given the fact that they don't look like text output
> of strace.
>
> Best regards,
>
> depesz
>
>
From ktrace man page....

DESCRIPTION
The ktrace utility enables kernel trace logging for the specified pro-
cesses.  Kernel trace data is logged to the file ktrace.out.  The kernel
operations that are traced include system calls, namei translations,sig-
nal processing, and I/O.

Once tracing is enabled on a process, trace data will be logged until
either the process exits or the trace point is cleared.  A traced
process can generate enormous amounts of log data quickly; It is
strongly suggested that users memorize how to disable tracing before
attempting to trace a process.  The following command is sufficient to
disable tracing on all user owned processes, and, if executed by root,
all processes:

           $ ktrace -C

So, read man page, and at least ktrace -C and read the output at
ktrace.out file.

Best regards

Rodrigo Gonzalez

Re: freebsd + postgresql 8.3.3 = 100% of cpu usage on stats collector?

От
hubert depesz lubaczewski
Дата:
On Fri, Jun 27, 2008 at 06:04:19PM -0400, Tom Lane wrote:
> Anything unusual about this DB (lots of tables, for instance)?

sorry, missed that question - nothing unusual. couple of dbs (10-15),
around 20 tables per databases, only 1 database really used - all others
tend to be used only for testing purposes.

Best regards,

depesz


Re: freebsd + postgresql 8.3.3 = 100% of cpu usage on stats collector?

От
hubert depesz lubaczewski
Дата:
On Sat, Jun 28, 2008 at 02:22:26AM -0300, Rodrigo Gonzalez wrote:
> So, read man page, and at least ktrace -C and read the output at
> ktrace.out file.

ok, i've read it and didn't understand. it says how to disable tracing
but it doesn't say anything about enabling tracing.

yesterday ktrace -p <pid> produced output, and today it generated empty
file.

so - apparently it's much more complicated than linux strace, and since
i generally never use bsd, i i'm not really interested in learning what
to do in this system to get trace. the only problem is that there
definitelly is some kind of problem in stats collector on bsd, but i'm
not able to help diagnose it at this moment.

Best regards,

depesz


Re: freebsd + postgresql 8.3.3 = 100% of cpu usage on stats collector?

От
Bill Moran
Дата:
On Sat, 28 Jun 2008 11:35:24 +0200
hubert depesz lubaczewski <depesz@depesz.com> wrote:

> On Sat, Jun 28, 2008 at 02:22:26AM -0300, Rodrigo Gonzalez wrote:
> > So, read man page, and at least ktrace -C and read the output at
> > ktrace.out file.
>
> ok, i've read it and didn't understand. it says how to disable tracing
> but it doesn't say anything about enabling tracing.

Use "ktrace -p [pid]" to start tracing.  Then use "ktrace -C" to stop
tracing.  Trace data is dumped in binary format to the file ktrace.out
(unless you use the -f option to specify another file)

Use the kdump utility to convert the ktrace.out file to something usable.
Something like "kdump > ktrace.txt" will probably get you what you want,
assuming your ktrace file is ktrace.out.

--
Bill Moran <wmoran@collaborativefusion.com>

Re: freebsd + postgresql 8.3.3 = 100% of cpu usage on stats collector?

От
hubert depesz lubaczewski
Дата:
On Sun, Jun 29, 2008 at 05:13:59PM -0400, Bill Moran wrote:
> Use the kdump utility to convert the ktrace.out file to something usable.
> Something like "kdump > ktrace.txt" will probably get you what you want,
> assuming your ktrace file is ktrace.out.

ok. did it. in about 7 seconds, i got 250mb of ktrace.out.
ktrace.txt is 176 megabytes.
content of this is pretty simple:
 58241 postgres CALL  poll(0x7fffffffd4e0,0x1,0x7d0)
 58241 postgres RET   poll -1 errno 4 Interrupted system call
 58241 postgres CALL  poll(0x7fffffffd4e0,0x1,0x7d0)
 58241 postgres RET   poll -1 errno 4 Interrupted system call
 58241 postgres CALL  poll(0x7fffffffd4e0,0x1,0x7d0)
 58241 postgres RET   poll -1 errno 4 Interrupted system call
 58241 postgres CALL  poll(0x7fffffffd4e0,0x1,0x7d0)
 58241 postgres RET   poll -1 errno 4 Interrupted system call
 58241 postgres CALL  poll(0x7fffffffd4e0,0x1,0x7d0)
 58241 postgres RET   poll -1 errno 4 Interrupted system call
 58241 postgres CALL  poll(0x7fffffffd4e0,0x1,0x7d0)
 58241 postgres RET   poll -1 errno 4 Interrupted system call
 58241 postgres CALL  poll(0x7fffffffd4e0,0x1,0x7d0)

should i make the ktrace.txt available, or should i do something else to
help you ( i mean, the pg-hackers community) diagnose the issue?

Best regards,

depesz


Re: freebsd + postgresql 8.3.3 = 100% of cpu usage on stats collector?

От
Bill Moran
Дата:
On Fri, 27 Jun 2008 18:04:19 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> hubert depesz lubaczewski <depesz@depesz.com> writes:
> > friend asked me to setup replication on their machines, and i noticed
> > that one of cpus (2 quad xeons) is used to 100%:
> > pgsql  58241 99.0  0.2 22456  7432  ??  Rs   Thu10AM 1530:35.93 postgres: stats collector process    (postgres)
>
> Hmm, we had some problems with the stats collector going nuts a couple
> of versions ago; maybe you've found another way to trigger that.
> Anything unusual about this DB (lots of tables, for instance)?
> Can you try strace and/or gdb to figure out what the collector is doing?

Just in case you're not a FreeBSD expert, it's ktrace on FreeBSD.  strace
is the Linux equivalent.

--
Bill Moran <wmoran@collaborativefusion.com>

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: freebsd + postgresql 8.3.3 = 100% of cpu usage on stats collector?

От
Tom Lane
Дата:
hubert depesz lubaczewski <depesz@depesz.com> writes:
>  58241 postgres CALL  poll(0x7fffffffd4e0,0x1,0x7d0)
>  58241 postgres RET   poll -1 errno 4 Interrupted system call
>  58241 postgres CALL  poll(0x7fffffffd4e0,0x1,0x7d0)
>  58241 postgres RET   poll -1 errno 4 Interrupted system call
>  58241 postgres CALL  poll(0x7fffffffd4e0,0x1,0x7d0)
>  58241 postgres RET   poll -1 errno 4 Interrupted system call
>  58241 postgres CALL  poll(0x7fffffffd4e0,0x1,0x7d0)
>  58241 postgres RET   poll -1 errno 4 Interrupted system call
>  58241 postgres CALL  poll(0x7fffffffd4e0,0x1,0x7d0)
>  58241 postgres RET   poll -1 errno 4 Interrupted system call

Hmm.  The wait-for-input loop in the stats collector assumes that
if it gets EINTR, it should just retry the poll() call immediately.
Which AFAIK is correct.  I think you're looking at a kernel or libc
bug here --- somehow the EINTR failure is recurring continuously.
It's probably time to ask about this on a FreeBSD-specific list.

            regards, tom lane