Обсуждение: Corrupted Data ?

Поиск
Список
Период
Сортировка

Corrupted Data ?

От
Ioana Danes
Дата:
Hi,

I suspect I am having a case of data corruption. Here are the details:

I am running postgres 9.4.8:

postgresql94-9.4.8-1PGDG.rhel7.x86_64
postgresql94-contrib-9.4.8-1PGDG.rhel7.x86_64
postgresql94-libs-9.4.8-1PGDG.rhel7.x86_64
postgresql94-server-9.4.8-1PGDG.rhel7.x86_64

on CentOS Linux release 7.2.1511 (Core)

This is happening in a production environment but luckily on the reporting database.
I have a cluster of 3 databases, db1 and db2 are masters and replicate between each other and also replicate to db3 (db1 <-> db2, db1 -> db3, db2 -> db3).
For replication I am using Bucardo.

The problem I am having is that one record in a table it shows a wrong value for one single field:

select gameplayid, transactionid, encodedplay from mytable where transactionid in (75315811, 75315815) order by transactionid;
 
 gameplayid | transactionid | encodedplay
------------+---------------+--------------
  160019239 |      75315811 | mix:5,2,7
  160019237 |      75315811 | mix:5,4,8
  160019235 |      75315811 | mix:6,2,9
  160019233 |      75315811 | mix:1,9,8
  160019271 |      75315815 | mix:9,0,9
  160019269 |      75315815 | mix:9,8,9
  160019267 |      75315815 | mix:9,2,2
  160019265 |      75315815 | mix:2,2,8
  160019263 |      75315811 | backup:1,9,1
  160019261 |      75315815 | backup:2,0,9

select gameplayid, transactionid, encodedplay from mytable where transactionid in (75315815) order by transactionid;
 
 gameplayid | transactionid | encodedplay
------------+---------------+--------------
  160019271 |      75315815 | mix:9,0,9
  160019269 |      75315815 | mix:9,8,9
  160019267 |      75315815 | mix:9,2,2
  160019265 |      75315815 | mix:2,2,8
  160019263 |      75315811 | backup:1,9,1
  160019261 |      75315815 | backup:2,0,9

select gameplayid, transactionid, encodedplay from mytable where transactionid in (75315811) order by transactionid;
 
 gameplayid | transactionid | encodedplay
------------+---------------+--------------
  160019239 |      75315811 | mix:5,2,7
  160019237 |      75315811 | mix:5,4,8
  160019235 |      75315811 | mix:6,2,9
  160019233 |      75315811 | mix:1,9,8

So the record with gameplayid = 160019263 have a wrong transactionid, 75315811 instead of 75315815.
The correct value is 75315815 and that I know because of the following facts:
- on db1 and db2 transactionid = 75315815 for gameplayid = 160019263,
- this table gets mostly inserts, very rare updates and only on other 2 fields not this one.
- there is another parent table that shows the number of records in this table which is 4 for transactionid =75315811  and 6 for transactionid = 7531581.

This table has an index by transactionid and that index seem correct because the filtering and the ordering are fine (like the filed has the correct value)...

What puzzles me is that the value that shows in this field is a real value from another record...

I only caught this issue because I have a script that runs in the night that compares the databases ...

By now I updated the field with the correct value and everything seem stable.

Postgres logs don't have any information about file corruption or any other kind of error. I also checked other logs on the system and I could not find any traces of corruption.

select name, setting from pg_settings order by 1;
                name                 |                                                     setting
-------------------------------------+-----------------------------------------------------------------------------------------------------------------
 allow_system_table_mods             | off
 application_name                    | psql
 archive_command                     | test -f /cbnDBscripts/tmp/PITR_primarydb_stop_backup || rsync --timeout=60 -atz %p stldrdb:/data01/wal_files/%f
 archive_mode                        | on
 archive_timeout                     | 60
 array_nulls                         | on
 authentication_timeout              | 60
 autovacuum                          | on
 autovacuum_analyze_scale_factor     | 0.1
 autovacuum_analyze_threshold        | 50
 autovacuum_freeze_max_age           | 200000000
 autovacuum_max_workers              | 3
 autovacuum_multixact_freeze_max_age | 400000000
 autovacuum_naptime                  | 60
 autovacuum_vacuum_cost_delay        | 20
 autovacuum_vacuum_cost_limit        | -1
 autovacuum_vacuum_scale_factor      | 0.2
 autovacuum_vacuum_threshold         | 50
 autovacuum_work_mem                 | -1
 backslash_quote                     | safe_encoding
 bgwriter_delay                      | 200
 bgwriter_lru_maxpages               | 100
 bgwriter_lru_multiplier             | 2
 block_size                          | 8192
 bonjour                             | off
 bonjour_name                        |
 bytea_output                        | hex
 check_function_bodies               | on
 checkpoint_completion_target        | 0.5
 checkpoint_segments                 | 16
 checkpoint_timeout                  | 300
 checkpoint_warning                  | 30
 client_encoding                     | UTF8
 client_min_messages                 | error
 commit_delay                        | 0
 commit_siblings                     | 5
 config_file                         | /data01/postgres/postgresql.conf
 constraint_exclusion                | partition
 cpu_index_tuple_cost                | 0.005
 cpu_operator_cost                   | 0.0025
 cpu_tuple_cost                      | 0.01
 cursor_tuple_fraction               | 0.1
 data_checksums                      | off
 data_directory                      | /data01/postgres
 DateStyle                           | ISO, MDY
 db_user_namespace                   | off
 deadlock_timeout                    | 1000
 debug_assertions                    | off
 debug_pretty_print                  | on
 debug_print_parse                   | off
 debug_print_plan                    | off
 debug_print_rewritten               | off
 default_statistics_target           | 100
 default_tablespace                  |
 default_text_search_config          | pg_catalog.english
 default_transaction_deferrable      | off
 default_transaction_isolation       | read committed
 default_transaction_read_only       | off
 default_with_oids                   | off
 dynamic_library_path                | $libdir
 dynamic_shared_memory_type          | posix
 effective_cache_size                | 1048576
 effective_io_concurrency            | 1
 enable_bitmapscan                   | on
 enable_hashagg                      | on
 enable_hashjoin                     | on
 enable_indexonlyscan                | on
 enable_indexscan                    | on
 enable_material                     | on
 enable_mergejoin                    | on
 enable_nestloop                     | on
 enable_seqscan                      | off
 enable_sort                         | on
 enable_tidscan                      | on
 escape_string_warning               | on
 event_source                        | PostgreSQL
 exit_on_error                       | off
 external_pid_file                   |
 extra_float_digits                  | 0
 from_collapse_limit                 | 8
 fsync                               | on
 full_page_writes                    | on
 geqo                                | on
 geqo_effort                         | 5
 geqo_generations                    | 0
 geqo_pool_size                      | 0
 geqo_seed                           | 0
 geqo_selection_bias                 | 2
 geqo_threshold                      | 12
 gin_fuzzy_search_limit              | 0
 hba_file                            | /data01/postgres/pg_hba.conf
 hot_standby                         | off
 hot_standby_feedback                | off
 huge_pages                          | try
 ident_file                          | /data01/postgres/pg_ident.conf
 ignore_checksum_failure             | off
 ignore_system_indexes               | off
 integer_datetimes                   | on
 IntervalStyle                       | postgres
 join_collapse_limit                 | 8
 krb_caseins_users                   | off
 krb_server_keyfile                  | FILE:/etc/sysconfig/pgsql/krb5.keytab
 lc_collate                          | en_US.UTF-8
 lc_ctype                            | en_US.UTF-8
 lc_messages                         | en_US.UTF-8
 lc_monetary                         | en_US.UTF-8
 lc_numeric                          | en_US.UTF-8
 lc_time                             | en_US.UTF-8
 listen_addresses                    | *
 local_preload_libraries             |
 lock_timeout                        | 0
 lo_compat_privileges                | off
 log_autovacuum_min_duration         | -1
 log_checkpoints                     | off
 log_connections                     | on
 log_destination                     | csvlog
 log_directory                       | pg_log
 log_disconnections                  | on
 log_duration                        | off
 log_error_verbosity                 | default
 log_executor_stats                  | off
 log_file_mode                       | 0600
 log_filename                        | postgresql-%d.log
 logging_collector                   | on
 log_hostname                        | off
 log_line_prefix                     | %t %d %u
 log_lock_waits                      | off
 log_min_duration_statement          | 2000
 log_min_error_statement             | error
 log_min_messages                    | error
 log_parser_stats                    | off
 log_planner_stats                   | off
 log_rotation_age                    | 1440
 log_rotation_size                   | 0
 log_statement                       | none
 log_statement_stats                 | off
 log_temp_files                      | -1
 log_timezone                        | America/St_Lucia
 log_truncate_on_rotation            | on
 maintenance_work_mem                | 131072
 max_connections                     | 300
 max_files_per_process               | 1000
 max_function_args                   | 100
 max_identifier_length               | 63
 max_index_keys                      | 32
 max_locks_per_transaction           | 64
 max_pred_locks_per_transaction      | 64
 max_prepared_transactions           | 0
 max_replication_slots               | 0
 max_stack_depth                     | 2048
 max_standby_archive_delay           | 30000
 max_standby_streaming_delay         | 30000
 max_wal_senders                     | 0
 max_worker_processes                | 8
 password_encryption                 | on
 port                                | 5432
 post_auth_delay                     | 0
 pre_auth_delay                      | 0
 quote_all_identifiers               | off
 random_page_cost                    | 4
 restart_after_crash                 | on
 search_path                         | "$user",public,hstore,dblink,www_fdw
 segment_size                        | 131072
 seq_page_cost                       | 1
 server_encoding                     | UTF8
 server_version                      | 9.4.8
 server_version_num                  | 90408
 session_preload_libraries           |
 session_replication_role            | origin
 shared_buffers                      | 262144
 shared_preload_libraries            |
 sql_inheritance                     | on
 ssl                                 | off
 ssl_ca_file                         |
 ssl_cert_file                       | server.crt
 ssl_ciphers                         | HIGH:MEDIUM:+3DES:!aNULL
 ssl_crl_file                        |
 ssl_ecdh_curve                      | prime256v1
 ssl_key_file                        | server.key
 ssl_prefer_server_ciphers           | on
 ssl_renegotiation_limit             | 0
 standard_conforming_strings         | on
 statement_timeout                   | 0
 stats_temp_directory                | pg_stat_tmp
 superuser_reserved_connections      | 3
 synchronize_seqscans                | on
 synchronous_commit                  | on
 synchronous_standby_names           |
 syslog_facility                     | local0
 syslog_ident                        | postgres
 tcp_keepalives_count                | 0
 tcp_keepalives_idle                 | 0
 tcp_keepalives_interval             | 0
 temp_buffers                        | 1024
 temp_file_limit                     | -1
 temp_tablespaces                    |
 TimeZone                            | America/St_Lucia
 timezone_abbreviations              | Default
 trace_notify                        | off
 trace_recovery_messages             | log
 trace_sort                          | off
 track_activities                    | on
 track_activity_query_size           | 1024
 track_counts                        | on
 track_functions                     | none
 track_io_timing                     | off
 transaction_deferrable              | off
 transaction_isolation               | read committed
 transaction_read_only               | off
 transform_null_equals               | off
 unix_socket_directories             | /var/run/postgresql, /tmp
 unix_socket_group                   |
 unix_socket_permissions             | 0777
 update_process_title                | on
 vacuum_cost_delay                   | 0
 vacuum_cost_limit                   | 200
 vacuum_cost_page_dirty              | 20
 vacuum_cost_page_hit                | 1
 vacuum_cost_page_miss               | 10
 vacuum_defer_cleanup_age            | 0
 vacuum_freeze_min_age               | 50000000
 vacuum_freeze_table_age             | 150000000
 vacuum_multixact_freeze_min_age     | 5000000
 vacuum_multixact_freeze_table_age   | 150000000
 wal_block_size                      | 8192
 wal_buffers                         | 128
 wal_keep_segments                   | 0
 wal_level                           | hot_standby
 wal_log_hints                       | off
 wal_receiver_status_interval        | 10
 wal_receiver_timeout                | 60000
 wal_segment_size                    | 2048
 wal_sender_timeout                  | 60000
 wal_sync_method                     | fdatasync
 wal_writer_delay                    | 200
 work_mem                            | 2048
 xmlbinary                           | base64
 xmloption                           | content
 zero_damaged_pages                  | off
(239 rows)


Any suggestions, thoughts?


Re: Corrupted Data ?

От
Adrian Klaver
Дата:
On 08/08/2016 09:11 AM, Ioana Danes wrote:
> Hi,
>
> I suspect I am having a case of data corruption. Here are the details:
>
> I am running postgres 9.4.8:
>
> postgresql94-9.4.8-1PGDG.rhel7.x86_64
> postgresql94-contrib-9.4.8-1PGDG.rhel7.x86_64
> postgresql94-libs-9.4.8-1PGDG.rhel7.x86_64
> postgresql94-server-9.4.8-1PGDG.rhel7.x86_64
>
> on CentOS Linux release 7.2.1511 (Core)
>
> This is happening in a production environment but luckily on the
> reporting database.
> I have a cluster of 3 databases, db1 and db2 are masters and replicate
> between each other and also replicate to db3 (db1 <-> db2, db1 -> db3,
> db2 -> db3).
> For replication I am using Bucardo.

I would say this is more a question for the Burcardo list:

https://mail.endcrypt.com/mailman/listinfo/bucardo-general

I am just not seeing that replicating two masters on to a single
database is going to end well.

>
> The problem I am having is that one record in a table it shows a wrong
> value for one single field:
>
> select gameplayid, transactionid, encodedplay from mytable where
> transactionid in (75315811, 75315815) order by transactionid;
>
>  gameplayid | transactionid | encodedplay
> ------------+---------------+--------------
>   160019239 |      75315811 | mix:5,2,7
>   160019237 |      75315811 | mix:5,4,8
>   160019235 |      75315811 | mix:6,2,9
>   160019233 |      75315811 | mix:1,9,8
>   160019271 |      75315815 | mix:9,0,9
>   160019269 |      75315815 | mix:9,8,9
>   160019267 |      75315815 | mix:9,2,2
>   160019265 |      75315815 | mix:2,2,8
>   160019263 |      *75315811* | backup:1,9,1
>   160019261 |      75315815 | backup:2,0,9
>
> select gameplayid, transactionid, encodedplay from mytable where
> transactionid in (75315815) order by transactionid;
>
>  gameplayid | transactionid | encodedplay
> ------------+---------------+--------------
>   160019271 |      75315815 | mix:9,0,9
>   160019269 |      75315815 | mix:9,8,9
>   160019267 |      75315815 | mix:9,2,2
>   160019265 |      75315815 | mix:2,2,8
>   160019263 |      *75315811* | backup:1,9,1
>   160019261 |      75315815 | backup:2,0,9
>
> select gameplayid, transactionid, encodedplay from mytable where
> transactionid in (75315811) order by transactionid;
>
>  gameplayid | transactionid | encodedplay
> ------------+---------------+--------------
>   160019239 |      75315811 | mix:5,2,7
>   160019237 |      75315811 | mix:5,4,8
>   160019235 |      75315811 | mix:6,2,9
>   160019233 |      75315811 | mix:1,9,8
>
> So the record with gameplayid = 160019263 have a wrong transactionid,
> 75315811 instead of 75315815.
> The correct value is 75315815 and that I know because of the following
> facts:
> - on db1 and db2 transactionid = 75315815 for gameplayid = 160019263,
> - this table gets mostly inserts, very rare updates and only on other 2
> fields not this one.
> - there is another parent table that shows the number of records in this
> table which is 4 for transactionid =75315811  and 6 for transactionid =
> 7531581.
>
> This table has an index by transactionid and that index seem correct
> because the filtering and the ordering are fine (like the filed has the
> correct value)...
>
> What puzzles me is that the value that shows in this field is a real
> value from another record...
>
> I only caught this issue because I have a script that runs in the night
> that compares the databases ...
>
> By now I updated the field with the correct value and everything seem
> stable.
>
> Postgres logs don't have any information about file corruption or any
> other kind of error. I also checked other logs on the system and I could
> not find any traces of corruption.
>
> select name, setting from pg_settings order by 1;
>                 name
> |                                                     setting
>
-------------------------------------+-----------------------------------------------------------------------------------------------------------------
>  allow_system_table_mods             | off
>  application_name                    | psql
>  archive_command                     | test -f
> /cbnDBscripts/tmp/PITR_primarydb_stop_backup || rsync --timeout=60 -atz
> %p stldrdb:/data01/wal_files/%f
>  archive_mode                        | on
>  archive_timeout                     | 60
>  array_nulls                         | on
>  authentication_timeout              | 60
>  autovacuum                          | on
>  autovacuum_analyze_scale_factor     | 0.1
>  autovacuum_analyze_threshold        | 50
>  autovacuum_freeze_max_age           | 200000000
>  autovacuum_max_workers              | 3
>  autovacuum_multixact_freeze_max_age | 400000000
>  autovacuum_naptime                  | 60
>  autovacuum_vacuum_cost_delay        | 20
>  autovacuum_vacuum_cost_limit        | -1
>  autovacuum_vacuum_scale_factor      | 0.2
>  autovacuum_vacuum_threshold         | 50
>  autovacuum_work_mem                 | -1
>  backslash_quote                     | safe_encoding
>  bgwriter_delay                      | 200
>  bgwriter_lru_maxpages               | 100
>  bgwriter_lru_multiplier             | 2
>  block_size                          | 8192
>  bonjour                             | off
>  bonjour_name                        |
>  bytea_output                        | hex
>  check_function_bodies               | on
>  checkpoint_completion_target        | 0.5
>  checkpoint_segments                 | 16
>  checkpoint_timeout                  | 300
>  checkpoint_warning                  | 30
>  client_encoding                     | UTF8
>  client_min_messages                 | error
>  commit_delay                        | 0
>  commit_siblings                     | 5
>  config_file                         | /data01/postgres/postgresql.conf
>  constraint_exclusion                | partition
>  cpu_index_tuple_cost                | 0.005
>  cpu_operator_cost                   | 0.0025
>  cpu_tuple_cost                      | 0.01
>  cursor_tuple_fraction               | 0.1
>  data_checksums                      | off
>  data_directory                      | /data01/postgres
>  DateStyle                           | ISO, MDY
>  db_user_namespace                   | off
>  deadlock_timeout                    | 1000
>  debug_assertions                    | off
>  debug_pretty_print                  | on
>  debug_print_parse                   | off
>  debug_print_plan                    | off
>  debug_print_rewritten               | off
>  default_statistics_target           | 100
>  default_tablespace                  |
>  default_text_search_config          | pg_catalog.english
>  default_transaction_deferrable      | off
>  default_transaction_isolation       | read committed
>  default_transaction_read_only       | off
>  default_with_oids                   | off
>  dynamic_library_path                | $libdir
>  dynamic_shared_memory_type          | posix
>  effective_cache_size                | 1048576
>  effective_io_concurrency            | 1
>  enable_bitmapscan                   | on
>  enable_hashagg                      | on
>  enable_hashjoin                     | on
>  enable_indexonlyscan                | on
>  enable_indexscan                    | on
>  enable_material                     | on
>  enable_mergejoin                    | on
>  enable_nestloop                     | on
>  enable_seqscan                      | off
>  enable_sort                         | on
>  enable_tidscan                      | on
>  escape_string_warning               | on
>  event_source                        | PostgreSQL
>  exit_on_error                       | off
>  external_pid_file                   |
>  extra_float_digits                  | 0
>  from_collapse_limit                 | 8
>  fsync                               | on
>  full_page_writes                    | on
>  geqo                                | on
>  geqo_effort                         | 5
>  geqo_generations                    | 0
>  geqo_pool_size                      | 0
>  geqo_seed                           | 0
>  geqo_selection_bias                 | 2
>  geqo_threshold                      | 12
>  gin_fuzzy_search_limit              | 0
>  hba_file                            | /data01/postgres/pg_hba.conf
>  hot_standby                         | off
>  hot_standby_feedback                | off
>  huge_pages                          | try
>  ident_file                          | /data01/postgres/pg_ident.conf
>  ignore_checksum_failure             | off
>  ignore_system_indexes               | off
>  integer_datetimes                   | on
>  IntervalStyle                       | postgres
>  join_collapse_limit                 | 8
>  krb_caseins_users                   | off
>  krb_server_keyfile                  | FILE:/etc/sysconfig/pgsql/krb5.keytab
>  lc_collate                          | en_US.UTF-8
>  lc_ctype                            | en_US.UTF-8
>  lc_messages                         | en_US.UTF-8
>  lc_monetary                         | en_US.UTF-8
>  lc_numeric                          | en_US.UTF-8
>  lc_time                             | en_US.UTF-8
>  listen_addresses                    | *
>  local_preload_libraries             |
>  lock_timeout                        | 0
>  lo_compat_privileges                | off
>  log_autovacuum_min_duration         | -1
>  log_checkpoints                     | off
>  log_connections                     | on
>  log_destination                     | csvlog
>  log_directory                       | pg_log
>  log_disconnections                  | on
>  log_duration                        | off
>  log_error_verbosity                 | default
>  log_executor_stats                  | off
>  log_file_mode                       | 0600
>  log_filename                        | postgresql-%d.log
>  logging_collector                   | on
>  log_hostname                        | off
>  log_line_prefix                     | %t %d %u
>  log_lock_waits                      | off
>  log_min_duration_statement          | 2000
>  log_min_error_statement             | error
>  log_min_messages                    | error
>  log_parser_stats                    | off
>  log_planner_stats                   | off
>  log_rotation_age                    | 1440
>  log_rotation_size                   | 0
>  log_statement                       | none
>  log_statement_stats                 | off
>  log_temp_files                      | -1
>  log_timezone                        | America/St_Lucia
>  log_truncate_on_rotation            | on
>  maintenance_work_mem                | 131072
>  max_connections                     | 300
>  max_files_per_process               | 1000
>  max_function_args                   | 100
>  max_identifier_length               | 63
>  max_index_keys                      | 32
>  max_locks_per_transaction           | 64
>  max_pred_locks_per_transaction      | 64
>  max_prepared_transactions           | 0
>  max_replication_slots               | 0
>  max_stack_depth                     | 2048
>  max_standby_archive_delay           | 30000
>  max_standby_streaming_delay         | 30000
>  max_wal_senders                     | 0
>  max_worker_processes                | 8
>  password_encryption                 | on
>  port                                | 5432
>  post_auth_delay                     | 0
>  pre_auth_delay                      | 0
>  quote_all_identifiers               | off
>  random_page_cost                    | 4
>  restart_after_crash                 | on
>  search_path                         | "$user",public,hstore,dblink,www_fdw
>  segment_size                        | 131072
>  seq_page_cost                       | 1
>  server_encoding                     | UTF8
>  server_version                      | 9.4.8
>  server_version_num                  | 90408
>  session_preload_libraries           |
>  session_replication_role            | origin
>  shared_buffers                      | 262144
>  shared_preload_libraries            |
>  sql_inheritance                     | on
>  ssl                                 | off
>  ssl_ca_file                         |
>  ssl_cert_file                       | server.crt
>  ssl_ciphers                         | HIGH:MEDIUM:+3DES:!aNULL
>  ssl_crl_file                        |
>  ssl_ecdh_curve                      | prime256v1
>  ssl_key_file                        | server.key
>  ssl_prefer_server_ciphers           | on
>  ssl_renegotiation_limit             | 0
>  standard_conforming_strings         | on
>  statement_timeout                   | 0
>  stats_temp_directory                | pg_stat_tmp
>  superuser_reserved_connections      | 3
>  synchronize_seqscans                | on
>  synchronous_commit                  | on
>  synchronous_standby_names           |
>  syslog_facility                     | local0
>  syslog_ident                        | postgres
>  tcp_keepalives_count                | 0
>  tcp_keepalives_idle                 | 0
>  tcp_keepalives_interval             | 0
>  temp_buffers                        | 1024
>  temp_file_limit                     | -1
>  temp_tablespaces                    |
>  TimeZone                            | America/St_Lucia
>  timezone_abbreviations              | Default
>  trace_notify                        | off
>  trace_recovery_messages             | log
>  trace_sort                          | off
>  track_activities                    | on
>  track_activity_query_size           | 1024
>  track_counts                        | on
>  track_functions                     | none
>  track_io_timing                     | off
>  transaction_deferrable              | off
>  transaction_isolation               | read committed
>  transaction_read_only               | off
>  transform_null_equals               | off
>  unix_socket_directories             | /var/run/postgresql, /tmp
>  unix_socket_group                   |
>  unix_socket_permissions             | 0777
>  update_process_title                | on
>  vacuum_cost_delay                   | 0
>  vacuum_cost_limit                   | 200
>  vacuum_cost_page_dirty              | 20
>  vacuum_cost_page_hit                | 1
>  vacuum_cost_page_miss               | 10
>  vacuum_defer_cleanup_age            | 0
>  vacuum_freeze_min_age               | 50000000
>  vacuum_freeze_table_age             | 150000000
>  vacuum_multixact_freeze_min_age     | 5000000
>  vacuum_multixact_freeze_table_age   | 150000000
>  wal_block_size                      | 8192
>  wal_buffers                         | 128
>  wal_keep_segments                   | 0
>  wal_level                           | hot_standby
>  wal_log_hints                       | off
>  wal_receiver_status_interval        | 10
>  wal_receiver_timeout                | 60000
>  wal_segment_size                    | 2048
>  wal_sender_timeout                  | 60000
>  wal_sync_method                     | fdatasync
>  wal_writer_delay                    | 200
>  work_mem                            | 2048
>  xmlbinary                           | base64
>  xmloption                           | content
>  zero_damaged_pages                  | off
> (239 rows)
>
>
> Any suggestions, thoughts?
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Corrupted Data ?

От
Ioana Danes
Дата:


On Mon, Aug 8, 2016 at 12:19 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 08/08/2016 09:11 AM, Ioana Danes wrote:
Hi,

I suspect I am having a case of data corruption. Here are the details:

I am running postgres 9.4.8:

postgresql94-9.4.8-1PGDG.rhel7.x86_64
postgresql94-contrib-9.4.8-1PGDG.rhel7.x86_64
postgresql94-libs-9.4.8-1PGDG.rhel7.x86_64
postgresql94-server-9.4.8-1PGDG.rhel7.x86_64

on CentOS Linux release 7.2.1511 (Core)

This is happening in a production environment but luckily on the
reporting database.
I have a cluster of 3 databases, db1 and db2 are masters and replicate
between each other and also replicate to db3 (db1 <-> db2, db1 -> db3,
db2 -> db3).
For replication I am using Bucardo.

I would say this is more a question for the Burcardo list:

https://mail.endcrypt.com/mailman/listinfo/bucardo-general

I am just not seeing that replicating two masters on to a single database is going to end well.

Only one master is active at one time the other one is in stand by that is a topic for another discussion but in our case that works well.

That was my first assumption, that it is a kind of a race condition or a bug on replication but I quickly ruled that out because that does not explain why when I filtered the table by transactionid = 75315815 it shows one record with transactionid 75315811...

select gameplayid, transactionid, encodedplay from abrazo.matchgameplay where transactionid in (75315815) order by transactionid;;
 gameplayid | transactionid | encodedplay
------------+---------------+--------------
  160019271 |      75315815 | mix:9,0,9
  160019269 |      75315815 | mix:9,8,9
  160019267 |      75315815 | mix:9,2,2
  160019265 |      75315815 | mix:2,2,8
  160019263 |      75315811 | backup:1,9,1
  160019261 |      75315815 | backup:2,0,9

So I don't think it is a replication issue...

 


The problem I am having is that one record in a table it shows a wrong
value for one single field:

select gameplayid, transactionid, encodedplay from mytable where
transactionid in (75315811, 75315815) order by transactionid;

 gameplayid | transactionid | encodedplay
------------+---------------+--------------
  160019239 |      75315811 | mix:5,2,7
  160019237 |      75315811 | mix:5,4,8
  160019235 |      75315811 | mix:6,2,9
  160019233 |      75315811 | mix:1,9,8
  160019271 |      75315815 | mix:9,0,9
  160019269 |      75315815 | mix:9,8,9
  160019267 |      75315815 | mix:9,2,2
  160019265 |      75315815 | mix:2,2,8
  160019263 |      *75315811* | backup:1,9,1
  160019261 |      75315815 | backup:2,0,9

select gameplayid, transactionid, encodedplay from mytable where
transactionid in (75315815) order by transactionid;

 gameplayid | transactionid | encodedplay
------------+---------------+--------------
  160019271 |      75315815 | mix:9,0,9
  160019269 |      75315815 | mix:9,8,9
  160019267 |      75315815 | mix:9,2,2
  160019265 |      75315815 | mix:2,2,8
  160019263 |      *75315811* | backup:1,9,1
  160019261 |      75315815 | backup:2,0,9

select gameplayid, transactionid, encodedplay from mytable where
transactionid in (75315811) order by transactionid;

 gameplayid | transactionid | encodedplay
------------+---------------+--------------
  160019239 |      75315811 | mix:5,2,7
  160019237 |      75315811 | mix:5,4,8
  160019235 |      75315811 | mix:6,2,9
  160019233 |      75315811 | mix:1,9,8

So the record with gameplayid = 160019263 have a wrong transactionid,
75315811 instead of 75315815.
The correct value is 75315815 and that I know because of the following
facts:
- on db1 and db2 transactionid = 75315815 for gameplayid = 160019263,
- this table gets mostly inserts, very rare updates and only on other 2
fields not this one.
- there is another parent table that shows the number of records in this
table which is 4 for transactionid =75315811  and 6 for transactionid =
7531581.

This table has an index by transactionid and that index seem correct
because the filtering and the ordering are fine (like the filed has the
correct value)...

What puzzles me is that the value that shows in this field is a real
value from another record...

I only caught this issue because I have a script that runs in the night
that compares the databases ...

By now I updated the field with the correct value and everything seem
stable.

Postgres logs don't have any information about file corruption or any
other kind of error. I also checked other logs on the system and I could
not find any traces of corruption.

select name, setting from pg_settings order by 1;
                name
|                                                     setting
-------------------------------------+-----------------------------------------------------------------------------------------------------------------
 allow_system_table_mods             | off
 application_name                    | psql
 archive_command                     | test -f
/cbnDBscripts/tmp/PITR_primarydb_stop_backup || rsync --timeout=60 -atz
%p stldrdb:/data01/wal_files/%f
 archive_mode                        | on
 archive_timeout                     | 60
 array_nulls                         | on
 authentication_timeout              | 60
 autovacuum                          | on
 autovacuum_analyze_scale_factor     | 0.1
 autovacuum_analyze_threshold        | 50
 autovacuum_freeze_max_age           | 200000000
 autovacuum_max_workers              | 3
 autovacuum_multixact_freeze_max_age | 400000000
 autovacuum_naptime                  | 60
 autovacuum_vacuum_cost_delay        | 20
 autovacuum_vacuum_cost_limit        | -1
 autovacuum_vacuum_scale_factor      | 0.2
 autovacuum_vacuum_threshold         | 50
 autovacuum_work_mem                 | -1
 backslash_quote                     | safe_encoding
 bgwriter_delay                      | 200
 bgwriter_lru_maxpages               | 100
 bgwriter_lru_multiplier             | 2
 block_size                          | 8192
 bonjour                             | off
 bonjour_name                        |
 bytea_output                        | hex
 check_function_bodies               | on
 checkpoint_completion_target        | 0.5
 checkpoint_segments                 | 16
 checkpoint_timeout                  | 300
 checkpoint_warning                  | 30
 client_encoding                     | UTF8
 client_min_messages                 | error
 commit_delay                        | 0
 commit_siblings                     | 5
 config_file                         | /data01/postgres/postgresql.conf
 constraint_exclusion                | partition
 cpu_index_tuple_cost                | 0.005
 cpu_operator_cost                   | 0.0025
 cpu_tuple_cost                      | 0.01
 cursor_tuple_fraction               | 0.1
 data_checksums                      | off
 data_directory                      | /data01/postgres
 DateStyle                           | ISO, MDY
 db_user_namespace                   | off
 deadlock_timeout                    | 1000
 debug_assertions                    | off
 debug_pretty_print                  | on
 debug_print_parse                   | off
 debug_print_plan                    | off
 debug_print_rewritten               | off
 default_statistics_target           | 100
 default_tablespace                  |
 default_text_search_config          | pg_catalog.english
 default_transaction_deferrable      | off
 default_transaction_isolation       | read committed
 default_transaction_read_only       | off
 default_with_oids                   | off
 dynamic_library_path                | $libdir
 dynamic_shared_memory_type          | posix
 effective_cache_size                | 1048576
 effective_io_concurrency            | 1
 enable_bitmapscan                   | on
 enable_hashagg                      | on
 enable_hashjoin                     | on
 enable_indexonlyscan                | on
 enable_indexscan                    | on
 enable_material                     | on
 enable_mergejoin                    | on
 enable_nestloop                     | on
 enable_seqscan                      | off
 enable_sort                         | on
 enable_tidscan                      | on
 escape_string_warning               | on
 event_source                        | PostgreSQL
 exit_on_error                       | off
 external_pid_file                   |
 extra_float_digits                  | 0
 from_collapse_limit                 | 8
 fsync                               | on
 full_page_writes                    | on
 geqo                                | on
 geqo_effort                         | 5
 geqo_generations                    | 0
 geqo_pool_size                      | 0
 geqo_seed                           | 0
 geqo_selection_bias                 | 2
 geqo_threshold                      | 12
 gin_fuzzy_search_limit              | 0
 hba_file                            | /data01/postgres/pg_hba.conf
 hot_standby                         | off
 hot_standby_feedback                | off
 huge_pages                          | try
 ident_file                          | /data01/postgres/pg_ident.conf
 ignore_checksum_failure             | off
 ignore_system_indexes               | off
 integer_datetimes                   | on
 IntervalStyle                       | postgres
 join_collapse_limit                 | 8
 krb_caseins_users                   | off
 krb_server_keyfile                  | FILE:/etc/sysconfig/pgsql/krb5.keytab
 lc_collate                          | en_US.UTF-8
 lc_ctype                            | en_US.UTF-8
 lc_messages                         | en_US.UTF-8
 lc_monetary                         | en_US.UTF-8
 lc_numeric                          | en_US.UTF-8
 lc_time                             | en_US.UTF-8
 listen_addresses                    | *
 local_preload_libraries             |
 lock_timeout                        | 0
 lo_compat_privileges                | off
 log_autovacuum_min_duration         | -1
 log_checkpoints                     | off
 log_connections                     | on
 log_destination                     | csvlog
 log_directory                       | pg_log
 log_disconnections                  | on
 log_duration                        | off
 log_error_verbosity                 | default
 log_executor_stats                  | off
 log_file_mode                       | 0600
 log_filename                        | postgresql-%d.log
 logging_collector                   | on
 log_hostname                        | off
 log_line_prefix                     | %t %d %u
 log_lock_waits                      | off
 log_min_duration_statement          | 2000
 log_min_error_statement             | error
 log_min_messages                    | error
 log_parser_stats                    | off
 log_planner_stats                   | off
 log_rotation_age                    | 1440
 log_rotation_size                   | 0
 log_statement                       | none
 log_statement_stats                 | off
 log_temp_files                      | -1
 log_timezone                        | America/St_Lucia
 log_truncate_on_rotation            | on
 maintenance_work_mem                | 131072
 max_connections                     | 300
 max_files_per_process               | 1000
 max_function_args                   | 100
 max_identifier_length               | 63
 max_index_keys                      | 32
 max_locks_per_transaction           | 64
 max_pred_locks_per_transaction      | 64
 max_prepared_transactions           | 0
 max_replication_slots               | 0
 max_stack_depth                     | 2048
 max_standby_archive_delay           | 30000
 max_standby_streaming_delay         | 30000
 max_wal_senders                     | 0
 max_worker_processes                | 8
 password_encryption                 | on
 port                                | 5432
 post_auth_delay                     | 0
 pre_auth_delay                      | 0
 quote_all_identifiers               | off
 random_page_cost                    | 4
 restart_after_crash                 | on
 search_path                         | "$user",public,hstore,dblink,www_fdw
 segment_size                        | 131072
 seq_page_cost                       | 1
 server_encoding                     | UTF8
 server_version                      | 9.4.8
 server_version_num                  | 90408
 session_preload_libraries           |
 session_replication_role            | origin
 shared_buffers                      | 262144
 shared_preload_libraries            |
 sql_inheritance                     | on
 ssl                                 | off
 ssl_ca_file                         |
 ssl_cert_file                       | server.crt
 ssl_ciphers                         | HIGH:MEDIUM:+3DES:!aNULL
 ssl_crl_file                        |
 ssl_ecdh_curve                      | prime256v1
 ssl_key_file                        | server.key
 ssl_prefer_server_ciphers           | on
 ssl_renegotiation_limit             | 0
 standard_conforming_strings         | on
 statement_timeout                   | 0
 stats_temp_directory                | pg_stat_tmp
 superuser_reserved_connections      | 3
 synchronize_seqscans                | on
 synchronous_commit                  | on
 synchronous_standby_names           |
 syslog_facility                     | local0
 syslog_ident                        | postgres
 tcp_keepalives_count                | 0
 tcp_keepalives_idle                 | 0
 tcp_keepalives_interval             | 0
 temp_buffers                        | 1024
 temp_file_limit                     | -1
 temp_tablespaces                    |
 TimeZone                            | America/St_Lucia
 timezone_abbreviations              | Default
 trace_notify                        | off
 trace_recovery_messages             | log
 trace_sort                          | off
 track_activities                    | on
 track_activity_query_size           | 1024
 track_counts                        | on
 track_functions                     | none
 track_io_timing                     | off
 transaction_deferrable              | off
 transaction_isolation               | read committed
 transaction_read_only               | off
 transform_null_equals               | off
 unix_socket_directories             | /var/run/postgresql, /tmp
 unix_socket_group                   |
 unix_socket_permissions             | 0777
 update_process_title                | on
 vacuum_cost_delay                   | 0
 vacuum_cost_limit                   | 200
 vacuum_cost_page_dirty              | 20
 vacuum_cost_page_hit                | 1
 vacuum_cost_page_miss               | 10
 vacuum_defer_cleanup_age            | 0
 vacuum_freeze_min_age               | 50000000
 vacuum_freeze_table_age             | 150000000
 vacuum_multixact_freeze_min_age     | 5000000
 vacuum_multixact_freeze_table_age   | 150000000
 wal_block_size                      | 8192
 wal_buffers                         | 128
 wal_keep_segments                   | 0
 wal_level                           | hot_standby
 wal_log_hints                       | off
 wal_receiver_status_interval        | 10
 wal_receiver_timeout                | 60000
 wal_segment_size                    | 2048
 wal_sender_timeout                  | 60000
 wal_sync_method                     | fdatasync
 wal_writer_delay                    | 200
 work_mem                            | 2048
 xmlbinary                           | base64
 xmloption                           | content
 zero_damaged_pages                  | off
(239 rows)


Any suggestions, thoughts?




--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Corrupted Data ?

От
Adrian Klaver
Дата:
On 08/08/2016 09:28 AM, Ioana Danes wrote:
>
>
> On Mon, Aug 8, 2016 at 12:19 PM, Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 08/08/2016 09:11 AM, Ioana Danes wrote:
>
>         Hi,
>
>         I suspect I am having a case of data corruption. Here are the
>         details:
>
>         I am running postgres 9.4.8:
>
>         postgresql94-9.4.8-1PGDG.rhel7.x86_64
>         postgresql94-contrib-9.4.8-1PGDG.rhel7.x86_64
>         postgresql94-libs-9.4.8-1PGDG.rhel7.x86_64
>         postgresql94-server-9.4.8-1PGDG.rhel7.x86_64
>
>         on CentOS Linux release 7.2.1511 (Core)
>
>         This is happening in a production environment but luckily on the
>         reporting database.
>         I have a cluster of 3 databases, db1 and db2 are masters and
>         replicate
>         between each other and also replicate to db3 (db1 <-> db2, db1
>         -> db3,
>         db2 -> db3).
>         For replication I am using Bucardo.
>
>
>     I would say this is more a question for the Burcardo list:
>
>     https://mail.endcrypt.com/mailman/listinfo/bucardo-general
>     <https://mail.endcrypt.com/mailman/listinfo/bucardo-general>
>
>     I am just not seeing that replicating two masters on to a single
>     database is going to end well.
>
>
> Only one master is active at one time the other one is in stand by that
> is a topic for another discussion but in our case that works well.
>
> That was my first assumption, that it is a kind of a race condition or a
> bug on replication but I quickly ruled that out because that does not
> explain why when I filtered the table by transactionid = 75315815 it
> shows one record with transactionid 75315811...
>
> select gameplayid, transactionid, encodedplay from abrazo.matchgameplay
> where transactionid in (75315815) order by transactionid;;
>  gameplayid | transactionid | encodedplay
> ------------+---------------+--------------
>   160019271 |      75315815 | mix:9,0,9
>   160019269 |      75315815 | mix:9,8,9
>   160019267 |      75315815 | mix:9,2,2
>   160019265 |      75315815 | mix:2,2,8
>   160019263 |      *75315811 *| backup:1,9,1
>   160019261 |      75315815 | backup:2,0,9
>
> So I don't think it is a replication issue...
>

Other that, if I am following correctly, it is on the database(db3)
being replicated to. The only way db3 is getting its data is through
replication, is that correct?. On the master databases the data is correct.


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Corrupted Data ?

От
Ioana Danes
Дата:


On Mon, Aug 8, 2016 at 12:37 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 08/08/2016 09:28 AM, Ioana Danes wrote:


On Mon, Aug 8, 2016 at 12:19 PM, Adrian Klaver
<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:

    On 08/08/2016 09:11 AM, Ioana Danes wrote:

        Hi,

        I suspect I am having a case of data corruption. Here are the
        details:

        I am running postgres 9.4.8:

        postgresql94-9.4.8-1PGDG.rhel7.x86_64
        postgresql94-contrib-9.4.8-1PGDG.rhel7.x86_64
        postgresql94-libs-9.4.8-1PGDG.rhel7.x86_64
        postgresql94-server-9.4.8-1PGDG.rhel7.x86_64

        on CentOS Linux release 7.2.1511 (Core)

        This is happening in a production environment but luckily on the
        reporting database.
        I have a cluster of 3 databases, db1 and db2 are masters and
        replicate
        between each other and also replicate to db3 (db1 <-> db2, db1
        -> db3,
        db2 -> db3).
        For replication I am using Bucardo.


    I would say this is more a question for the Burcardo list:

    https://mail.endcrypt.com/mailman/listinfo/bucardo-general
    <https://mail.endcrypt.com/mailman/listinfo/bucardo-general>

    I am just not seeing that replicating two masters on to a single
    database is going to end well.


Only one master is active at one time the other one is in stand by that
is a topic for another discussion but in our case that works well.

That was my first assumption, that it is a kind of a race condition or a
bug on replication but I quickly ruled that out because that does not
explain why when I filtered the table by transactionid = 75315815 it
shows one record with transactionid 75315811...

select gameplayid, transactionid, encodedplay from abrazo.matchgameplay
where transactionid in (75315815) order by transactionid;;
 gameplayid | transactionid | encodedplay
------------+---------------+--------------
  160019271 |      75315815 | mix:9,0,9
  160019269 |      75315815 | mix:9,8,9
  160019267 |      75315815 | mix:9,2,2
  160019265 |      75315815 | mix:2,2,8
  160019263 |      *75315811 *| backup:1,9,1
  160019261 |      75315815 | backup:2,0,9

So I don't think it is a replication issue...


Other that, if I am following correctly, it is on the database(db3) being replicated to. The only way db3 is getting its data is through replication, is that correct?. On the master databases the data is correct.

OK, let's assume that what you're saying is correct and the replication has a bug, or corruption or whatever that is and the record gets created with transactionid = 75315811. Bucardo replication is trigger based and it is using a copy command to insert the new records into the replicated database.

Then how can I explain that my query select gameplayid, transactionid, encodedplay from abrazo.matchgameplay where transactionid in (75315815) order by transactionid; returns me a record with transactionid 75315811???

Thanks,
ioana



--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Corrupted Data ?

От
Melvin Davidson
Дата:


On Mon, Aug 8, 2016 at 12:47 PM, Ioana Danes <ioanadanes@gmail.com> wrote:


On Mon, Aug 8, 2016 at 12:37 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 08/08/2016 09:28 AM, Ioana Danes wrote:


On Mon, Aug 8, 2016 at 12:19 PM, Adrian Klaver
<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:

    On 08/08/2016 09:11 AM, Ioana Danes wrote:

        Hi,

        I suspect I am having a case of data corruption. Here are the
        details:

        I am running postgres 9.4.8:

        postgresql94-9.4.8-1PGDG.rhel7.x86_64
        postgresql94-contrib-9.4.8-1PGDG.rhel7.x86_64
        postgresql94-libs-9.4.8-1PGDG.rhel7.x86_64
        postgresql94-server-9.4.8-1PGDG.rhel7.x86_64

        on CentOS Linux release 7.2.1511 (Core)

        This is happening in a production environment but luckily on the
        reporting database.
        I have a cluster of 3 databases, db1 and db2 are masters and
        replicate
        between each other and also replicate to db3 (db1 <-> db2, db1
        -> db3,
        db2 -> db3).
        For replication I am using Bucardo.


    I would say this is more a question for the Burcardo list:

    https://mail.endcrypt.com/mailman/listinfo/bucardo-general
    <https://mail.endcrypt.com/mailman/listinfo/bucardo-general>

    I am just not seeing that replicating two masters on to a single
    database is going to end well.


Only one master is active at one time the other one is in stand by that
is a topic for another discussion but in our case that works well.

That was my first assumption, that it is a kind of a race condition or a
bug on replication but I quickly ruled that out because that does not
explain why when I filtered the table by transactionid = 75315815 it
shows one record with transactionid 75315811...

select gameplayid, transactionid, encodedplay from abrazo.matchgameplay
where transactionid in (75315815) order by transactionid;;
 gameplayid | transactionid | encodedplay
------------+---------------+--------------
  160019271 |      75315815 | mix:9,0,9
  160019269 |      75315815 | mix:9,8,9
  160019267 |      75315815 | mix:9,2,2
  160019265 |      75315815 | mix:2,2,8
  160019263 |      *75315811 *| backup:1,9,1
  160019261 |      75315815 | backup:2,0,9

So I don't think it is a replication issue...


Other that, if I am following correctly, it is on the database(db3) being replicated to. The only way db3 is getting its data is through replication, is that correct?. On the master databases the data is correct.

OK, let's assume that what you're saying is correct and the replication has a bug, or corruption or whatever that is and the record gets created with transactionid = 75315811. Bucardo replication is trigger based and it is using a copy command to insert the new records into the replicated database.

Then how can I explain that my query select gameplayid, transactionid, encodedplay from abrazo.matchgameplay where transactionid in (75315815) order by transactionid; returns me a record with transactionid 75315811???

Thanks,
ioana



--
Adrian Klaver
adrian.klaver@aklaver.com


You did provide the full table structure, but this might possibly be a case of index corruption.
Try executing the following query and if any show up as invalid, just drop and rebuild.

SELECT n.nspname,
               i.relname,
               i.indexrelname,
               CASE WHEN idx.indisprimary
                         THEN 'pkey'
                          WHEN idx.indisunique
                          THEN 'uidx'
                          ELSE 'idx'
                  END AS type,
                  'INVALID'
  FROM pg_stat_all_indexes i
    JOIN pg_class c     ON (c.oid = i.relid)
    JOIN pg_namespace n ON (n.oid = c.relnamespace)
    JOIN pg_index idx   ON (idx.indexrelid =  i.indexrelid )
WHERE idx.indisvalid = FALSE  
ORDER BY 1, 2;

--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: Corrupted Data ?

От
Adrian Klaver
Дата:
On 08/08/2016 09:47 AM, Ioana Danes wrote:
>
>
> On Mon, Aug 8, 2016 at 12:37 PM, Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 08/08/2016 09:28 AM, Ioana Danes wrote:
>
>
>
>         On Mon, Aug 8, 2016 at 12:19 PM, Adrian Klaver
>         <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>         <mailto:adrian.klaver@aklaver.com
>         <mailto:adrian.klaver@aklaver.com>>> wrote:
>
>             On 08/08/2016 09:11 AM, Ioana Danes wrote:
>
>                 Hi,
>
>                 I suspect I am having a case of data corruption. Here
>         are the
>                 details:
>
>                 I am running postgres 9.4.8:
>
>                 postgresql94-9.4.8-1PGDG.rhel7.x86_64
>                 postgresql94-contrib-9.4.8-1PGDG.rhel7.x86_64
>                 postgresql94-libs-9.4.8-1PGDG.rhel7.x86_64
>                 postgresql94-server-9.4.8-1PGDG.rhel7.x86_64
>
>                 on CentOS Linux release 7.2.1511 (Core)
>
>                 This is happening in a production environment but
>         luckily on the
>                 reporting database.
>                 I have a cluster of 3 databases, db1 and db2 are masters and
>                 replicate
>                 between each other and also replicate to db3 (db1 <->
>         db2, db1
>                 -> db3,
>                 db2 -> db3).
>                 For replication I am using Bucardo.
>
>
>             I would say this is more a question for the Burcardo list:
>
>             https://mail.endcrypt.com/mailman/listinfo/bucardo-general
>         <https://mail.endcrypt.com/mailman/listinfo/bucardo-general>
>             <https://mail.endcrypt.com/mailman/listinfo/bucardo-general
>         <https://mail.endcrypt.com/mailman/listinfo/bucardo-general>>
>
>             I am just not seeing that replicating two masters on to a single
>             database is going to end well.
>
>
>         Only one master is active at one time the other one is in stand
>         by that
>         is a topic for another discussion but in our case that works well.
>
>         That was my first assumption, that it is a kind of a race
>         condition or a
>         bug on replication but I quickly ruled that out because that
>         does not
>         explain why when I filtered the table by transactionid = 75315815 it
>         shows one record with transactionid 75315811...
>
>         select gameplayid, transactionid, encodedplay from
>         abrazo.matchgameplay
>         where transactionid in (75315815) order by transactionid;;
>          gameplayid | transactionid | encodedplay
>         ------------+---------------+--------------
>           160019271 |      75315815 | mix:9,0,9
>           160019269 |      75315815 | mix:9,8,9
>           160019267 |      75315815 | mix:9,2,2
>           160019265 |      75315815 | mix:2,2,8
>           160019263 |      *75315811 *| backup:1,9,1
>           160019261 |      75315815 | backup:2,0,9
>
>         So I don't think it is a replication issue...
>
>
>     Other that, if I am following correctly, it is on the database(db3)
>     being replicated to. The only way db3 is getting its data is through
>     replication, is that correct?. On the master databases the data is
>     correct.
>
> OK, let's assume that what you're saying is correct and the replication
> has a bug, or corruption or whatever that is and the record gets created
> with transactionid = 75315811. Bucardo replication is trigger based and
> it is using a copy command to insert the new records into the replicated
> database.
>
> Then how can I explain that my query select gameplayid, transactionid,
> encodedplay from abrazo.matchgameplay where transactionid in (75315815)
> order by transactionid; returns me a record with transactionid 75315811???

Corrupted index on db3?

Might want to look in the db3 logs to see if anything pops out.

I just do not know enough about Burcardo to be of much help beyond that.

>
> Thanks,
> ioana
>
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Corrupted Data ?

От
Ioana Danes
Дата:


On Mon, Aug 8, 2016 at 12:55 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 08/08/2016 09:47 AM, Ioana Danes wrote:


On Mon, Aug 8, 2016 at 12:37 PM, Adrian Klaver
<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:

    On 08/08/2016 09:28 AM, Ioana Danes wrote:



        On Mon, Aug 8, 2016 at 12:19 PM, Adrian Klaver
        <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
        <mailto:adrian.klaver@aklaver.com
        <mailto:adrian.klaver@aklaver.com>>> wrote:

            On 08/08/2016 09:11 AM, Ioana Danes wrote:

                Hi,

                I suspect I am having a case of data corruption. Here
        are the
                details:

                I am running postgres 9.4.8:

                postgresql94-9.4.8-1PGDG.rhel7.x86_64
                postgresql94-contrib-9.4.8-1PGDG.rhel7.x86_64
                postgresql94-libs-9.4.8-1PGDG.rhel7.x86_64
                postgresql94-server-9.4.8-1PGDG.rhel7.x86_64

                on CentOS Linux release 7.2.1511 (Core)

                This is happening in a production environment but
        luckily on the
                reporting database.
                I have a cluster of 3 databases, db1 and db2 are masters and
                replicate
                between each other and also replicate to db3 (db1 <->
        db2, db1
                -> db3,
                db2 -> db3).
                For replication I am using Bucardo.


            I would say this is more a question for the Burcardo list:

            https://mail.endcrypt.com/mailman/listinfo/bucardo-general
        <https://mail.endcrypt.com/mailman/listinfo/bucardo-general>
            <https://mail.endcrypt.com/mailman/listinfo/bucardo-general
        <https://mail.endcrypt.com/mailman/listinfo/bucardo-general>>

            I am just not seeing that replicating two masters on to a single
            database is going to end well.


        Only one master is active at one time the other one is in stand
        by that
        is a topic for another discussion but in our case that works well.

        That was my first assumption, that it is a kind of a race
        condition or a
        bug on replication but I quickly ruled that out because that
        does not
        explain why when I filtered the table by transactionid = 75315815 it
        shows one record with transactionid 75315811...

        select gameplayid, transactionid, encodedplay from
        abrazo.matchgameplay
        where transactionid in (75315815) order by transactionid;;
         gameplayid | transactionid | encodedplay
        ------------+---------------+--------------
          160019271 |      75315815 | mix:9,0,9
          160019269 |      75315815 | mix:9,8,9
          160019267 |      75315815 | mix:9,2,2
          160019265 |      75315815 | mix:2,2,8
          160019263 |      *75315811 *| backup:1,9,1
          160019261 |      75315815 | backup:2,0,9

        So I don't think it is a replication issue...


    Other that, if I am following correctly, it is on the database(db3)
    being replicated to. The only way db3 is getting its data is through
    replication, is that correct?. On the master databases the data is
    correct.

OK, let's assume that what you're saying is correct and the replication
has a bug, or corruption or whatever that is and the record gets created
with transactionid = 75315811. Bucardo replication is trigger based and
it is using a copy command to insert the new records into the replicated
database.

Then how can I explain that my query select gameplayid, transactionid,
encodedplay from abrazo.matchgameplay where transactionid in (75315815)
order by transactionid; returns me a record with transactionid 75315811???

Corrupted index on db3?

yes
 
Might want to look in the db3 logs to see if anything pops out.

I checked the logs, no traces of errors or corruption.
 
I just do not know enough about Burcardo to be of much help beyond that.
 

it is trigger based, it saves the ids of the inserted record in a delta table and then on sync it creates copy commands to the slave. Even if there is a bug or corruption in that process I don't see how that corrupts the index on db3...


So it is either replication bug + index corruption on db3 or data corruption on db3...


In response to Melvin, the query returns no rows:


SELECT n.nspname,
               i.relname,
               i.indexrelname,
               CASE WHEN idx.indisprimary
                         THEN 'pkey'
                          WHEN idx.indisunique
                          THEN 'uidx'
                          ELSE 'idx'
                  END AS type,
                  'INVALID'
  FROM pg_stat_all_indexes i
    JOIN pg_class c     ON (c.oid = i.relid)
    JOIN pg_namespace n ON (n.oid = c.relnamespace)
    JOIN pg_index idx   ON (idx.indexrelid =  i.indexrelid )
WHERE idx.indisvalid = FALSE  
ORDER BY 1, 2;

 nspname | relname | indexrelname | type | ?column?
---------+---------+--------------+------+----------
(0 rows)


Thank you for your thoughts,
ioana


Thanks,
ioana



    --
    Adrian Klaver
    adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>




--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Corrupted Data ?

От
Adrian Klaver
Дата:
On 08/08/2016 10:06 AM, Ioana Danes wrote:
>
>
> On Mon, Aug 8, 2016 at 12:55 PM, Adrian Klaver>         75315811???
>
>
>     Corrupted index on db3?
>
> yes
>
>
>     Might want to look in the db3 logs to see if anything pops out.
>
> I checked the logs, no traces of errors or corruption.
>
>
>     I just do not know enough about Burcardo to be of much help beyond that.
>
>
>
> it is trigger based, it saves the ids of the inserted record in a delta
> table and then on sync it creates copy commands to the slave. Even if
> there is a bug or corruption in that process I don't see how that
> corrupts the index on db3...

It seems to do more then that:

https://bucardo.org/wiki/Bucardo/Documentation/Overview

That is why I suggested the post to the Burcardo list. Folks there will
have a better idea what goes under the hood.

There is also this from a previous post:

"Only one master is active at one time the other one is in stand by that
is a topic for another discussion but in our case that works well."

Have no idea how that interaction plays out.

At this point what I see is:

1) Data is entered on a master and is correct there.

2) Data is replicated to a single standby from one of two possible
sources via Bucardo and is no longer correct.

3) Now Bucardo uses Postgres to do its work so it is possible that
something in Postgres is at fault. Still the fact that the data is good
on the master but not in the standby tends to indicate that the act of
replication is the issue.

4) Exactly how that replication is accomplished is not obvious to me.



>
>
> So it is either replication bug + index corruption on db3 or data
> corruption on db3...
>
>
> In response to Melvin, the query returns no rows:
>
>
> SELECT n.nspname,
>                i.relname,
>                i.indexrelname,
>                CASE WHEN idx.indisprimary
>                          THEN 'pkey'
>                           WHEN idx.indisunique
>                           THEN 'uidx'
>                           ELSE 'idx'
>                   END AS type,
>                   'INVALID'
>   FROM pg_stat_all_indexes i
>     JOIN pg_class c     ON (c.oid = i.relid)
>     JOIN pg_namespace n ON (n.oid = c.relnamespace)
>     JOIN pg_index idx   ON (idx.indexrelid =  i.indexrelid )
> WHERE idx.indisvalid = FALSE
> ORDER BY 1, 2;
>
>  nspname | relname | indexrelname | type | ?column?
> ---------+---------+--------------+------+----------
> (0 rows)
>
>
> Thank you for your thoughts,
> ioana
>
>
>
>         Thanks,
>         ioana
>
>
>
>             --
>             Adrian Klaver
>             adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>         <mailto:adrian.klaver@aklaver.com
>         <mailto:adrian.klaver@aklaver.com>>
>
>
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Corrupted Data ?

От
Ioana Danes
Дата:
Hello Everyone,

I have new information on this case.  I also open a post for Bucardo because I am still not sure what triggers this problem.

The problem happened again on the same table but on another field. Few days ago I started a fourth database called drdb that is a PITR slave from db3.

DB1, DB2, DRDB:

-[ RECORD 1 ]-------+--------------------------
gameplayid           | 324455566
systemuserid        | 515151
gameid                 | 41
transactionid         | 666556533
drawid                  | 318220
....

DB3:
-[ RECORD 1 ]-------+--------------------------
gameplayid           | 32445556
systemuserid        | 515151
gameid                 | 41
transactionid         | 666556533
drawid                  | 318216



Here are the facts I know:

August 10 @ 11:10
    - The record was created on db1 and replicated to db2 and db3
August 11 @ 2:30
    - db1, db2 and db3 are in sync (I have a script that compares the data for all 3 dbs every night @ 2:30 am)
August 12 @ 2:30
    - db3 is out of sync because of this field (drawid)
    - drdb (which is PITRed from db3) is in sync with db1 and db2?????


Because drdb (PITR slave from db3) is in sync with db1 and db2 and because the base backup was taken before the record in case was created, I believe that the xlogs are fine and I have a data kind of corruption on db3 on the data file for that table that happened after August 11 @ 2:30 (because the compare script found the dbs in sync)...

Also the index is correct on db3 as the record in case (with drawid = 318216) is retrieved if I filter by drawid = 318220

Any help is greatly appreciated,

Thank you


On Mon, Aug 8, 2016 at 1:25 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 08/08/2016 10:06 AM, Ioana Danes wrote:


On Mon, Aug 8, 2016 at 12:55 PM, Adrian Klaver>         75315811???


    Corrupted index on db3?

yes


    Might want to look in the db3 logs to see if anything pops out.

I checked the logs, no traces of errors or corruption.


    I just do not know enough about Burcardo to be of much help beyond that.



it is trigger based, it saves the ids of the inserted record in a delta
table and then on sync it creates copy commands to the slave. Even if
there is a bug or corruption in that process I don't see how that
corrupts the index on db3...

It seems to do more then that:

https://bucardo.org/wiki/Bucardo/Documentation/Overview

That is why I suggested the post to the Burcardo list. Folks there will have a better idea what goes under the hood.

There is also this from a previous post:

"Only one master is active at one time the other one is in stand by that is a topic for another discussion but in our case that works well."

Have no idea how that interaction plays out.

At this point what I see is:

1) Data is entered on a master and is correct there.

2) Data is replicated to a single standby from one of two possible sources via Bucardo and is no longer correct.

3) Now Bucardo uses Postgres to do its work so it is possible that something in Postgres is at fault. Still the fact that the data is good on the master but not in the standby tends to indicate that the act of replication is the issue.

4) Exactly how that replication is accomplished is not obvious to me.





So it is either replication bug + index corruption on db3 or data
corruption on db3...


In response to Melvin, the query returns no rows:


SELECT n.nspname,
               i.relname,
               i.indexrelname,
               CASE WHEN idx.indisprimary
                         THEN 'pkey'
                          WHEN idx.indisunique
                          THEN 'uidx'
                          ELSE 'idx'
                  END AS type,
                  'INVALID'
  FROM pg_stat_all_indexes i
    JOIN pg_class c     ON (c.oid = i.relid)
    JOIN pg_namespace n ON (n.oid = c.relnamespace)
    JOIN pg_index idx   ON (idx.indexrelid =  i.indexrelid )
WHERE idx.indisvalid = FALSE
ORDER BY 1, 2;

 nspname | relname | indexrelname | type | ?column?
---------+---------+--------------+------+----------
(0 rows)


Thank you for your thoughts,
ioana



        Thanks,
        ioana



            --
            Adrian Klaver
            adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
        <mailto:adrian.klaver@aklaver.com
        <mailto:adrian.klaver@aklaver.com>>




    --
    Adrian Klaver
    adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>




--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Corrupted Data ?

От
Melvin Davidson
Дата:


On Fri, Aug 12, 2016 at 9:09 AM, Ioana Danes <ioanadanes@gmail.com> wrote:
Hello Everyone,

I have new information on this case.  I also open a post for Bucardo because I am still not sure what triggers this problem.

The problem happened again on the same table but on another field. Few days ago I started a fourth database called drdb that is a PITR slave from db3.

DB1, DB2, DRDB:

-[ RECORD 1 ]-------+--------------------------
gameplayid           | 324455566
systemuserid        | 515151
gameid                 | 41
transactionid         | 666556533
drawid                  | 318220
....

DB3:
-[ RECORD 1 ]-------+--------------------------
gameplayid           | 32445556
systemuserid        | 515151
gameid                 | 41
transactionid         | 666556533
drawid                  | 318216



Here are the facts I know:

August 10 @ 11:10
    - The record was created on db1 and replicated to db2 and db3
August 11 @ 2:30
    - db1, db2 and db3 are in sync (I have a script that compares the data for all 3 dbs every night @ 2:30 am)
August 12 @ 2:30
    - db3 is out of sync because of this field (drawid)
    - drdb (which is PITRed from db3) is in sync with db1 and db2?????


Because drdb (PITR slave from db3) is in sync with db1 and db2 and because the base backup was taken before the record in case was created, I believe that the xlogs are fine and I have a data kind of corruption on db3 on the data file for that table that happened after August 11 @ 2:30 (because the compare script found the dbs in sync)...

Also the index is correct on db3 as the record in case (with drawid = 318216) is retrieved if I filter by drawid = 318220

Any help is greatly appreciated,

Thank you


On Mon, Aug 8, 2016 at 1:25 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 08/08/2016 10:06 AM, Ioana Danes wrote:


On Mon, Aug 8, 2016 at 12:55 PM, Adrian Klaver>         75315811???


    Corrupted index on db3?

yes


    Might want to look in the db3 logs to see if anything pops out.

I checked the logs, no traces of errors or corruption.


    I just do not know enough about Burcardo to be of much help beyond that.



it is trigger based, it saves the ids of the inserted record in a delta
table and then on sync it creates copy commands to the slave. Even if
there is a bug or corruption in that process I don't see how that
corrupts the index on db3...

It seems to do more then that:

https://bucardo.org/wiki/Bucardo/Documentation/Overview

That is why I suggested the post to the Burcardo list. Folks there will have a better idea what goes under the hood.

There is also this from a previous post:

"Only one master is active at one time the other one is in stand by that is a topic for another discussion but in our case that works well."

Have no idea how that interaction plays out.

At this point what I see is:

1) Data is entered on a master and is correct there.

2) Data is replicated to a single standby from one of two possible sources via Bucardo and is no longer correct.

3) Now Bucardo uses Postgres to do its work so it is possible that something in Postgres is at fault. Still the fact that the data is good on the master but not in the standby tends to indicate that the act of replication is the issue.

4) Exactly how that replication is accomplished is not obvious to me.





So it is either replication bug + index corruption on db3 or data
corruption on db3...


In response to Melvin, the query returns no rows:


SELECT n.nspname,
               i.relname,
               i.indexrelname,
               CASE WHEN idx.indisprimary
                         THEN 'pkey'
                          WHEN idx.indisunique
                          THEN 'uidx'
                          ELSE 'idx'
                  END AS type,
                  'INVALID'
  FROM pg_stat_all_indexes i
    JOIN pg_class c     ON (c.oid = i.relid)
    JOIN pg_namespace n ON (n.oid = c.relnamespace)
    JOIN pg_index idx   ON (idx.indexrelid =  i.indexrelid )
WHERE idx.indisvalid = FALSE
ORDER BY 1, 2;

 nspname | relname | indexrelname | type | ?column?
---------+---------+--------------+------+----------
(0 rows)


Thank you for your thoughts,
ioana



        Thanks,
        ioana



            --
            Adrian Klaver
            adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
        <mailto:adrian.klaver@aklaver.com
        <mailto:adrian.klaver@aklaver.com>>




    --
    Adrian Klaver
    adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>




--
Adrian Klaver
adrian.klaver@aklaver.com


Just out of curiosity, have you tried to REINDEX any of the indexes that might be involved on db3?
Even though PostgreSQL considers them valid, it's possible that their pointers could be incorrect.

Additionally, I would compare the EXPLAIN query on all 3 db's and see if there is any difference between the three.

--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: Corrupted Data ?

От
Francisco Olarte
Дата:
On Fri, Aug 12, 2016 at 3:09 PM, Ioana Danes <ioanadanes@gmail.com> wrote:
....
> drawid                  | 318220
...
> drawid                  | 318216

> Here are the facts I know:
>
> August 10 @ 11:10
>     - The record was created on db1 and replicated to db2 and db3
> August 11 @ 2:30
>     - db1, db2 and db3 are in sync (I have a script that compares the data
> for all 3 dbs every night @ 2:30 am)
> August 12 @ 2:30
>     - db3 is out of sync because of this field (drawid)
>     - drdb (which is PITRed from db3) is in sync with db1 and db2?????

given 318220 and 318216 are just a bit away ( 4db08/4db0c ), and it
repeats sporadically, have you ruled out ( by having page checksums or
other mechanism ) a potential disk read/write error ?

> Also the index is correct on db3 as the record in case (with drawid =
> 318216) is retrieved if I filter by drawid = 318220

Specially if this happens, you may have some slightly bad disks/ram/
leading to this kind of problems.

Francisco Olarte.


Re: Corrupted Data ?

От
Ioana Danes
Дата:
Hi Melvin,

On Fri, Aug 12, 2016 at 9:36 AM, Melvin Davidson <melvin6925@gmail.com> wrote:


On Fri, Aug 12, 2016 at 9:09 AM, Ioana Danes <ioanadanes@gmail.com> wrote:
Hello Everyone,

I have new information on this case.  I also open a post for Bucardo because I am still not sure what triggers this problem.

The problem happened again on the same table but on another field. Few days ago I started a fourth database called drdb that is a PITR slave from db3.

DB1, DB2, DRDB:

-[ RECORD 1 ]-------+--------------------------
gameplayid           | 324455566
systemuserid        | 515151
gameid                 | 41
transactionid         | 666556533
drawid                  | 318220
....

DB3:
-[ RECORD 1 ]-------+--------------------------
gameplayid           | 32445556
systemuserid        | 515151
gameid                 | 41
transactionid         | 666556533
drawid                  | 318216



Here are the facts I know:

August 10 @ 11:10
    - The record was created on db1 and replicated to db2 and db3
August 11 @ 2:30
    - db1, db2 and db3 are in sync (I have a script that compares the data for all 3 dbs every night @ 2:30 am)
August 12 @ 2:30
    - db3 is out of sync because of this field (drawid)
    - drdb (which is PITRed from db3) is in sync with db1 and db2?????


Because drdb (PITR slave from db3) is in sync with db1 and db2 and because the base backup was taken before the record in case was created, I believe that the xlogs are fine and I have a data kind of corruption on db3 on the data file for that table that happened after August 11 @ 2:30 (because the compare script found the dbs in sync)...

Also the index is correct on db3 as the record in case (with drawid = 318216) is retrieved if I filter by drawid = 318220

Any help is greatly appreciated,

Thank you



Just out of curiosity, have you tried to REINDEX any of the indexes that might be involved on db3?
Even though PostgreSQL considers them valid, it's possible that their pointers could be incorrect.

Additionally, I would compare the EXPLAIN query on all 3 db's and see if there is any difference between the three.


The (drawid) index is correct:

There are too many records for that draw so I will create a temp table with all the records for that draw and then retrieve the faulty one:

# create temporary table tmp_test as select * from abrazo.matchgameplay where drawid = 318220 order by datemodified;
SELECT 16611

# select * from tmp_test where drawid = 318216; -- this query should return nothing
-[ RECORD 1 ]-------+--------------------------
gameplayid           | 159329792
systemuserid        | 12797
gameid                 | 4
transactionid         | 75006684
drawid                  | 318216

# explain select * from abrazo.matchgameplay where drawid = 318220 order by datemodified;
                                              QUERY PLAN
------------------------------------------------------------------------------------------------------
 Sort  (cost=1039.41..1058.65 rows=7696 width=103)
   Sort Key: datemodified
   ->  Index Scan using ix_matchgame_drawid on matchgameplay  (cost=0.44..542.64 rows=7696 width=103)
         Index Cond: (drawid = 318220)
(4 rows)


I will probably end up rebuilding the table from the other good dbs...

The only problem I have is that the values I see in these cases don't look like corruption to be, they are valid information (for example this is an id of another draw in the same period from another game... )

Thanks,
ioana


--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: Corrupted Data ?

От
Francisco Olarte
Дата:
CCing to the list...

On Fri, Aug 12, 2016 at 4:10 PM, Ioana Danes <ioanadanes@gmail.com> wrote:
>> given 318220 and 318216 are just a bit away ( 4db08/4db0c ), and it
>> repeats sporadically, have you ruled out ( by having page checksums or
>> other mechanism ) a potential disk read/write error ?
>>
>>
>> > Also the index is correct on db3 as the record in case (with drawid =
>> > 318216) is retrieved if I filter by drawid = 318220
>>
>> Specially if this happens, you may have some slightly bad disks/ram/
>> leading to this kind of problems.
>>
>
> Could be. I also had some issues with an rsync between db3 and drdb a week
> ago that did not complete for bigger files (> 200MB) and gave me some
> corruption messages. Then the system was revbooted and everything seemed
> fine but apparently it is not.
> I am planning to drop & create the table from a good backup and if that does
> not fix the issue then I will rebuild the server.

I would check whatever logs you can ( syslog or eventlog, smart log,
etc.. ) hunting for disk errors ( sometimes they are reported ). This
kind of problems, with programs as tested as postgres and rsync, tend
to indicate controller/RAM/disk going bad ( in your case it could be
caused by a single bit getting flipped in a sector for the data
portion of the table, and not being propagated either because it
happened after your sync of drdb or because it was synced from the WAL
and not the table, or because it was read from the disk cache ).

Francisco Olarte.


Re: Corrupted Data ?

От
Ioana Danes
Дата:


On Fri, Aug 12, 2016 at 10:47 AM, Francisco Olarte <folarte@peoplecall.com> wrote:
CCing to the list...

Thanks
 
On Fri, Aug 12, 2016 at 4:10 PM, Ioana Danes <ioanadanes@gmail.com> wrote:
>> given 318220 and 318216 are just a bit away ( 4db08/4db0c ), and it
>> repeats sporadically, have you ruled out ( by having page checksums or
>> other mechanism ) a potential disk read/write error ?
>>
>>
>> > Also the index is correct on db3 as the record in case (with drawid =
>> > 318216) is retrieved if I filter by drawid = 318220
>>
>> Specially if this happens, you may have some slightly bad disks/ram/
>> leading to this kind of problems.
>>
>
> Could be. I also had some issues with an rsync between db3 and drdb a week
> ago that did not complete for bigger files (> 200MB) and gave me some
> corruption messages. Then the system was revbooted and everything seemed
> fine but apparently it is not.
> I am planning to drop & create the table from a good backup and if that does
> not fix the issue then I will rebuild the server.

I would check whatever logs you can ( syslog or eventlog, smart log,
etc.. ) hunting for disk errors ( sometimes they are reported ). This
kind of problems, with programs as tested as postgres and rsync, tend
to indicate controller/RAM/disk going bad ( in your case it could be
caused by a single bit getting flipped in a sector for the data
portion of the table, and not being propagated either because it
happened after your sync of drdb or because it was synced from the WAL
and not the table, or because it was read from the disk cache ).

I agree, unfortunately I did not find any clues about corruption or any anomalies in the logs.
I will work tonight to rebuild that table and see where I go from there.

Thanks,
ioana

Francisco Olarte.

Re: Corrupted Data ?

От
Adrian Klaver
Дата:
On 08/12/2016 08:10 AM, Ioana Danes wrote:
>
>
> On Fri, Aug 12, 2016 at 10:47 AM, Francisco Olarte
> <folarte@peoplecall.com <mailto:folarte@peoplecall.com>> wrote:
>
>     CCing to the list...
>
> Thanks
>
>
>     On Fri, Aug 12, 2016 at 4:10 PM, Ioana Danes <ioanadanes@gmail.com
>     <mailto:ioanadanes@gmail.com>> wrote:
>     >> given 318220 and 318216 are just a bit away ( 4db08/4db0c ), and it
>     >> repeats sporadically, have you ruled out ( by having page
>     checksums or
>     >> other mechanism ) a potential disk read/write error ?
>     >>
>     >>
>     >> > Also the index is correct on db3 as the record in case (with
>     drawid =
>     >> > 318216) is retrieved if I filter by drawid = 318220
>     >>
>     >> Specially if this happens, you may have some slightly bad disks/ram/
>     >> leading to this kind of problems.
>     >>
>     >
>     > Could be. I also had some issues with an rsync between db3 and
>     drdb a week
>     > ago that did not complete for bigger files (> 200MB) and gave me some
>     > corruption messages. Then the system was revbooted and everything
>     seemed
>     > fine but apparently it is not.
>     > I am planning to drop & create the table from a good backup and if
>     that does
>     > not fix the issue then I will rebuild the server.
>
>     I would check whatever logs you can ( syslog or eventlog, smart log,
>     etc.. ) hunting for disk errors ( sometimes they are reported ). This
>     kind of problems, with programs as tested as postgres and rsync, tend
>     to indicate controller/RAM/disk going bad ( in your case it could be
>     caused by a single bit getting flipped in a sector for the data
>     portion of the table, and not being propagated either because it
>     happened after your sync of drdb or because it was synced from the WAL
>     and not the table, or because it was read from the disk cache ).
>
> I agree, unfortunately I did not find any clues about corruption or any
> anomalies in the logs.
> I will work tonight to rebuild that table and see where I go from there.

The db3 database is on a different machine from all the other databases
you set up, correct?

>
> Thanks,
> ioana
>
>     Francisco Olarte.
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Corrupted Data ?

От
Ioana Danes
Дата:


On Fri, Aug 12, 2016 at 11:26 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 08/12/2016 08:10 AM, Ioana Danes wrote:


On Fri, Aug 12, 2016 at 10:47 AM, Francisco Olarte
<folarte@peoplecall.com <mailto:folarte@peoplecall.com>> wrote:

    CCing to the list...

Thanks


    On Fri, Aug 12, 2016 at 4:10 PM, Ioana Danes <ioanadanes@gmail.com
    <mailto:ioanadanes@gmail.com>> wrote:
    >> given 318220 and 318216 are just a bit away ( 4db08/4db0c ), and it
    >> repeats sporadically, have you ruled out ( by having page
    checksums or
    >> other mechanism ) a potential disk read/write error ?
    >>
    >>
    >> > Also the index is correct on db3 as the record in case (with
    drawid =
    >> > 318216) is retrieved if I filter by drawid = 318220
    >>
    >> Specially if this happens, you may have some slightly bad disks/ram/
    >> leading to this kind of problems.
    >>
    >
    > Could be. I also had some issues with an rsync between db3 and
    drdb a week
    > ago that did not complete for bigger files (> 200MB) and gave me some
    > corruption messages. Then the system was revbooted and everything
    seemed
    > fine but apparently it is not.
    > I am planning to drop & create the table from a good backup and if
    that does
    > not fix the issue then I will rebuild the server.

    I would check whatever logs you can ( syslog or eventlog, smart log,
    etc.. ) hunting for disk errors ( sometimes they are reported ). This
    kind of problems, with programs as tested as postgres and rsync, tend
    to indicate controller/RAM/disk going bad ( in your case it could be
    caused by a single bit getting flipped in a sector for the data
    portion of the table, and not being propagated either because it
    happened after your sync of drdb or because it was synced from the WAL
    and not the table, or because it was read from the disk cache ).

I agree, unfortunately I did not find any clues about corruption or any
anomalies in the logs.
I will work tonight to rebuild that table and see where I go from there.

The db3 database is on a different machine from all the other databases you set up, correct?

Yes, they are all different vms first 3 dbs are on the same cluster but drdb is a remote machine,

Thank you
 

Thanks,
ioana

    Francisco Olarte.




--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Corrupted Data ?

От
Adrian Klaver
Дата:
On 08/12/2016 08:30 AM, Ioana Danes wrote:
>
>
> On Fri, Aug 12, 2016 at 11:26 AM, Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 08/12/2016 08:10 AM, Ioana Danes wrote:
>
>
>
>         On Fri, Aug 12, 2016 at 10:47 AM, Francisco Olarte
>         <folarte@peoplecall.com <mailto:folarte@peoplecall.com>
>         <mailto:folarte@peoplecall.com <mailto:folarte@peoplecall.com>>>
>         wrote:
>
>             CCing to the list...
>
>         Thanks
>
>
>             On Fri, Aug 12, 2016 at 4:10 PM, Ioana Danes
>         <ioanadanes@gmail.com <mailto:ioanadanes@gmail.com>
>             <mailto:ioanadanes@gmail.com <mailto:ioanadanes@gmail.com>>>
>         wrote:
>             >> given 318220 and 318216 are just a bit away ( 4db08/4db0c
>         ), and it
>             >> repeats sporadically, have you ruled out ( by having page
>             checksums or
>             >> other mechanism ) a potential disk read/write error ?
>             >>
>             >>
>             >> > Also the index is correct on db3 as the record in case
>         (with
>             drawid =
>             >> > 318216) is retrieved if I filter by drawid = 318220
>             >>
>             >> Specially if this happens, you may have some slightly bad
>         disks/ram/
>             >> leading to this kind of problems.
>             >>
>             >
>             > Could be. I also had some issues with an rsync between db3 and
>             drdb a week
>             > ago that did not complete for bigger files (> 200MB) and
>         gave me some
>             > corruption messages. Then the system was revbooted and
>         everything
>             seemed
>             > fine but apparently it is not.
>             > I am planning to drop & create the table from a good
>         backup and if
>             that does
>             > not fix the issue then I will rebuild the server.
>
>             I would check whatever logs you can ( syslog or eventlog,
>         smart log,
>             etc.. ) hunting for disk errors ( sometimes they are
>         reported ). This
>             kind of problems, with programs as tested as postgres and
>         rsync, tend
>             to indicate controller/RAM/disk going bad ( in your case it
>         could be
>             caused by a single bit getting flipped in a sector for the data
>             portion of the table, and not being propagated either because it
>             happened after your sync of drdb or because it was synced
>         from the WAL
>             and not the table, or because it was read from the disk cache ).
>
>         I agree, unfortunately I did not find any clues about corruption
>         or any
>         anomalies in the logs.
>         I will work tonight to rebuild that table and see where I go
>         from there.
>
>
>     The db3 database is on a different machine from all the other
>     databases you set up, correct?
>
> Yes, they are all different vms first 3 dbs are on the same cluster but
> drdb is a remote machine,

Aah, another player in the mix.

What virtualization technology are you using?

>
> Thank you
>
>
>
>         Thanks,
>         ioana
>
>             Francisco Olarte.
>
>
>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Corrupted Data ?

От
Ioana Danes
Дата:


On Fri, Aug 12, 2016 at 11:34 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 08/12/2016 08:30 AM, Ioana Danes wrote:


On Fri, Aug 12, 2016 at 11:26 AM, Adrian Klaver
<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:

    On 08/12/2016 08:10 AM, Ioana Danes wrote:



        On Fri, Aug 12, 2016 at 10:47 AM, Francisco Olarte
        <folarte@peoplecall.com <mailto:folarte@peoplecall.com>
        <mailto:folarte@peoplecall.com <mailto:folarte@peoplecall.com>>>
        wrote:

            CCing to the list...

        Thanks


            On Fri, Aug 12, 2016 at 4:10 PM, Ioana Danes
        <ioanadanes@gmail.com <mailto:ioanadanes@gmail.com>
            <mailto:ioanadanes@gmail.com <mailto:ioanadanes@gmail.com>>>
        wrote:
            >> given 318220 and 318216 are just a bit away ( 4db08/4db0c
        ), and it
            >> repeats sporadically, have you ruled out ( by having page
            checksums or
            >> other mechanism ) a potential disk read/write error ?
            >>
            >>
            >> > Also the index is correct on db3 as the record in case
        (with
            drawid =
            >> > 318216) is retrieved if I filter by drawid = 318220
            >>
            >> Specially if this happens, you may have some slightly bad
        disks/ram/
            >> leading to this kind of problems.
            >>
            >
            > Could be. I also had some issues with an rsync between db3 and
            drdb a week
            > ago that did not complete for bigger files (> 200MB) and
        gave me some
            > corruption messages. Then the system was revbooted and
        everything
            seemed
            > fine but apparently it is not.
            > I am planning to drop & create the table from a good
        backup and if
            that does
            > not fix the issue then I will rebuild the server.

            I would check whatever logs you can ( syslog or eventlog,
        smart log,
            etc.. ) hunting for disk errors ( sometimes they are
        reported ). This
            kind of problems, with programs as tested as postgres and
        rsync, tend
            to indicate controller/RAM/disk going bad ( in your case it
        could be
            caused by a single bit getting flipped in a sector for the data
            portion of the table, and not being propagated either because it
            happened after your sync of drdb or because it was synced
        from the WAL
            and not the table, or because it was read from the disk cache ).

        I agree, unfortunately I did not find any clues about corruption
        or any
        anomalies in the logs.
        I will work tonight to rebuild that table and see where I go
        from there.


    The db3 database is on a different machine from all the other
    databases you set up, correct?

Yes, they are all different vms first 3 dbs are on the same cluster but
drdb is a remote machine,

Aah, another player in the mix.

What virtualization technology are you using?

kvm

 


Thank you



        Thanks,
        ioana

            Francisco Olarte.




    --
    Adrian Klaver
    adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>




--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Corrupted Data ?

От
Ioana Danes
Дата:


On Fri, Aug 12, 2016 at 11:44 AM, Ioana Danes <ioanadanes@gmail.com> wrote:


On Fri, Aug 12, 2016 at 11:34 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 08/12/2016 08:30 AM, Ioana Danes wrote:


On Fri, Aug 12, 2016 at 11:26 AM, Adrian Klaver
<adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:

    On 08/12/2016 08:10 AM, Ioana Danes wrote:



        On Fri, Aug 12, 2016 at 10:47 AM, Francisco Olarte
        <folarte@peoplecall.com <mailto:folarte@peoplecall.com>
        <mailto:folarte@peoplecall.com <mailto:folarte@peoplecall.com>>>
        wrote:

            CCing to the list...

        Thanks


            On Fri, Aug 12, 2016 at 4:10 PM, Ioana Danes
        <ioanadanes@gmail.com <mailto:ioanadanes@gmail.com>
            <mailto:ioanadanes@gmail.com <mailto:ioanadanes@gmail.com>>>
        wrote:
            >> given 318220 and 318216 are just a bit away ( 4db08/4db0c
        ), and it
            >> repeats sporadically, have you ruled out ( by having page
            checksums or
            >> other mechanism ) a potential disk read/write error ?
            >>
            >>
            >> > Also the index is correct on db3 as the record in case
        (with
            drawid =
            >> > 318216) is retrieved if I filter by drawid = 318220
            >>
            >> Specially if this happens, you may have some slightly bad
        disks/ram/
            >> leading to this kind of problems.
            >>
            >
            > Could be. I also had some issues with an rsync between db3 and
            drdb a week
            > ago that did not complete for bigger files (> 200MB) and
        gave me some
            > corruption messages. Then the system was revbooted and
        everything
            seemed
            > fine but apparently it is not.
            > I am planning to drop & create the table from a good
        backup and if
            that does
            > not fix the issue then I will rebuild the server.

            I would check whatever logs you can ( syslog or eventlog,
        smart log,
            etc.. ) hunting for disk errors ( sometimes they are
        reported ). This
            kind of problems, with programs as tested as postgres and
        rsync, tend
            to indicate controller/RAM/disk going bad ( in your case it
        could be
            caused by a single bit getting flipped in a sector for the data
            portion of the table, and not being propagated either because it
            happened after your sync of drdb or because it was synced
        from the WAL
            and not the table, or because it was read from the disk cache ).

        I agree, unfortunately I did not find any clues about corruption
        or any
        anomalies in the logs.
        I will work tonight to rebuild that table and see where I go
        from there.


    The db3 database is on a different machine from all the other
    databases you set up, correct?

Yes, they are all different vms first 3 dbs are on the same cluster but
drdb is a remote machine,

Aah, another player in the mix.

What virtualization technology are you using?

kvm
Sorry I should add more info 
kernel 4.7
and the filesystem is xfs vs ext3/ext4

 


Thank you



        Thanks,
        ioana

            Francisco Olarte.




    --
    Adrian Klaver
    adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>




--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Corrupted Data ?

От
Adrian Klaver
Дата:
On 08/12/2016 08:49 AM, Ioana Danes wrote:
>
>
> On Fri, Aug 12, 2016 at 11:44 AM, Ioana Danes <ioanadanes@gmail.com
> <mailto:ioanadanes@gmail.com>> wrote:
>
>

>     kvm
>
> Sorry I should add more info
> kernel 4.7
> and the filesystem is xfs vs ext3/ext4

Not following the above.

Do you mean the physical and virtual machines have different filesystems?


>
>
>
>
>
>             Thank you
>
>
>
>                     Thanks,
>                     ioana
>
>                         Francisco Olarte.
>
>
>
>
>                 --
>                 Adrian Klaver
>                 adrian.klaver@aklaver.com
>             <mailto:adrian.klaver@aklaver.com>
>             <mailto:adrian.klaver@aklaver.com
>             <mailto:adrian.klaver@aklaver.com>>
>
>
>
>
>         --
>         Adrian Klaver
>         adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Corrupted Data ?

От
Ioana Danes
Дата:


On Fri, Aug 12, 2016 at 12:22 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 08/12/2016 08:49 AM, Ioana Danes wrote:


On Fri, Aug 12, 2016 at 11:44 AM, Ioana Danes <ioanadanes@gmail.com
<mailto:ioanadanes@gmail.com>> wrote:



    kvm

Sorry I should add more info
kernel 4.7
and the filesystem is xfs vs ext3/ext4

Not following the above.

Do you mean the physical and virtual machines have different filesystems?

sorry the data partition where the postgres folder with all the postgres files resides is on xfs and the root partition is on ext4






            Thank you



                    Thanks,
                    ioana

                        Francisco Olarte.




                --
                Adrian Klaver
                adrian.klaver@aklaver.com
            <mailto:adrian.klaver@aklaver.com>
            <mailto:adrian.klaver@aklaver.com
            <mailto:adrian.klaver@aklaver.com>>




        --
        Adrian Klaver
        adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>





--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Corrupted Data ?

От
Adrian Klaver
Дата:
On 08/12/2016 08:30 AM, Ioana Danes wrote:
>
>

>
>
>     The db3 database is on a different machine from all the other
>     databases you set up, correct?
>
> Yes, they are all different vms first 3 dbs are on the same cluster but
> drdb is a remote machine,
>

This rattled around in my head and got less obvious with time:) Probably
because cluster has different meanings depending on whether you are
talking about Postgres or VMs. Right now I am imagining:

Physical machine 1                    |   Physical machine 2
                                       |
VM1     VM2       VM3                 |   VM1
db1     db2       db3                 |   drdb

where each VM has its own Postgres cluster

Is the above correct?


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Corrupted Data ?

От
Ioana Danes
Дата:


On Fri, Aug 12, 2016 at 2:50 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 08/12/2016 08:30 AM, Ioana Danes wrote:





    The db3 database is on a different machine from all the other
    databases you set up, correct?

Yes, they are all different vms first 3 dbs are on the same cluster but
drdb is a remote machine,


This rattled around in my head and got less obvious with time:) Probably because cluster has different meanings depending on whether you are talking about Postgres or VMs. Right now I am imagining:

Physical machine 1                    |   Physical machine 2
                                      |
VM1     VM2       VM3                 |   VM1
db1     db2       db3                 |   drdb

where each VM has its own Postgres cluster

Is the above correct?

yes correct, I should  have been more clear,

--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Corrupted Data ?

От
Adrian Klaver
Дата:
On 08/12/2016 11:51 AM, Ioana Danes wrote:
>
>
> On Fri, Aug 12, 2016 at 2:50 PM, Adrian Klaver
> <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 08/12/2016 08:30 AM, Ioana Danes wrote:
>
>
>
>
>
>
>             The db3 database is on a different machine from all the other
>             databases you set up, correct?
>
>         Yes, they are all different vms first 3 dbs are on the same
>         cluster but
>         drdb is a remote machine,
>
>
>     This rattled around in my head and got less obvious with time:)
>     Probably because cluster has different meanings depending on whether
>     you are talking about Postgres or VMs. Right now I am imagining:
>
>     Physical machine 1                    |   Physical machine 2
>                                           |
>     VM1     VM2       VM3                 |   VM1
>     db1     db2       db3                 |   drdb
>
>     where each VM has its own Postgres cluster
>
>     Is the above correct?
>
> yes correct, I should  have been more clear,

Hmm.

I like Francisco's suggestion:

"Specially if this happens, you may have some slightly bad disks/ram/
leading to this kind of problems."

Trying to reconcile that with all the physical machine 1 VMs sharing the
same RAM and physical disk, but the error only occurring on db3.

Is the VM hosting db3 setup different from the VMs 1 & 2?

>
>
>     --
>     Adrian Klaver
>     adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: Corrupted Data ?

От
Francisco Olarte
Дата:
Hi Adrian:

On Fri, Aug 12, 2016 at 9:01 PM, Adrian Klaver
<adrian.klaver@aklaver.com> wrote:
> "Specially if this happens, you may have some slightly bad disks/ram/
> leading to this kind of problems."
>
> Trying to reconcile that with all the physical machine 1 VMs sharing the
> same RAM and physical disk, but the error only occurring on db3.
>
> Is the VM hosting db3 setup different from the VMs 1 & 2?

They probably share the disk, bus, ram and disk controllers, but they
surely do not share the disk SECTORS. "Weak" ( nearly failing ) tracks
can give this kind of problems ( although the disk CRC should catch
all odd number of bit errors , but with VMs in the mix who knows where
the messages could end up ).

Francisco Olarte.