Обсуждение: Error after Streaming Replication

Поиск
Список
Период
Сортировка

Error after Streaming Replication

От
Walters Che Ndoh
Дата:
Hello All!

Please i need help with the below error...if someone has come across it before. 

For this one  ERROR:  requested WAL segment 000000010000089700000019 has already been removed i have come across it before and solved it by increasing to wal_keep_segments to reasonably high value and it worked for me. But this time i did same but still see the error.. so wondering what i am doing wrong.

My database is running on postgres 12 and i am wondering why i am seeing  ERROR:  could not open directory "pg_xlog/archive_status": No such file or directory and ERROR:  function pg_last_xlog_receive_location() does not exist at character 18

other infos:
database size = 213 gb
pg_wal directory size = 200gb plus
wal_keep_segments = increased from 512 to 2000. but didn't resolve error
Wal_level = replica


2021-02-18 07:01:37 MST [38376]: [2-1] LOG:  connection authorized: user=postgres database=postgres application_name=psql
2021-02-18 07:01:37 MST [38376]: [3-1] ERROR:  could not open directory "pg_xlog/archive_status": No such file or directory
2021-02-18 07:01:37 MST [38376]: [4-1] STATEMENT:  BEGIN;SET statement_timeout=30000;COMMIT;SELECT count(*) AS count FROM pg_ls_dir('pg_xlog/archive_status') WHERE pg_ls_dir ~ E'^[0-9A-F]{24}.ready$'

2021-02-18 07:03:21 MST [39035]: [2-1] LOG:  replication connection authorized: user=replicator application_name=data
2021-02-18 07:03:21 MST [39035]: [3-1] ERROR:  requested WAL segment 000000010000089700000019 has already been removed
2021-02-18 07:03:21 MST [39035]: [4-1] LOG:  disconnection: session time: 0:00:00.289 user=replicator database= host=10.112.30.12 port=58992
2021-02-18 07:03:28 MST [39131]: [1-1] LOG:  connection received: host=[local]
2021-02-18 07:03:28 MST [39131]: [2-1] LOG:  connection authorized: user=postgres database=syncmanager
2021-02-18 07:03:28 MST [39131]: [3-1] ERROR:  function pg_last_xlog_receive_location() does not exist at character 18
2021-02-18 07:03:28 MST [39131]: [4-1] HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
2021-02-18 07:03:28 MST [39131]: [5-1] STATEMENT:  SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;

2021-02-18 06:58:46 MST [37849]: [3-1] ERROR:  requested WAL segment 000000010000089700000019 has already been removed
2021-02-18 06:58:46 MST [37849]: [4-1] LOG:  disconnection: session time: 0:00:00.276 user=replicator database= host=10.112.30.12 port=58900
2021-02-18 06:58:51 MST [37861]: [1-1] LOG:  connection received: host=10.112.30.4 port=58900
2021-02-18 06:58:51 MST [37861]: [2-1] LOG:  replication connection authorized: user=replicator application_name=data
2021-02-18 06:58:51 MST [37861]: [3-1] ERROR:  requested WAL segment 000000010000089700000019 has already been removed
2021-02-18 06:58:51 MST [37861]: [4-1] LOG:  disconnection: session time: 0:00:00.244 user=replicator database= host=10.112.30.12 port=58900
2021-02-18 06:58:56 MST [37873]: [1-1] LOG:  connection received: host=10.112.30.12 port=58962
2021-02-18 06:58:56 MST [37873]: [2-1] LOG:  replication connection authorized: user=replicator application_name=data
2021-02-18 06:58:56 MST [37873]: [3-1] ERROR:  requested WAL segment 000000010000089700000019 has already been removed
2021-02-18 06:58:56 MST [37873]: [4-1] LOG:  disconnection: session time: 0:00:00.227 user=replicator database= host=10.112.30.12 port=58900
2021-02-18 06:59:01 MST [37886]: [1-1] LOG:  connection received: host=10.112.30.12 port=58900
2021-02-18 06:59:01 MST [37886]: [2-1] LOG:  replication connection authorized: user=replicator application_name=data
2021-02-18 06:59:01 MST [37886]: [3-1] ERROR:  requested WAL segment 000000010000089700000019 has already been removed

Any help will be appreciated

Walters
--

Re: Error after Streaming Replication

От
Walters Che Ndoh
Дата:
Just to add...all these errors are appearing on the replica

On Thu, Feb 18, 2021 at 7:32 AM Walters Che Ndoh <chendohw@gmail.com> wrote:
Hello All!

Please i need help with the below error...if someone has come across it before. 

For this one  ERROR:  requested WAL segment 000000010000089700000019 has already been removed i have come across it before and solved it by increasing to wal_keep_segments to reasonably high value and it worked for me. But this time i did same but still see the error.. so wondering what i am doing wrong.

My database is running on postgres 12 and i am wondering why i am seeing  ERROR:  could not open directory "pg_xlog/archive_status": No such file or directory and ERROR:  function pg_last_xlog_receive_location() does not exist at character 18

other infos:
database size = 213 gb
pg_wal directory size = 200gb plus
wal_keep_segments = increased from 512 to 2000. but didn't resolve error
Wal_level = replica


2021-02-18 07:01:37 MST [38376]: [2-1] LOG:  connection authorized: user=postgres database=postgres application_name=psql
2021-02-18 07:01:37 MST [38376]: [3-1] ERROR:  could not open directory "pg_xlog/archive_status": No such file or directory
2021-02-18 07:01:37 MST [38376]: [4-1] STATEMENT:  BEGIN;SET statement_timeout=30000;COMMIT;SELECT count(*) AS count FROM pg_ls_dir('pg_xlog/archive_status') WHERE pg_ls_dir ~ E'^[0-9A-F]{24}.ready$'

2021-02-18 07:03:21 MST [39035]: [2-1] LOG:  replication connection authorized: user=replicator application_name=data
2021-02-18 07:03:21 MST [39035]: [3-1] ERROR:  requested WAL segment 000000010000089700000019 has already been removed
2021-02-18 07:03:21 MST [39035]: [4-1] LOG:  disconnection: session time: 0:00:00.289 user=replicator database= host=10.112.30.12 port=58992
2021-02-18 07:03:28 MST [39131]: [1-1] LOG:  connection received: host=[local]
2021-02-18 07:03:28 MST [39131]: [2-1] LOG:  connection authorized: user=postgres database=syncmanager
2021-02-18 07:03:28 MST [39131]: [3-1] ERROR:  function pg_last_xlog_receive_location() does not exist at character 18
2021-02-18 07:03:28 MST [39131]: [4-1] HINT:  No function matches the given name and argument types. You might need to add explicit type casts.
2021-02-18 07:03:28 MST [39131]: [5-1] STATEMENT:  SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location() THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;

2021-02-18 06:58:46 MST [37849]: [3-1] ERROR:  requested WAL segment 000000010000089700000019 has already been removed
2021-02-18 06:58:46 MST [37849]: [4-1] LOG:  disconnection: session time: 0:00:00.276 user=replicator database= host=10.112.30.12 port=58900
2021-02-18 06:58:51 MST [37861]: [1-1] LOG:  connection received: host=10.112.30.4 port=58900
2021-02-18 06:58:51 MST [37861]: [2-1] LOG:  replication connection authorized: user=replicator application_name=data
2021-02-18 06:58:51 MST [37861]: [3-1] ERROR:  requested WAL segment 000000010000089700000019 has already been removed
2021-02-18 06:58:51 MST [37861]: [4-1] LOG:  disconnection: session time: 0:00:00.244 user=replicator database= host=10.112.30.12 port=58900
2021-02-18 06:58:56 MST [37873]: [1-1] LOG:  connection received: host=10.112.30.12 port=58962
2021-02-18 06:58:56 MST [37873]: [2-1] LOG:  replication connection authorized: user=replicator application_name=data
2021-02-18 06:58:56 MST [37873]: [3-1] ERROR:  requested WAL segment 000000010000089700000019 has already been removed
2021-02-18 06:58:56 MST [37873]: [4-1] LOG:  disconnection: session time: 0:00:00.227 user=replicator database= host=10.112.30.12 port=58900
2021-02-18 06:59:01 MST [37886]: [1-1] LOG:  connection received: host=10.112.30.12 port=58900
2021-02-18 06:59:01 MST [37886]: [2-1] LOG:  replication connection authorized: user=replicator application_name=data
2021-02-18 06:59:01 MST [37886]: [3-1] ERROR:  requested WAL segment 000000010000089700000019 has already been removed

Any help will be appreciated

Walters
--

Re: Error after Streaming Replication

От
Ian Lawrence Barwick
Дата:
2021年2月18日(木) 23:32 Walters Che Ndoh <chendohw@gmail.com>:
>
> Hello All!
>
> Please i need help with the below error...if someone has come across it before.
>
> For this one  ERROR:  requested WAL segment 000000010000089700000019 has already been removed i have come across it
beforeand solved it by increasing to wal_keep_segments to reasonably high value and it worked for me. But this time i
didsame but still see the error.. so wondering what i am doing wrong. 
>
> My database is running on postgres 12 and i am wondering why i am seeing  ERROR:  could not open directory
"pg_xlog/archive_status":No such file or directory and ERROR:  function pg_last_xlog_receive_location() does not exist
atcharacter 18 
>
> other infos:
> database size = 213 gb
> pg_wal directory size = 200gb plus
> wal_keep_segments = increased from 512 to 2000. but didn't resolve error
> Wal_level = replica

It's always risky to rely on "wal_keep_segments" as you'll always risk
either setting
a value which is not high enough, or a so high that you risk
exhausting disk space.

Better options would be:
- use a replication slot (which will guarantee sufficient WAL is retained,
  without you needing to guess how much; the downside is you'll need to
  monitor for inactive replication slots, as these will let WAL pile up
  indefinitely)
- some sort of WAL archival solution (e.g. pg_barman) which will enable the
  standby to retrieve WAL from there if it is not available on the primary

>
> 2021-02-18 07:01:37 MST [38376]: [2-1] LOG:  connection authorized: user=postgres database=postgres
application_name=psql
> 2021-02-18 07:01:37 MST [38376]: [3-1] ERROR:  could not open directory "pg_xlog/archive_status": No such file or
directory
> 2021-02-18 07:01:37 MST [38376]: [4-1] STATEMENT:  BEGIN;SET statement_timeout=30000;COMMIT;SELECT count(*) AS count
FROMpg_ls_dir('pg_xlog/archive_status') WHERE pg_ls_dir ~ E'^[0-9A-F]{24}.ready$' 

(...)
> 2021-02-18 07:03:28 MST [39131]: [3-1] ERROR:  function pg_last_xlog_receive_location() does not exist at character
18
> 2021-02-18 07:03:28 MST [39131]: [4-1] HINT:  No function matches the given name and argument types. You might need
toadd explicit type casts. 
> 2021-02-18 07:03:28 MST [39131]: [5-1] STATEMENT:  SELECT CASE WHEN pg_last_xlog_receive_location() =
pg_last_xlog_replay_location()THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS
log_delay;

This looks like some kind of monitoring script which was designed for
PostgreSQL 9.6 or earlier.


Regards

Ian Barwick



--
EnterpriseDB: https://www.enterprisedb.com



Re: Error after Streaming Replication

От
Scott Ribe
Дата:
> On Feb 18, 2021, at 8:48 AM, Ian Lawrence Barwick <barwick@gmail.com> wrote:
>
> the downside is you'll need to
>  monitor for inactive replication slots, as these will let WAL pile up
>  indefinitely

Perhaps worth noting: PG 13 has a way to limit this. (OP is running 12, so not directly applicable for now. And you
shouldmonitor regardless...) 


Re: Error after Streaming Replication

От
Holger Jakobs
Дата:
Everything (directories, functions) named ...xlog has been renamed to
...wal as of PostgreSQL 10.

--
Holger Jakobs, Bergisch Gladbach, Tel. +49-178-9759012



Вложения