登录查看更多内容

Unable to switchover with farsync involved (12.2)

Stéphane MAURIZIO

Chef de service adjoint/Database Administrator (Oracle/Postgres) - Centre des technologies de l'information de l'état (CTIE)

发布日期: 2019年10月16日

I think i point something with my DG configuration.

My primary is a RAC One Node, my AWS standbys are single instances.

Here is my config :

DGMGRL> show configuration;

Configuration - cellar_dg_t

  Protection Mode: MaxAvailability
  Members:
  cellarmxt    - Primary database
    cellar1xt    - Physical standby database 
    cellar2xt    - Physical standby database 
    cellarmxtfs1 - Far sync instance 
      cellarmwt    - Physical standby database 
      cellar1wt    - Physical standby database 
      cellar2wt    - Physical standby database 

  Members Not Receiving Redo:
  cellarmxtfs2 - Far sync instance (alternate of cellarmxtfs1)
  cellarmwtfs2 - Far sync instance 
  cellarmwtfs1 - Far sync instance 

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 49 seconds ago)

I have far sync configured in HA (meaning that if mxtfs1 is down for some reasons, then mxtfs2 is taking the hand).
Ok this is working as expected and i have the same thing on AWS side (mwtfs1/mwtfs2).

And changes are sent to all standbys on AWS side.

Now i want to switchover to AWS side on cellarmwt.

I try the validate database 'cellarmwt' :

DGMGRL> validate database 'cellarmwt';

  Database Role:     Physical standby database
  Primary Database:  cellarmxt

  Ready for Switchover:  Yes
  Ready for Failover:    Yes (Primary Running)

  Managed by Clusterware:
    cellarmxt:  YES            
    cellarmwt:  YES

Ok it sounds good.

I checked the v$archived_log and v$log_history views too on primary and standby to confirm that it was in sync.

Then i tried the SQL command to be sure :

alter database switchover to 'cellarmwt' verify;
*
ERROR at line 1:
ORA-16467: switchover target is not synchronized

What ? not in sync ?
Hmmmm nothing in alert.log of primary or standby and even in far sync.

I decided to set LOG_ARCHIVE_TRACE to 10240 on far sync as it goes through far sync for the switchover.

edit far_sync 'cellarmxtfs1' set property 'LogArchiveTrace'=10240;

Why 10240 ? 
Because in v$archived_dest_status it complained on far sync for a non resolvable gap.

So gap resolution is involved = 2048 and MRP=8192

I retried the command :

alter database switchover to 'cellarmwt' verify;
*
ERROR at line 1:
ORA-16467: switchover target is not synchronized

and on far sync i have a nice rmi trace file now.

And surprise :

RMI startup request received from PID 19348 

kcv_recv_so_to_directconn: Received SWITCHOVER request(PING) to switchover target 1606120691 
kcv_recv_so_to_directconn: found direct connection, get gap status 
*** 2019-10-16 12:10:15.537972 1067 krsg.c 

krsg_ping: Performing PING on thread 1 
krsg_gap_ping: Setting LE entry as invalid 
krsg_check_curseq_state: Checking AL for sequence 19349 
krsg_check_connection: Establishing link for LOG_ARCHIVE_DEST_3 to standby cellarmwt 
SUCCESS: retrieved DB password file location:+DATA3/orapwCELLARMXT 
*** 2019-10-16 12:10:16.033641 3207 krsg.c 
krsg_gap_ping: Pinging LOG_ARCHIVE_DEST_3 at cellarmwt (ping iteration 1) 
stop log B-985533585.T-1.S-19349 
krsg_ping_by_dest: Target recovery incomplete 
*** 2019-10-16 12:10:16.079231 1067 krsg.c 

krsg_ping: Performing PING on thread 2 <--------------------- where did it find thread#2 as cellarmwt,cellar1wt,cellar2wt are single instance. 

krsg_gap_ping: Setting LE entry as invalid 
krsg_check_curseq_state: Checking AL for sequence 31 
krsg_check_connection: Establishing link for LOG_ARCHIVE_DEST_3 to standby cellarmwt 
*** 2019-10-16 12:10:16.100581 3207 krsg.c 
krsg_gap_ping: Pinging LOG_ARCHIVE_DEST_3 at cellarmwt (ping iteration 1) 
stop log B-985533585.T-2.S-31 
*** 2019-10-16 12:10:16.184806 3540 krsg.c 
krsg_gap_ping: Gap ping detects gap for cellarmwt 
GAP - SCN range: 0x0000001626f9fac8 - 0x0000001626f9fac8 
DBID 1018469905 branch 985533585 
...Attempt to queue request 
krsg_ping_by_dest: Discovered a gap <------------------------------ This is the problem 

krsg_ping_by_dest: Target recovery incomplete 

I have the same traces with cellarmwt,cellar1wt or cellar2wt. 
It is looking for Thread 2 for AL 31 and cellarmwt,cellar1wt or cellar2wt are single instance.

My primary is Rac One Node :

SQL> select THREAD#,STATUS,ENABLED,INSTANCE,SEQUENCE#,LAST_REDO_SEQUENCE#,LAST_REDO_TIME
  2  from gv$thread
  3  ;

   THREAD# STATUS ENABLED  INSTANCE									     SEQUENCE# LAST_REDO_SEQUENCE# LAST_REDO_TIME
---------- ------ -------- -------------------------------------------------------------------------------- ---------- ------------------- -------------------
	 1 OPEN   PUBLIC   CELLARMXT_1										 19356		     19356 16/10/2019 14:52:21
	 2 CLOSED PUBLIC   CELLARMXT_2										    31			30 22/11/2018 11:00:54


On standbys :
	 
SQL> select THREAD#,STATUS,ENABLED,INSTANCE,SEQUENCE#,LAST_REDO_SEQUENCE#,LAST_REDO_TIME
  2  from gv$thread
  3  ;

   THREAD# STATUS ENABLED  INSTANCE									     SEQUENCE# LAST_REDO_SEQUENCE# LAST_REDO_TIME
---------- ------ -------- -------------------------------------------------------------------------------- ---------- ------------------- -------------------
	 1 OPEN   PUBLIC   CELLARMWT										 19356		     19356 14/08/2019 10:43:08
	 2 CLOSED PUBLIC   CELLARMXT_2										    31			31 22/11/2018 11:00:54

The thread 2 is closed.

But far sync is checking for each thread even if it is closed.

That's surely the problem.

Without the far sync involved in the loop (removed or disabled), i'm able to switchover.

I have a SR opened for this thing.

Unable to switchover with farsync involved (12.2)

Stéphane MAURIZIO

Chef de service adjoint/Database Administrator (Oracle/Postgres) - Centre des technologies de l'information de l'état (CTIE)

更多精彩文章

社区洞察

其他会员也浏览了

Vault implementation for static secrets

New SQL Server 2022 rules - good or bad?

Multicloud Oracle Database@Microsoft Azure - How to create & access Oracle Autonomous Database 23ai on Azure Cloud

Free yourself from Oracle

Tessell - DBaaS Download - April 2024

Setting up Oracle Database At Azure Pay As You Go Edition - Final steps

AWS: A Decade of Fear & Uncertainty

Understanding DynamoDB's HTTP Connection Model And It's Benefits

How to choose the best Azure SQL service for your needs

How to migrate an Azure Cosmos DB instance

Pgackrest and Minio, the perfect match

2021年12月3日

Plug a non-cdb in dataguard mode into a cdb in dataguard mode too and reusing datafiles from source.

2021年5月11日

ORA-48189 when opening a PDB

2021年3月23日

EM13C Fleet Management Gold Image creation

2021年1月22日

Oracle parameter standby_pdb_source_file_directory has a weird effect

2020年9月11日

DBVisit : register standby on ODA 19.7

2020年7月18日

DBVisit Standby 9.0.12 new functionnality with snapshot/snapshots group pre/post script

2020年4月20日

Dbvisit : single snapshot feature + snapshots group

2020年2月5日

Dbvisit : PDBs creation and switchover test

2019年12月31日

Oracle autoupgrade

2019年11月5日

社区洞察

其他会员也浏览了

Vault implementation for static secrets

New SQL Server 2022 rules - good or bad?

Multicloud Oracle Database@Microsoft Azure - How to create & access Oracle Autonomous Database 23ai on Azure Cloud

Free yourself from Oracle

Tessell - DBaaS Download - April 2024

Setting up Oracle Database At Azure Pay As You Go Edition - Final steps

AWS: A Decade of Fear & Uncertainty

Understanding DynamoDB's HTTP Connection Model And It's Benefits

How to choose the best Azure SQL service for your needs

How to migrate an Azure Cosmos DB instance