Configuring TCPS for database client connections on Flashgrid-enabled clusters

In this document we assume the following port assignments (all four ports must be different):

  • TCP SCAN Listener: 1521
  • TCP Local Listener: 1522
  • TCPS SCAN Listener: 1523
  • TCPS Local Listener: 1524

1. Add TCPS port to SCAN Proxy configuration

Do the following on all database nodes.

1.1 Modify /etc/flashgrid-scan.cfg

#scan_port: 1521
scan_port_list: [1521, 1523]

1.2 Restart SCAN Proxy service

# systemctl restart flashgrid-scan-proxy.service

2. Add iptables forwarding rule for Local Listener TCPS port

Do the following on all database nodes.

2.1 Remove immutable flag from /etc/sysconfig/iptables

# chattr -i /etc/sysconfig/iptables

2.2 In /etc/sysconfig/iptables add the following forwarding rule under similar rule for TCP port (replace X with address corresponding to the node):

-A PREROUTING -i eth0 -p tcp -m tcp --dport 1524 -j DNAT --to-destination 192.168.1.X

2.3 Add immutable flag to /etc/sysconfig/iptables

# chattr +i /etc/sysconfig/iptables

2.4 Restart iptables service

# systemctl restart iptables.service

3. Add TCPS port to Local Listener configuration

On the first node:

[grid@rac1 ~]$ srvctl modify listener -p "TCP:1522/TCPS:1524";

On each database node, modify LISTENER parameter in listener.ora under GI home (replace rac1.example.com with corresponding host name):

LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac1.example.com)(PORT = 1522))
      (ADDRESS = (PROTOCOL = IPC)(KEY = LISTENER))
      (ADDRESS = (PROTOCOL = TCPS)(HOST =rac1.example.com)(PORT = 1524))
    )
  )

SECURE_REGISTER_LISTENER_SCAN1 = (IPC,TCPS)
SECURE_REGISTER_LISTENER_SCAN2 = (IPC,TCPS)
SECURE_REGISTER_LISTENER_SCAN3 = (IPC,TCPS)
SECURE_REGISTER_LISTENER = (IPC,TCPS)

From the first node:

[grid@rac1 ~]$ srvctl stop listener
[grid@rac1 ~]$ srvctl start listener

[grid@rac1 ~]$ srvctl config listener
Name: LISTENER
Type: Database Listener
Network: 1, Owner: grid
Home: <CRS home>
End points: TCP:1522/TCPS:1524
Listener is enabled.
Listener is individually enabled on nodes: 
Listener is individually disabled on nodes: 

[grid@rac1 ~]$ lsnrctl status
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=rac1.example.com)(PORT=1522)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=rac1.example.com)(PORT=1524)))

From the second node:

[grid@rac2 ~]$ lsnrctl status
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=rac2.example.com)(PORT=1522)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=rac2.example.com)(PORT=1524)))

4. Add TCPS port to SCAN Listener configuration

On the first database node:

[grid@rac1 ~]$ srvctl stop scan_listener
[grid@rac1 ~]$ srvctl stop scan
[grid@rac1 ~]$ srvctl modify scan_listener -p TCP:1521/TCPS:1523
[grid@rac1 ~]$ srvctl start scan
[grid@rac1 ~]$ srvctl start scan_listener

[grid@rac1 ~]$ srvctl config scan_listener
SCAN Listeners for network 1:
Registration invited nodes: 
Registration invited subnets: 
Endpoints: TCP:1521/TCPS:1523
SCAN Listener LISTENER_SCAN1 exists
SCAN Listener is enabled.
SCAN Listener LISTENER_SCAN2 exists
SCAN Listener is enabled.
SCAN Listener LISTENER_SCAN3 exists
SCAN Listener is enabled.

[grid@rac1 ~]$ lsnrctl status listener_scan3
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN3)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=192.168.1.23)(PORT=1523)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.23)(PORT=1521)))
The listener supports no services
The command completed successfully

[grid@rac1 ~]$ crsctl stat res -p |grep ENDPOINTS
ENDPOINTS=TCP:1525
ENDPOINTS=TCP:1522 TCPS:1524
ENDPOINTS=TCP:1521 TCPS:1523
ENDPOINTS=TCP:1521 TCPS:1523
ENDPOINTS=TCP:1521 TCPS:1523

5. Create SSL Certificates and wallets for DB nodes and client

5.1 Create a self-signed CA (You can use your own CA and safely skip this step)

On the first database node:

[oracle@rac1 ~]$ mkdir -p /u01/app/oracle/CA
[oracle@rac1 ~]$ export CA_HOME=/u01/app/oracle/CA
[oracle@rac1 ~]$ orapki wallet create -wallet  $CA_HOME  -auto_login -pwd Oracle123
[oracle@rac1 ~]$ orapki wallet remove -trusted_cert_all -wallet $CA_HOME -pwd Oracle123
[oracle@rac1 ~]$ orapki wallet add -wallet $CA_HOME -self_signed -dn "CN=TEST-CA" -keysize 1024 -validity 3650 -sign_alg sha256 -pwd Oracle123
[oracle@rac1 ~]$ orapki wallet export -wallet $CA_HOME -dn "CN=TEST-CA" -cert /u01/app/oracle/CA/testCA.cer -pwd Oracle123

[oracle@rac1 ~]$ orapki wallet display -wallet $CA_HOME -summary
Oracle PKI Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright (c) 2004, 2019, Oracle and/or its affiliates. All rights reserved.

Requested Certificates:
User Certificates:
Subject:        CN=TEST-CA
Trusted Certificates:
Subject:        CN=TEST-CA

5.2 Create Cluster Wallet and Certificate

On the first database node:

[oracle@rac1 ~]$ mkdir -p /u01/app/oracle/wallet
[oracle@rac1 ~]$ export W_HOME=/u01/app/oracle/wallet
[oracle@rac1 ~]$ orapki wallet create -wallet $W_HOME -pwd Oracle123 -auto_login -pwd Oracle123
[oracle@rac1 ~]$ orapki wallet add -wallet $W_HOME -trusted_cert -cert /u01/app/oracle/CA/testCA.cer -pwd Oracle123

[oracle@rac1 ~]$ orapki wallet display -wallet $W_HOME -summary
Oracle PKI Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright (c) 2004, 2019, Oracle and/or its affiliates. All rights reserved.

Requested Certificates:
User Certificates:
Trusted Certificates:
Subject:        CN=TEST-CA

Find out what is your scan name (from oracle or root user):

[oracle@rac1 ~]$ srvctl config scan|grep "SCAN name"
SCAN name: marihcheck-scan.example.com, Network: 1

Instead of marihcheck-scan indicate scan name returned from the previous command:

[oracle@rac1 ~]$ export W_HOME=/u01/app/oracle/wallet
[oracle@rac1 ~]$ export CA_HOME=/u01/app/oracle/CA
[oracle@rac1 ~]$ orapki wallet add -wallet $W_HOME -dn "CN=marihcheck-scan" -keysize 1024 -pwd Oracle123
[oracle@rac1 ~]$ orapki wallet export -wallet $W_HOME -dn "CN=marihcheck-scan" -request $W_HOME/marihcheck.req -pwd Oracle123
[oracle@rac1 ~]$ orapki cert create -wallet $CA_HOME -request $W_HOME/marihcheck.req -cert $W_HOME/marihcheck.cer -validity 3650 -sign_alg sha256 -pwd Oracle123
[oracle@rac1 ~]$ orapki wallet add -wallet $W_HOME -user_cert -cert $W_HOME/marihcheck.cer -pwd Oracle123

[oracle@rac1 ~]$ orapki wallet display -wallet $W_HOME -summary
Oracle PKI Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright (c) 2004, 2019, Oracle and/or its affiliates. All rights reserved.

Requested Certificates:
User Certificates:
Subject:        CN=marihcheck-scan
Trusted Certificates:
Subject:        CN=TEST-CA

5.3 Create Client Wallet and Certificate

From the first database node:

Instead of clientvm indicate hostname of the client:

[oracle@rac1 ~]$ export W_HOME=/u01/app/oracle/wallet/client
[oracle@rac1 ~]$ orapki wallet create -wallet $W_HOME -pwd Oracle123 -auto_login -pwd Oracle123
[oracle@rac1 ~]$ orapki wallet add -wallet $W_HOME -trusted_cert -cert /u01/app/oracle/CA/testCA.cer -pwd Oracle123

[oracle@rac1 ~]$ orapki wallet add -wallet $W_HOME -dn "CN=clientvm" -keysize 1024 -pwd Oracle123
[oracle@rac1 ~]$ orapki wallet export -wallet $W_HOME -dn "CN=clientvm" -request $W_HOME/clientvm.req -pwd Oracle123
[oracle@rac1 ~]$ orapki cert create -wallet $CA_HOME -request $W_HOME/clientvm.req -cert $W_HOME/clientvm.cer -validity 3650 -sign_alg sha256 -pwd Oracle123
[oracle@rac1 ~]$ orapki wallet add -wallet $W_HOME -user_cert -cert $W_HOME/clientvm.cer -pwd Oracle123
[oracle@rac1 ~]$ orapki wallet display -wallet $W_HOME -summary
Oracle PKI Tool Release 19.0.0.0.0 - Production
Version 19.4.0.0.0
Copyright (c) 2004, 2019, Oracle and/or its affiliates. All rights reserved.

Requested Certificates:
User Certificates:
Subject:        CN=clientvm
Trusted Certificates:
Subject:        CN=TEST-CA

Change permission on server and client cwallet.sso file:

[oracle@rac1 wallet]$ chmod 640 /u01/app/oracle/wallet/cwallet.sso
[oracle@rac1 wallet]$ chmod 640 /u01/app/oracle/wallet/client/cwallet.sso

Create wallet location on the clientvm and copy generated client wallet from rac1 to the client:

[root@clientvm ~]# mkdir /usr/lib/oracle/19.5/client64/wallet
[root@clientvm ~]# cd /usr/lib/oracle/19.5/client64/wallet
[root@clientvm ~]# scp oracle@rac1:/u01/app/oracle/wallet/client/* .

Delete client folder from rac1:

[oracle@rac1 ~]$ rm -rf /u01/app/oracle/wallet/client/

Copy generated server wallet from rac1 to rac2:

[oracle@rac1 ~]$ scp -r /u01/app/oracle/wallet oracle@rac2:/u01/app/oracle/
cwallet.sso              100% 2445   638.3KB/s   00:00
cwallet.sso.lck          100%    0     0.0KB/s   00:00
ewallet.p12              100% 2400   540.6KB/s   00:00
ewallet.p12.lck          100%    0     0.0KB/s   00:00

6. Configure Oracle network files

6.1 Add the following lines to the server side sqlnet.ora file in RDBMS and GI homes accordingly (on all database nodes)

[oracle@rac1 ~]$ cat /u01/app/oracle/product/19.3.0/dbhome_1/network/admin/sqlnet.ora
WALLET_LOCATION =
   (SOURCE =
     (METHOD = FILE)
     (METHOD_DATA =
       (DIRECTORY = /u01/app/oracle/wallet)
     )
   )

SSL_VERSION = 0
SQLNET.AUTHENTICATION_SERVICES = (TCPS,BEQ)
SSL_CLIENT_AUTHENTICATION = FALSE
[grid@rac1 ~]$ cat /u01/app/19.3.0/grid/network/admin/sqlnet.ora
WALLET_LOCATION =
   (SOURCE =
     (METHOD = FILE)
     (METHOD_DATA =
       (DIRECTORY = /u01/app/oracle/wallet)
     )
   )

SSL_CLIENT_AUTHENTICATION = FALSE

6.2 Add the following entries to the client side sqlnet.ora file:

[root@clientvm admin]# cat sqlnet.ora
WALLET_LOCATION =
   (SOURCE =
     (METHOD = FILE)
     (METHOD_DATA =
       (DIRECTORY = /usr/lib/oracle/19.5/client64/wallet)
     )
   )

6.3 Add the following lines to the listener.ora in GI home (/u01/app/19.3.0/grid/network/admin/listener.ora) on all database nodes

SSL_CLIENT_AUTHENTICATION = FALSE

WALLET_LOCATION =
  (SOURCE =
    (METHOD = FILE)
    (METHOD_DATA =
      (DIRECTORY = /u01/app/oracle/wallet)
    )
  )

6.4 Update NODEFQDN entry to TCPS/1524 in tnsnames.ora in RDBMS home on all database nodes

[oracle@rac1 ~]$ cat /u01/app/oracle/product/19.3.0/dbhome_1/network/admin/tnsnames.ora

DONOTDELETE,NODEFQDN =
  (ADDRESS = (PROTOCOL = TCPS)(Host = rac1.example.com)(Port = 1524))
[oracle@rac2 ~]$ cat /u01/app/oracle/product/19.3.0/dbhome_1/network/admin/tnsnames.ora

DONOTDELETE,NODEFQDN =
  (ADDRESS = (PROTOCOL = TCPS)(Host = rac2.example.com)(Port = 1524))

6.5 Reset local_listener parameter to NODEFQND

[oracle@rac1 ~]$ export ORACLE_SID=orclcdb1
[oracle@rac1 ~]$ sqlplus / as sysdba

SQL> alter system set local_listener='NODEFQDN';

6.6 Restart listeners:

[root@rac1 ~]# srvctl stop listener; srvctl start listener
[root@rac1 ~]# srvctl stop scan_listener; srvctl start scan_listener

6.7 Add the following entries to the client side tnsnames.ora

[root@clientvm admin]# cat tnsnames.ora
TEST_SSL =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCPS)(HOST = marihcheck-scan.example.com)(PORT = 1523))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = orclcdb)
    )
  )

TEST_SSL1 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCPS)(HOST = rac1.example.com)(PORT = 1524))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = orclcdb)
    )
  )

TEST_SSL2 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCPS)(HOST = rac2.example.com)(PORT = 1524))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = orclcdb)
    )
  )

7. Test connections

[root@clientvm admin]# export ORACLE_HOME=/usr/lib/oracle/19.5/client64
[root@clientvm admin]# export PATH=$ORACLE_HOME/bin:$PATH
[root@clientvm admin]# export TNS_ADMIN=$ORACLE_HOME/network/admin

[root@clientvm admin]# sqlplus system/oracle@TEST_SSL

SQL*Plus: Release 19.0.0.0.0 - Production on Thu Mar 19 11:52:33 2020
Version 19.5.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Last Successful login time: Sat Mar 14 2020 19:54:53 +00:00

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.6.0.0.0

SQL> SELECT SYS_CONTEXT('USERENV', 'network_protocol') FROM DUAL;

SYS_CONTEXT('USERENV','NETWORK_PROTOCOL')
--------------------------------------------------------------------------------
tcps
[root@clientvm admin]# sqlplus system/oracle@TEST_SSL1

SQL*Plus: Release 19.0.0.0.0 - Production on Thu Mar 19 11:53:23 2020
Version 19.5.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Last Successful login time: Thu Mar 19 2020 11:53:20 +00:00

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.6.0.0.0

SQL>
[root@clientvm admin]# sqlplus system/oracle@TEST_SSL2

SQL*Plus: Release 19.0.0.0.0 - Production on Thu Mar 19 11:53:23 2020
Version 19.5.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Last Successful login time: Thu Mar 19 2020 11:53:20 +00:00

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.6.0.0.0

SQL>

Note: The same steps are described https://kb.flashgrid.io/configuring-tcps-for-client-connections#6-configure-oracle-network-files, which is also written by me.

Creating an Oracle RAC on Azure cloud with FlashGrid SkyCluster

agctl start goldengate hangs and fails

In this post I will discuss one of the reasons why golden gate startup may hang and fail on both nodes:

Problem:

[root@primrac1 tmp]# /u01/app/grid/xag/bin/agctl start goldengate gg_replicate
^[[ACRS-2672: Attempting to start 'xag.gg_replicate.goldengate' on 'primrac2'
CRS-2674: Start of 'xag.gg_replicate.goldengate' on 'primrac2' failed
CRS-2679: Attempting to clean 'xag.gg_replicate.goldengate' on 'primrac2'
CRS-2681: Clean of 'xag.gg_replicate.goldengate' on 'primrac2' succeeded
CRS-2563: Attempt to start resource 'xag.gg_replicate.goldengate' on 'primrac2' has failed. Will re-retry on 'primrac1' now.
CRS-2672: Attempting to start 'xag.gg_replicate.goldengate' on 'primrac1'
CRS-2674: Start of 'xag.gg_replicate.goldengate' on 'primrac1' failed
CRS-2679: Attempting to clean 'xag.gg_replicate.goldengate' on 'primrac1'
CRS-2681: Clean of 'xag.gg_replicate.goldengate' on 'primrac1' succeeded
CRS-2632: There are no more servers to try to place resource 'xag.gg_replicate.goldengate' on that would satisfy its placement policy
CRS-4000: Command Start failed, or completed with errors.

Troubleshooting:

Check crsd_scriptagent_oracle.trc trace file on each database node. For my environment the full path of this file is /u01/app/grid/diag/crs/primrac2/crs/trace/crsd_scriptagent_oracle.trc

I found that the following error was repeating in the trace file:

2020-03-05 23:36:30.467 : CLSDADR:3560113920: ERRORdynamic component [R2530613] mapped to [CLSDYNAM]

On the metalink found only this PRCR-1079 : Failed to start resource ora.oc4j, the problem is not the same, but glanced a sentence containing config file.. so here I realized it could be related to the config file. And I was correct.

Solution:

Delete existing GG resource:

# /u01/app/grid/xag/bin/agctl remove goldengate gg_replicate

Readd using correct config file location:
--config_home /GG_HOME/sm/etc/conf --var_home /GG_HOME/sm/var

# /u01/app/grid/xag/bin/agctl add goldengate gg_replicate --gg_home /GG_HOME/ma --service_manager --config_home /GG_HOME/sm/etc/conf --var_home /GG_HOME/sm/var --port 9001 --adminuser oggadmin --user oracle --group oinstall --filesystems ora.ggdg.acfsgg.acfs --db_services ora.primorcl.orclservice.svc --use_local_services

Start Golden Gate, relocate and check:

[root@primrac1 ~]# /u01/app/grid/xag/bin/agctl start goldengate gg_replicate
[root@primrac1 ~]# /u01/app/grid/xag/bin/agctl status goldengate gg_replicate
Goldengate instance 'gg_replicate' is running on primrac2
[root@primrac1 ~]# /u01/app/grid/xag/bin/agctl relocate goldengate gg_replicate
[root@primrac1 ~]# /u01/app/grid/xag/bin/agctl status goldengate gg_replicate
Goldengate instance 'gg_replicate' is running on primrac1

Step by step instructions for configuring Bastion host in Azure

NFS Disk Configuration for Quorum

Before starting make sure that:

1. NFS server is accessible from DB nodes
2. nfs-utils is installed on all client and DB nodes
3. ASM is running on all nodes

NFS disk configuration steps:

1. On NFS server node (e.g. nfsserver)

# systemctl disable firewalld
# systemctl stop firewalld
# mkdir /NFS_DISKS

--Create quorum disk in nfs share:
# dd if=/dev/zero of=/NFS_DISKS/nfsdisk1 bs=1M count=1024

# chmod -R 755 /NFS_DISKS
# chown -R grid:oinstall /NFS_DISKS

# vim /etc/exports
/NFS_DISKS *(rw,async)

# systemctl enable nfs-server
# systemctl start nfs-server

2. On all DB nodes (e.g. rac1 rac2)

# mkdir /NFS_DISKS 
# chown -R grid:oinstall /NFS_DISKS
# vim /etc/fstab

nfsserver:/NFS_DISKS /NFS_DISKS nfs rw,bg,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,noac,vers=3,timeo=600 0 0 0

# mount -a

--Update disk discovery string from grid user:
$ asmcmd dsset '/NFS_DISKS/*','/dev/flashgrid/*'

If this is Flashgrid-enabled cluster, then you can create diskgroup which contains NFS disk as quorum:

[grid@rac1 ~]$ flashgrid-dg create --name TEST --normal --asm-compat 19.0.0.0 --db-compat 19.0.0.0 --disks /dev/flashgrid/rac1.lun4 /dev/flashgrid/rac2.lun4 --quorum-disks /NFS_DISKS/nfsdisk1

FlashGrid 19.6.48.60391
~~~~
Querying nodes: rac1, rac2, racq …
Creating group…
Completing this operation may take long time. Please wait…
CREATE DISKGROUP "TEST" NORMAL REDUNDANCY
FAILGROUP "rac1" DISK '/dev/flashgrid/rac1.lun4' NAME "rac1$LUN4" SIZE 20480M
FAILGROUP "rac2" DISK '/dev/flashgrid/rac2.lun4' NAME "rac2$LUN4" SIZE 20480M
QUORUM FAILGROUP "QUORUM0" DISK '/NFS_DISKS/nfsdisk1' NAME "TEST_Q0"
ATTRIBUTE 'au_size' = '4M', 'compatible.asm' = '19.0.0.0', 'compatible.rdbms' = '19.0.0.0';
ALTER SYSTEM SET asm_preferred_read_failure_groups='DATA.rac1','FRA.rac1','GGDG.rac1','GRID.rac1','TEST.rac1' SID='+ASM1';
ALTER SYSTEM SET asm_preferred_read_failure_groups='DATA.rac2','FRA.rac2','GGDG.rac2','GRID.rac2','TEST.rac2' SID='+ASM2';
Success

Configure Bastion host for Azure VM

What is Azure Bastion?

The Azure Bastion is PaaS service that you create inside your virtual network. It provides secure RDP and SSH connectivity to all of the VMs in the virtual network in which it is provisioned so that your target virtual machines do not need a public IP address.

Configuration Steps:

1. Connect to the Azure portal and choose the VM to connect to.

2. From the left panel, choose Bastion under the Operations section, or click Connect from the Overview section.

3. To associate a virtual network with a Bastion, it must contain a subnet with the name AzureBastionSubnet. On the following page, click Manage subnet configuration

4. Add subnet named AzureBastionSubnet

5. After configuring subnet, click link rac1-vm – Bastion in the upper left corner. And then click Create button.

6. Connect to the VM via Azure Bastion

If you have ssh key based authentication, enter correct username and choose the right private key file.

Please note, that the private key file should be in PEM format. If you get the following error message, it means your key file is not in the correct format:

 If so, convert it using ssh-keygen (for windows PuTTYgen) :

$ ssh-keygen -p -m PEM -f my_private_key

Choose the key file again and click Connect.

opatch apply failed: /u01/app/oraInventory/ContentsXML/oui-patch.xml (Permission denied)

Problem:

In the two-node cluster. I applied patch 29342099 on the 1st node successfully. But the patch failed on the 2nd node:

[oracle@stbyrac2 29342099]$ $ORACLE_HOME/OPatch/opatch apply
...
Applying interim patch '29342099' to OH '/u01/app/oracle/product/19.3.0/dbhome_1'
Patching component oracle.rdbms, 19.0.0.0.0…
Patching component oracle.rdbms.rman, 19.0.0.0.0…
Patching component oracle.rdbms.rsf, 19.0.0.0.0…
ApplySession failed in system modification phase… 'ApplySession::apply failed: java.io.IOException: oracle.sysman.oui.patch.PatchException: 
java.io.FileNotFoundException: /u01/app/oraInventory/ContentsXML/oui-patch.xml (Permission denied)'

The patch failed in a bad phase, after failure while I was trying to check inventory on that node I was getting:

[oracle@stbyrac2 ~]$ $ORACLE_HOME/OPatch/opatch lsinventory -detail
...
Inventory load failed… OPatch cannot load inventory for the given Oracle Home.
LsInventorySession failed: Unable to create patchObject
Possible causes are:
ORACLE_HOME/inventory/oneoffs/29342099 is corrupted.
java.lang.RuntimeException: No Patch exists,Please check.

Inventory knew about patch 29342099 but $ORACLE_HOME/inventory/oneoffs/29342099 did not exist.

I decided to copy folder from 1st node (where patch was successful):

[oracle@stbyrac1 ~]$ scp -r $ORACLE_HOME/inventory/oneoffs/29342099 stbyrac2:$ORACLE_HOME/inventory/oneoffs/

After that listing inventory was successful. And patch seemed to be part of the inventory. I tried to rollback it, which failed with the same permission error `/u01/app/oraInventory/ContentsXML/oui-patch.xml (Permission denied)`

Solution:

Check permissions on oui-patch.xml file on both nodes:

[oracle@primrac1 ~]$ ls -la /u01/app/oraInventory/ContentsXML/oui-patch.xml
-rw-rw---- 1 grid oinstall 174 Oct 9 20:46 /u01/app/oraInventory/ContentsXML/oui-patch.xml
[oracle@primrac1 ~]$ ssh primrac2 ls -la /u01/app/oraInventory/ContentsXML/oui-patch.xml
-rw-r--r-- 1 grid oinstall 174 Sep 29 19:38 /u01/app/oraInventory/ContentsXML/oui-patch.xml

They are different. Modify permission of that file to 660:

[root@stbyrac2 ContentsXML]# chmod 660 oui-patch.xml 

Rollback and reapply patch:

[oracle@stbyrac2 29342099]$ $ORACLE_HOME/OPatch/opatch apply
...
Applying interim patch '29342099' to OH '/u01/app/oracle/product/19.3.0/dbhome_1'
Patching component oracle.rdbms, 19.0.0.0.0…
Patching component oracle.rdbms.rman, 19.0.0.0.0…
Patching component oracle.rdbms.rsf, 19.0.0.0.0…
Patch 29342099 successfully applied.
Log file location: /u01/app/oracle/product/19.3.0/dbhome_1/cfgtoollogs/opatch/opatch2020-01-27_13-55-44PM_1.log
OPatch succeeded.

Apache Derby: “ERROR 25502: An SQL data change is not permitted for a read-only connection, user or database.”

Problem:

During testing Oracle ZDM (Zero Downtime Migration), I’ve stopped ZDM host ungracefully. After that zdmservice was not able to start:

[zdmuser@primracq ~]$ /u01/app/zdmhome/bin/zdmservice start
...
CRS_ERROR:TCC-0004: The container was not able to start.
CRS_ERROR:One or more listeners failed to start. Full details will be found in the appropriate container log fileContext [/rhp] startup failed due to previous errors
...
Start operation could not start zdmservice.
zdmservice start failed…

Troubleshooting:

Check logs under [/rhp], in my case /u01/app/zdmbase/crsdata/primracq/rhp/logs/

[zdmuser@primracq ~]$ ll /u01/app/zdmbase/crsdata/primracq/rhp/logs/
 total 3748
-rw-r--r-- 1 zdmuser zdm   277774 Dec 17 19:19 catalina.2019-12-17.log
-rw-r--r-- 1 zdmuser zdm   186117 Dec 18 18:12 catalina.2019-12-18.log
-rw-r--r-- 1 root    root   24133 Dec 19 19:12 catalina.2019-12-19.log
-rw-r--r-- 1 root    root  541694 Dec 22 20:31 catalina.2019-12-22.log
-rw-r--r-- 1 zdmuser zdm    35666 Dec 23 12:40 catalina.2019-12-23.log
-rw-r--r-- 1 zdmuser zdm  2407571 Dec 23 12:40 catalina.out
-rw-r--r-- 1 zdmuser zdm    12541 Dec 23 12:40 derby.log
-rw-r--r-- 1 zdmuser zdm   273095 Dec 23 12:40 jwc_checker_stdout_err_0.log
-rw-r--r-- 1 zdmuser zdm      155 Dec 23 12:40 JWCStartEvent.log
-rw-r--r-- 1 zdmuser zdm     5083 Dec 17 19:19 localhost.2019-12-17.log
-rw-r--r-- 1 zdmuser zdm     5083 Dec 18 18:12 localhost.2019-12-18.log
-rw-r--r-- 1 root    root    5083 Dec 19 19:12 localhost.2019-12-19.log
-rw-r--r-- 1 root    root    5083 Dec 22 20:31 localhost.2019-12-22.log
-rw-r--r-- 1 zdmuser zdm    18708 Dec 23 12:40 localhost.2019-12-23.log

catalina.out showed the following error:

Internal Exception: java.sql.SQLException: An SQL data change is not permitted for a read-only connection, user or database.
...
Caused by: ERROR 25502: An SQL data change is not permitted for a read-only connection, user or database.

Solution:

Remove db.lck file under /u01/app/zdmbase/derbyRepo:

# rm -rf /u01/app/zdmbase/derbyRepo/db.lck

And start the service again.

Change Oracle DB name using NID

Details:

Environment: RAC
Source name: orclA
Target name: orcl

1. Make sure that you have a recoverable whole database backup.

2. In Real Application Cluster we need to set cluster_database parameter to false and mount an instance on only one node:

[oracle@primrac1 ~]$ sqlplus / as sysdba
SQL> alter system set cluster_database=false scope=spfile;

[oracle@primrac1 ~]$ srvctl stop database orclA
[oracle@primrac1 ~]$ sqlplus / as sysdba
SQL> startup mount;

3. From the 1st node run nid utility. Specify username with sysdba privilege, target database name, and SETNAME parameter to yes.

[oracle@primrac1 ~]$ . oraenv
ORACLE_SID = [oracle] ? orclA1
The Oracle base remains unchanged with value /u01/app/oracle

[oracle@primrac1 ~]$ nid TARGET=SYS DBNAME=orcl SETNAME=YES
DBNEWID: Release 19.0.0.0.0 - Production on Mon Dec 16 16:54:36 2019
Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.
Password:
Connected to database ORCLA (DBID=3133348785)
Connected to server version 19.4.0
Control Files in database:
    +DATA/ORCLA/CONTROLFILE/current.290.1020357617
Change database name of database ORCLA to ORCL? (Y/[N]) => Y
Proceeding with operation
Changing database name from ORCLA to ORCL
     Control File +DATA/ORCLA/CONTROLFILE/current.290.1020357617 - modified
     Datafile +DATA/ORCLA/DATAFILE/system.283.102035744 - wrote new name
     Datafile +DATA/ORCLA/DATAFILE/sysaux.286.102035749 - wrote new name
     Datafile +DATA/ORCLA/DATAFILE/undotbs1.287.102035753 - wrote new name
     Datafile +DATA/ORCLA/DATAFILE/undotbs2.300.102035839 - wrote new name
     Datafile +DATA/ORCLA/DATAFILE/users.288.102035754 - wrote new name
     Datafile +DATA/ORCLA/TEMPFILE/temp.296.102035765 - wrote new name
     Control File +DATA/ORCLA/CONTROLFILE/current.290.1020357617 - wrote new name
     Instance shut down
 Database name changed to ORCL.
 Modify parameter file and generate a new password file before restarting.
 Succesfully changed database name.
 DBNEWID - Completed succesfully.

5. Change db_name parameter in the initialization parameter file:

[oracle@primrac1 dbs]$ . oraenv
ORACLE_SID = [orcl1] ? orclA1

[oracle@primrac1 dbs]$ sqlplus / as sysdba
SQL> startup nomount;
SQL> alter system set db_name=orcl scope=spfile;
SQL> alter system set cluster_database=true scope=spfile;
SQL> shut immediate;

6. Modify database name in srvctl:

[oracle@primrac1 dbs]$ srvctl modify database -db orclA -dbname orcl

7. Remove existing password file entry and create a new one:

[oracle@primrac1 dbs]$ srvctl modify database -db orclA -pwfile

[oracle@primrac1 dbs]$ orapwd  dbuniquename=orclA file='+DATA/ORCLA/PASSWORD/pwdorcl.ora'
Enter password for SYS:

8. Start the database using srvctl:

[oracle@primrac1 dbs]$ srvctl start database -db orclA

Depending on your needs you may also update instance names. If so, please make sure that you don’t have instance-specific parameters (e.g orclA1.instance_number), otherwise, it’s better to recreate the parameter file with the correct instance names.

I’ve shown you just a simple example of the database name change. In your specific case, there may be other things that depend on the database name.

Common usage examples of cluvfy utility

1. Discovering all available network interfaces and verifying the connectivity between the nodes in the cluster through those network interfaces:

$ cluvfy comp nodecon -n all -verbose

2. Check the reachability of specified nodes from a source node:

$ cluvfy comp nodereach -n all -verbose

3. Check the integrity of Oracle Cluster Registry (OCR) on all the specified nodes:

$ cluvfy comp ocr -n all -verbose

4. Verifying the integrity of the Oracle high availability services daemon on all nodes in the cluster:

$ cluvfy comp ohasd -n all -verbose

5. Comparing nodes:

$ cluvfy comp peer -n all -verbose

6. Verifying the software configuration on all nodes in the cluster for the Oracle Clusterware home directory:

$ cluvfy comp software -n all -verbose

7. Check the integrity of the cluster on all the nodes in the node list:

$ cluvfy comp clu -n all -verbose