DB | DBA Knowledge Base

Creating an Oracle RAC on Azure cloud with FlashGrid SkyCluster

March 17, 2020 3 Comments

Filed under Azure, FlashGrid, RAC, VideoBlog Tagged with Azure, FlashGrid, RAC, SkyCluster

agctl start goldengate hangs and fails

March 5, 2020 2 Comments

In this post I will discuss one of the reasons why golden gate startup may hang and fail on both nodes:

Problem:

[root@primrac1 tmp]# /u01/app/grid/xag/bin/agctl start goldengate gg_replicate
^[[ACRS-2672: Attempting to start 'xag.gg_replicate.goldengate' on 'primrac2'
CRS-2674: Start of 'xag.gg_replicate.goldengate' on 'primrac2' failed
CRS-2679: Attempting to clean 'xag.gg_replicate.goldengate' on 'primrac2'
CRS-2681: Clean of 'xag.gg_replicate.goldengate' on 'primrac2' succeeded
CRS-2563: Attempt to start resource 'xag.gg_replicate.goldengate' on 'primrac2' has failed. Will re-retry on 'primrac1' now.
CRS-2672: Attempting to start 'xag.gg_replicate.goldengate' on 'primrac1'
CRS-2674: Start of 'xag.gg_replicate.goldengate' on 'primrac1' failed
CRS-2679: Attempting to clean 'xag.gg_replicate.goldengate' on 'primrac1'
CRS-2681: Clean of 'xag.gg_replicate.goldengate' on 'primrac1' succeeded
CRS-2632: There are no more servers to try to place resource 'xag.gg_replicate.goldengate' on that would satisfy its placement policy
CRS-4000: Command Start failed, or completed with errors.

Troubleshooting:

Check crsd_scriptagent_oracle.trc trace file on each database node. For my environment the full path of this file is /u01/app/grid/diag/crs/primrac2/crs/trace/crsd_scriptagent_oracle.trc

I found that the following error was repeating in the trace file:

2020-03-05 23:36:30.467 : CLSDADR:3560113920: ERRORdynamic component [R2530613] mapped to [CLSDYNAM]

On the metalink found only this PRCR-1079 : Failed to start resource ora.oc4j, the problem is not the same, but glanced a sentence containing config file.. so here I realized it could be related to the config file. And I was correct.

Solution:

Delete existing GG resource:

# /u01/app/grid/xag/bin/agctl remove goldengate gg_replicate

Readd using correct config file location:
--config_home /GG_HOME/sm/etc/conf --var_home /GG_HOME/sm/var

# /u01/app/grid/xag/bin/agctl add goldengate gg_replicate --gg_home /GG_HOME/ma --service_manager --config_home /GG_HOME/sm/etc/conf --var_home /GG_HOME/sm/var --port 9001 --adminuser oggadmin --user oracle --group oinstall --filesystems ora.ggdg.acfsgg.acfs --db_services ora.primorcl.orclservice.svc --use_local_services

Start Golden Gate, relocate and check:

[root@primrac1 ~]# /u01/app/grid/xag/bin/agctl start goldengate gg_replicate
[root@primrac1 ~]# /u01/app/grid/xag/bin/agctl status goldengate gg_replicate
Goldengate instance 'gg_replicate' is running on primrac2
[root@primrac1 ~]# /u01/app/grid/xag/bin/agctl relocate goldengate gg_replicate
[root@primrac1 ~]# /u01/app/grid/xag/bin/agctl status goldengate gg_replicate
Goldengate instance 'gg_replicate' is running on primrac1

Filed under Golden Gate Tagged with agctl, GG, GoldenGate, MA, ServiceManager, start resource, XAG, xagEnabled

NFS Disk Configuration for Quorum

February 13, 2020 Leave a comment

Before starting make sure that:

1. NFS server is accessible from DB nodes
2. nfs-utils is installed on all client and DB nodes
3. ASM is running on all nodes

NFS disk configuration steps:

1. On NFS server node (e.g. nfsserver)

# systemctl disable firewalld
# systemctl stop firewalld
# mkdir /NFS_DISKS

--Create quorum disk in nfs share:
# dd if=/dev/zero of=/NFS_DISKS/nfsdisk1 bs=1M count=1024

# chmod -R 755 /NFS_DISKS
# chown -R grid:oinstall /NFS_DISKS

# vim /etc/exports
/NFS_DISKS *(rw,async)

# systemctl enable nfs-server
# systemctl start nfs-server

2. On all DB nodes (e.g. rac1 rac2)

# mkdir /NFS_DISKS 
# chown -R grid:oinstall /NFS_DISKS
# vim /etc/fstab

nfsserver:/NFS_DISKS /NFS_DISKS nfs rw,bg,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,noac,vers=3,timeo=600 0 0 0

# mount -a

--Update disk discovery string from grid user:
$ asmcmd dsset '/NFS_DISKS/*','/dev/flashgrid/*'

If this is Flashgrid-enabled cluster, then you can create diskgroup which contains NFS disk as quorum:

[grid@rac1 ~]$ flashgrid-dg create --name TEST --normal --asm-compat 19.0.0.0 --db-compat 19.0.0.0 --disks /dev/flashgrid/rac1.lun4 /dev/flashgrid/rac2.lun4 --quorum-disks /NFS_DISKS/nfsdisk1

FlashGrid 19.6.48.60391
~~~~
Querying nodes: rac1, rac2, racq …
Creating group…
Completing this operation may take long time. Please wait…
CREATE DISKGROUP "TEST" NORMAL REDUNDANCY
FAILGROUP "rac1" DISK '/dev/flashgrid/rac1.lun4' NAME "rac1$LUN4" SIZE 20480M
FAILGROUP "rac2" DISK '/dev/flashgrid/rac2.lun4' NAME "rac2$LUN4" SIZE 20480M
QUORUM FAILGROUP "QUORUM0" DISK '/NFS_DISKS/nfsdisk1' NAME "TEST_Q0"
ATTRIBUTE 'au_size' = '4M', 'compatible.asm' = '19.0.0.0', 'compatible.rdbms' = '19.0.0.0';
ALTER SYSTEM SET asm_preferred_read_failure_groups='DATA.rac1','FRA.rac1','GGDG.rac1','GRID.rac1','TEST.rac1' SID='+ASM1';
ALTER SYSTEM SET asm_preferred_read_failure_groups='DATA.rac2','FRA.rac2','GGDG.rac2','GRID.rac2','TEST.rac2' SID='+ASM2';
Success

Filed under DB Tagged with FlashGrid, NFS, Quorum

Change Oracle DB name using NID

December 23, 2019 Leave a comment

Details:

Environment: RAC
Source name: orclA
Target name: orcl

1. Make sure that you have a recoverable whole database backup.

2. In Real Application Cluster we need to set cluster_database parameter to false and mount an instance on only one node:

[oracle@primrac1 ~]$ sqlplus / as sysdba
SQL> alter system set cluster_database=false scope=spfile;

[oracle@primrac1 ~]$ srvctl stop database orclA
[oracle@primrac1 ~]$ sqlplus / as sysdba
SQL> startup mount;

3. From the 1st node run nid utility. Specify username with sysdba privilege, target database name, and SETNAME parameter to yes.

[oracle@primrac1 ~]$ . oraenv
ORACLE_SID = [oracle] ? orclA1
The Oracle base remains unchanged with value /u01/app/oracle

[oracle@primrac1 ~]$ nid TARGET=SYS DBNAME=orcl SETNAME=YES
DBNEWID: Release 19.0.0.0.0 - Production on Mon Dec 16 16:54:36 2019
Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.
Password:
Connected to database ORCLA (DBID=3133348785)
Connected to server version 19.4.0
Control Files in database:
    +DATA/ORCLA/CONTROLFILE/current.290.1020357617
Change database name of database ORCLA to ORCL? (Y/[N]) => Y
Proceeding with operation
Changing database name from ORCLA to ORCL
     Control File +DATA/ORCLA/CONTROLFILE/current.290.1020357617 - modified
     Datafile +DATA/ORCLA/DATAFILE/system.283.102035744 - wrote new name
     Datafile +DATA/ORCLA/DATAFILE/sysaux.286.102035749 - wrote new name
     Datafile +DATA/ORCLA/DATAFILE/undotbs1.287.102035753 - wrote new name
     Datafile +DATA/ORCLA/DATAFILE/undotbs2.300.102035839 - wrote new name
     Datafile +DATA/ORCLA/DATAFILE/users.288.102035754 - wrote new name
     Datafile +DATA/ORCLA/TEMPFILE/temp.296.102035765 - wrote new name
     Control File +DATA/ORCLA/CONTROLFILE/current.290.1020357617 - wrote new name
     Instance shut down
 Database name changed to ORCL.
 Modify parameter file and generate a new password file before restarting.
 Succesfully changed database name.
 DBNEWID - Completed succesfully.

5. Change db_name parameter in the initialization parameter file:

[oracle@primrac1 dbs]$ . oraenv
ORACLE_SID = [orcl1] ? orclA1

[oracle@primrac1 dbs]$ sqlplus / as sysdba
SQL> startup nomount;
SQL> alter system set db_name=orcl scope=spfile;
SQL> alter system set cluster_database=true scope=spfile;
SQL> shut immediate;

6. Modify database name in srvctl:

[oracle@primrac1 dbs]$ srvctl modify database -db orclA -dbname orcl

7. Remove existing password file entry and create a new one:

[oracle@primrac1 dbs]$ srvctl modify database -db orclA -pwfile

[oracle@primrac1 dbs]$ orapwd  dbuniquename=orclA file='+DATA/ORCLA/PASSWORD/pwdorcl.ora'
Enter password for SYS:

8. Start the database using srvctl:

[oracle@primrac1 dbs]$ srvctl start database -db orclA

Depending on your needs you may also update instance names. If so, please make sure that you don’t have instance-specific parameters (e.g orclA1.instance_number), otherwise, it’s better to recreate the parameter file with the correct instance names.

I’ve shown you just a simple example of the database name change. In your specific case, there may be other things that depend on the database name.

Filed under 19c, DB Tagged with 19c, change database name, db_name, nid

Patch 28729234 requires component(s) that are not installed in OracleHome. These not-installed components are oracle.crs:11.2.0.4.0

October 28, 2019 1 Comment

Problem:

Applying OCW patch 28729234 on RDBMS home failed with the following error:

[oracle@rac1 28729234]$ $ORACLE_HOME/OPatch/opatch apply -oh $ORACLE_HOME

Oracle Interim Patch Installer version 11.2.0.3.20
...
UtilSession failed: Patch 28729234 requires component(s) that are not installed in OracleHome. These not-installed components are oracle.crs:11.2.0.4.0

Reason:

The error message is confusing, because the real reason is in the patch itself. There is a duplication of folders and files. Custom/server directory under 28729234, also contains 28729234 directory.

Solution:

As a workaround run opatch from inner 28729234 directory:

[oracle@rac1 ~]$ cd /home/oracle/28813878/28729234/custom/server/28729234

Generate emocmrsp file:

[oracle@rac1 28729234]$ /u01/app/oracle/product/11.2.0/dbhome_1/OPatch/ocm/bin/emocmrsp

OCM Installation Response Generator 10.3.7.0.0 - Production
Copyright (c) 2005, 2012, Oracle and/or its affiliates.  All rights reserved.
Provide your email address to be informed of security issues, install and
initiate Oracle Configuration Manager. Easier for you if you use your My Oracle Support Email address/User Name.
Visit http://www.oracle.com/support/policies.html for details.
Email address/User Name: 
You have not provided an email address for notification of security issues.
Do you wish to remain uninformed of security issues ([Y]es, [N]o) [N]:  Y
The OCM configuration response file (ocm.rsp) was successfully created

Apply patch using the following command:

[oracle@rac1 28729234]$ $ORACLE_HOME/OPatch/opatch apply -oh $ORACLE_HOME -ocmrf ./ocm.rsp 
...
Start OOP by Prereq process.
Launch OOP…
Oracle Interim Patch Installer version 11.2.0.3.20
Copyright (c) 2019, Oracle Corporation.  All rights reserved.
...
Applying interim patch '28729234' to OH '/u01/app/oracle/product/11.2.0/dbhome_1'
Patching component oracle.rdbms, 11.2.0.4.0…
Patch 28729234 successfully applied.

Checking that OCW is applied:

[oracle@rac1 28729234]$ $ORACLE_HOME/OPatch/opatch lspatches
28729234;OCW Patch Set Update : 11.2.0.4.190115 (28729234)
28729262;Database Patch Set Update : 11.2.0.4.190115 (28729262)

Filed under 11g, OPatch Tagged with 11.2.0.4, 11g, 28729234, OCW, oracle.crs, RDBMS

ACFS Input/output error on OS kernel 3.10.0-957

October 14, 2019 Leave a comment

Problem:

My ACFS volume has intermittent problems, when I run ls command I get ls: cannot access sm: Input/output error and user/group ownership contains question marks.

[root@primrac1 GG_HOME]# ll
ls: cannot access ma: Input/output error
ls: cannot access sm: Input/output error
ls: cannot access deploy: Input/output error
total 64
d????????? ? ? ? ? ? deploy
drwx------ 2 oracle oinstall 65536 Sep 28 21:05 lost+found
d????????? ? ? ? ? ? ma
d????????? ? ? ? ? ? sm

Reason:

Doc ID 2561145.1 mentions that kernel 3.10.0-957 changed behaviour of d_splice_alias interface which is used by ACFS driver.

My kernel is also the same:

[root@primrac1 GG_HOME]# uname -r
3.10.0-957.21.3.el7.x86_64

Solution:

Download and apply patch 29963428 on GI home.

[root@primrac1 29963428]# /u01/app/19.3.0/grid/OPatch/opatchauto apply -oh /u01/app/19.3.0/grid

Filed under ASM, DB, RAC Tagged with 29963428, 3.10.0-957, ?, ACFS, Input/output error

JDBC 11g: SQLException(“Io exception: Connection reset”)

October 7, 2019 Leave a comment

Problem:

Connection using 11g ojdbc was very slow and most of the time was failing with Connection reset error after 60s (default inbound connection timeout). Database alert log contained WARNING: inbound connection timed out (ORA-3136) errors.

Reason:

Oracle 11g JDBC drivers use random numbers during authentication. Those random numbers are generated by OS using /dev/random and if there is faulty/slow hardware or not too much activity on the system this generation can be slow, which causes slowness during jdbc connection.

Solution:

Instead of /dev/random indicate non-blocking /dev/urandom as java command line argument:

# java -Djava.security.egd=file:/dev/urandom -cp ojdbc8.jar:. JDBCTest "stbyrac-scan.example.com"

Filed under 11g, Java, JDBC Tagged with 11g, Connection reset, java, JDBC, ojdbc, ORA-3136, random, urandom

Simple java code to test connection to Oracle database

September 30, 2019 4 Comments

Assuming that client computer does not have Oracle client installed.

1. Download necessary version of ojdbc jar file from Oracle. The latest version for now is ojdbc8.jar

2. Install java development tools:

# yum install java-devel * -y

3. Create a sample java code, which:
– connects to the database
– selects 1 from dual
– disconnects

# cat JDBCTest.java

import java.sql.*;
 class JDBCTest{
     public static void main(String args[]) throws SQLException {
         Connection con = null;
         try{
             Class.forName("oracle.jdbc.driver.OracleDriver");
             String dbURL = "jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=" + args[0] + ")(PORT=1521))(CONNECT_DATA=(SERVER = DEDICATED)(SERVICE_NAME=orclgg)))";
             System.out.println("jdbcurl=" + dbURL);
             String strUserID = "system";
             String strPassword = "Oracle123";
             con=DriverManager.getConnection(dbURL,strUserID,strPassword);
             System.out.println("Connected to the database.");
             Statement stmt=con.createStatement();
             System.out.println("Executing query");
             ResultSet rs=stmt.executeQuery("SELECT 1 FROM DUAL");
             while(rs.next())
                 System.out.println(rs.getInt("1"));
             con.close();
         }catch(Exception e){ System.out.println(e);}
         finally {
             con.close();
         }
 }
 }

4. Compile java code and check that *.class file was generated:

# javac JDBCTest.java

# ll  JDBCTest.*
 -rw-r--r-- 1 root root 1836 Sep 27 11:45 JDBCTest.class
 -rw-r--r-- 1 root root  925 Sep 27 11:45 JDBCTest.java

5. Run code:

# java -Djava.security.egd=file:/dev/../dev/urandom -cp ojdbc8.jar:. JDBCTest "stbyrac-scan.example.com"

jdbcurl=jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=stbyrac-scan.example.com)(PORT=1521))(CONNECT_DATA=(SERVER = DEDICATED)(SERVICE_NAME=orclgg)))

Connected to the database.
Executing query…
1

Please note that -Djava.security.egd=file:/dev/../dev/urandom parameter is required to have a stable connection. I will discuss the importance of this parameter in the next post.

Filed under DB, JDBC Tagged with java, JDBC, ojdbc, random, urandom

OPATCHAUTO-72115: Please complete the previous apply session across all nodes to perform apply session

September 23, 2019 Leave a comment

Problem:

My opatchauto outofplace patching failed on GI home. I was able to cleanup cloned GI home and information about it in inventory.xml, but after running opatchauto again I was getting the following error:

[root@rac1 29708703]# $ORACLE_HOME/OPatch/opatchauto apply -oh $ORACLE_HOME -outofplace
OPatchauto session is initiated at Sun Aug 18 20:40:43 2019
System initialization log file is /u01/app/18.3.0/grid/cfgtoollogs/opatchautodb/systemconfig2019-08-18_08-40-46PM.log.
Session log file is /u01/app/18.3.0/grid/cfgtoollogs/opatchauto/opatchauto2019-08-18_08-42-20PM.log
The id for this session is Z1CP
OPATCHAUTO-72115: Out of place patching apply session cannot be performed.
OPATCHAUTO-72115: Previous apply session  is not completed on node rac1.
OPATCHAUTO-72115: Please complete the previous apply  session across all nodes to perform apply session.
OPatchAuto failed.

Solution:

Clear checkpoint files from the previous session :

[root@rac1 29708703]# cd /u01/app/18.3.0/grid/.opatchauto_storage/rac1
[root@rac1 rac1]# ls
oopsessioninfo.ser
[root@rac1 rac1]# rm -rf oopsessioninfo.ser

Rerun opatchauto apply again.

Filed under 18c, opatchauto Tagged with 18.3, 18c, opatchauto, OPATCHAUTO-72115

Patch Planner to check and request conflict patches

September 9, 2019 Leave a comment

Problem:

Recently, I was applying p29963428_194000ACFSRU_Linux-x86-64.zip on top of 19.4 GI home and got the following error:

==Following patches FAILED in analysis for apply:
 Patch: /u01/swtmp/29963428/29963428
 Log: /u01/app/19.3.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2019-08-07_10-07-56AM_1.log
 Reason: Failed during Analysis: CheckConflictAgainstOracleHome Failed, [ Prerequisite Status: FAILED, Prerequisite output: 
 Summary of Conflict Analysis:
 There are no patches that can be applied now.
 Following patches have conflicts. Please contact Oracle Support and get the merged patch of the patches : 
 29851014, 29963428
 Conflicts/Supersets for each patch are:
 Patch : 29963428
 Bug Conflict with 29851014 Conflicting bugs are: 29039918, 27494830, 29338628, 29031452, 29264772, 29760083, 28855761 ... 
 After fixing the cause of failure Run opatchauto resume

Solution:

Oracle MOS note ID 1317012.1 describes steps how to check such conflicts and request conflict/merged patches in previous:

1. Run lsinventory from the target home:

[grid@rac1 ~]$ /u01/app/19.3.0/grid/OPatch/opatch lsinventory > GI_lsinventory.txt

2. Logon to support.oracle.com -> Click the “Patch and Updates” tab -> Enter the patch number you want to apply:

2. Click Analyze with OPatch…

3. Attach GI_lsinventory.txt file created in the first step and click “Analyze for Conflict”:

4. Wait for a while and you will see the result. According to it, patch 29963428 conflicts with my current patches:

From the same screen I can “Request Patch”.

5. After clicking “Request Patch” button I got the following error:

Click “See Details”:

The message actually means that fix for the same bug is already included in currently installed 19.4.0.0.190716ACFSRU.

So I don’t have to apply 29963428 patch. I wanted to share the steps with you , because the mentioned tool is really useful.

Filed under 19c, DB, OPatch, opatchauto, Patching Tagged with 19c, 29963428, ACFS, ID 1317012.1, Patch Planner

← Older posts

Newer posts →

DBA Knowledge Base

Creating an Oracle RAC on Azure cloud with FlashGrid SkyCluster

agctl start goldengate hangs and fails

NFS Disk Configuration for Quorum

Change Oracle DB name using NID

Patch 28729234 requires component(s) that are not installed in OracleHome. These not-installed components are oracle.crs:11.2.0.4.0

ACFS Input/output error on OS kernel 3.10.0-957

Problem:

Reason:

Solution:

JDBC 11g: SQLException(“Io exception: Connection reset”)

Simple java code to test connection to Oracle database

OPATCHAUTO-72115: Please complete the previous apply session across all nodes to perform apply session

Patch Planner to check and request conflict patches

Database Administrator Jobs

Recent Posts

Archives

Categories

Meta

Subscribe to Blog via Email