Move Oracle datafiles from filesystem to ASM using RMAN

1. Connect to the database using RMAN and move database into mount mode:

$ rman target /
RMAN> shutdown immediate
RMAN> startup mount

2. Get information about target datafiles and tempfiles:

RMAN> report schema;

using target database control file instead of recovery catalog
Report of database schema for database with db_unique_name PRSH01

List of Permanent Datafiles
===========================
File Size(MB) Tablespace           RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1    1020     SYSTEM               YES     /u01/app/oracle/oradata/PRSH01/system01.dbf
3    490      SYSAUX               NO      /u01/app/oracle/oradata/PRSH01/sysaux01.dbf
4    850      UNDOTBS1             YES     /u01/app/oracle/oradata/PRSH01/undotbs01.dbf
7    5        USERS                NO      /u01/app/oracle/oradata/PRSH01/users01.dbf

List of Temporary Files
=======================
File Size(MB) Tablespace           Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- --------------------
1    114      TEMP                 32767       /u01/app/oracle/oradata/PRSH01/temp01.dbf

3. Copy datafiles from filesystem to ASM diskgroup, in our case to +DATA:

RMAN> copy datafile 1 to '+DATA';
RMAN> copy datafile 3 to '+DATA';
RMAN> copy datafile 4 to '+DATA';
RMAN> copy datafile 7 to '+DATA';
RMAN> switch datafile 1 to copy;
RMAN> switch datafile 3 to copy;
RMAN> switch datafile 4 to copy;
RMAN> switch datafile 7 to copy;

4. Place tempfile on +DATA:

RMAN> run{
2> set newname for tempfile 1 to '+DATA';
3> switch tempfile 1;
4> }

executing command: SET NEWNAME
renamed tempfile 1 to +DATA in control file

5. Double check that all files have a new name

RMAN> report schema;

Report of database schema for database with db_unique_name PRSH01

List of Permanent Datafiles
===========================
File Size(MB) Tablespace           RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1    1020     SYSTEM               ***     +DATA/PRSH01/DATAFILE/system.256.1122674191
3    570      SYSAUX               ***     +DATA/PRSH01/DATAFILE/sysaux.257.1122674225
4    910      UNDOTBS1             ***     +DATA/PRSH01/DATAFILE/undotbs1.258.1122674235
7    5        USERS                ***     +DATA/PRSH01/DATAFILE/users.259.1122674253

List of Temporary Files
=======================
File Size(MB) Tablespace           Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- --------------------
1    114      TEMP                 32767       +DATA

6. Open database:

RMAN> alter database open;

7. Check new file names. Note tempfile name will also be generated:

RMAN> report schema;

Report of database schema for database with db_unique_name PRSH01

List of Permanent Datafiles
===========================
File Size(MB) Tablespace           RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1    1020     SYSTEM               YES     +DATA/PRSH01/DATAFILE/system.256.1122674191
3    570      SYSAUX               NO      +DATA/PRSH01/DATAFILE/sysaux.257.1122674225
4    910      UNDOTBS1             YES     +DATA/PRSH01/DATAFILE/undotbs1.258.1122674235
7    5        USERS                NO      +DATA/PRSH01/DATAFILE/users.259.1122674253

List of Temporary Files
=======================
File Size(MB) Tablespace           Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- --------------------
1    114      TEMP                 32767       +DATA/PRSH01/TEMPFILE/temp.260.1122674673
Advertisement

DBCA Fails With INS-00001 and DBT-00007

Problem:

While trying to run dbca from Oracle user I received the following error:

Cause - No additional information available.  
Action - Refer to the logs or contact Oracle Support Services  

Summary  
- [DBT-00007] User does not have the appropiate write privileges.  
- [DBT-00007] User does not have the appropiate write privileges.

Screenshot:

Solution:

Make sure the directory has the correct owner and permissions, otherwise set:

# chown oracle:oinstall $ORACLE_BASE/cfgtoollogs/dbca
# chmod 750 $ORACLE_BASE/cfgtoollogs/dbca

Rerun dbca.

Postfix: flush emails from the mail queue

Problem:

The customer was getting a lot system alerts even after disabling it. There were a lot of emails in the queue:

# mailq
-Queue ID-  --Size-- ----Arrival Time---- -Sender/Recipient-------
DB13ED78461   21597 Thu May 26 05:39:26  flashgrid@localhost.localdomain
(delivery temporarily suspended: connect to us-smtp-inbound-2.meme.com[~]:25: Connection timed out)
                                         alertdb@test.com

DA31AD9C4B5   21503 Thu May 26 10:32:16  flashgrid@localhost.localdomain
(delivery temporarily suspended: connect to us-smtp-inbound-2.meme.com[~]:25: Connection timed out)
                                         alertdb@test.com

DA701D9AC16   21601 Thu May 26 07:01:06  flashgrid@localhost.localdomain
(delivery temporarily suspended: connect to us-smtp-inbound-1.meme.com[~]:25: Connection timed out)
                                         alertdb@test.com

DA3B9D9C488   21503 Thu May 26 09:31:36  flashgrid@localhost.localdomain
(connect to us-smtp-inbound-1.meme.com[~]:25: Connection timed out)
                                         alertdb@test.com

[...]

Solution:

  • To remove all mail from all the queues ( hold, incoming, active and deferred ) , run :
# postsuper -d ALL
  • To remove all mails in the deferred queue only, run :
# postsuper -d ALL deferred

CRS-2549: Resource ‘ora.asmgroup’ cannot be placed on ‘rac1’ as it is not a valid candidate as per the placement policy

Problem:

After failed JDK patching on the 1st node, we tried troubleshooting and saw that ASM was not able to start:

# su - grid
$ sqlplus / as sysasm
SQL> startup nomount;
ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance
ORA-39511: Start of CRS resource for instance '223' failed with error:[CRS-2549: Resource 'ora.asmgroup' cannot be placed on 'rac1' as it is not a valid candidate as per the placement policy
CRS-0223: Resource 'ora.asm' has placement error.
clsr_start_resource:260 status:223
clsrapi_start_asm:start_asmdbs status:223

Reason:

Prepatch modified RESOURCE_USE_ENABLED=0 for rac1 node:

[grid@rac1 ~]$ crsctl stat server -f

NAME=rac1
MEMORY_SIZE=63465
CPU_COUNT=8
CPU_CLOCK_RATE=2499
CPU_HYPERTHREADING=1
CPU_EQUIVALENCY=1000
DEPLOYMENT=other
CONFIGURED_CSS_ROLE=hub
RESOURCE_USE_ENABLED=0
SERVER_LABEL=
PHYSICAL_HOSTNAME=
CSS_CRITICAL=no
CSS_CRITICAL_TOTAL=0
RESOURCE_TOTAL=0
SITE_NAME=stsfilive
STATE=ONLINE
ACTIVE_POOLS=Free
STATE_DETAILS=
ACTIVE_CSS_ROLE=hub

NAME=rac2
MEMORY_SIZE=63465
CPU_COUNT=8
CPU_CLOCK_RATE=2499
CPU_HYPERTHREADING=1
CPU_EQUIVALENCY=1000
DEPLOYMENT=other
CONFIGURED_CSS_ROLE=hub
RESOURCE_USE_ENABLED=1
….

Solution:

Connect to the failing node and run:

[root@rac1 ~]# crsctl set resource use 1

Start ASM.

Flashgrid: Strict local was NOT disabled due to dependency errors!

Problem:

# flashgrid-cluster strict-read-local-disable
~~~~~~~~~~~~~~~~
Unable to revert 'ora.orcl.db': saved data doesn't match current dependencies!

Expected 'hard(fg.OGG.DisksReady, fg.OCR.DisksReady, fg.MGMT.DisksReady, fg.GRID.DisksReady, fg.DATA2.DisksReady, fg.DATA1.DisksReady, fg.ARCH.DisksReady, global:uniform:ora.DATA2.dg, uniform:global:ora.GRID.dg, uniform:global:ora.OCR.dg, uniform:global:ora.MGMT.dg, uniform:global:ora.OGG.dg, global:uniform:ora.DATA1.dg) pullup(fg.OGG.DisksReady, fg.OCR.DisksReady, fg.MGMT.DisksReady, fg.GRID.DisksReady, fg.DATA2.DisksReady, fg.DATA1.DisksReady, fg.ARCH.DisksReady, global:ora.GRID.dg, global:ora.OCR.dg, global:ora.MGMT.dg, global:ora.OGG.dg, global:ora.DATA2.dg, global:ora.DATA1.dg) weak(type:ora.listener.type, global:type:ora.scan_listener.type, uniform:ora.ons, global:ora.gns, global:uniform:ora.ARCH.dg)', found 'hard(fg.OGG.DisksReady, fg.OCR.DisksReady, fg.MGMT.DisksReady, fg.GRID.DisksReady, fg.DATA2.DisksReady, fg.DATA1.DisksReady, fg.ARCH.DisksReady, global:uniform:ora.DATA2.dg, uniform:global:ora.GRID.dg, uniform:global:ora.OCR.dg, uniform:global:ora.MGMT.dg, uniform:global:ora.OGG.dg, global:uniform:ora.ARCH.dg, global:uniform:ora.DATA1.dg) pullup(fg.OGG.DisksReady, fg.OCR.DisksReady, fg.MGMT.DisksReady, fg.GRID.DisksReady, fg.DATA2.DisksReady, fg.DATA1.DisksReady, fg.ARCH.DisksReady, global:ora.GRID.dg, global:ora.OCR.dg, global:ora.MGMT.dg, global:ora.OGG.dg, global:ora.DATA2.dg, global:ora.ARCH.dg, global:ora.DATA1.dg) weak(type:ora.listener.type, global:type:ora.scan_listener.type, uniform:ora.ons, global:ora.gns)'

Strict local was NOT disabled due to dependency errors!

The dependency attribute for the database resource is different than expected.

Solution:

Copy text after Expected until found sections, and modify dependencies manually. This process does not require any downtime and is safe to run. Please note, that your dependencies should be different, so instead of copying the text below use the output from the previous command to modify accordingly:

# crsctl modify resource ora.orcl.db -attr "START_DEPENDENCIES='hard(fg.OGG.DisksReady, fg.OCR.DisksReady, fg.MGMT.DisksReady, fg.GRID.DisksReady, fg.DATA2.DisksReady, fg.DATA1.DisksReady, fg.ARCH.DisksReady, global:uniform:ora.DATA2.dg, uniform:global:ora.GRID.dg, uniform:global:ora.OCR.dg, uniform:global:ora.MGMT.dg, uniform:global:ora.OGG.dg, global:uniform:ora.DATA1.dg) pullup(fg.OGG.DisksReady, fg.OCR.DisksReady, fg.MGMT.DisksReady, fg.GRID.DisksReady, fg.DATA2.DisksReady, fg.DATA1.DisksReady, fg.ARCH.DisksReady, global:ora.GRID.dg, global:ora.OCR.dg, global:ora.MGMT.dg, global:ora.OGG.dg, global:ora.DATA2.dg, global:ora.DATA1.dg) weak(type:ora.listener.type, global:type:ora.scan_listener.type, uniform:ora.ons, global:ora.gns, global:uniform:ora.ARCH.dg)'" -unsupported

Disable read local:

# flashgrid-cluster strict-read-local-disable

I would not recommend disabling it, to reenable please run # flashgrid-cluster strict-read-local-enable

The reason we were disabling it was because of testing and we’ve found that the resource had a dependency error that was causing problems. So in case you encounter the same, you know how to solve it.

Part 2: ora.storage fails to start, ORA-01017

Problem:

One of our customers changed ASM password file by mistake and regarding other actions, we are not sure. After node restart, they encountered ora.storage startup issue on the second node.

CRS-2672: Attempting to start 'ora.storage' on 'orcl02'
ORA-01017: invalid username/password; logon denied
CRS-5055: unable to connect to an ASM instance because no ASM instance is running in the cluster
CRS-2883: Resource 'ora.storage' failed during Clusterware stack start.
CRS-4406: Oracle High Availability Services synchronous start failed.
CRS-41053: checking Oracle Grid Infrastructure for file permission issues
CRS-4000: Command Start failed, or completed with errors.

I have followed my blog post to recover ASM passwordfile and add CRSUSER__ASM_001. The CRS started successfully on the first node but it still was not able to start on the second.

Reason:

When we checked password for CRSUSER__ASM_001 on both nodes, we got different results:

[grid@orcl01 ~]$ crsctl get credmaint -path ASM/Self/0b5330fe4bdf6f6ebffb09beab078d6e -credtype userpass -id 0 -attr passwd -local 
zSZDts1PQx8v7gRrdmH1EjIpSBsAt
[grid@orcl02 ~]$ crsctl get credmaint -path ASM/Self/0b5330fe4bdf6f6ebffb09beab078d6e -credtype userpass -id 0 -attr passwd -local 
rHgulYGfY17Uxbb9Tbd9VF3yr2Kvr

Which is not normal and they must be the same. This was the reason CRS was not able to start on the second node, because ASM passwordfile for CRSUSER__ASM_001 had value zSZDts1PQx8v7gRrdmH1EjIpSBsAt

Solution:

Verify and fix the credentials:

If you are not able to set up root ssh passwordless connectivity, you can run the following command as grid. Note in that case you will get “credfix: could not delete crs credentials for jxrucJl3”, this is because the command was not run as root and old credentials were not deleted. But new credentials are successfully created.

[grid@orcl01 ~]$ asmcmd --nocp credverify
credverify: More than one credential in password file, please run 'credfix' to fix the credentials.
​
[grid@orcl01 ~]$ asmcmd --nocp credfix
credfix: Credentials for JXRUCJL3 not in password file, trying next credential.
op=addcrscreds wrap=/tmp/creds0.xml
credfix: Creating new credentials, no valid credentials in OCR.
credfix: New user CRSUSER__ASM_004 created.
op=credimport wrap=/tmp/creds0.xml olr=true force=true
credfix: OLR for orcl01 has been fixed if credentials were created incorrectly.
credfix: Starting SSH session on node orcl02.
credfix: OLR for orcl02 has been fixed if credentials were created incorrectly. Exiting SSH session.
op=delcrscreds crs_user=jxrucJl3
ASMCMD-8202: internal error:
credfix: could not delete crs credentials for jxrucJl3

It is recommended to setup passwordless ssh connectivity for root user and then run credfix as root to have clean configuration without old entries:

[root@rac1 ~]# asmcmd --nocp credfix
..

Installing linux.x64_193000_client gets java.library.path system variable is missing or invalid

Problem:

While installing linux.x64_193000_client.zip on Linux, getting the following error:

Preparing to launch Oracle Universal Installer from /tmp/OraInstall2022-06-15_07-48-19AM. Please wait ...[oracle@mariclient client]$ Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/u01/tmp
The java.library.path system variable is missing or invalid. Please set java.library.path with a correct value and retry the operation.
Exception in thread "main" java.lang.NoClassDefFoundError: Could not initialize class oracle.sysman.oii.oiip.oiipg.OiipgPropertyLoader
	at oracle.sysman.oii.oiip.oiipg.OiipgBootstrap.isCleanMachine(OiipgBootstrap.java:484)
	at oracle.sysman.oii.oiii.OiiiInstallAreaControl.isCleanMachine(OiiiInstallAreaControl.java:3796)

Solution:

Install the following rpm and retry:

# yum install libnsl.x86_64

Terminal window xterm is not displayed in VNC

Problem:

After installing/starting the tiger VNC server, and connecting using VNC Viewer terminal window is not displayed.

# yum install tigervnc-server

Solution:

Install xterm:

# yum install xterm

Kill old VNC process and start again:

$ vncserver -kill :1
$ vncserver -geometry 1024x1024

Reconnect using VNC viewer, terminal should be displayed automatically.

RMAN backup on NFS v3 takes too much time

Problem:

I was trying to test RMAN backup on Azure blob storage with NFS v3 access. I have noticed a huge delay even when backing up a small controlfile.

The same backup on disk took 7sec, but on NFS 15min. While watching IO on NFS mountpoint using nfsiostat I saw that initial 13min there was no IO at all and actual backup was taken only at the last minute.

Another interesting thing while watching netstat output was the following:

[root@rac1 dbbackup2]# netstat -na|grep 10.0.0.
tcp 0 0 10.0.0.5:875 10.0.0.16:2048 ESTABLISHED
tcp 0 1 10.0.0.5:54997 10.0.0.16:2049 SYN_SENT

You can also turn debug on before running backup in RMAN and check the output:

RMAN> RUN
{
  ALLOCATE CHANNEL disk1 DEVICE TYPE DISK FORMAT '/dataload/%U';
  debug on;
  BACKUP current controlfile;
  debug off;
}

Solution:

Disable dNFS.

$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins_rdbms.mk dnfs_off

Retry RMAN backup, in my case it took only 1sec:

...
piece handle=/dbbackup/0q0rruhd_26_1_1 tag=TAG20220425T202309 comment=NONE
channel disk1: backup set complete, elapsed time: 00:00:01
Finished backup at 25-APR-22

Monitor NFS mount point IO performance

Problem:

I am configuring RMAN backup of my Oracle database and redirecting backup sets to Azure Blob Storage with NFSv3 access (quite a new feature for this time and needs to be tested). But I don’t know what will be the writing performance for this type of storage.

Solution:

One of the useful tools is nfsiostat, we will test it in this blog post.

1. After mounting Azure Blob Storage to my database node as /dbbackup mount point, instead of triggering RMAN backup I’d prefer dd command at this time:

$  dd if=/dev/zero of=/dbbackup/myfile oflag=direct bs=1M count=512000

2. Run nfsiostat with interval 1sec and monitor values:

[oracle@rac1 data]$ nfsiostat 1

marirmanstorage.blob.core.windows.net:/marirmanstorage/dbbackup mounted on /dbbackup:

   op/s	   rpc bklog
   0.11	   0.00
read:      ops/s    kB/s      kB/op.    retrans	 avg RTT (ms)	avg exe (ms)
	   0.000    0.000     0.000     0 (0.0%) 0.000	        0.000
write:     ops/s    kB/s      kB/op     retrans	 avg RTT (ms)	avg exe (ms)
	   18.000   18437.977 1024.332  0 (0.0%) 55.500	        55.611

For information, interval specifies the amount of time in seconds between each report. The first report contains statistics for the time since each file system was mounted. Each subsequent report contains statistics collected during the interval since the previous report

3. Cancel dd operation, you will also get information about the speed :

[oracle@rac1 dbbackup]$  dd if=/dev/zero of=/dbbackup/myfile oflag=direct bs=1M count=512000
^C
124+0 records in
124+0 records out
130023424 bytes (130 MB) copied, 6.85939 s, 19.0 MB/s

More information about nfsiostat can of course be found using man nfsiostat.