‘udev’ rules continuously being reloaded resulted in ASM nvme disks going offline

Environment:

Linux Server release 7.2
kernel-3.10.0-514.26.2.el7.x86_64

Problem:

When Oracle processes are opening the device for writing and then closing it, this synthesizes a change event. And udev rules having  ACTION=="add|change" gets reloaded. This behavior causes ASM nvme disks to go offline:

Thu Jul 09 16:33:16 2020
WARNING: Disk 18 (rac1$disk1) in group 2 mode 0x7f is now being offlined

Fri Jul 10 10:04:34 2020
WARNING: Disk 19 (rac1$disk5) in group 2 mode 0x7f is now being offlined

Fri Jul 10 13:45:45 2020
WARNING: Disk 15 (rac1$disk8) in group 2 mode 0x7f is now being offlined

Solution:

To suppress the false positive change events disable the inotify watch for devices used for Oracle ASM using following steps:

  1.  Create /etc/udev/rules.d/96-nvme-nowatch.rules file with just one line in it:
ACTION=="add|change", KERNEL=="nvme*", OPTIONS:="nowatch"

2. After creating the file please run the following to activate the rule:

# udevadm control --reload-rules
# udevadm trigger --type=devices --action=change

The above command will reload the complete udev configuration and will trigger all the udev rules. On a busy production system this could disrupt ongoing operations, applications running on the server. Please use the above command during a scheduled maintenance window only.

Source: https://access.redhat.com/solutions/1465913 + our experience with customers.

Query the cluster active patchlevel in the OCR backup

To identify OCR backup where the cluster active patch level matches, you need to run the following:

Identify cluster active patch level:

[root@rac1 ~]# crsctl query crs activeversion -f
Oracle Clusterware active version on the cluster is [19.0.0.0.0]. 
The cluster upgrade state is [NORMAL]. 
The cluster active patch level is [2701864972].

If backup is located on diskgroup:

[root@rac1 ~]# ocrdump -stdout -keyname SYSTEM.version.activeversion.patchlevel -backupfile +GRID:/marirac/OCRBACKUP/backup_20200425_171539.ocr.266.1038676539
04/26/2020 12:28:20
+GRID:/marirac/OCRBACKUP/backup_20200425_171539.ocr.266.1038676539
/u01/app/19.3.0/grid/bin/ocrdump.bin -stdout -keyname SYSTEM.version.activeversion.patchlevel -backupfile +GRID:/marirac/OCRBACKUP/backup_20200425_171539.ocr.266.1038676539
[SYSTEM.version.activeversion.patchlevel]
UB4 (10) : 2701864972
SECURITY : {USER_PERMISSION : PROCR_ALL_ACCESS, GROUP_PERMISSION : PROCR_READ, OTHER_PERMISSION : PROCR_READ, USER_NAME : root, GROUP_NAME : root}

If backup is located on filesystem:

[root@rac1 ~]# ocrdump -stdout -keyname SYSTEM.version.activeversion.patchlevel -backupfile /tmp/backup_07_13_47_25
04/26/2020 12:26:57
/tmp/backup_07_13_47_25
/u01/app/19.3.0/grid/bin/ocrdump.bin -stdout -keyname SYSTEM.version.activeversion.patchlevel -backupfile /tmp/backup_07_13_47_25
[SYSTEM.version.activeversion.patchlevel]
UB4 (10) : 2701864972
SECURITY : {USER_PERMISSION : PROCR_ALL_ACCESS, GROUP_PERMISSION : PROCR_READ, OTHER_PERMISSION : PROCR_READ, USER_NAME : root, GROUP_NAME : root}

ORA-15137: The ASM cluster is in rolling patch state, CRS does not start

Problem:

The customer was not able to start cluster because ASM disks were offline and could not become online because the cluster was in ROLLING patch state.

SQL> ALTER DISKGROUP "GRID" ONLINE REGULAR DISK "RAC2$XVDN"
2020-04-23T17:28:10.851065+08:00ORA-15032: not all alterations performed
ORA-15137: The ASM cluster is in rolling patch state.

Troubleshooting:

  1. Check cluster activeversion
[root@rac1 ~]# crsctl query crs activeversion -f
Oracle Clusterware active version on the cluster is [12.2.0.1.0].
The cluster upgrade state is [ROLLING PATCH]. The cluster active patch level is [2250904419].

2. Check softwarepatch of each node:

[root@rac1 ~]# crsctl query crs softwarepatch
Oracle Clusterware patch level on node rac1 is [2269302628]

[root@rac2 ~]# crsctl query crs softwarepatch
Oracle Clusterware patch level on node rac2 is [2269302628]

[root@rac3 ~]# crsctl query crs softwarepatch
Oracle Clusterware patch level on node rac3 is [2269302628]

[root@rac4 ~]# crsctl query crs softwarepatch
Oracle Clusterware patch level on node rac4 is [3074559134]

As we see software version on rac4 is different then others. In addition to this activeversion [2250904419] is older, this is normal in ROLLING mode, when ROLLING mode is finished then this value in OCR is also updated.

3. From the output of the 2nd step we see that additional troubleshooting on installed patches are necessary, compare patches on each node:

Inventory showed the same output on all 4 nodes:

$ $GRID_HOME/OPatch/opatch lspatches
30593149;Database Jan 2020 Release Update : 12.2.0.1.200114 (30593149)
30591794;TOMCAT RELEASE UPDATE 12.2.0.1.0(ID:RELEASE) (30591794)
30586063;ACFS JAN 2020 RELEASE UPDATE 12.2.0.1.200114 (30586063)
30585969;OCW JAN 2020 RELEASE UPDATE 12.2.0.1.200114 (30585969)
26839277;DBWLM RELEASE UPDATE 12.2.0.1.0(ID:170913) (26839277)

But after checking kfod op=patches we’ve found that 4 additional patches existed on rac4 node only, the difference was not visible in inventory because they were inactive, left from old RU (they were not rolled back during applying superset patches, but they were made inactive).

The output from rac1, rac2 and rac3:

$ $GRID_HOME/bin/kfod op=patches
List of Patches
26710464
26737232
26839277
...
<extracted>

The output from rac4:

$ $GRID_HOME/bin/kfod op=patches
---------------
List of Patches
===============
26710464
26737232
26839277
...
<extracted>
28566910<<<<<Bug 28566910:TOMCAT RELEASE UPDATE 12.2.0.1.0
29757449<<<<<Bug 29757449:DATABASE JUL 2019 RELEASE UPDATE 12.2.0.1.190716
29770040<<<< Bug 29770040:OCW JUL 2019 RELEASE UPDATE 12.2.0.1.190716
29770090<<<<< Bug 29770090:ACFS JUL 2019 RELEASE UPDATE 12.2.0.1.190716

To rollback inactive patches you need to use patchgen, using opatch rollback -id it is not possible.

As root:

# $GRID_HOME/crs/install/rootcrs.sh -prepatch

As grid:

$ $GRID_HOME/bin/patchgen commit -rb 28566910
$ $GRID_HOME/bin/patchgen commit -rb 29757449
$ $GRID_HOME/bin/patchgen commit -rb 29770040
$ $GRID_HOME/bin/patchgen commit -rb 29770090

Validate patch level again on rac4, should be 2269302628:

$ $GRID_HOME/bin/kfod op=PATCHLVL
-------------------
Current Patch level
===================
2269302628

As root:

# $GRID_HOME/crs/install/rootcrs.sh -postpatch

4. If CRS was up at least on one node, then we would be able to stop rollingpatch state using crsctl stop rollingpatch (although, if CRS is up, postpatch should stop rolling also). But now our CRS is down, ASM is not able to access OCR file (this was caused by some other problem, which I consider as a bug, but will not mention here otherwise blog will become too long). We need to do extra steps to start CRS.

The only solution that I used here was to restore OCR from old backup. I needed OCR which was not in ROLLING mode. I needed NORMAL OCR backup, which was located on filesystem. I knew after restore my OCR activeversion would be lower than the current software version on nodes, but this is solvable (we will see this step also)

a) Stop crs services on all nodes
b) On one of the node, start cluster in exclusive and restore OCR backup:

# crsctl stop crs -f 
# crsctl start crs -excl -nocrs
# ocrconfig -restore /tmp/OCR/OCR_BACKUP_FILE
# crsctl stop crs -f

c) If you check softwarepatch here, it will not be 2269302628, but it will be something old (because OCR is old), so we need to correct it on each node:

# crsctl start crs -wait
# clscfg -patch
# crsctl query crs softwarepatch

Sofwarepatch will be 2269302628 on all nodes.

d) Note even correcting softwarepatch crsctl query crs activeversion -f still shows something old, which is normal. To correct it, run the following only on last node:

# crsctl stop rollingpatch
# crsctl query crs activeversion -f

e) If you had custom services or configuration in CRS after OCR backup was taken then you need to add them manually. For example, if you had service added srvctl add service, need to readd. That’s is why it’s is highly recommended to backup OCR manually before and after patching or any custom configuration changes using # ocrconfig -manualbackup

ACFS Input/output error on OS kernel 3.10.0-957

Problem:

My ACFS volume has intermittent problems, when I run ls command I get ls: cannot access sm: Input/output error and user/group ownership contains question marks.

[root@primrac1 GG_HOME]# ll
ls: cannot access ma: Input/output error
ls: cannot access sm: Input/output error
ls: cannot access deploy: Input/output error
total 64
d????????? ? ? ? ? ? deploy
drwx------ 2 oracle oinstall 65536 Sep 28 21:05 lost+found
d????????? ? ? ? ? ? ma
d????????? ? ? ? ? ? sm

Reason:

Doc ID 2561145.1 mentions that kernel 3.10.0-957 changed behaviour of d_splice_alias interface which is used by ACFS driver.

My kernel is also the same:

[root@primrac1 GG_HOME]# uname -r
3.10.0-957.21.3.el7.x86_64

Solution:

Download and apply patch 29963428 on GI home.

[root@primrac1 29963428]# /u01/app/19.3.0/grid/OPatch/opatchauto apply -oh /u01/app/19.3.0/grid

Resize ASM disks in AWS (FG enabled cluster)

  1. Connect to AWS console https://console.aws.amazon.com
  2. On the left side -> under the section ELASTIC BLOCK STORE -> choose Volumes
  3. Choose necessary disk -> click Actions button -> choose Modify Volume -> change Size
    Please note that all data disks (not quorum disk) must be increased under the same diskgroup, otherwise ASM will not let you to have different sized disks.

Choose another data disks and repeat the same steps.

4. Run the following on database nodes via root user:

# for i in /sys/block/*/device/rescan; do echo 1 > $i; done

5. Check that disks have correct sizes:

# flashgrid-node

6. Connect to the ASM instance from any database node and run:

[grid@rac1 ~]$ sqlplus / as sysasm
SQL*Plus: Release 19.0.0.0.0 - Production on Fri Aug 23 10:17:50 2019
Version 19.4.0.0.0
Copyright (c) 1982, 2019, Oracle.  All rights reserved.
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.4.0.0.0

SQL> alter diskgroup GRID resize all; 
Diskgroup altered.

PRVF-6402 : Core file name pattern is not same on all the nodes

Problem:

Oracle 18c GI configuration prerequisite checks failed with the following error:

PRVF-6402 : Core file name pattern is not same on all the nodes. Found core filename pattern "core" on nodes "primrac1". Found core filename pattern "core.%p" on nodes "primrac2".  
- Cause:  The core file name pattern is not same on all the nodes.  
- Action:  Ensure that the mechanism for core file naming works consistently on all the nodes. Typically for Linux, the elements to look into are the contents of two files /proc/sys/kernel/core_pattern or /proc/sys/kernel/core_uses_pid. Refer OS vendor documentation for platforms AIX, HP-UX, and Solaris.

Comparing parameter values on both nodes:

[root@primrac1 ~]# cat /proc/sys/kernel/core_uses_pid
0
[root@primrac2 ~]# cat /proc/sys/kernel/core_uses_pid
1 

[root@primrac1 ~]# sysctl -a|grep core_uses_pid
kernel.core_uses_pid = 0

[root@primrac2 ~]# sysctl -a|grep core_uses_pid
kernel.core_uses_pid = 1

Strange fact was that this parameter was not defined explicitly in sysctl.conf file, but still had different default values:

[root@primrac1 ~]# cat /etc/sysctl.conf |grep core_uses_pid
[root@primrac2 ~]# cat /etc/sysctl.conf |grep core_uses_pid 

Solution:

I’ve set parameter to 1 explicitly in sysctl.conf on both nodes:

[root@primrac1 ~]# cat /etc/sysctl.conf |grep core_uses_pid
kernel.core_uses_pid=1 

[root@primrac2 ~]# cat /etc/sysctl.conf |grep core_uses_pid
kernel.core_uses_pid=1

[root@primrac1 ~]# sysctl -p 
[root@primrac2 ~]# sysctl -p

[root@primrac1 ~]# sysctl -a|grep core_uses_pid 
kernel.core_uses_pid = 1

[root@primrac2 ~]# sysctl -a|grep core_uses_pid 
kernel.core_uses_pid = 1

Pressed Check Again button and GI configuration succeeded.

UDEV rules for configuring ASM disks

Problem:

During my previous installations I used the following udev rule on multipath devices:

KERNEL=="dm-[0-9]*", BUS=="scsi", PROGRAM=="/sbin/scsi_id -g -u -d /dev/$parent", RESULT=="360050768028200a9a40000000000001c", NAME="oracleasm/asm-disk1", OWNER="oracle", GROUP="asmadmin", MODE="0660"

So to identify the exact disk I used PROGRAM option. The above script looks through `/dev/dm-*` devices and if any of them satisfy the condition, for example:

# scsci_id -gud /dev/dm-3
360050768028200a9a40000000000001c 

then device name will be changed to /dev/oracleasm/asm-disk1, owner:group to grid:asmadmin and permission to 0660

But on my new servers same udev rule was not working anymore. (Of course, it needs more investigation, but our time is really valuable and never enough and if we know another solution that works and is acceptable- let’s just use it)

Solution:

I used udevadm command to identify other properties of these devices and wrote new udev rule (to see all properties, just remove grep):

# udevadm info --query=property --name /dev/mapper/asm1 | grep DM_UUID
DM_UUID=mpath-360050768028200a9a40000000000001c

New udev rule looks like this:

# cat /etc/udev/rules.d/99-oracle-asmdevices.rules
ENV{DM_UUID}=="mpath-360050768028200a9a40000000000001c",  SUBSYSTEM=="block", NAME="oracleasm/asm-disk1", OWNER="grid", GROUP="asmadmin", MODE="0660"

Trigger udev rules:

# udevadm trigger

Verify that name, owner, group and permissions are changed:

# ll /dev/oracleasm/
total 0
brw-rw---- 1 grid asmadmin 253, 3 Jul 17 17:33 asm-disk1

Detach diskgroup from 12c GI and attach to 19c GI

Task:

We have two separate Real Application Clusters, one 12c and another 19c. We decided to migrate data from 12c to 19c by simply detaching all ASM disks from the source and attaching to the destination.

Steps:

1. Connect to the 12c GI via grid user and dismount FRA diskgroup on all nodes:

[grid@rac1 ~]$ sqlplus  / as sysasm
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
SQL> alter diskgroup FRA dismount;
Diskgroup altered. 
[grid@rac2 ~]$ sqlplus  / as sysasm
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
SQL> alter diskgroup FRA dismount;
Diskgroup altered.

You can also use srvctl to stop the diskgroup with one command.

2. Detach disks belonging to the specific diskgroup from 12c cluster and attach to 19c cluster.

3. After ASM disks are visible on 19c cluster, connect as sysasm via grid user and mount the diskgroup:

# Check that there is no FRA resource registered with CRS:

[root@rac1 ~]# crsctl status res -t |grep FRA

# Mount the diskgroup on all nodes

[grid@rac1 ~]$ sqlplus / as sysasm
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
SQL> alter diskgroup FRA mount;
Diskgroup altered.
[grid@rac2 ~]$ sqlplus / as sysasm
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
SQL> alter diskgroup FRA mount;
Diskgroup altered.

# FRA diskgroup resource will automatically be registered with CRS:

[root@rac1 ~]# crsctl status res -t |grep FRA
ora.FRA.dg(ora.asmgroup)

And data will be there…

What is a Flex ASM and how to check if it is enabled?

In versions prior to 12c, the ASM instance needed to be run on each of the nodes of the cluster. In case ASM was not able to start, the database instance located on the same node was not able to come up also. There were a hard dependency between database and ASM instances.

With Oracle Flex ASM, databases are able to connect remote ASM using network connection(ASM network). In case of ASM instance fails, the database instance will reconnect to another ASM instance on another node. This feature is called Oracle Flex ASM.

Check if you are using such a great feature using the following command:

[grid@rac1 ~]$ asmcmd
ASMCMD> showclustermode
ASM cluster : Flex mode enabled

 

ORA-17635: failure in obtaining physical sector size for ‘+DATA’

Action:

I was trying to create a spfile on asm diskgroup from standby database.

SYS @ shcat > create spfile='+DATA' from pfile='/tmp/initshcat_stby.ora';
create spfile='+DATA' from pfile='/tmp/initshcat_stby.ora'
*
ERROR at line 1:
ORA-17635: failure in obtaining physical sector size for '+DATA'
ORA-12547: TNS:lost contact
ORA-12547: TNS:lost contact

Troubleshooting:

I’ve checked the sector size and it was ok:

SQL> select name,sector_size from v$asm_diskgroup;

NAME       SECTOR_SIZE
---------- ------------
DATA       512

I’ve checked if oracle account was able to see ASM diskgroups in DBCA and it was not. The diskgroup list was empty.

Causes:

There are several causes:

1) File permissions in <Grid_home>/bin/oracle executable not set properly.

2) Oracle user is not a part of asmdba group

Solution: 

1)  Change permissions:

[root@stbycat ~]# chmod 6751 /u01/app/18.3.0/grid/bin/oracle

2)  Add oracle to asmdba group

[root@stbycat ~]# usermod -g oinstall -G oper,dba,asmdba oracle

In my case it was 1st.

My permissions:

$ ll /u01/app/18.3.0/grid/bin/oracle
-rwxr-x--x 1 grid oinstall 413844056 Nov 4 09:14 /u01/app/18.3.0/grid/bin/oracle

Must be:

$ ll /u01/app/18.3.0/grid/bin/oracle
-rwsr-s--x 1 grid oinstall 413844056 Nov 4 08:45 /u01/app/18.3.0/grid/bin/oracle