ORA-15477: cannot communicate with the volume driver (DBD ERROR: OCIStmtExecute)

Problem:

I had GI Standalone installation, which I’ve deconfigured and configured one node RAC which was successful. Then I’ve tried to create ACFS volume which failed with ORA-15477:

[root@host1 dbs]# asmcmd volcreate -G OGG -s 10G ACFSGG
ORA-15032: not all alterations performed
ORA-15477: cannot communicate with the volume driver (DBD ERROR: OCIStmtExecute)

Reason:

It seems the ACFS/ADMV modules are not loaded:

[root@host1 dbs]# lsmod | grep oracle
oracleacfs           5921415  0
oracleadvm           1236257  0
oracleoks             750688  2 oracleacfs,oracleadvm

Solution:

First of all, I will share two possible solutions, that helped others but not me and one possible solution (3rd) that helped me:

  1. Start module manualy and make sure it’s enabled:
# acfsload start
# acfsload enable

Check if modules is loaded using lsmod | grep oracle and retry volume creation.

2. Reinstall acfs/admv modules manually:

[root@host1 dbs]# acfsroot install
ACFS-9300: ADVM/ACFS distribution files found.
ACFS-9314: Removing previous ADVM/ACFS installation.
depmod: ERROR: fstatat(6, uds.ko): No such file or directory
depmod: ERROR: fstatat(6, kvdo.ko): No such file or directory
ACFS-9315: Previous ADVM/ACFS components successfully removed.
ACFS-9294: updating file /etc/sysconfig/oracledrivers.conf
ACFS-9307: Installing requested ADVM/ACFS software.
ACFS-9294: updating file /etc/sysconfig/oracledrivers.conf
ACFS-9308: Loading installed ADVM/ACFS drivers.
ACFS-9321: Creating udev for ADVM/ACFS.
ACFS-9323: Creating module dependencies - this may take some time.
depmod: ERROR: fstatat(6, uds.ko): No such file or directory
depmod: ERROR: fstatat(6, kvdo.ko): No such file or directory
ACFS-9390: The command 'echo '/lib/modules/3.10.0-862.el7.x86_64/extra/usm/oracleadvm.ko
/lib/modules/3.10.0-862.el7.x86_64/extra/usm/oracleoks.ko
/lib/modules/3.10.0-862.el7.x86_64/extra/usm/oracleacfs.ko
' | /sbin/weak-modules --no-initramfs --add-modules 3.10.0-1127.18.2.el7.x86_64 2>&1 |' returned unexpected output that may be important for system configuration:
depmod: ERROR: fstatat(6, kvdo.ko): No such file or directory

depmod: ERROR: fstatat(6, uds.ko): No such file or directory

depmod: ERROR: fstatat(6, uds.ko): No such file or directory

depmod: ERROR: fstatat(6, kvdo.ko): No such file or directory

ACFS-9154: Loading 'oracleoks.ko' driver.
ACFS-9154: Loading 'oracleadvm.ko' driver.
ACFS-9154: Loading 'oracleacfs.ko' driver.
ACFS-9327: Verifying ADVM/ACFS devices.
ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.
ACFS-9156: Detecting control device '/dev/ofsctl'.
ACFS-9309: ADVM/ACFS installation correctness verified.

Retry volume creation.

If none of the above helps, do the 3rd solution (which is not available on the internet, it was my decision):

3. Rebuild initramfs

[root@host1 ~]# cp -p /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
[root@host1 ~]# dracut -f
[root@host1 ~]# reboot

After restart, you should be able to create volume.

ACFS-05913: unable to contact the standby node stbyrac1

Problem:

I was trying to setup ACFS replication, where one of the steps is to validate keys using acfsutil, which failed with ACFS-05913 error:

[root@rac1 .ssh]# acfsutil repl info -c -u oggrepl stbyrac1 stbyrac2 /GG
acfsutil repl info: ACFS-05913: unable to contact the standby node stbyrac1
acfsutil repl info: ACFS-05913: unable to contact the standby node stbyrac2

Cause: 

An attempt to use the ping utility to contact a standby node failed.

Solution:

Enable ICMP traffic between these nodes and retry validation:

[root@rac1 .ssh]# acfsutil repl info -c -u oggrepl stbyrac1 stbyrac2 /GG
A valid 'ssh' connection was detected for standby node stbyrac1 as user oggrepl.
A valid 'ssh' connection was detected for standby node stbyrac2 as user oggrepl.

srvctl start filesystem hangs

The title of this post is general, there can be a lot of reasons why srvctl start filesystem hangs. The aim of this blog post is to share one of the reasons only.

Problem:

I’ve created ACFS volume and added it to srvctl:

$ asmcmd volcreate -G OGG -s 10G ACFSGG
# srvctl add filesystem -device /dev/asm/acfsgg-11 -path /GG_HOME -volume acfsgg -diskgroup OGG -user oracle -fstype ACFS

then tried to start the filesystem using:

# srvctl start filesystem -device /dev/asm/acfsgg-11

Which hanged.

Troubleshooting:

I’ve checked logs under trace folder under GI base, but could not find any clue. Even worse, stopping filesystem was also hanging.

But let’s stop here, the file that should have been checked was really there, but I missed it and checked wrong files. The file name that shows the necessary error is mount_<process id>.trc and is definitely located under trace folder. So instead of manually mounting filesystem to see the error, you can just open that mount_<process id>.trc and you will see the reason there.

Then I tried manual mounting of the filesystem, without srvctl:

[root@stbyrac1 trace]# /bin/mount -t acfs  /dev/asm/acfsgg-11 /GG_HOME
mount.acfs: ACFS-03037: not an ACFS file system

saw the error, which explained what was happening. My volume was not formatted with acfs filesystem. Somehow I missed that step on the standby cluster, so just a human error, but srvctl at least should have said that instead of hanging and placing info in trace file.

Solution:

Format ACFS volume:

[root@stbyrac1 trace]# mkfs -t acfs /dev/asm/acfsgg-11
mkfs.acfs: version                   = 19.0.0.0.0
mkfs.acfs: on-disk version           = 46.0
mkfs.acfs: volume                    = /dev/asm/acfsgg-11
mkfs.acfs: volume size               = 10737418240  (  10.00 GB )
mkfs.acfs: Format complete.

Because the start and stop operations are hanged, you need to mount filesystem on all database nodes manually:

[root@stbyrac1 ~]# /bin/mount -t acfs  /dev/asm/acfsgg-11 /GG_HOME
[root@stbyrac1 ~]# /bin/mount -t acfs  /dev/asm/acfsgg-11 /GG_HOME

Now try to stop and start filesystem, to make sure srvctl is able to do it’s job without any manual interaction:

[root@stbyrac1 ~]# srvctl stop filesystem -device /dev/asm/acfsgg-11
[root@stbyrac1 ~]# srvctl start filesystem -device /dev/asm/acfsgg-11

How to identify OS is Oracle Linux or RHEL?

There are several ways to identify that, I will suggest one of them using rpm -qf, that finds out what package a file belongs to:

Oracle Linux:

#  rpm -qf /etc/redhat-release
oraclelinux-release-7.8-1.0.7.el7.x86_64

RHEL:

# rpm -qf /etc/redhat-release
redhat-release-server-7.8-2.el7.x86_64