sshd: /etc/ssh/sshd_config: Permission denied

Problem:

sshd and chronyd services on the database server were in a failed state and not able to start because of the permission problem on their configuration files. Permissions on these files were correct and services should have been able to start, so there was something else… let’s dig into the details.

# systemctl status sshd
 â sshd.service - OpenSSH server daemon
    Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: enabled)
    Active: activating (auto-restart) (Result: exit-code) since Tue 2019-07-09 12:21:49 UTC; 32s ago
      Docs: man:sshd(8)
            man:sshd_config(5)
   Process: 124026 ExecStart=/usr/sbin/sshd -D $OPTIONS (code=exited, status=1/FAILURE)
Main PID: 124026 (code=exited, status=1/FAILURE)
Jul 09 12:21:49 node03 systemd[1]: Failed to start OpenSSH server daemon.
Jul 09 12:21:49 node03 systemd[1]: Unit sshd.service entered failed state.
Jul 09 12:21:49 node03 systemd[1]: sshd.service failed

`journalctl -xe` shows:

-- Unit sshd.service has begun starting up.
Jul 09 12:26:03 node03 sshd[129121]: /etc/ssh/sshd_config: Permission denied
Jul 09 12:26:03 node03 systemd[1]: sshd.service: main process exited, code=exited, status=1/FAILURE
Jul 09 12:26:03 node03 systemd[1]: Failed to start OpenSSH server daemon.
-- Subject: Unit sshd.service has failed

The same problem was happening with chronyd service. It was claiming about /etc/chrony.conf file. Incorrect time on database servers can cause node evictions.

Reason:

If permissions on these files are correct, we can think about SELinux, let’s check:

# getenforce 
Enforcing

Solution:

Disable SELinux and reboot the server:

# vim /etc/selinux/config
SELINUX=disabled

# reboot

Summary:

I consider SELinux as a non-desirable service on the database servers. But I appreciate opinion of my colleages/friends and I want to share it with you.

SELinux can be enabled with the correct config in RHEL 4,5,6 – “Starting with Oracle Database 11g Release 2 (11.2), the Security Enhanced Linux (SELinux) feature is supported for Oracle Linux 4, Oracle Linux 5, Oracle Linux 6, Red Hat Enterprise Linux 4, Red Hat Enterprise Linux 5, and Red Hat Enterprise Linux 6.
https://docs.oracle.com/cd/E11882_01/install.112/e47689/pre_install.htm#LADBI1092

SELinux is a good security tool and usually I only disable it as a last resort or if the software doesn’t support it.

Detach diskgroup from 12c GI and attach to 19c GI

Task:

We have two separate Real Application Clusters, one 12c and another 19c. We decided to migrate data from 12c to 19c by simply detaching all ASM disks from the source and attaching to the destination.

Steps:

1. Connect to the 12c GI via grid user and dismount FRA diskgroup on all nodes:

[grid@rac1 ~]$ sqlplus  / as sysasm
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
SQL> alter diskgroup FRA dismount;
Diskgroup altered. 
[grid@rac2 ~]$ sqlplus  / as sysasm
Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
SQL> alter diskgroup FRA dismount;
Diskgroup altered.

You can also use srvctl to stop the diskgroup with one command.

2. Detach disks belonging to the specific diskgroup from 12c cluster and attach to 19c cluster.

3. After ASM disks are visible on 19c cluster, connect as sysasm via grid user and mount the diskgroup:

# Check that there is no FRA resource registered with CRS:

[root@rac1 ~]# crsctl status res -t |grep FRA

# Mount the diskgroup on all nodes

[grid@rac1 ~]$ sqlplus / as sysasm
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
SQL> alter diskgroup FRA mount;
Diskgroup altered.
[grid@rac2 ~]$ sqlplus / as sysasm
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
SQL> alter diskgroup FRA mount;
Diskgroup altered.

# FRA diskgroup resource will automatically be registered with CRS:

[root@rac1 ~]# crsctl status res -t |grep FRA
ora.FRA.dg(ora.asmgroup)

And data will be there…

Upgrading/Installing TFA with OSWatcher

The whole process is very simple and straightforward.
Post seems big but most of the content is a command output.

1. Download TFA Collector – TFA with Database Support Tools Bundle from Doc ID 1513912.1

2. Place downloaded zip file on rac1 and unzip it:

# cd /u01/app/sw
# ll
…
-rw-r--r-- 1 root root      264751391 Apr 25 19:18 TFA-LINUX_v19.2.1

# unzip TFA-LINUX_v19.2.1 

3. Install TFA:

[root@rac1 sw]# ./installTFA-LINUX 

TFA Installation Log will be written to File : /tmp/tfa_install_21556_2019_06_03-10_39_10.log
Starting TFA installation
 TFA Version: 192100 Build Date: 201904251105
 TFA HOME : /u01/app/12.2.0/grid/tfa/rac1/tfa_home
 Installed Build Version: 184100 Build Date: 201902262137
 TFA is already installed. Upgrading TFA
 TFA Upgrade Log : /u01/app/12.2.0/grid/tfa/rac1/tfapatch.log
 TFA will be upgraded on : 
 rac1
 rac2
 Do you want to continue with TFA Upgrade ? [Y|N] [Y]: Y
 Checking for ssh equivalency in rac2
 Node rac2 is not configured for ssh user equivalency
 SSH is not configured on these nodes : 
 rac2
 Do you want to configure SSH on these nodes ? [Y|N] [Y]: N
 Patching remote nodes using TFA Installer /u01/app/sw/installTFA-LINUX…
 Copying TFA Installer to rac2…
 lost connection
 Starting TFA Installer on rac2…
 Upgrading TFA on rac1 :
 Stopping TFA Support Tools…
 Shutting down TFA for Patching…
 Shutting down TFA
 Removed symlink /etc/systemd/system/multi-user.target.wants/oracle-tfa.service.
 Removed symlink /etc/systemd/system/graphical.target.wants/oracle-tfa.service.
 Successfully shutdown TFA..
 No Berkeley DB upgrade required
 Copying TFA Certificates…
 Starting TFA in rac1…
 Starting TFA..
 Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
 Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
 Waiting up to 100 seconds for TFA to be started..
 . . . . . 
 Successfully started TFA Process..
 . . . . . 
 TFA Started and listening for commands
 Enabling Access for Non-root Users on rac1…
 Connection refused!rac2
 RemoteUtil : Connection refused!rac2
 .------------------------------------------------------------.
 | Host | TFA Version | TFA Build ID         | Upgrade Status |
 +------+-------------+----------------------+----------------+
 | rac1 |  19.2.1.0.0 | 19210020190425110550 | UPGRADED       |
 | rac2 | -           | -                    | NOT UPGRADED   |
 '------+-------------+----------------------+----------------'

The reason why it did not upgrade on rac2, is that I did not have ssh equivalency between nodes for root user.

I could enable ssh passwordless authentication and TFA would be upgraded in one step, but because of the security I will not enable it and just manually install TFA on the second node:

4. Copy installation file to rac2 and install:

[root@rac2 ~]# mkdir -p /u01/app/sw/
[root@rac2 ~]# chmod -R 777 /u01/app/sw/
[root@rac2 ~]# su - oracle
[oracle@rac2 ~]$ cd /u01/app/sw/  
[oracle@rac2 sw]$ scp rac1:/u01/app/sw/installTFA-LINUX .
installTFA-LINUX                           100%  254MB 114.9MB/s   00:02 
[root@rac2 sw]# ./installTFA-LINUX 
 TFA Installation Log will be written to File : /tmp/tfa_install_15370_2019_06_03-10_50_05.log
 Starting TFA installation
 TFA Version: 192100 Build Date: 201904251105
 TFA HOME : /u01/app/12.2.0/grid/tfa/rac2/tfa_home
 Installed Build Version: 184100 Build Date: 201902262137
 TFA is already installed. Upgrading TFA
 TFA Upgrade Log : /u01/app/12.2.0/grid/tfa/rac2/tfapatch.log
 TFA-00002 Oracle Trace File Analyzer (TFA) is not running
 TFA-00002 Oracle Trace File Analyzer (TFA) is not running
 Unable to determine the status of TFA in other nodes.
 TFA will be upgraded on Node rac2:
 Do you want to continue with TFA Upgrade ? [Y|N] [Y]: 
 Upgrading TFA on rac2 :
 Stopping TFA Support Tools…
 Shutting down TFA for Patching…
 Shutting down TFA
 Removed symlink /etc/systemd/system/multi-user.target.wants/oracle-tfa.service.
 Removed symlink /etc/systemd/system/graphical.target.wants/oracle-tfa.service.
 . . . . . 
 . . . 
 Successfully shutdown TFA..
 No Berkeley DB upgrade required
 Copying TFA Certificates…
 Starting TFA in rac2…
 Starting TFA..
 Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
 Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
 Waiting up to 100 seconds for TFA to be started..
 . . . . . 
 Successfully started TFA Process..
 . . . . . 
 TFA Started and listening for commands
 Enabling Access for Non-root Users on rac2…
 .------------------------------------------------------------.
 | Host | TFA Version | TFA Build ID         | Upgrade Status |
 +------+-------------+----------------------+----------------+
 | rac2 |  19.2.1.0.0 | 19210020190425110550 | UPGRADED       |
 | rac1 |  19.2.1.0.0 | 19210020190425110550 | UPGRADED       |
 '------+-------------+----------------------+----------------'

5. Stop and Start TFA on rac1 and rac2:

# tfactl stop
Sending stoptfa
Success
Stopping TFA from the Command Line
Nothing to do !
LCM is not running
TFA is running  - Will wait 5 seconds (up to 3 times)  
TFA-00104 Cannot establish connection with TFA Server. Please check TFA Certificates
Killing TFA running with pid 16627
. . . 
Successfully stopped TFA..

# tfactl start
TFA-00002 Oracle Trace File Analyzer (TFA) is not running
Starting TFA..
Waiting up to 100 seconds for TFA to be started..
Successfully started TFA Process..

# ps -ef|grep OSWatcher
root     14528     1  0 May30 ?        00:44:11 /bin/sh ./OSWatcher.sh
root     14850 14528  0 May30 ?        00:00:58 /bin/sh ./OSWatcherFM.sh 48 /home/fg/oswbb/archive

6. Check OSWatcher repository location:

# ll /home/fg/oswbb/archive

drwxr-xr-x 2 root root  202 May 30 22:00 oswcpuinfo
drwxr-xr-x 2 root root 4096 Jun  3 11:00 oswifconfig
drwxr-xr-x 2 root root 4096 Jun  3 11:00 oswiostat
drwxr-xr-x 2 root root 4096 Jun  3 11:00 oswmeminfo
drwxr-xr-x 2 root root 4096 Jun  3 11:00 oswmpstat
drwxr-xr-x 2 root root 4096 Jun  3 11:00 oswnetstat
drwxr-xr-x 2 root root 4096 Jun  3 11:00 oswnfsiostat
drwxr-xr-x 2 root root 4096 Jun  3 11:00 oswpidstat
drwxr-xr-x 2 root root 4096 Jun  3 11:00 oswpidstatd
drwxr-xr-x 2 root root    6 May 30 21:56 oswprvtnet
drwxr-xr-x 2 root root 4096 Jun  3 11:00 oswps
drwxr-xr-x 2 root root 4096 Jun  3 11:00 oswslabinfo
drwxr-xr-x 2 root root 4096 Jun  3 11:00 oswtop
drwxr-xr-x 2 root root 4096 Jun  3 11:00 oswvmstat
drwxr-xr-x 2 root root    6 May 30 21:56 oswxentop

JDBC THIN CONNECTIONS OVER SCAN AND NAT FAIL WITH ORA-12516

Problem:

JDBC 11g thin clients are not able to connect to the database over SCAN and NAT. JDBC OCI and SQLPLUS connections work fine. JDBC connections to local listener work fine, only connections to SCAN listener fail.

For visibility, I will use sql developer with 11g JDBC thin driver. Actually, the problem happens on all applications using that version of JDBC.

Cause:

SQL*Plus and OCI clients use a redirect count to inform the SCAN listener about a redirected connection, but JDBC 11g and below use a lower version of NS , which does not keep a redirect count and connections are rejected.

Solution:

There are several solutions:

1.  Upgrade to JDBC 12C
2. Apply Patch 17284368 to the JDBC Thin Driver

The first option may not be acceptable for you, because it requires some changes on application side. Let’s describe how to apply patch to the JDBC thin driver:

1. Download and unzip patch file on database node(s):

$ unzip p17284368_112040_Generic.zip

2. Apply patch on db node(s):

$ cd 17284368/ 
$ $ORACLE_HOME/OPatch/opatch apply

Oracle Interim Patch Installer version 11.2.0.3.20
Copyright (c) 2019, Oracle Corporation. All rights reserved.
Oracle Home : /u01/app/oracle/product/11.2.0/dbhome_1
Central Inventory : /u01/app/oraInventory
from : /u01/app/oracle/product/11.2.0/dbhome_1/oraInst.loc
OPatch version : 11.2.0.3.20
OUI version : 11.2.0.4.0
Log file location : /u01/app/oracle/product/11.2.0/dbhome_1/cfgtoollogs/opatch/opatch2019-02-22_20-26-57PM_1.log
Verifying environment and performing prerequisite checks…
OPatch continues with these patches: 17284368
Do you want to proceed? [y|n]
y
User Responded with: Y
All checks passed.
Problem with accessing response file of "/u01/app/oracle/product/11.2.0/dbhome_1/ccr/bin/setupCCR".
Backing up files…
Applying interim patch '17284368' to OH '/u01/app/oracle/product/11.2.0/dbhome_1'
Patching component oracle.dbjava.ic, 11.2.0.4.0…
Patching component oracle.dbjava.jdbc, 11.2.0.4.0…
Patch 17284368 successfully applied.
Log file location: /u01/app/oracle/product/11.2.0/dbhome_1/cfgtoollogs/opatch/opatch2019-02-22_20-26-57PM_1.log
OPatch succeeded.

3. Copy patched $ORACLE_HOME/jdbc/lib/ojdbc5.jar to client side.

4. Restart the application:

Instead of patching database side and then copying patched ojdbc to client side, you can manually patch ojdbc thin driver without touching database.

Please consider that, manual method is for testing purposes only and is explained in the next post.

GGSCI: error while loading shared libraries: libnnz18.so: cannot open shared object file: No such file or directory

Problem:

Oracle GoldenGate Software Command Interface (GGSCI) 18c raises error:

$ /GG_HOME/home_1/ggsci
/GG_HOME/home_1/ggsci: error while loading shared libraries: libnnz18.so: cannot open shared object file: No such file or directory

Cause: 

GoldenGate put all necessary shared objects (.so files) under its own home directory, but several shared objects may not be found there. They should be borrowed from RDBMS or GI home.

Solution:

# find / -name libnnz18.so
/u01/app/18.3.0/grid/lib/libnnz18.so
/u01/app/18.3.0/grid/inventory/Scripts/ext/lib/libnnz18.so

Create symbolic link of the specidifed library in the Golden Gate home:

$ cd /GG_HOME/home_1
$ ln -s /u01/app/18.3.0/grid/lib/libnnz18.so libnnz18.so

Rerun ggsci.

 

MGTCA-1176/MGTCA-1162 : An error occurred while marking the Cluster Manifest File as expired.

If you are installing “Oracle Member Cluster for Oracle Database” and during the installation GIRM configuration assistant fails with the following error:

MGTCA-1176 : An error occurred while marking the Cluster Manifest File as expired.
MGTCA-1162 : failed to add a property to the provided Cluster Manifest File

Just give the following permission to the manifest file , to let the installer make changes there:

chmod 777 manifest.xml

There is no useful info about that on the internet and metalink! I guessed it myself.  That’s why posting that simple solution here.

DNS/NIS name service prereq failed

I was configuring an Oracle Member cluster for Databases. This type of cluster was introduced in 12.2. It has a lot of benefits. To see more information and installation steps for this type of cluster visit Oracle doc site.

Visually it looks like the following:

Domain_Cluster_arch

So during the installtion prerequisite check was complaining about “DNS/NIS name service”

DNS_NIS_Error

If you press details of the error you get a huge message. I won’t paste it’s content here, because it is something like bla bla… bla. Coud not find any meaningfull information there and lost too much time.

I seached on the Internet , still not found any good info.  Then started trying everything that came to my mind and one of them solved it.

So my entries about scan looked like the following:

Scan

Added domain name at the end:

Scan_correct

And my check looks like the following:

Prereq_OK

Please note that there may be several cases when this type of failure appeares. I showed you one of them.

Good Luck!