Installing linux.x64_193000_client gets java.library.path system variable is missing or invalid

Problem:

While installing linux.x64_193000_client.zip on Linux, getting the following error:

Preparing to launch Oracle Universal Installer from /tmp/OraInstall2022-06-15_07-48-19AM. Please wait ...[oracle@mariclient client]$ Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=/u01/tmp
The java.library.path system variable is missing or invalid. Please set java.library.path with a correct value and retry the operation.
Exception in thread "main" java.lang.NoClassDefFoundError: Could not initialize class oracle.sysman.oii.oiip.oiipg.OiipgPropertyLoader
	at oracle.sysman.oii.oiip.oiipg.OiipgBootstrap.isCleanMachine(OiipgBootstrap.java:484)
	at oracle.sysman.oii.oiii.OiiiInstallAreaControl.isCleanMachine(OiiiInstallAreaControl.java:3796)

Solution:

Install the following rpm and retry:

# yum install libnsl.x86_64

Oracle 12c relink resets oracle group to oinstall, while 19c sets asmadmin

Problem:

In a mixed environment when you have 19c GI, but several versions of Oracle RDBMS home (19c, 12c), initially during 12c database restore you may encounter the following error:

ORA-15001: diskgroup "DATA" does not exist or is not mounted
ORA-15040: diskgroup is incomplete

There are several reasons for the above error and one of them is when the $ORACLE_HOME/bin/oracle binary under rdbms home does not have group asmadmin. After identifying it, you try to change permissions to oracle:asmadmin and but after a while, you may notice that the group is reset to oinstall. It can happen after patching or after running relink manually. So let’s see test case:

[oracle@rac1 lib]$ ll /u01/app/oracle/product/12.1.0/dbhome_1/bin/oracle
-rwsr-s--x 1 oracle asmadmin 323613256 Feb 19 17:00 /u01/app/oracle/product/12.1.0/dbhome_1/bin/oracle

Run relink:

[oracle@rac1 lib]$ make -f ins_rdbms.mk ioracle

Check permissions again:

[oracle@rac1 lib]$ ll /u01/app/oracle/product/12.1.0/dbhome_1/bin/oracle
-rwsr-s--x 1 oracle oinstall 323613256 Feb 19 17:03 /u01/app/oracle/product/12.1.0/dbhome_1/bin/oracle

If you do the same for 19c home, group is still asmadmin.

Reason:

ins_rdbms.mk script is different in 19c and 12c, comparing ioracle section:

12c:

ioracle: preinstall $(ORACLE)
        -$(NOT_EXIST) $(ORACLE_HOME)/bin/oracle ||\
           mv -f $(ORACLE_HOME)/bin/oracle $(ORACLE_HOME)/bin/oracleO
        -mv $(ORACLE_HOME)/rdbms/lib/oracle $(ORACLE_HOME)/bin/oracle
        -chmod 6751 $(ORACLE_HOME)/bin/oracle

19c:

ioracle: preinstall $(ORACLE)
        -$(RMF) $(ORACLE_HOME)/bin/oracle
        -mv $(ORACLE_HOME)/rdbms/lib/oracle $(ORACLE_HOME)/bin/oracle
        -chmod 6751 $(ORACLE_HOME)/bin/oracle

        -(if [ ! -f $(ORACLE_HOME)/bin/crsd.bin ]; then \
            getcrshome="$(ORACLE_HOME)/srvm/admin/getcrshome" ; \
            if [ -f "$$getcrshome" ]; then \
                crshome="`$$getcrshome`"; \
                if [ -n "$$crshome" ]; then \
                    if [ $$crshome != $(ORACLE_HOME) ]; then \
                        oracle="$(ORACLE_HOME)/bin/oracle"; \
                        $$crshome/bin/setasmgidwrap oracle_binary_path=$$oracle; \
                    fi \
                fi \
            fi \
        fi\
        );

As you see ioracle section in 12c does not contain the last IF statement which sets oracle binary group to asmadmin (using setasmgidwrap)

Solution:

Unfortunately, I cannot recommend adding that if statement in 12c ins_rdbms.mk although I did it in my test environment and works as expected.
After running make -f ins_rdbms.mk ioracle for 12c the group is set to asmadmin.

It is better to ask Oracle about this change and do only after you get an approval from them.

The only non-harmful recommendation from my side would be to have permission change script and run after every patch or relink:

# chown oracle:asmadmin  /u01/app/oracle/product/12.1.0/dbhome_1/bin/oracle
‚Äč# chmod 6751  /u01/app/oracle/product/12.1.0/dbhome_1/bin/oracle
# ll   /u01/app/oracle/product/12.1.0/dbhome_1/bin/oracle

If you still decide to add IF statement as I did, please make sure there is TAB instead of spaces before IF clause. Otherwise, you will get the following error and you can easily understand which line should be corrected.

[oracle@rac1 lib]$ make -f ins_rdbms.mk ioracle
ins_rdbms.mk:120: *** missing separator (did you mean TAB instead of 8 spaces?).  Stop.

The home is not clean. This home cannot be used since there was a failed OPatch execution in this home. Use a different home to proceed

Problem:

While installing Oracle software using applyRU:

$ /u01/app/oracle/product/19.3.0/dbhome_1/runInstaller -silent -applyRU /u01/swtmp/32895426/32904851/ -responseFile /opt/rsp/db_19.3_EE_swonlyinstall.rsp -ignorePrereqFailure -waitforcompletion

Got an error:

ERROR: The home is not clean. This home cannot be used since there was a failed OPatch execution in this home. Use a different home to proceed.

Solution:

If a previous attempt to install Oracle software using applyRU or applyOneOffs option failed, the home becomes unusable. Before retrying the process, you need to clean up the home first:

$ rm -rf /u01/app/oracle/product/19.3.0/dbhome_1/*

$ unzip -o /u01/swtmp/LINUX.X64_193000_db_home.zip -d /u01/app/oracle/product/19.3.0/dbhome_1

As a best practice I always have a new Opatch, so need to unzip it again:

$ unzip -o /u01/swtmp/p6880880_190000_Linux-x86-64.zip -d /u01/app/oracle/product/19.3.0/dbhome_1

Rerun the command:

$ /u01/app/oracle/product/19.3.0/dbhome_1/runInstaller -silent -applyRU /u01/swtmp/32895426/32904851/ -responseFile /opt/rsp/db_19.3_EE_swonlyinstall.rsp -ignorePrereqFailure -waitforcompletion
Preparing the home to patch...
Applying the patch /u01/swtmp/32895426/32904851/...

After GI upgrade flashgrid-cluster shows wrong diskgroup status

Problem:

I have upgraded Oracle GI from 12c to 19c. ASM is up and diskgroups are mounted, but flashgrid-cluster still shows that diskgroups are not mounted on any node:

[root@rac1 ~]# flashgrid-cluster
FlashGrid 21.8.67.66809 #707efdc10a212421dbd3737ef8e693ae14c15964
License: Active, Marketplace
Licensee: Flashgrid Inc.
Support plan: 24x7
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
FlashGrid running: OK
Clocks check: OK
Configuration check: OK
Network check: OK

Querying nodes: rac1, rac2, racq ...

Cluster Name: mariGIMR
Cluster status: Warning
-------------------------------------------------------------
Node  Status   ASM_Node  Storage_Node  Quorum_Node  Failgroup
-------------------------------------------------------------
rac1  Warning  Yes       Yes           No           RAC1
rac2  Warning  Yes       Yes           No           RAC2
racq  Warning  No        No            Yes          RACQ
-------------------------------------------------------------
-----------------------------------------------------------------------------
GroupName  Status  Mounted  Type  TotalMiB  FreeMiB  OfflineDisks  LostDisks  Resync  ReadLocal  Vote
-----------------------------------------------------------------------------
GRID     Warning  No  N/A   N/A   N/A    N/A    N/A  N/A     N/A        N/A
-----------------------------------------------------------------------------

Solution:

After upgrade, you should restart flashgrid_asm service on all nodes, it does not require downtime:

[root@rac1 ~]# systemctl restart flashgrid_asm
[root@rac2 ~]# systemctl restart flashgrid_asm

Check the status again:

...
-----------------------------------------------------------------------------
GroupName  Status  Mounted   Type    TotalMiB  FreeMiB  OfflineDisks  LostDisks  Resync  ReadLocal  Vote
-----------------------------------------------------------------------------
GRID    Good    AllNodes  NORMAL  204800    154336   0   0  No Enabled    3/3
-----------------------------------------------------------------------------

Reduce high CPU usage by TFA

Problem:

Cluster nodes experienced high CPU usage, after investigation one of the top CPU consumers on the server has been found to be a TFA process (2nd place):

 # Fri Feb 19 17:44:01 2021
AllCPU  OneCPU  PID     User    PR      NI      STime   RSS     Name
--------------------------------------------------------------------------------
11.75%  94.02%  23895   root    20      0       17:43   87M     ora_m001_ORCL2
1.42%   11.39%  2468    root    20      0       Feb02   736M    /opt/oracle.ahf/jre/bin/java -server -Xms256m -Xmx512m -Djava.awt.headless=true -Ddisable.checkForUpdate=true -XX:HeapDumpPath=/u01/app/oracle.ahf/data/rac02/diag/tfa -XX:ParallelGCThreads=5 oracle.rat.tfa.TFAMain /opt/oracle.ahf/tfa

Workaround:

In newer version of TFA, you can set CPU resource limit.

tfactl setresourcelimit 
 [-tool tool_name] 
 [-resource resource_type] 
 [-value value]

To limit TFA to a maximum of 50% of a single CPU, run the following:

# tfactl setresourcelimit -value 0.5

For more information, please check TFA official documentation.

If you don’t have newer version of TFA, you need to upgrade it first.

asmcmd does not return error text for root user

Problem:

While running asmcmd commands via root user, error messages are not displayed. But if we run the same command via grid – it returns messages. In the following example, ORACLE_HOME is set to RDBMS home (instead of GI) and we are trying to list the file which does not exist:

[root@rac1 ~]# echo $ORACLE_HOME
/u01/app/oracle/product/19.3.0/dbhome_1

[root@rac1 ~]# echo $PATH
/u01/app/19.3.0/grid/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin

[root@rac1 ~]# asmcmd ls +demodg/orcl/tempfile/TEMP.263.10617958
<<<<<<Nothing is displayed here

When a file does not exist, ls command should display ASMCMD-8002: entry 'TEMP.263.10617958' does not exist in directory '+demodg/orcl/tempfile/' error. But in our example, it does not return anything.

In the following example, if we try to copy a file, we see a message about copying a file but actually, the file is not copied because grid user does not have permission under /u01:

[root@rac1 ~]# asmcmd cp +demodg/orcl/tempfile/TEMP.263.1061795851 /u01
copying +demodg/orcl/tempfile/TEMP.263.1061795851 -> /u01/TEMP.263.1061795851

If we run the same command via grid, we get understandable error message:

[grid@rac1 ~]$  asmcmd cp +demodg/orcl/tempfile/TEMP.263.1061795851 /u01
 ASMCMD-9463: operation failed due to lack of write permissions

Reason:

Environment variables are not set correctly. ORACLE_HOME should be pointing to GI home.

Solution:

[root@rac1 ~]# export ORACLE_HOME=/u01/app/19.3.0/grid

[root@rac1 ~]# asmcmd ls +demodg/orcl/tempfile/TEMP.263.10617958
ASMCMD-8002: entry 'TEMP.263.10617958' does not exist in directory '+demodg/orcl/tempfile/'

Upgrade the database time zone file using the DBMS_DST package in 19c

After upgrading database from 12c to 19c, you may need to upgrade database time zone file version. This step is not always mandatory, but it is recommended by pre-upgrade checker.

Useful info about its necessity from Oracle site: https://oracle-base.com/articles/misc/update-database-time-zone-file

From Oracle 11gR2 onward, new time zone files are shipped with upgrades and patches, but they are not automatically applied to the database.

Applying a change to the database time zone file not only affects the way new data is handled, but potentially alters data stored in TIMESTAMP WITH TIME ZONE columns, so you need to consider the impact of this before upgrading the time zone file.

Remember, if you only deal with dates in your country, and your country has not altered its time zone or daylight saving time policy, this upgrade may not be necessary.”

Now let’s do time zone file upgrade:

1. Check current settings:

SQL> SELECT * FROM v$timezone_file;

FILENAME		VERSION     CON_ID
-------------------- ---------- ----------
timezlrg_26.dat 	     26 	 0

2. Startup database in upgrade mode:

SQL> shutdown immediate;
SQL> startup upgrade;

3. Start upgrade window:

SQL> SET SERVEROUTPUT ON
SQL> DECLARE
  l_tz_version PLS_INTEGER;
BEGIN
  l_tz_version := DBMS_DST.get_latest_timezone_version;

  DBMS_OUTPUT.put_line('l_tz_version=' || l_tz_version);
  DBMS_DST.begin_upgrade(l_tz_version);
END;
/


l_tz_version=32
An upgrade window has been successfully started.

PL/SQL procedure successfully completed.

4. Check primary and secondary time zone versions:

SQL> SELECT PROPERTY_NAME, SUBSTR(property_value, 1, 30) value
FROM DATABASE_PROPERTIES
WHERE PROPERTY_NAME LIKE 'DST_%'
ORDER BY PROPERTY_NAME; 

PROPERTY_NAME		 VALUE
------------------------ ----------
DST_PRIMARY_TT_VERSION	 32
DST_SECONDARY_TT_VERSION 26
DST_UPGRADE_STATE	 UPGRADE

5. Startup database in normal mode:

SQL> shut immediate;
SQL> startup;

6. Do the upgrade:

SQL> SET SERVEROUTPUT ON
SQL> DECLARE
  l_failures   PLS_INTEGER;
BEGIN
  DBMS_DST.upgrade_database(l_failures);
  DBMS_OUTPUT.put_line('DBMS_DST.upgrade_database : l_failures=' || l_failures);
  DBMS_DST.end_upgrade(l_failures);
  DBMS_OUTPUT.put_line('DBMS_DST.end_upgrade : l_failures=' || l_failures);
END;
/


Table list: "GSMADMIN_INTERNAL"."AQ$_CHANGE_LOG_QUEUE_TABLE_S"
Number of failures: 0
Table list: "GSMADMIN_INTERNAL"."AQ$_CHANGE_LOG_QUEUE_TABLE_L"
Number of failures: 0
Table list: "MDSYS"."SDO_DIAG_MESSAGES_TABLE"
Number of failures: 0
Table list: "DVSYS"."SIMULATION_LOG$"
Number of failures: 0
Table list: "DVSYS"."AUDIT_TRAIL$"
Number of failures: 0
DBMS_DST.upgrade_database : l_failures=0
An upgrade window has been successfully ended.
DBMS_DST.end_upgrade : l_failures=0

PL/SQL procedure successfully completed.

7. Check new settings:

SQL> SELECT * FROM v$timezone_file;

FILENAME		VERSION     CON_ID
-------------------- ---------- ----------
timezlrg_32.dat 	     32 	 0


SQL> COLUMN property_name FORMAT A30
SQL> COLUMN property_value FORMAT A20

SQL> SELECT property_name, property_value
FROM   database_properties
WHERE  property_name LIKE 'DST_%'
ORDER BY property_name;

PROPERTY_NAME		       PROPERTY_VALUE
------------------------------ --------------------
DST_PRIMARY_TT_VERSION	       32
DST_SECONDARY_TT_VERSION       0
DST_UPGRADE_STATE	       NONE

ORA-56920: a prepare or upgrade window or an on-demand or datapump-job loading of a secondary time zone data file is in an active state

Problem:

During DTS upgrade, while following the steps mentioned here, got the following error:

SQL> DECLARE
  l_tz_version PLS_INTEGER;
BEGIN
  l_tz_version := DBMS_DST.get_latest_timezone_version;
  DBMS_OUTPUT.put_line('l_tz_version=' || l_tz_version);
  DBMS_DST.begin_upgrade(l_tz_version);
END;
/ 

DECLARE
*
ERROR at line 1:
ORA-56920: a prepare or upgrade window or an on-demand or datapump-job loading
of a secondary time zone data file is in an active state
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 79
ORA-06512: at "SYS.DBMS_DST", line 1208
ORA-06512: at line 6

Solution:

The error simply means that this step is already done and you can continue with other steps, make sure that DST_UPGRADE_STATE column in the following query shows UPGRADE value.

SQL> SELECT PROPERTY_NAME, SUBSTR(property_value, 1, 30) value
FROM DATABASE_PROPERTIES
WHERE PROPERTY_NAME LIKE 'DST_%'
ORDER BY PROPERTY_NAME;

PROPERTY_NAME		 VALUE
------------------------ ----------
DST_PRIMARY_TT_VERSION	 32
DST_SECONDARY_TT_VERSION 26
DST_UPGRADE_STATE	 UPGRADE

there is no need to do anything, continue with the next DTS upgrade steps.

PRCD-1229 : An attempt to access configuration of database orcl was rejected because its version 12.2.0.1.0 differs from the program version 19.0.0.0.0

Problem:

After a manual database upgrade from 12c to 19c, I was not able to start database instance using srvctl.

[oracle@rac1 ~]$ srvctl start instance -db orcl -n rac1

PRCD-1027 : Failed to retrieve database orcl
PRCD-1229 : An attempt to access configuration of database orcl was rejected because its version 12.2.0.1.0 differs from the program version 19.0.0.0.0. Instead run the program from /u01/app/oracle/product/12.2.0/dbhome_1.

Solution:

Use srvctl upgrade from new home:

$ /u01/app/oracle/product/19.3.0/dbhome_1/bin/srvctl upgrade database -d orcl -oraclehome /u01/app/oracle/product/19.3.0/dbhome_1

Try to start again:

$ srvctl start instance -db orcl -n rac1

ORA-39701: database must be mounted EXCLUSIVE for UPGRADE or DOWNGRADE

Problem:

In cluster environment, I was not able to start database in upgrade mode:

SQL> startup upgrade

ORACLE instance started.
Total System Global Area 1996486272 bytes
Fixed Size		    8898176 bytes
Variable Size		  704643072 bytes
Database Buffers	 1275068416 bytes
Redo Buffers		    7876608 bytes
Database mounted.
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-39701: database must be mounted EXCLUSIVE for UPGRADE or DOWNGRADE

The following also did not work:

SQL> startup mount exclusive

ORACLE instance started.
Total System Global Area 1996486272 bytes
Fixed Size		    8898176 bytes
Variable Size		  704643072 bytes
Database Buffers	 1275068416 bytes
Redo Buffers		    7876608 bytes
Database mounted.

SQL> alter database open upgrade;
alter database open upgrade
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-39701: database must be mounted EXCLUSIVE for UPGRADE or DOWNGRADE

Solution:

Change parameter cluster_database to FALSE, and startup in upgrade mode:

$ sqlplus / as sysdba

SQL> startup nomount;

SQL> alter system set cluster_database=FALSE scope=spfile sid='*';

SQL> shutdown immediate;

SQL> startup upgrade

ORACLE instance started.
Total System Global Area 1996486272 bytes
Fixed Size		    8898176 bytes
Variable Size		  704643072 bytes
Database Buffers	 1275068416 bytes
Redo Buffers		    7876608 bytes
Database mounted.
Database opened.

After finishing your work, don’t forget to return cluster_database parameter to TRUE and restart your database:

SQL> alter system set cluster_database=TRUE scope=spfile sid='*';