Gmail blocks emails from Postfix client on Linux

Problem:

I want to send email notification to my Gmail account from Linux server using Postfix client. Mails are not received and /var/log/maillog is full of the following error messages:

Aug 18 17:24:29 rac1 postfix/smtp[17580]: connect to gmail-smtp-in.l.google.com[74.125.69.27]:25: Connection timed out
Aug 18 17:24:29 rac1 postfix/smtp[17580]: connect to gmail-smtp-in.l.google.com[2607:f8b0:4001:c0d::1a]:25: Network is unreachable
Aug 18 17:24:59 rac1 postfix/smtp[17580]: connect to alt1.gmail-smtp-in.l.google.com[173.194.77.27]:25: Connection timed out
Aug 18 17:25:29 rac1 postfix/smtp[17580]: connect to alt2.gmail-smtp-in.l.google.com[173.194.219.27]:25: Connection timed out

Solution:

Configure Postfix and Gmail account accordingly.

1. Confirm that the myhostname parameter is configured with your server’s FQDN:

# grep ^myhostname /etc/postfix/main.cf
myhostname = rac1.example.com

2. Generate an App Password for Postfix:

Click on App passwords -> Select app dropdown -> choose Other (custom name) -> Enter “Postfix” -> click GENERATE.

Postfix app password is generated in yellow box, copy and save it (generated_password_goes_here will be changed by this value).

3. Fill SMTP Host, username, and password in /etc/postfix/sasl_passwd

# cat /etc/postfix/sasl_passwd
smtp.gmail.com your_username@gmail.com:generated_password_goes_here

4. Create the hash db file

# postmap /etc/postfix/sasl_passwd

5. Configure the Postfix Relay Server:

# grep ^relayhost /etc/postfix/main.cf
relayhost = [smtp.gmail.com]:587

6.  To enable authentication, add the following parameters in /etc/postfix/main.cf

smtp_sasl_auth_enable = yes
smtpd_tls_auth_only = yes
smtp_sasl_password_maps = hash:/etc/postfix/sasl_passwd
smtp_sasl_security_options = noanonymous
smtp_tls_security_level = encrypt

7. Reload Postfix service:

# systemctl reload postfix

8. For sending test email, I use Flashgrid tool:

[root@rac1 ~]# flashgrid-node test-alerts
FlashGrid 21.2.24.58935 #bb6005e9d66650d1996184c38d2fb8a2a78420a8
License: Active, Marketplace
Licensee: Flashgrid Inc.
Support plan: 24x7
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Test alerts were sent

The alert is received now:

Flashgrid CHM and Basic Troubleshooting Part 3

In this tutorial, we are gracefully stopping one node in 2-node cluster using flashgrid-node stop command.

The command is useful when you are doing maintenance tasks on one node, e.g upgrading OS, upgrading kernel.

ora.evmd and ora.mdnsd fails to start when http_proxy is set to https://

Problem:

After setting http_proxy to https string (export http_proxy=https://test) and then stopping and starting CRS got the following error:

CRS-2883: Resource 'ora.evmd' failed during Clusterware stack start.
CRS-4406: Oracle High Availability Services synchronous start failed.
CRS-41053: checking Oracle Grid Infrastructure for file permission issues
PRVG-2031 : Owner of file "/u01/app/19.3.0/grid/bin/CommonSetup.pm" did not match the expected value on node "rac1". [Expected = "root(0)" ; Found = "grid(3002)"]
....
PRVG-2031 : Owner of file "/u01/app/19.3.0/grid/lib/libnl19.a" did not match the expected value on node "rac1". [Expected = "root(0)" ; Found = "grid(3002)"]
CRS-4000: Command Start failed, or completed with errors.

Even after unsetting http_proxy and trying to stop CRS got the following:

[root@rac1 ~]# crsctl start crs -wait
CRS-4640: Oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.

[root@rac1 ~]# crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1'
CRS-2679: Attempting to clean 'ora.mdnsd' on 'rac1'
CRS-2679: Attempting to clean 'ora.gpnpd' on 'rac1'
CRS-2679: Attempting to clean 'ora.evmd' on 'rac1'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'rac1'
CRS-2677: Stop of 'ora.drivers.acfs' on 'rac1' succeeded
CRS-2680: Clean of 'ora.evmd' on 'rac1' failed
CRS-2680: Clean of 'ora.gpnpd' on 'rac1' failed
CRS-2680: Clean of 'ora.mdnsd' on 'rac1' failed
CRS-2799: Failed to shut down resource 'ora.evmd' on 'rac1'
CRS-2799: Failed to shut down resource 'ora.gpnpd' on 'rac1'
CRS-2799: Failed to shut down resource 'ora.mdnsd' on 'rac1'
CRS-2795: Shutdown of Oracle High Availability Services-managed resources on 'rac1' has failed
CRS-4687: Shutdown command has completed with errors.
CRS-4000: Command Stop failed, or completed with errors

So https entry in http_proxy variable caused my CRS even not being able to stop.

Solution:

The solution is simple, find processes that were started during previous attempt and kill them (be careful, not to kill anything that is not started from GI home):

[root@rac1 ~]# ps -ef|grep d.bin
root      1817     1  0 05:12 ?        00:00:01 /opt/flashgrid/bin/flashgrid_aio_srv
root      1821     1  0 05:12 ?        00:00:06 /opt/flashgrid/bin/flashgrid_target_srv
root      1824     1  0 05:12 ?        00:00:13 /opt/flashgrid/bin/flashgrid_initiator_srv
grid      1832     1  0 05:12 ?        00:00:04 /opt/flashgrid/bin/flashgrid_asm_srv
root      1845     1  0 05:12 ?        00:00:06 /opt/flashgrid/bin/flashgrid_cluster_srv
root      1879     1  0 05:12 ?        00:00:02 /opt/flashgrid/bin/flashgrid_iamback
root      1881     1  0 05:12 ?        00:00:00 /opt/flashgrid/bin/flashgrid_diskwatch
root      1884     1  0 05:12 ?        00:00:00 /opt/flashgrid/bin/flashgrid_reconstruct
root     10228 13775  0 05:43 pts/0    00:00:00 grep --color=auto d.bin
root     20305     1  2 05:16 ?        00:00:33 /u01/app/19.3.0/grid/bin/ohasd.bin reboot _ORA_BLOCKING_STACK_LOCALE=AMERICAN_AMERICA.US7ASCII
root     20631     1  0 05:16 ?        00:00:05 /u01/app/19.3.0/grid/bin/orarootagent.bin

[root@rac1 ~]# kill -9 20305 20631

[root@rac1 ~]# ps -ef|grep d.bin
root      1817     1  0 05:12 ?        00:00:01 /opt/flashgrid/bin/flashgrid_aio_srv
root      1821     1  0 05:12 ?        00:00:06 /opt/flashgrid/bin/flashgrid_target_srv
root      1824     1  0 05:12 ?        00:00:13 /opt/flashgrid/bin/flashgrid_initiator_srv
grid      1832     1  0 05:12 ?        00:00:04 /opt/flashgrid/bin/flashgrid_asm_srv
root      1845     1  0 05:12 ?        00:00:06 /opt/flashgrid/bin/flashgrid_cluster_srv
root      1879     1  0 05:12 ?        00:00:02 /opt/flashgrid/bin/flashgrid_iamback
root      1881     1  0 05:12 ?        00:00:00 /opt/flashgrid/bin/flashgrid_diskwatch
root      1884     1  0 05:12 ?        00:00:00 /opt/flashgrid/bin/flashgrid_reconstruct
root     10296 13775  0 05:43 pts/0    00:00:00 grep --color=auto d.bin

Make sure http_proxy is not set or instead of https there is http as a value:

[root@rac1 ~]# unset http_proxy

[root@rac1 ~]# echo $http_proxy

Or

[root@rac1 ~]# export http_proxy=http://test

Try to start CRS now:

[root@rac1 ~]# crsctl start crs -wait
CRS-4123: Starting Oracle High Availability Services-managed resources
CRS-2672: Attempting to start 'ora.evmd' on 'rac1'
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1'
CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded
CRS-2676: Start of 'ora.evmd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1'
CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'rac1'
CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.crf' on 'rac1'
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'
CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded
CRS-2676: Start of 'ora.crf' on 'rac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'
CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2676: Start of 'ora.asm' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.storage' on 'rac1'
CRS-2676: Start of 'ora.storage' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'rac1'
CRS-2676: Start of 'ora.crsd' on 'rac1' succeeded
CRS-6017: Processing resource auto-start for servers: rac1
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'rac2'
CRS-2672: Attempting to start 'ora.chad' on 'rac1'
CRS-2672: Attempting to start 'ora.ons' on 'rac1'
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.scan1.vip' on 'rac2'
CRS-2677: Stop of 'ora.scan1.vip' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.scan1.vip' on 'rac1'
CRS-2676: Start of 'ora.chad' on 'rac1' succeeded
CRS-2676: Start of 'ora.scan1.vip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.LISTENER_SCAN1.lsnr' on 'rac1'
CRS-2676: Start of 'ora.LISTENER_SCAN1.lsnr' on 'rac1' succeeded
CRS-2676: Start of 'ora.ons' on 'rac1' succeeded
CRS-6016: Resource auto-start has completed for server rac1
CRS-6024: Completed start of Oracle Cluster Ready Services-managed resources
CRS-4123: Oracle High Availability Services has been started.

DPI-1030: unable to get or set error structure for thread local storage

Problem:

flashgrid-cluster command was showing that diskgroups were not mounted, while diskgroup were successfully mounted on all nodes:

Reason:

GI was upgraded and Flashgrid was not able to reconnect ASM.

Solution:

Restart flashgrid_asm service, please note that it does not cause any downtime and is safe to run during business hours:

 # systemctl restart flashgrid_asm.service