On Tue, 2 Aug 2022 07:25:04 +0000 Andrea Verni Andrea.Verni@u-blox.com wrote:
this fix seems to have been working with version 5.3 - since we upgraded to 5.4 we stopped receiving emails and not sure if destination email address is blacklisted or anything else going wrong. Even with restarting the alertengine process or the entire NAV VM we do not get any message in the /var/log/nav/alertengine anymore.
That sounds pretty strange. Are you able to verify that the alertengine process is actually running on your system?
What if you stop the background process and run `alertengine.py --test`, is there any output to the terminal?
The last entries in the rotated alertengine says something like:
2022-06-23 09:01:50,572 [WARNING] [nav.alertengine.alertaddress.send] Not sending alert 15496 xxx.yyy@domain.com as handler Email is blacklisted: [Errno 110] Connection timed out
Well, before it stopped logging, it seems your problem was still that you had connection problems with your SMTP server, and if you don't detect this issue, you may end up with a large amount of unsent alerts from NAV.
Have you configured NAV with an external mail relay server? If you are having repeated issues with talking to this SMTP server, I would suggest that you consider installing an MTA on your NAV server, and configuring NAV to use localhost as its SMTP server. The local MTA should at least queue outgoing mail and retry when the external server isn't responding.
The postgres alertqmsg contains 19101 msg - not sure if this is supposed to be empty once email are sent or not:
It should be empty when there are no more notifications to send out (whether the notifcations are via e-mail, sms, jabber, slack etc. is irrelevant).
nav=# select count(*) from alertqmsg; count
19101 (1 row)
While the alertq table contains 9550 row
That is a pretty hefty amount of unprocessed alerts. However, this doesn't necessarily mean that there are 9550 e-mails waiting in a queue. Each of these 9550 alerts must still match some user's alert profile to result in an actual e-mail or other notification. (of course, if you are subscribed to *everything*, you're in for a ride).
What would be the best way here ? Is there any way to clean up old messages which has not been sent and have the alertengine back working?
There is a way, but it's not necessarily user friendly.
You could simply issue the SQL statement `DELETE FROM alertq;`. This would normally completely empty the queue, no questions asked.
(I thought I remembered there was an easy command line switch for alertengine to truncate the queue, but it seems I mixed it up with the SMS daemon, which has such a thing specifically for the SMS message queue)
Thanks to anyone who could help me 😊
Sorry I couldn't help you sooner, I've been away on vacation for most of August.