I used the time periods already defined in NAV, I didn't make my own. I have added G01 in all of these. These time periods are in my active profile.
Where do I see if all my NAV backend processes are running?
Looking at pping.log it says that 3 hosts are currently down. This is not the case, only 1 IP device is currently down, and the GUI NAV shows this right. I'm not sure which other two NAV thinks is down.
Regarding eventEngine, that is a folder with three log files:
One is called eventEngine-stderr.log.2008-11-03-1309.log and the output is: Device not found, trying DB update: 52 Device not found, trying DB update: 54 Device not found, trying DB update: 170
Another one is called eventEngine.stderr.log and contain: Device not found, trying DB update: 171
The last log file is called eventEngine.stdout.log and is empty.
alertengine.log contains: PID: /var/lib/nav/run/alertengine.pid Fri Oct 31 14:03:42 2008 alertEngine Log-3-printlog: Level not defined: Engine shutdownConstruct Got signal :TERM:! nice shutdown. PID: /var/lib/nav/run/alertengine.pid PID: /var/lib/nav/run/alertengine.pid
So if you can see any error here, please get back to me and tell me how to correct them.
Regards, Lene Maria
Morten Brekkevold skrev:
On Mon, 3 Nov 2008 15:54:04 +0100 (CET) lene.myhre@item.ntnu.no wrote:
I can't get NAV to send alerts when IP devices is down, neither e-mail nor sms alert. I have added G01: All alerts for the profile, I am allowed to receive SMS in my account. Gammu is also installed. What am I doing wrong? I also tried adding my own filter group with different filter alerts, but nothing worked.
Is your G01 subscription in an active timeperiod of your active profile? Are all your NAV backend processes running (nav status)?
When an IP device goes down, three things happen in NAV:
pping receives no ping response from the device, and posts a boxState start event on NAV's event queue.
eventEngine picks up the boxState event from the queue, and tries to figure out what to do with it. Typically, it will wait for one minute to see if pping can re-establish contact with the device (this is to prevent alert spamming when the network or device is flapping). After that minute has passed, it will post a boxDownWarning alert to NAV's alert queue. After three more minutes have passed without any contact with the device, eventEngine will post a boxDown alert to NAV's alert queue, and the device's down status is registered permanently in NAV's history.
AlertEngine receives the boxDownWarning on the alert queue, interprets the alert profiles of the individual users and decides who will receive the alert, and in what medium (email, sms). Later, it receives the boxDown alert and does the same.
If you are having problems getting alerts through, you should monitor the log files of these three processes. First pping.log, to confirm that it cannot ping the device. Then eventEngine.log, to confirm that eventEngine receives the event and dispatches an alert. Then finally alertengine.log, to confirm that the AlertEngine receives the alert and correctly finds that you should receive a copy of it via email or SMS.