Asbj?rn Pr?is wrote:
- Is the alertq table suppose to just increase and increase?
No.
$this->{log}->printlog("Alert","delete",$Log::debugging,
"deleted alertqid=$this->{id}"); #$this->{dbh}->do("delete from alertq where alertqid=$this->{id}");
Yes, this is a good guess. I think Arne commented out this line for debugging, and noone remembered to remove the hash sign before the releae of 3.0.0. This was fixed on March 24th, and will be part of the 3.0.1 release.
- Some alerts are missing sysname in the alerts. It shows
netbox.deviceid instead of netbox.sysname. This is very confusing for IT-guys at our facultys thats using NAV for receiving alerts regarding equipment at their location.
As Kristian suggested - change alertmsg.conf. Please share your changes with us, I don't think anyone is interested in device numbers in their alerts ;-)
- DNS mismatch alerts are "wrong". I.e :
xxyyzz.uio.no does not match xxyyzz.uio.no
It should say "'hostname from switch' does not match DNSname" i guess..
This is known, and has been fixed. It was a two-part problem: 1. DNS and sysname comparisons were case sensitive, leading to erroneous dnsMismatch events in some cases where people use mixed case to name their devices. 2. The error message references the dns name of the device twice, making the alert very confusing ("X does not match X").
- Why are alerts like the example in '3.' coming? We get them from
ALOT of devices every 6. hour.. These are more or less only Cisco Catalyst devices, if thats of importance.
One out of two reasons, I guess. One is what Kristian suggested - that you may have registered the device sysname and dnsname using different character cases.
The other might be that the reported sysnames actually do not match the dns names of the devices. This happened at lot at NTNU when all devices were moved from the ntnu.no to the nettel.ntnu.no domain - the device configurations still kept the old sysname.ntnu.no names. Whether they fixed this or just suppress the dnsMismatch messages I don't know..
By the way. I've made a perl-deamon that controlls that all of NAVs internal and external onces are running, and reports by mail and/or sms if something is wrong. Version two will try to restart dead processes as well. If anybody wants it, just drop me an email.
It sounds interesting, but I have two suggestions/questions:
Does it recognise that a NAV service may have been stopped or disabled on purpose by the administrator? Restarting such a service could have adverse effects...
If the daemon were implemented in Python, it could benefit from being able to directly use NAV's startstop API in the nav.startstop python module. The startstop API is also up for some improvements for either the 3.1 or 3.2 release. One idea is to have a NAV service status page in the web interface, but this would require the Apache server to be able to act as the navcron user.