Thanks for the kind reply!
Right now all our active switches are registered in NAV. There are about 130 pcs. 2510-48 and a couple of 2600-8, 2610, 2626, 2650, 2810 and 2824. The core switches are two 5406 and a single 5412. Traffic data is collected from the switches correctly. Statistics are presented fine via Cricket. Alerts, like firmware upgrade and cold start, are all noticed via Alert Manager as it should.
Topology right now is totally missing, but luckily I've cached a hint of topology with a single 2824 in the middle and one 2626 one 2650 and a 2510 as the neighbors before my colleague, sorry to say, have updated to the latest firmware. The magic surely was, that the 2824 had a complete CDP and LLDP implementation in the old firmware, while all the new HP firmware has only a crippled Receive Only CDP implementation. The missing topology is rather crucial since we can be flooded by SMS and mails in case a box at the top fails.
MAC information is collected, so it is possible to find a MAC number belonging to an IP number, but the MAC is not associated to any switch port, so we can not trace MAC, and use Arnold.
Service monitoring is rather important for us, since our server people do need a better identification of a functioning server than a simple ping. Needless to say that a process can fail while the server still is pingable. Right now we are monitoring DNS, HTTP, IMAP, SMB, SMTP with success. The only service we are missing, for the time beeing, is the RADIUS.
If I can help the developers with any tests, I would be more than happy to do it!
Thanks again! Istvan
Morten Brekkevold morten.brekkevold@uninett.no 10/15/09 11:14 AM >>>
On Fri, 02 Oct 2009 12:12:41 +0200 "Istvan Bernath" ibe@life.ku.dk wrote:
Sorry about the inconvenience!
I've been on the sidelines in the past 3 years, receiving all messages from the nav-users list and running a maintained and updated test installation all that time.
To my pleasure this time it will be official, and the Faculty of Life Sciences, Copenhagen University gives NAV a chance as the global monitoring tool of all the new HP 2510 switches our net is now upgraded to.
Music to my ears. Although I'm curious as to how NAV will perform on an all HP network. We've had some problems with HP in the past. Not all of them are completely behind us, but they hopefully will be in NAV 3.6.
Monitoring of several services will also be crucial.
NAV's service monitoring capabilities have been somewhat downplayed by UNINETT, due to lack of resources. We've left maintenance of those systems in the capable hands of the University of Tromsø. Hopefully, their developers can answer your questions, as I don't know radius or the service monitor all that well. I've prodded one of their guys to look at your servicemon question.
I have though problems with monitoring the Radius service, and booth topology and MAC behind a port do not work with the new HP ProCurve switches.
Your topology problems are intriguing. Are you experiencing a total lack of topology information, or just plain wrong topology? I've had reports of the latter from other HP-heavy installations.
Are you able to search any MAC data for any of your switches?
I hope somebody can help me rectifying the last problems.
It might take some Q&A, but hopefully we can be of assistance :)
ibe@life.ku.dk said:
Right now we are monitoring DNS, HTTP, IMAP, SMB, SMTP with success. The only service we are missing, for the time beeing, is the RADIUS.
Perhaps you are missing a couple of (undocumented?) dependencies? The Radius service monitor depends on 'freeradius' and 'pyrad'. Our experience is also that /usr/local/etc/raddb/ must be readable for all (default on FreeBSD is 700)
--Ingeborg
On Thu, 15 Oct 2009 13:06:20 +0200 "Istvan Bernath" ibe@life.ku.dk wrote:
Topology right now is totally missing, but luckily I've cached a hint of topology with a single 2824 in the middle and one 2626 one 2650 and a 2510 as the neighbors before my colleague, sorry to say, have updated to the latest firmware. The magic surely was, that the 2824 had a complete CDP and LLDP implementation in the old firmware, while all the new HP firmware has only a crippled Receive Only CDP implementation.
The switches' forwarding tables are collected by the getBoksMacs program (nav start mactrace). Do you see any error messages in your getBoksMacs.log file? It might also help to make sure that getBoksMacs is logging at debug level (level 6, see nav.conf).
NAV builds a table of neighbor candidates. It would be nice to know some statistics from this table. Could you access the PostgreSQL database (sudo -u postgres psql nav) and show the output of the following queries:
SELECT COUNT(*) FROM swp_netbox; SELECT COUNT(DISTINCT netboxid) FROM swp_netbox;
Right now we are monitoring DNS, HTTP, IMAP, SMB, SMTP with success. The only service we are missing, for the time beeing, is the RADIUS.
Thinking about it now, I remember discovering a while back that there are some complicated issues surrounding the RADIUS service checker.
I remember discussing this with the original author of the service checker plugin. He also happens to be the Debian maintainer for the python-pyrad package.
The RADIUS service checker uses python-pyrad to talk to a RADIUS server. pyrad is dependent on a RADIUS service dictionary to have any meaningful conversation with a RADIUS server. Unfortunately, a default dictionary isn't provided with the python-pyrad package.
Dictionaries are usually provided by your radius server. I see the freeradius Debian package provides a multitude of dictionary files, linked together by a single one in /etc/freeradius/dictionary .
The location of a dictionary file needs to be specified in the options when setting up the RADIUS service checker. I see the plugin itself documents the following:
Arguments: ---------- hostname : Accessible from self.getAddress() as pure FQDN hostname port : Remote udp-port where radius authentication is living. Port 1812 is default for authentication. username : A valid radius-username password : Clear-text password associated with the username above. identifier: Each "client-source" connects to radius with a given identity and secret. rad_secret: Password associated with 'identifier' dictionary: Path to filename which holds the dictionary for this radius-daemon. The default-dictionary can be used, or a specific dictionary for a specific implementation of the radius-server.
I hope this information will help you to get it working.