Morten,
Thanks for the quick reply.
I'll have to look up who is responsible for maintaining the port for nav (for FreeBSD) and see if I can't figure out where things stand - maybe I can lend a hand (or at least some time here). I know on FreeBSD v7.x the port is listed as "broken" (ie: won't compile for some reason). I prefer to utilize the standard ports install, just for 'sanity' reasons, but I may pull the latest source and go through it. I keep the "ports" hierarchy up to date (weekly automation and manual updates each day that I'm working on something to make sure that any patches/updates are included). Unfortunately, I don't know Java or python - just perl and php. :-(
RE: import - superb! - Thanks!
I started looking through some of our cron catches and found the following errors (numerous instances of each):
----- cricket/collect-subtrees normal: Could not read /usr/local/cricket/subtree-sets file ----- getBoksMacs.sh: createConnection ClassNotFoundExecption error: org.postgresql.Driver -----
I would think that the driver is in the jar, which is set for all user profiles and in the "navcron" crontab (includes postgresql.jar): CLASSPATH=/usr/local/nav/lib/java/ConfigParser.jar:/usr/local/nav/lib/java/D atabase.jar:/usr/local/nav/lib/java/Event.jar:/usr/local/nav/lib/java/Logger .jar:/usr/local/nav/lib/java/NetboxInfo.jar:/usr/local/nav/lib/java/SimpleSn mp.jar:/usr/local/nav/lib/java/Util.jar:/usr/local/share/java/classes/postgr esql.jar:/usr/local/share/java/classes/snmp.jar
From what I read on the website, it would appear that configuration of cricket is done by nav??? In terms of the log file - this is the strange part - the file isn't even being created?! The perms are such that navcron (user navcron - group nav) owns the directory hierarchy (so shouldn't be an issue). Hence the reason I've been a bit blind in troubleshooting the whole thing. Here's the 'ls -lR' output on the logging path for nav (I deleted a bunch of lines from the networkDiscovery path - no need to chew up useless bits): -------------------------------------------
$$> ls -lR total 3296 drwxrwsr-x 2 navcron nav 512 Apr 23 03:31 arnold drwxrwsr-x 2 navcron nav 512 Apr 23 03:31 eventEngine -rw-r--r-- 1 navcron nav 6774 Apr 25 12:56 getBoksMacs.log drwxrwsr-x 2 navcron nav 512 Apr 23 03:31 getDeviceData -rw-r--r-- 1 navcron nav 0 Apr 24 15:30 maintengine.log drwxrwsr-x 2 navcron nav 3072 Apr 25 13:05 networkDiscovery -rw-rw-r-- 1 navcron nav 3025984 Apr 25 13:05 pping.log -rw-r--r-- 1 navcron nav 239382 Apr 25 13:05 smsd.log -rw-r--r-- 1 navcron nav 46020 Apr 25 13:05 thresholdMon.log
./arnold: total 0
./eventEngine: total 0
./getDeviceData: total 0
./networkDiscovery: total 58 -rw-r--r-- 1 navcron nav 69 Apr 25 12:38 networkDiscovery-stderr.log -rw-r--r-- 1 navcron nav 69 Apr 25 10:38 networkDiscovery-stderr.log.2008-04-25-1135.log -rw-r--r-- 1 navcron nav 69 Apr 25 11:35 networkDiscovery-stderr.log.2008-04-25-1138.log -rw-r--r-- 1 navcron nav 69 Apr 25 11:38 networkDiscovery-stderr.log.2008-04-25-1235.log -rw-r--r-- 1 navcron nav 69 Apr 25 12:35 networkDiscovery-stderr.log.2008-04-25-1238.log -rw-r--r-- 1 navcron nav 38 Apr 25 12:35 networkDiscovery-topology.html -rw-r--r-- 1 navcron nav 38 Apr 25 12:38 networkDiscovery-vlan.html
------------------------------------------- BTW - pping starts immediately each and every time. Servicemon doesn't. Right now I'm running a simple "while true" shell statement and it's gone through about 30+ iterations - the output remains the same:
Starting: cricket iptrace logengine mactrace maintengine networkDiscovery pping thresholdMon Failed: alertengine eventengine getDeviceData servicemon smsd
From the look of the alertengine.cfg all the log levels are enabled. Is
there a way I get more detailed information out of these to determine what is happening?
Thanks!
-----Original Message----- From: Morten Brekkevold [mailto:morten.brekkevold@uninett.no] Sent: Friday, April 25, 2008 2:34 AM To: nav-users@uninett.no Cc: scorpion7@iqonline.net Subject: Re: Services
On Thu, 24 Apr 2008 21:48:30 +0200 (CEST) scorpion7@iqonline.net wrote:
location, etc.) it appears that some services still aren't starting:
$$> nav start Starting: cricket iptrace logengine mactrace maintengine
networkDiscovery pping
smsd thresholdMon Failed: alertengine eventengine getDeviceData servicemon
I've already done some research and had to hand-tool some .py files
as there
were errors for "import sre as re" and python stated that it was
deprecated,
ie: it should be "import re" (sre is now re)
That old skeleton... I've just committed a changeset to my local repository to fix those imports for good.
I'm not seeing any log data (even with debug=6 in nav.conf) for the
failed
services. Does anyone have information that may lead to this being
more
successful?
The debug setting in nav.conf only applies to the Java subsystems (as I think the comment in nav.conf says). This applies to eventengine and getDeviceData. If these two fail on startup, they might not get the chance to log normally to (eventEngine|getDeviceData).log, but their stderr output will be placed in log/eventEngine/eventEngine-stderr.log and log/getDeviceData/getDeviceData-stderr.log. Those files should tell you why they won't start.
The alertEngine loglevel setting is in etc/alertengine.cfg. It has two log files, alertengine.log and alertengine.err. The latter logs whatever Perl exceptions and errors that occur during alertEngine's runtime.
servicemon (and pping) are sort of notorious for having startup problems. We don't know exactly what's going on with them yet, but usually it helps to just keep "nav start"-ing them a couple of times, until they stick ;-)
BTW - we utilize Apache 2.x with AD authentication for internal resources. I've seen some errors with python and the web pages as
well - is
there an expectation that it would be Apache 1.x instead?
No, we absolutely recommend Apache 2.x. I think you'll only get an _old_ mod_python to work with 1.x.
Thanks!
You're welcome!
-- mvh Morten Brekkevold UNINETT