Good Evenin',
Being new to nav..... Installation is on FreeBSD v6.x. Performed initial install from the 'ports' collection. After following instructions on basic configuration (included setup of Postgres, setting up a couple devices, room, location, etc.) it appears that some services still aren't starting:
$$> nav start Starting: cricket iptrace logengine mactrace maintengine networkDiscovery pping smsd thresholdMon Failed: alertengine eventengine getDeviceData servicemon
I've already done some research and had to hand-tool some .py files as there were errors for "import sre as re" and python stated that it was deprecated, ie: it should be "import re" (sre is now re)
I'm not seeing any log data (even with debug=6 in nav.conf) for the failed services. Does anyone have information that may lead to this being more successful? BTW - we utilize Apache 2.x with AD authentication for internal resources. I've seen some errors with python and the web pages as well - is there an expectation that it would be Apache 1.x instead?
Here are some specifics: OS: FreeBSD v6.x (Microsoft/BSD shop) Java: Diablo JDK-1.5.0 Tomcat: 6.0.14 Cricket: 1.0.5_4 nav: 3.2.2 Python: 2.5.2 Apache 2.2.8 ant: 1.7.0
Thanks!
On Thu, 24 Apr 2008 21:48:30 +0200 (CEST) scorpion7@iqonline.net wrote:
location, etc.) it appears that some services still aren't starting:
$$> nav start Starting: cricket iptrace logengine mactrace maintengine networkDiscovery pping smsd thresholdMon Failed: alertengine eventengine getDeviceData servicemon
I've already done some research and had to hand-tool some .py files as there were errors for "import sre as re" and python stated that it was deprecated, ie: it should be "import re" (sre is now re)
That old skeleton... I've just committed a changeset to my local repository to fix those imports for good.
I'm not seeing any log data (even with debug=6 in nav.conf) for the failed services. Does anyone have information that may lead to this being more successful?
The debug setting in nav.conf only applies to the Java subsystems (as I think the comment in nav.conf says). This applies to eventengine and getDeviceData. If these two fail on startup, they might not get the chance to log normally to (eventEngine|getDeviceData).log, but their stderr output will be placed in log/eventEngine/eventEngine-stderr.log and log/getDeviceData/getDeviceData-stderr.log. Those files should tell you why they won't start.
The alertEngine loglevel setting is in etc/alertengine.cfg. It has two log files, alertengine.log and alertengine.err. The latter logs whatever Perl exceptions and errors that occur during alertEngine's runtime.
servicemon (and pping) are sort of notorious for having startup problems. We don't know exactly what's going on with them yet, but usually it helps to just keep "nav start"-ing them a couple of times, until they stick ;-)
BTW - we utilize Apache 2.x with AD authentication for internal resources. I've seen some errors with python and the web pages as well - is there an expectation that it would be Apache 1.x instead?
No, we absolutely recommend Apache 2.x. I think you'll only get an _old_ mod_python to work with 1.x.
Thanks!
You're welcome!
Morten,
Thanks for the quick reply.
I'll have to look up who is responsible for maintaining the port for nav (for FreeBSD) and see if I can't figure out where things stand - maybe I can lend a hand (or at least some time here). I know on FreeBSD v7.x the port is listed as "broken" (ie: won't compile for some reason). I prefer to utilize the standard ports install, just for 'sanity' reasons, but I may pull the latest source and go through it. I keep the "ports" hierarchy up to date (weekly automation and manual updates each day that I'm working on something to make sure that any patches/updates are included). Unfortunately, I don't know Java or python - just perl and php. :-(
RE: import - superb! - Thanks!
I started looking through some of our cron catches and found the following errors (numerous instances of each):
----- cricket/collect-subtrees normal: Could not read /usr/local/cricket/subtree-sets file ----- getBoksMacs.sh: createConnection ClassNotFoundExecption error: org.postgresql.Driver -----
I would think that the driver is in the jar, which is set for all user profiles and in the "navcron" crontab (includes postgresql.jar): CLASSPATH=/usr/local/nav/lib/java/ConfigParser.jar:/usr/local/nav/lib/java/D atabase.jar:/usr/local/nav/lib/java/Event.jar:/usr/local/nav/lib/java/Logger .jar:/usr/local/nav/lib/java/NetboxInfo.jar:/usr/local/nav/lib/java/SimpleSn mp.jar:/usr/local/nav/lib/java/Util.jar:/usr/local/share/java/classes/postgr esql.jar:/usr/local/share/java/classes/snmp.jar
From what I read on the website, it would appear that configuration of cricket is done by nav??? In terms of the log file - this is the strange part - the file isn't even being created?! The perms are such that navcron (user navcron - group nav) owns the directory hierarchy (so shouldn't be an issue). Hence the reason I've been a bit blind in troubleshooting the whole thing. Here's the 'ls -lR' output on the logging path for nav (I deleted a bunch of lines from the networkDiscovery path - no need to chew up useless bits): -------------------------------------------
$$> ls -lR total 3296 drwxrwsr-x 2 navcron nav 512 Apr 23 03:31 arnold drwxrwsr-x 2 navcron nav 512 Apr 23 03:31 eventEngine -rw-r--r-- 1 navcron nav 6774 Apr 25 12:56 getBoksMacs.log drwxrwsr-x 2 navcron nav 512 Apr 23 03:31 getDeviceData -rw-r--r-- 1 navcron nav 0 Apr 24 15:30 maintengine.log drwxrwsr-x 2 navcron nav 3072 Apr 25 13:05 networkDiscovery -rw-rw-r-- 1 navcron nav 3025984 Apr 25 13:05 pping.log -rw-r--r-- 1 navcron nav 239382 Apr 25 13:05 smsd.log -rw-r--r-- 1 navcron nav 46020 Apr 25 13:05 thresholdMon.log
./arnold: total 0
./eventEngine: total 0
./getDeviceData: total 0
./networkDiscovery: total 58 -rw-r--r-- 1 navcron nav 69 Apr 25 12:38 networkDiscovery-stderr.log -rw-r--r-- 1 navcron nav 69 Apr 25 10:38 networkDiscovery-stderr.log.2008-04-25-1135.log -rw-r--r-- 1 navcron nav 69 Apr 25 11:35 networkDiscovery-stderr.log.2008-04-25-1138.log -rw-r--r-- 1 navcron nav 69 Apr 25 11:38 networkDiscovery-stderr.log.2008-04-25-1235.log -rw-r--r-- 1 navcron nav 69 Apr 25 12:35 networkDiscovery-stderr.log.2008-04-25-1238.log -rw-r--r-- 1 navcron nav 38 Apr 25 12:35 networkDiscovery-topology.html -rw-r--r-- 1 navcron nav 38 Apr 25 12:38 networkDiscovery-vlan.html
------------------------------------------- BTW - pping starts immediately each and every time. Servicemon doesn't. Right now I'm running a simple "while true" shell statement and it's gone through about 30+ iterations - the output remains the same:
Starting: cricket iptrace logengine mactrace maintengine networkDiscovery pping thresholdMon Failed: alertengine eventengine getDeviceData servicemon smsd
From the look of the alertengine.cfg all the log levels are enabled. Is
there a way I get more detailed information out of these to determine what is happening?
Thanks!
-----Original Message----- From: Morten Brekkevold [mailto:morten.brekkevold@uninett.no] Sent: Friday, April 25, 2008 2:34 AM To: nav-users@uninett.no Cc: scorpion7@iqonline.net Subject: Re: Services
On Thu, 24 Apr 2008 21:48:30 +0200 (CEST) scorpion7@iqonline.net wrote:
location, etc.) it appears that some services still aren't starting:
$$> nav start Starting: cricket iptrace logengine mactrace maintengine
networkDiscovery pping
smsd thresholdMon Failed: alertengine eventengine getDeviceData servicemon
I've already done some research and had to hand-tool some .py files
as there
were errors for "import sre as re" and python stated that it was
deprecated,
ie: it should be "import re" (sre is now re)
That old skeleton... I've just committed a changeset to my local repository to fix those imports for good.
I'm not seeing any log data (even with debug=6 in nav.conf) for the
failed
services. Does anyone have information that may lead to this being
more
successful?
The debug setting in nav.conf only applies to the Java subsystems (as I think the comment in nav.conf says). This applies to eventengine and getDeviceData. If these two fail on startup, they might not get the chance to log normally to (eventEngine|getDeviceData).log, but their stderr output will be placed in log/eventEngine/eventEngine-stderr.log and log/getDeviceData/getDeviceData-stderr.log. Those files should tell you why they won't start.
The alertEngine loglevel setting is in etc/alertengine.cfg. It has two log files, alertengine.log and alertengine.err. The latter logs whatever Perl exceptions and errors that occur during alertEngine's runtime.
servicemon (and pping) are sort of notorious for having startup problems. We don't know exactly what's going on with them yet, but usually it helps to just keep "nav start"-ing them a couple of times, until they stick ;-)
BTW - we utilize Apache 2.x with AD authentication for internal resources. I've seen some errors with python and the web pages as
well - is
there an expectation that it would be Apache 1.x instead?
No, we absolutely recommend Apache 2.x. I think you'll only get an _old_ mod_python to work with 1.x.
Thanks!
You're welcome!
-- mvh Morten Brekkevold UNINETT
On Fri, 25 Apr 2008 13:23:53 -0400 "Scorpion7" scorpion7@iqonline.net wrote:
I'll have to look up who is responsible for maintaining the port for nav (for FreeBSD) and see if I can't figure out where things stand - maybe I can lend a hand (or at least some time here). I know on FreeBSD v7.x the port is listed as "broken" (ie: won't compile for some reason).
That would be great :) The FreeBSD port is sponsored by the University of Tromsø (and I think maybe the University of Bergen, both of which are FreeBSD proponents), but I'm not exactly sure who maintains the port (not being a FreeBSD guy myself). It's a shame no-one has contacted us if there is a build problem with the port.
I started looking through some of our cron catches and found the following errors (numerous instances of each):
cricket/collect-subtrees normal: Could not read /usr/local/cricket/subtree-sets file
Does that file exist at all, or is it just that the permissions are wrong?
getBoksMacs.sh: createConnection ClassNotFoundExecption error: org.postgresql.Driver
I would think that the driver is in the jar, which is set for all user profiles and in the "navcron" crontab (includes postgresql.jar): CLASSPATH=/usr/local/nav/lib/java/ConfigParser.jar:/usr/local/nav/lib/java/D atabase.jar:/usr/local/nav/lib/java/Event.jar:/usr/local/nav/lib/java/Logger .jar:/usr/local/nav/lib/java/NetboxInfo.jar:/usr/local/nav/lib/java/SimpleSn mp.jar:/usr/local/nav/lib/java/Util.jar:/usr/local/share/java/classes/postgr esql.jar:/usr/local/share/java/classes/snmp.jar
Hm, which version of NAV is in the FreeBSD port? If your Java VM supports the java.ext.dirs option, you shouldn't need to set such a big classpath. The getBoksMacs.sh script will set the option '-Djava.ext.dir=/usr/local/nav/lib/java' when calling the java executable, which will make sure all the jar files from /usr/local/nav/lib/java/ will be loaded.
Typically, we symlink the postgresql.jar and snmp.jar into the /usr/local/nav/lib/java/ directory to make it all work without classpath fiddling. Then Tomcat will also need an identical JVM option in its startup script. This is configurable on the platforms I work with, not sure about the FreeBSD port.
From what I read on the website, it would appear that configuration of cricket is done by nav???
The Cricket config tree is automatically built by NAV on a nightly (cronjob) basis, yes. The subtree-sets and cricket-conf.pl files are not touched by NAV and may need manual configuration. NAV supplies its own initial subtree-sets file to match the configuration tree it generates.
BTW - pping starts immediately each and every time. Servicemon doesn't.
That's expected. It's an intermittent problem for both daemons.
Starting: cricket iptrace logengine mactrace maintengine networkDiscovery pping thresholdMon Failed: alertengine eventengine getDeviceData servicemon smsd
From the look of the alertengine.cfg all the log levels are enabled. Is there a way I get more detailed information out of these to determine what is happening?
Try executing the alertengine init script directly (as root) and see if any errors crop up in the terminal:
/usr/local/nav/etc/init.d/alertengine start
You didn't mention what you found in getDeviceData-stderr.log and eventEngine-stderr.log?
Here is what happens when building NAV 3.2.2 on FreeBSD v7.x via the "ports" collection: --------------------------------------------------- Building Logger ... (cd Logger && /usr/local/bin/ant ) && touch Logger Buildfile: build.xml
init: [mkdir] Created dir: /usr/ports/net-mgmt/nav/work/nav-3.2.2/src/Logger/build
compile: [javac] Compiling 1 source file to /usr/ports/net-mgmt/nav/work/nav-3.2.2/src/Logger/build [javac] /usr/ports/net-mgmt/nav/work/nav-3.2.2/src/Logger/src/no/ntnu/nav/logger/Log.java:33: package no.ntnu.nav.ConfigParser does not exist [javac] import no.ntnu.nav.ConfigParser.*; [javac] ^ [javac] /usr/ports/net-mgmt/nav/work/nav-3.2.2/src/Logger/src/no/ntnu/nav/logger/Log.java:34: cannot find symbol [javac] symbol : class Path [javac] location: package no.ntnu.nav [javac] import no.ntnu.nav.Path; [javac] ^ [javac] /usr/ports/net-mgmt/nav/work/nav-3.2.2/src/Logger/src/no/ntnu/nav/logger/Log.java:69: cannot find symbol [javac] symbol : variable Path [javac] location: class no.ntnu.nav.logger.Log [javac] private static final String navConfigFile = (Path.sysconfdir + "/nav.conf").replace('/', File.separatorChar); [javac] ^ [javac] /usr/ports/net-mgmt/nav/work/nav-3.2.2/src/Logger/src/no/ntnu/nav/logger/Log.java:88: cannot find symbol [javac] symbol : class ConfigParser [javac] location: class no.ntnu.nav.logger.Log [javac] ConfigParser navCp; [javac] ^ [javac] /usr/ports/net-mgmt/nav/work/nav-3.2.2/src/Logger/src/no/ntnu/nav/logger/Log.java:90: cannot find symbol [javac] symbol : class ConfigParser [javac] location: class no.ntnu.nav.logger.Log [javac] navCp = new ConfigParser(navConfigFile); [javac] ^ [javac] 5 errors
BUILD FAILED /usr/ports/net-mgmt/nav/work/nav-3.2.2/src/Logger/build.xml:30: Compile failed; see the compiler error output for details.
Total time: 3 seconds gmake[1]: *** [Logger] Error 1 gmake[1]: Leaving directory `/usr/ports/net-mgmt/nav/work/nav-3.2.2/src' gmake: *** [all] Error 1 *** Error code 2
--------------------------------------------------------- BTW - How would one normally get the compiler error output? I can include the output with some insight on how one normally get's the output. --------------------------------------------------------- Java:
java version "1.5.0_14-p8" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_14-p8- root_19_apr_2008_00_01) Java HotSpot(TM) Client VM (build 1.5.0_14-p8-root_19_apr_2008_00_01, mixed mode)
Lines ~30 in build.xml:
<target name="compile" depends="init" description="compile the source " > <!-- Compile the java code from ${src} into ${build} --> <javac srcdir="${src}" destdir="${build}" source="1.4" target="1.4" debug="${debug}"/> </target>
On Fri, 2 May 2008 15:12:59 -0400 "Scorpion7" scorpion7@iqonline.net wrote:
Here is what happens when building NAV 3.2.2 on FreeBSD v7.x via the "ports" collection:
[snip]
compile: [javac] Compiling 1 source file to /usr/ports/net-mgmt/nav/work/nav-3.2.2/src/Logger/build [javac] /usr/ports/net-mgmt/nav/work/nav-3.2.2/src/Logger/src/no/ntnu/nav/logger/Log.java:33: package no.ntnu.nav.ConfigParser does not exist [javac] import no.ntnu.nav.ConfigParser.*; [javac] ^ [javac] /usr/ports/net-mgmt/nav/work/nav-3.2.2/src/Logger/src/no/ntnu/nav/logger/Log.java:34: cannot find symbol [javac] symbol : class Path [javac] location: package no.ntnu.nav [javac] import no.ntnu.nav.Path; [javac] ^ [javac] /usr/ports/net-mgmt/nav/work/nav-3.2.2/src/Logger/src/no/ntnu/nav/logger/Log.java:69: cannot find symbol [javac] symbol : variable Path [javac] location: class no.ntnu.nav.logger.Log [javac] private static final String navConfigFile = (Path.sysconfdir + "/nav.conf").replace('/', File.separatorChar); [javac] ^ [javac] /usr/ports/net-mgmt/nav/work/nav-3.2.2/src/Logger/src/no/ntnu/nav/logger/Log.java:88: cannot find symbol [javac] symbol : class ConfigParser [javac] location: class no.ntnu.nav.logger.Log [javac] ConfigParser navCp; [javac] ^ [javac] /usr/ports/net-mgmt/nav/work/nav-3.2.2/src/Logger/src/no/ntnu/nav/logger/Log.java:90: cannot find symbol [javac] symbol : class ConfigParser [javac] location: class no.ntnu.nav.logger.Log [javac] navCp = new ConfigParser(navConfigFile); [javac] ^ [javac] 5 errors
BUILD FAILED /usr/ports/net-mgmt/nav/work/nav-3.2.2/src/Logger/build.xml:30: Compile failed; see the compiler error output for details.
[snip]
BTW - How would one normally get the compiler error output? I can include the output with some insight on how one normally get's the output.
You already pasted the compiler error output, it's all under the "compile:" section.
It says it can't find a couple of the NAV libraries, which is strange. It suggests the CLASSPATH hasn't been set correctly when building the Logger library, yet the build process should set the CLASSPATH to include all the build directories under src/ .
If you cd to /usr/ports/net-mgmt/nav/work/nav-3.2.2/src and type "make debug", what output do you get (it should tell you what CLASSPATH is used when building the Java sources).
When doing the "make debug":
-------------- make debug "Makefile", line 44: Need an operator make: fatal errors encountered -- cannot continue --------------
If I comment out that line (line #44): export -> #export
-------------- make debug Classpath:
INSTALL_TARGETS: --------------
Thoughts?
-----Original Message----- From: Morten Brekkevold [mailto:morten.brekkevold@uninett.no] Sent: Wednesday, May 07, 2008 5:15 AM To: nav-users@uninett.no Cc: Scorpion7 Subject: Re: NAV build on FreeBSD v7.x
On Fri, 2 May 2008 15:12:59 -0400 "Scorpion7" scorpion7@iqonline.net wrote:
Here is what happens when building NAV 3.2.2 on FreeBSD v7.x via the
"ports" collection: [snip]
compile: [javac] Compiling 1 source file to /usr/ports/net-
mgmt/nav/work/nav-3.2.2/src/Logger/build
[javac] /usr/ports/net-mgmt/nav/work/nav-
3.2.2/src/Logger/src/no/ntnu/nav/logger/Log.java:33: package no.ntnu.nav.ConfigParser does not exist
[javac] import no.ntnu.nav.ConfigParser.*; [javac] ^ [javac] /usr/ports/net-mgmt/nav/work/nav-
3.2.2/src/Logger/src/no/ntnu/nav/logger/Log.java:34: cannot find symbol
[javac] symbol : class Path [javac] location: package no.ntnu.nav [javac] import no.ntnu.nav.Path; [javac] ^ [javac] /usr/ports/net-mgmt/nav/work/nav-
3.2.2/src/Logger/src/no/ntnu/nav/logger/Log.java:69: cannot find symbol
[javac] symbol : variable Path [javac] location: class no.ntnu.nav.logger.Log [javac] private static final String navConfigFile =
(Path.sysconfdir + "/nav.conf").replace('/', File.separatorChar);
[javac] ^ [javac] /usr/ports/net-mgmt/nav/work/nav-
3.2.2/src/Logger/src/no/ntnu/nav/logger/Log.java:88: cannot find symbol
[javac] symbol : class ConfigParser [javac] location: class no.ntnu.nav.logger.Log [javac] ConfigParser navCp; [javac] ^ [javac] /usr/ports/net-mgmt/nav/work/nav-
3.2.2/src/Logger/src/no/ntnu/nav/logger/Log.java:90: cannot find symbol
[javac] symbol : class ConfigParser [javac] location: class no.ntnu.nav.logger.Log [javac] navCp = new
ConfigParser(navConfigFile);
[javac] ^ [javac] 5 errors
BUILD FAILED /usr/ports/net-mgmt/nav/work/nav-3.2.2/src/Logger/build.xml:30:
Compile failed; see the compiler error output for details. [snip]
BTW - How would one normally get the compiler error output? I can include the output with some insight on how one normally get's the output.
You already pasted the compiler error output, it's all under the "compile:" section.
It says it can't find a couple of the NAV libraries, which is strange. It suggests the CLASSPATH hasn't been set correctly when building the Logger library, yet the build process should set the CLASSPATH to include all the build directories under src/ .
If you cd to /usr/ports/net-mgmt/nav/work/nav-3.2.2/src and type "make debug", what output do you get (it should tell you what CLASSPATH is used when building the Java sources).
-- mvh Morten Brekkevold UNINETT
On Fri, 9 May 2008 15:50:49 -0400 "Scorpion7" scorpion7@iqonline.net wrote:
When doing the "make debug":
make debug "Makefile", line 44: Need an operator make: fatal errors encountered -- cannot continue
As Anders Nordby commented, it appears you should be using gmake on FreeBSD, since the Makefiles were written for GNU make.
(I think Anders' post was rejected by the list, though, as he was posting from a non-subscribed address)
(see below)
-----Original Message-----
That would be great :) The FreeBSD port is sponsored by the University of Tromsø (and I think maybe the University of Bergen, both of which are FreeBSD proponents), but I'm not exactly sure who maintains the port (not being a FreeBSD guy myself). It's a shame no-one has contacted us if there is a build problem with the port.
I'll go ahead and start the compile on 7.x and I can email you the output so that you can see what's happening (along with version, etc)
I started looking through some of our cron catches and found the following errors (numerous instances of each):
cricket/collect-subtrees normal: Could not read /usr/local/cricket/subtree-sets file
Does that file exist at all, or is it just that the permissions are wrong?
It exists as: /usr/local/cricket/cricket-1.0.5/subtree-sets
Again - Cricket was installed via the standard ports collection. I believe that I have this resolved as well.
Hm, which version of NAV is in the FreeBSD port? If your Java VM supports the java.ext.dirs option, you shouldn't need to set such a big classpath. The getBoksMacs.sh script will set the option '-Djava.ext.dir=/usr/local/nav/lib/java' when calling the java executable, which will make sure all the jar files from /usr/local/nav/lib/java/ will be loaded.
NAV v3.2.2
Unfortunately, I know just enough to be 'dangerous' with java. I've added the symlinks and updated tomcat appropriately. Tried restart everything - same errors.
Try executing the alertengine init script directly (as root) and see if any errors crop up in the terminal:
/usr/local/nav/etc/init.d/alertengine start
There appeared to be a couple errors that I fixed (minor permission issues). However, now upon starting alertengine it just appears to just stall (see notes below).
You didn't mention what you found in getDeviceData-stderr.log and eventEngine-stderr.log?
Those files weren't being created. Now they are. ;-)
It appears now that I can start each of the services (located in init.d) by hand, but not via 'nav start' or 'nav start (service)'.
It appears that when doing a 'truss' on the execution of the program there are tons of 'ERR' lines - appears that some files aren't found and others just take numerous attempts at various locations to find. Now that all the services are running (pping took about 6 times of 'nav start pping' and now it's running as well). Here are the last few lines while running truss (truss -f -a -s 64 nav stop alertengine) against 'nav stop alertengine':
49669: close(3) = 0 (0x0) 49669: setpriority(0x0,0x0,0x0) = 0 (0x0) 49669: chdir("/usr/local/nav") = 0 (0x0)
Without looking at forked processes:
geteuid() = 0 (0x0) sigaction(SIGINT,{ SIG_IGN SA_SIGINFO ss_t },0x0) = 0 (0x0) sigaction(SIGQUIT,{ SIG_IGN SA_SIGINFO ss_t },0x0) = 0 (0x0) sigprocmask(SIG_BLOCK,SIGCHLD,0x0) = 0 (0x0) sigprocmask(SIG_SETMASK,0x0,SIGCHLD) = 0 (0x0) sigprocmask(SIG_SETMASK,SIGHUP|SIGINT|SIGQUIT|SIGILL|SIGTRAP|SIGABRT|SIGEMT|SIGFPE|SIGKILL|SIGBUS|SIGSEGV|SIGSYS|SIGPIPE|SIGALRM|SIGTERM|SIGURG|SIGSTOP|SIGTSTP|SIGCONT|SIGCHLD|SIGTTIN|SIGTTOU|SIGIO|SIGXCPU|SIGXFSZ|SIGVTALRM|SIGPROF|SIGWINCH|SIGINFO|SIGUSR1|SIGUSR2,SIGCHLD) = 0 (0x0) fork() = 52919 (0xceb7) sigprocmask(SIG_SETMASK,SIGCHLD,0x0) = 0 (0x0)
--- Notably, the lines previous to the above are open("/usr/local/nav/etc/cron.d/(various files)"), followed by an fstat and then a read. When it hits this point, the process and any associated forks appears to stall. I let the process run for a couple minutes just to see if perhaps it was waiting for a return from another process (perhaps communication with another process?)
On Fri, 2 May 2008 15:21:50 -0400 "Scorpion7" scorpion7@iqonline.net wrote:
Does that file exist at all, or is it just that the permissions are wrong?
It exists as: /usr/local/cricket/cricket-1.0.5/subtree-sets
Again - Cricket was installed via the standard ports collection. I believe that I have this resolved as well.
Yet, if cricket says it can't read /usr/local/cricket/subtree-sets, then it either doesn't exist, or the permissions are wrong. Does the port setup run the cricket collector as the cricket user, or the navcron user?
also, don't forget that NAV comes with a copy of the subtree-sets config file that is proper for the way NAV configures Cricket - you should use this as your starting point for /usr/local/cricket/cricket/subtree-sets . I don't know if the doc for the port mentions that factoid.
NAV v3.2.2
Ouch, that's old. Really hope someone updates that port.
You didn't mention what you found in getDeviceData-stderr.log and eventEngine-stderr.log?
Those files weren't being created. Now they are. ;-)
But getDeviceData and eventEngine are starting? Or failing? What are the errors in those files?
It appears now that I can start each of the services (located in init.d) by hand, but not via 'nav start' or 'nav start (service)'.
It appears that when doing a 'truss' on the execution of the program there are tons of 'ERR' lines - appears that some files aren't found and others just take numerous attempts at various locations to find.
I have no idea what "truss" is, but I'm assuming it's a FreeBSD tool that does something akin to strace on Linux?
Your truss output doesn't really give me any clues.
Notably, the lines previous to the above are open("/usr/local/nav/etc/cron.d/(various files)"), followed by an fstat and then a read. When it hits this point, the process and any associated forks appears to stall. I let the process run for a couple minutes just to see if perhaps it was waiting for a return from another process (perhaps communication with another process?)
The files in /usr/local/nav/etc/cron.d/ are cron snippets that the nav command inserts into (or removes from) the navcron user's crontab. The nav command will read a file from /usr/local/nav/etc/cron.d/, then attempt to call the system crontab command (with the "-u navcron" option) to read the user's existing crontab and to write a new one. Could it be that the crontab command is hanging for some reason?
Besides, you say that "nav start alertengine" seems to stall, and this command will not modify the crontab, since alertengine isn't a cronjob but a daemon. It will however look at the files in the cron.d directory, and look at the navcron user's crontab with "crontab -u navcron -l".
If you could send a full truss output (with forks) of the "nav start alertengine", then maybe that would help (upload it to a pastebin or send it in a private e-mail so as to not bother the entire mailing list with it).