Re: Hardware recommendation

24 Jun 2013


      Hi,
On 19.06.13 12:54, "Morten Brekkevold" morten.brekkevold@uninett.no
wrote:
...
On Tue, 18 Jun 2013 10:25:25 +0000 Mischa Diehm mischa.diehm@unibas.ch
wrote:
...
Hi,
Hi Mischa!
...
first of all welcome to the list!
Uh, thank you? Welcome yourself :)
...
A problem we have since a while is that our system is permanently
under a lot of load (mostly CPU bound) and we haven't really found a
way to reduce the pressure. The hardware we use is a hp blade
(ProLiant BL460c G6) with:
...
At the moment we have 1460 active Devices (mainly Cisco Switches).
Around 30 or 40 are OVERDUE in:
https://urz-nav/report/lastupdated
Are all jobs overdue for these devices, or just some of the jobs? Does
As far as I can see it's mainly inventory and topo jobs which are overdue.
...
NAV consider the devices to be reachable and responding to SNMP
requests? Does `ipdevpoll.conf` indicate that the jobs are failing due
to errors, or just that they are delayed or time out?
Devices are marked up and and snmp_status = ok. I don't understand what
you 
mean by the last sentence.
...
...
So my question is, do you have any good experience with HW-systems
that are actually dealing with this amount of devices or is there any
tuning possibility (without losing functionality) we could try to
reduce the pressure on the system?
At the moment, the closest I have access to is a system monitoring 882
devices, but it still isn't in full production mode (meaning, they still
have more devices to add). The load number of the system varies wildly
with which collection jobs are running at any given moment. They might
be seen as high numbers, but the system has 4 cores (with hyperthreading
enabled), so the load average is mostly less than the number of cores.
Yes indeed. The average on our machine is ok but there are ongoing peeks
with
very high load.
...
This system is a HP DL360p Gen-8 server, with 12GB RAM and 4x600GB SAS
10K drives mounted in a hardware RAID 1+0 configuration.
We will very soon be migrating PostgreSQL off this server and onto a
dedicated server with identical specifications, specifically to
alleviate some of the load issues we are experiencing.
ok. So we will be thinking about doing the same thing if that is what is
necessary to return to more sane loads.
...
ipdevpoll is currently running in its "experimental" multiprocess-mode
on this system, which means each of the configured jobs in
`ipdevpoll.conf` get their own dedicated process (which improves things
on multicore systems). This can be achieved on a more permanent basis by
adding the "-m" switch to the ipdevpoll command in the
`/etc/nav/init.d/ipdevpoll` script.
adding the -m switch seems not so easy here. The problem is the way
daemon() is written
and su - xxx -c $CMD is working. How do I add the -m Switch so it's
actually using it
(root is using bash on this system):
Debug output when starting with the init script:
Starting ipdevpoll: + daemon 'su - navcron -c /usr/lib/nav/ipdevpolld -m'
+ su - navcron -c /usr/lib/nav/ipdevpolld -m
This way the -m is never executed... I couldn't find a way to integrate -m
without changing
the nav code.
...
We will be using this system for testing performance optimizations to
ipdevpoll once we migrate PostgreSQL to a dedicated server. I can post
our findings here once we get there, but that probably won't be until
August, as I'll be offline most of July.
That would still be very much appreciated.
Cheers,
Mischa
...
-- 
Morten Brekkevold
UNINETT

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: Hardware recommendation