Greetings,
I am looking for assistance in determining respective NAV VM system parameters for managing 800-1000 Network devices across two data centers and 30 sites. Most of the network devices are Cisco and all sites are connected with high- speed and low latent connections. Please let me know if any additional information would be helpful.
Thanks for your help,
Ken...
On Tue, 26 Mar 2019 07:12:58 +0100 ken.livesey@subzero.com wrote:
Greetings,
I am looking for assistance in determining respective NAV VM system parameters for managing 800-1000 Network devices across two data centers and 30 sites. Most of the network devices are Cisco and all sites are connected with high- speed and low latent connections. Please let me know if any additional information would be helpful.
Hi Ken,
once you get to that size, I would normally recommend splitting up your NAV installation into either two or three servers - which I think would work equally well for VMs, as you will have the flexibility of provisioning the resources where they are needed.
Some thoughts:
1. Split lines: The NAV, PostgreSQL and Graphite components are easily split into separate servers. If you are already up and running on one server, the simplest and most effective split is to move PostgreSQL to a dedicated server.
2. All our new servers are provisioned with 32GB of RAM these days, but they are normally intended to run all the components on a single server. If splitting, 16GB ought to be more than enough per server. You could even get away with less. If separating Graphite to a server of its own, it would probably need the least amount of RAM.
3. Storage I/O performance: It becomes crucial to have very high I/O throughput, especially for Graphite, which is really write-heavy. We _always_ deploy Graphite storage on SSD. It's not always necessary, but with the number of nodes you'll be monitoring, it will be.
4. Storage space: Compared to the smallest drives we can buy today, NAV uses very little storage. Some numbers: An installation having monitored ~500 nodes for several years now consumes approx. 47G for Graphite storage, and 40G for PostgreSQL storage. Same numbers for an installation monitoring ~800 nodes: PostgreSQL: 152G, Graphite: 67G. Graphite's usage is normally fixed and grows linearly with the number of metrics. PostgreSQL grows over time, as events are logged, but normally you would run navclean regularly to throw out old ARP/CAM records (often depending on privacy laws in your locale), which take up most of the space.
5. Assuming you've provisioned NAV with multiple cores, configure ipdevpoll to run in multiprocess mode - one process per core: https://nav.uninett.no/doc/latest/reference/ipdevpoll.html#multiprocess-mode
morten.brekkevold@uninett.no said:
- Storage space
Some numbers from us:
We monitor 3000 devices, 900 of these are GSW, SW, GW, EDGE.
We run on 3 (physical) servers: NAV, postgreSQL, graphite. The most important piece to run on a separate host is postgreSQL. Run separately NAV/postgreSQL use less resources than when they run on the same host. (I think they kept stepping on each others toes...)
Data usage:
PostgreSQL: 14 G (we prune ARP/CAM after 1 year) Graphite: 62 G
--Ingeborg
On Thu, 28 Mar 2019 10:00:58 +0100 Ingeborg Hellemo ingeborg.hellemo@uit.no wrote:
Data usage:
PostgreSQL: 14 G (we prune ARP/CAM after 1 year) Graphite: 62 G
Thanks for those numbers, Ingeborg.
I might add, that on our installations, we also automatically prune Graphite data by deleting *.wsp files that have not been modified in 365 days or so.