Hello Morten
apologies for the late reply, the flu season is terrible this year.
I have verified that the 5min jobs are running at the correct intervals and that we're using SNMP v2c everywhere. I checked everything that was suggested in "Debugging gaps in graphs" article and I'm looking for advice as to how to proceed further.
Best regards Karl
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ *From:* Morten Brekkevold [mailto:morten.brekkevold@uninett.no] *Sent:* Tuesday, Feb 20, 2018 8:49 AM CET *To:* Karl Gerhard *Cc:* nav-users@uninett.no *Subject:* Is that a healthy carbon cache?
On Mon, 19 Feb 2018 18:25:55 +0100 Karl Gerhard karl_gerh@gmx.at wrote:
Picture e7: graphite-carbon 0.9.15-1, nav 4.8.2-1stretch (debian stretch) Picture jy: graphite-carbon 0.9.15-1, nav 4.8.2-2stretch (debian stretch)
The negative cache size thing seems be a known problem in that version: https://github.com/graphite-project/carbon/issues/420 - but I'm not clear on whether the issue is more serious than faulty reports...
We don't have any SNMP timeouts in our logs, ipdevpoll reports no errors and strangely enough this issue affects only some interfaces: *On one device we have an AE consisting of 4 interfaces and all 4 interfaces have graphs without gaps, but the AE interface consists of mostly gaps. The problems are not limited to AE interfaces though, that would be too easy. *On another device we have interfaces with 1000BASE-T (=copper) and some of them have no gaps at all while others are mostly gaps.
Did you verify that the 5minstats jobs for these devices are running at the correct intervals, as suggested by the guide?
Also, are you sure you have configured NAV (in SeedDB) to use SNMP v2c and not SNMP v1 on these devices (just want to rule out the use of 32-bit counters, which would really make things bad on high-speed interfaces)?