We observe a WS-C93XX (Catalyst 93xx Switch Stack from cisco) where for some MAC-addresses information about switch port is missing.
All jobs have run recently, no error message about the topo job in ipdevpoll.log.
Any ideas?
--Ingeborg
On Mon, 21 Jan 2019 14:30:28 +0100 Ingeborg Hellemo ingeborg.hellemo@uit.no wrote:
We observe a WS-C93XX (Catalyst 93xx Switch Stack from cisco) where for some MAC-addresses information about switch port is missing.
Sorry, you kind of lost me here. Are you saying there are CAM records from this switch stack available in Machine Tracker, but that these records have no interface names attached to them?
If so, can you confirm that NAV does not list any nameless interfaces on these switches (using e.g. the interfaces report)?
On Mon, 21 Jan 2019 14:30:28 +0100 Ingeborg Hellemo ingeborg.hellemo@uit.no wrote:
We observe a WS-C93XX (Catalyst 93xx Switch Stack from cisco) where for some MAC-addresses information about switch port is missing.
Sorry, you kind of lost me here. Are you saying there are CAM records from this switch stack available in Machine Tracker, but that these records have no interface names attached to them?
Yep.
If so, can you confirm that NAV does not list any nameless interfaces on these switches (using e.g. the interfaces report)?
Example lines from the database:
nav=> select netboxid,ifindex,port,mac,start_time,end_time from cam where netboxid='5802'; netboxid | ifindex | port | mac | start_time | end_time ----------+---------+----------+-------------------+--------------------------- -+---------------------------- 5802 | 279 | | 6c:4b:90:1d:7d:a1 | 2019-01-18 10:13:00.577658 | infinity
We know that the device with MAC 6c:4b:90:1d:7d:a1 is connected to Gi3/0/15
nav=> select interfaceid,netboxid,moduleid,ifindex,ifname,ifdescr,iftype from interface where netboxid='5802' and interfaceid='2266190'; interfaceid | netboxid | moduleid | ifindex | ifname | ifdescr | iftype -------------+----------+----------+---------+----------+---------------------- -+-------- 2266190 | 5802 | | 279 | Gi3/0/15 | GigabitEthernet3/0/15 | 6
The box has no nameless interfaces.
--Ingeborg
On Tue, 22 Jan 2019 11:42:34 +0100 Ingeborg Hellemo ingeborg.hellemo@uit.no wrote:
Example lines from the database:
nav=> select netboxid,ifindex,port,mac,start_time,end_time from cam where netboxid='5802'; netboxid | ifindex | port | mac | start_time | end_time ----------+---------+----------+-------------------+--------------------------- -+---------------------------- 5802 | 279 | | 6c:4b:90:1d:7d:a1 | 2019-01-18 10:13:00.577658 | infinity
We know that the device with MAC 6c:4b:90:1d:7d:a1 is connected to Gi3/0/15
nav=> select interfaceid,netboxid,moduleid,ifindex,ifname,ifdescr,iftype from interface where netboxid='5802' and interfaceid='2266190'; interfaceid | netboxid | moduleid | ifindex | ifname | ifdescr | iftype -------------+----------+----------+---------+----------+---------------------- -+-------- 2266190 | 5802 | | 279 | Gi3/0/15 | GigabitEthernet3/0/15 | 6
The box has no nameless interfaces.
I'm looking at the code, and the only immediate reason I can see how this could happen is if the cam collection ran before the interface had been collected by the inventory job (i.e. `cam.port` is set from `interface.ifname` or `interface.ifdescr`, if present, at the time the cam record is created, and is never updated after its creation).
It might also happen if the switch reindexed its interfaces between runs of the inventory and cam jobs (so that the cam records referred to ifindexes that inventory had not yet collected).
When was this WS-C93XX first entered into NAV, and did it reboot at any time near 2019-01-18 10:13:00?
morten.brekkevold@uninett.no said:
I'm looking at the code, and the only immediate reason I can see how this could happen is if the cam collection ran before the interface had been collected by the inventory job (i.e. `cam.port` is set from `interface.ifname` or `interface.ifdescr`, if present, at the time the cam record is created, and is never updated after its creation).
It might also happen if the switch reindexed its interfaces between runs of the inventory and cam jobs (so that the cam records referred to ifindexes that inventory had not yet collected).
When was this WS-C93XX first entered into NAV, and did it reboot at any time near 2019-01-18 10:13:00?
Ah, I guess this happened because this switch had a case of "Found multiple matching interfaces for Interface" (a failure we see quite often) and I was rummaging around in the database and deleting interfaces around this time.
Then the next question is how do I fix this (short of deleting everything and begin from scratch...)? Delete from cam table where port is empty?
--Ingeborg
On Wed, 23 Jan 2019 11:53:25 +0100 Ingeborg Hellemo ingeborg.hellemo@uit.no wrote:
(i.e. `cam.port` is set from `interface.ifname` or `interface.ifdescr`, if present, at the time the cam record is created, and is never updated after its creation).
Ah, I guess this happened because this switch had a case of "Found multiple matching interfaces for Interface" (a failure we see quite often)
What? That sounds like a much more intriguing issue...
and I was rummaging around in the database and deleting interfaces around this time.
A-ha!
Then the next question is how do I fix this (short of deleting everything and begin from scratch...)? Delete from cam table where port is empty?
Unless you find it OK to lose history logs, I wouldn't recommend it :)
But you could potentially backfill those records with the current interface name, if you so cared (and if you trust the ifindexes have not been renumbered since the records were created). You should also take care to only update records from the problematic device or records from within a certain time period.
Something like this, perhaps:
UPDATE cam SET port=ifname FROM interface WHERE interface.netboxid = cam.netboxid AND interface.ifindex = cam.ifindex AND COALESCE(port, '') = '' AND cam.netboxid = <misbehaving_netbox_id> ;
And I would definitely recommend doing it inside a transaction, so you can verify your results before committing (i.e. start with `BEGIN;` and only issue `END;` when you are certain you did the right thing).