Hello,
I've check my logs on ipdevpoll and I found this :
2019-01-28 17:37:28,698 [ERROR jobs.jobhandler] [statuscheck PAL-SW-R1-13.res.iogs] Caught exception during save. Last manager = DefaultManager(<class 'nav.ipdevpoll.shadows.POEPort'>, 'ContainerRepository'(...)). Last model = <class 'nav.ipdevpoll.shadows.POEPort'> Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/nav/ipdevpoll/jobs.py", line 442, in _perform_save manager.save() File "/usr/lib/python2.7/dist-packages/nav/ipdevpoll/storage.py", line 87, in save obj.save(self.containers) File "/usr/lib/python2.7/dist-packages/nav/ipdevpoll/storage.py", line 476, in save obj.save() File "/usr/lib/python2.7/dist-packages/django/db/models/base.py", line 589, in save force_update=force_update, update_fields=update_fields) File "/usr/lib/python2.7/dist-packages/django/db/models/base.py", line 617, in save_base updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields) File "/usr/lib/python2.7/dist-packages/django/db/models/base.py", line 698, in _save_table result = self._do_insert(cls._base_manager, using, fields, update_pk, raw) File "/usr/lib/python2.7/dist-packages/django/db/models/base.py", line 731, in _do_insert using=using, raw=raw) File "/usr/lib/python2.7/dist-packages/django/db/models/manager.py", line 92, in manager_method return getattr(self.get_queryset(), name)(*args, **kwargs) File "/usr/lib/python2.7/dist-packages/django/db/models/query.py", line 921, in _insert return query.get_compiler(using=using).execute_sql(return_id) File "/usr/lib/python2.7/dist-packages/django/db/models/sql/compiler.py", line 921, in execute_sql cursor.execute(sql, params) File "/usr/lib/python2.7/dist-packages/django/db/backends/utils.py", line 65, in execute return self.cursor.execute(sql, params) File "/usr/lib/python2.7/dist-packages/django/db/utils.py", line 94, in __exit__ six.reraise(dj_exc_type, dj_exc_value, traceback) File "/usr/lib/python2.7/dist-packages/django/db/backends/utils.py", line 65, in execute return self.cursor.execute(sql, params) IntegrityError: ERREUR: une valeur NULL viole la contrainte NOT NULL de la colonne « classification » DETAIL: La ligne en échec contient (599034, 49, 66, null, 34, t, 2, 4, null)
2019-01-28 17:37:28,701 [ERROR jobs.jobhandler] [statuscheck PAL-SW-R1-13.res.iogs] Job 'statuscheck' for PAL-SW-R1-13.res.iogs aborted: Job aborted due to save failure (cause=IntegrityError('ERREUR: une valeur NULL viole la contrainte NOT NULL de la colonne \xc2\xab classification \xc2\xbb\nDETAIL: La ligne en \xc3\xa9chec contient (599034, 49, 66, null, 34, t, 2, 4, null)\n',)) 2019-01-28 17:37:28,703 [INFO schedule.netboxjobscheduler] [statuscheck PAL-SW-R1-13.res.iogs] statuscheck for PAL-SW-R1-13.res.iogs failed in 0:00:14.603465. next run in 0:04:59.999955.
It seems it doesn't like the PoE on my Dell N2048P because of a NULL value in "classification" column. Do you have an idea ?
Regards
On Mon, 28 Jan 2019 17:47:54 +0100 Vinsonnaud Ludovic ludovic.vinsonnaud@institutoptique.fr wrote:
It seems it doesn't like the PoE on my Dell N2048P because of a NULL value in "classification" column. Do you have an idea ?
I have no experience with the POWER-ETHERNET-MIB itself, as I was not directly involved in the development of the PoE support (and interestingly enough, the dev who did signed off this list right after you posted your question :-D ).
However, the information comes from the `pethPsePortTable`, and the classification value comes explicitly from the `pethPsePortPowerClassifications` column.
NAV's model does not allow the classification field to be empty, so it would be interesting to see what your Dell switch reports in this column. Are you able to pull the full contents of POWER-ETHERNET-MIB::pethPsePortTable using NET-SNMP command line tools?
If there is a port with no value in this column, is the port powered at the moment? The MIB mentions that the classification value is only valid as long as there is a powered device connected to the port - so NAV's model requirement to have a value in this might not be valid.
So I used this site to find the oid : http://cric.grenoble.cnrs.fr/Administrateurs/Outils/MIBS/?module=POWER-ETHER...
and you can see the result in joined file after a snmpwalk
Cordialement, IOGS Logo https://www.institutoptique.fr *Ludovic Vinsonnaud * - Ingénieur Réseau basé à Bordeaux, bureau F108 (IOA, Rue François Mitterrand, 33400 Talence)
*Institut Optique Graduate School* 2 Avenue Augustin Fresnel - 91127 PALAISEAU Cedex Tel. +33 5 57 01 71 52 - Mob. +33 6 08 08 41 05
Le 29/01/2019 à 15:29, Morten Brekkevold a écrit :
On Mon, 28 Jan 2019 17:47:54 +0100 Vinsonnaud Ludovic ludovic.vinsonnaud@institutoptique.fr wrote:
It seems it doesn't like the PoE on my Dell N2048P because of a NULL value in "classification" column. Do you have an idea ?
I have no experience with the POWER-ETHERNET-MIB itself, as I was not directly involved in the development of the PoE support (and interestingly enough, the dev who did signed off this list right after you posted your question :-D ).
However, the information comes from the `pethPsePortTable`, and the classification value comes explicitly from the `pethPsePortPowerClassifications` column.
NAV's model does not allow the classification field to be empty, so it would be interesting to see what your Dell switch reports in this column. Are you able to pull the full contents of POWER-ETHERNET-MIB::pethPsePortTable using NET-SNMP command line tools?
If there is a port with no value in this column, is the port powered at the moment? The MIB mentions that the classification value is only valid as long as there is a powered device connected to the port - so NAV's model requirement to have a value in this might not be valid.
On Tue, 29 Jan 2019 17:37:40 +0100 Vinsonnaud Ludovic ludovic.vinsonnaud@institutoptique.fr wrote:
So I used this site to find the oid : http://cric.grenoble.cnrs.fr/Administrateurs/Outils/MIBS/?module=POWER-ETHER...
and you can see the result in joined file after a snmpwalk
Well, it appears the switch is only reporting the power classification of a single port, identified as 1.61 (group 1, index 61).
I'm not too familiar with the MIB, but there seems to be no definite way of mapping ths port index to an ifIndex, as used by other MIBs, but on most vendors, the port index seems to coincide with the ifIndex (except on Cisco, where we need to get a mapping from a proprietary MIB). Not sure how Dell switches treat this.
Does your switch have an interface with ifIndex 61, and is there something unique about this interface?
Hello,
After some search I found something In fact only one port is consuming enery (Gi2/0/7) in this stack of 2 N2048P So I tried to understand why it is known as 61 After many snmpwalk I found the solution. If you look at the ending index, you will see when it reach 48, it jumps to 55 So we have 1st switch from 1 to 48 and 2nd switch from 55 to 102, 48 ports each time With this, interface Gi2/0/7 is corresponding to 61 so si the answer. But a gap is remaining. The only reason I see is because of additionnal card. In fact you can add a 4 ports 1G card or a 2 ports 10G card. This meens ifIndex for theses ports should be reserved by a total of 6 (4 + 2) from 49 to 54 Do you think I should "power inline never" the interfaces not consuming power to solve the problem ?
Cordialement, IOGS Logo https://www.institutoptique.fr *Ludovic Vinsonnaud * - Ingénieur Réseau basé à Bordeaux, bureau F108 (IOA, Rue François Mitterrand, 33400 Talence)
*Institut Optique Graduate School* 2 Avenue Augustin Fresnel - 91127 PALAISEAU Cedex Tel. +33 5 57 01 71 52 - Mob. +33 6 08 08 41 05
Le 02/02/2019 à 14:30, Morten Brekkevold a écrit :
On Tue, 29 Jan 2019 17:37:40 +0100 Vinsonnaud Ludovic ludovic.vinsonnaud@institutoptique.fr wrote:
So I used this site to find the oid : http://cric.grenoble.cnrs.fr/Administrateurs/Outils/MIBS/?module=POWER-ETHER...
and you can see the result in joined file after a snmpwalk
Well, it appears the switch is only reporting the power classification of a single port, identified as 1.61 (group 1, index 61).
I'm not too familiar with the MIB, but there seems to be no definite way of mapping ths port index to an ifIndex, as used by other MIBs, but on most vendors, the port index seems to coincide with the ifIndex (except on Cisco, where we need to get a mapping from a proprietary MIB). Not sure how Dell switches treat this.
Does your switch have an interface with ifIndex 61, and is there something unique about this interface?
On Mon, 4 Feb 2019 14:59:26 +0100 Vinsonnaud Ludovic ludovic.vinsonnaud@institutoptique.fr wrote:
Hello,
After some search I found something In fact only one port is consuming enery (Gi2/0/7) in this stack of 2 N2048P So I tried to understand why it is known as 61
It doesn't really matter _why_ it has that id, just whether there was something special about that port (once identified). ifIndexes are assigned at the whim of the OS running on the switch.
I interpret the MIB to say that the power classification column doesn't need to contain a valid value for unpowered ports. NAV's data model says differently, which is likely an effect of which actual equipment was available for testing the code by the original developer (which probably reported a classification for all ports, regardless of their status).
I'd suggest NAV's model is wrong, and that the classification value should be optional. You're welcome to post a bug report to that effect at https://github.com/UNINETT/nav/issues/new :-)
Do you think I should "power inline never" the interfaces not consuming power to solve the problem ?
I have no reason to think that would solve anything in this case. Since the device is PoE enabled, NAV will try to get the PoE information from it, and will fail when processing any non-powered port.
Hello, I have NAV 5.7.1 and I have the same problem. There is any workaround to resolve it?
The status check of the switches that have this problem is shown as failed.
Regards, Carles P. UPF.EDU
On Thu, 23 Nov 2023 13:34:37 -0000 (2 weeks, 6 days, 23 hours ago) carles.perarnau@upf.edu wrote:
I have NAV 5.7.1 and I have the same problem. There is any workaround to resolve it?
The status check of the switches that have this problem is shown as failed.
What errors are reported for these devices in `ipdevpoll.log`?
Hello Morten,
I try to explain. We have several switch stacks with N2048P and N2048 units.
for example this:
Unit 2 (serial: XXXXXXXXXXXXXXXXXXXXXXX, model: N2048P , software: 6.5.4.4) Unit 9 (serial: XXXXXXXXXXXXXXXXXXXXXXX, model: N2048P , software: 6.5.4.4) *Unit 4 (serial: XXXXXXXXXXXXXXXXXXXXXXX, model: N2048 , software: 6.5.4.4)* Unit 8 (serial: XXXXXXXXXXXXXXXXXXXXXXX, model: N2048P , software: 6.5.4.4) Unit 3 (serial: XXXXXXXXXXXXXXXXXXXXXXX, model: N2048P , software: 6.5.4.4) Unit 1 (serial: XXXXXXXXXXXXXXXXXXXXXXX, model: N2048P , software: 6.5.4.4) Unit 6 (serial: XXXXXXXXXXXXXXXXXXXXXXX, model: N2048P , software: 6.5.4.4) Unit 5 (serial: XXXXXXXXXXXXXXXXXXXXXXX, model: N2048P , software: 6.5.4.4) *Unit 7 (serial: XXXXXXXXXXXXXXXXXXXXXXX, model: N2048 , software: 6.5.4.4)*
When this switch is defined in Nav, it generates the next message into ipdevpool log file.
Last manager = DefaultManager(<class 'nav.ipdevpoll.shadows.POEPort'>,
'ContainerRepository'(...)). Last model = <class 'nav.ipdevpoll.shadows.POEPort'> Traceback (most recent call last): File "/opt/venvs/nav/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute return self.cursor.execute(sql, params) psycopg2.errors.NotNullViolation: null value in column "classification" of relation "poeport" violates not-null constraint DETAIL: Failing row contains (88341, 134, 127, null, 100, t, 2, 4, null).
When it occurs, the statuscheck of the switch is red.
I tried to remove the constraint into the database definition, but the statuscheck was still red.
I resolved this problem by adding the next trigger into the databases. It changes del Null values of classification attribute for zero before updating or inserting the value into the table poeport :
CREATE OR REPLACE FUNCTION actualitzar_classification() RETURNS TRIGGER AS $$ BEGIN -- Verificar si l'atribut és null i canviar-lo a zero IF NEW.classification IS NULL THEN NEW.classification := 0; END IF; RETURN NEW; END; $$ LANGUAGE plpgsql;
-- Afegir el trigger abans de l'acció d'inserció o update CREATE TRIGGER actualitzar_classification_trigger before INSERT OR UPDATE ON poeport FOR EACH ROW EXECUTE FUNCTION actualitzar_classification();
Now, the statuscheck of the switch is green and Nav shows the Poe information of N2048P switches .
Please let me know if you need more information.
Best regards,
PD: our current Nav version is 5.8.3, and if the trigger is undefined the problem persists. *Carles Perarnau i Sabés * Unitat d'Infraestructures i Seguretat TIC - Servei d'Informàtica [image: Universitat Pompeu Fabra, Barcelona]
Missatge de Morten Brekkevold morten.brekkevold@sikt.no del dia dj., 14 de des. 2023 a les 13:47:
On Thu, 23 Nov 2023 13:34:37 -0000 (2 weeks, 6 days, 23 hours ago) carles.perarnau@upf.edu wrote:
I have NAV 5.7.1 and I have the same problem. There is any workaround to resolve it?
The status check of the switches that have this problem is shown as
failed.
What errors are reported for these devices in `ipdevpoll.log`?
-- Sincerely, Morten Brekkevold
Sikt – Norwegian Agency for Shared Services in Education and Research
On Fri, 15 Dec 2023 09:11:13 +0100 (4 hours, 16 minutes, 14 seconds ago) PERARNAU SABÉS, Carles carles.perarnau@upf.edu wrote:
I tried to remove the constraint into the database definition, but the statuscheck was still red.
I resolved this problem by adding the next trigger into the databases. It changes del Null values of classification attribute for zero before updating or inserting the value into the table poeport :
Wow, that's an interesting and creative workaround :)
It doesn't look like anyone ever bothered to report this in our bugtracker, as I suggested back in 2019.
I think that maybe the best way to go about this would be to simply support the case that some devices just don't want to report a power classification for some of their PoE ports. Adding a power classification of 0 = 'unknown' to the `POEPort` model should do the trick (and updating the ipdevpoll `poe` plugin to use 0 as a default value if `pethPsePortPowerClassifications` doesn't contain a value).
Perfect. Please let me know when you apply this solution for testing.
Best,
*Carles Perarnau i Sabés * Unitat d'Infraestructures i Seguretat TIC - Servei d'Informàtica [image: Universitat Pompeu Fabra, Barcelona]
Missatge de Morten Brekkevold morten.brekkevold@sikt.no del dia dv., 15 de des. 2023 a les 13:32:
On Fri, 15 Dec 2023 09:11:13 +0100 (4 hours, 16 minutes, 14 seconds ago) PERARNAU SABÉS, Carles carles.perarnau@upf.edu wrote:
I tried to remove the constraint into the database definition, but the statuscheck was still red.
I resolved this problem by adding the next trigger into the databases. It changes del Null values of classification attribute for zero before updating or inserting the value into the table poeport :
Wow, that's an interesting and creative workaround :)
It doesn't look like anyone ever bothered to report this in our bugtracker, as I suggested back in 2019.
I think that maybe the best way to go about this would be to simply support the case that some devices just don't want to report a power classification for some of their PoE ports. Adding a power classification of 0 = 'unknown' to the `POEPort` model should do the trick (and updating the ipdevpoll `poe` plugin to use 0 as a default value if `pethPsePortPowerClassifications` doesn't contain a value).
-- Sincerely, Morten Brekkevold
Sikt – Norwegian Agency for Shared Services in Education and Research
On Mon, 18 Dec 2023 08:45:18 +0100 (9 weeks, 6 hours, 43 seconds ago) PERARNAU SABÉS, Carles carles.perarnau@upf.edu wrote:
Perfect. Please let me know when you apply this solution for testing.
Since no one seems to care enough to actually write up the enhancement request, I threw something together at https://github.com/Uninett/nav/issues/2812 myself. If you subscribe to that GitHub issue, you should automatically be notified when things are being worked on.
Hello,
Today I've got other errors (joined to this mail)
It seems to affect Dell N2048P, N4064 and Cisco 2960
Any ideas ?
Cordialement, IOGS Logo https://www.institutoptique.fr *Ludovic Vinsonnaud * - Ingénieur Réseau basé à Bordeaux, bureau F108 (IOA, Rue François Mitterrand, 33400 Talence)
*Institut Optique Graduate School* 2 Avenue Augustin Fresnel - 91127 PALAISEAU Cedex Tel. +33 5 57 01 71 52 - Mob. +33 6 08 08 41 05
On Tue, 29 Jan 2019 17:38:37 +0100 Vinsonnaud Ludovic ludovic.vinsonnaud@institutoptique.fr wrote:
Today I've got other errors (joined to this mail)
It seems to affect Dell N2048P, N4064 and Cisco 2960
File "/usr/lib/python2.7/dist-packages/twisted/internet/epollreactor.py", line 158, in _remove del selectables[fd] exceptions.KeyError: 22
I don't think this affects any specific device type. This appears to be a bug with connection handling in the Twisted framework - and once it occurs, it tends to bug up for the lifetime of the process. I've seen it multiple times, but unfortunately, it seems the bug has not been closed by upstream yet [1], even though a pull request containing a solution was opened two years ago.
Restarting the ipdevpoll process tends to resolve it.
[1] https://github.com/twisted/twisted/pull/594
Well it seems a restart change nothing. Maybe when I have time I will recreate the VM from scratch to see if it happens again
Cordialement, IOGS Logo https://www.institutoptique.fr *Ludovic Vinsonnaud * - Ingénieur Réseau basé à Bordeaux, bureau F108 (IOA, Rue François Mitterrand, 33400 Talence)
*Institut Optique Graduate School* 2 Avenue Augustin Fresnel - 91127 PALAISEAU Cedex Tel. +33 5 57 01 71 52 - Mob. +33 6 08 08 41 05
Le 02/02/2019 à 14:36, Morten Brekkevold a écrit :
On Tue, 29 Jan 2019 17:38:37 +0100 Vinsonnaud Ludovic ludovic.vinsonnaud@institutoptique.fr wrote:
Today I've got other errors (joined to this mail)
It seems to affect Dell N2048P, N4064 and Cisco 2960
File "/usr/lib/python2.7/dist-packages/twisted/internet/epollreactor.py", line 158, in _remove del selectables[fd] exceptions.KeyError: 22
I don't think this affects any specific device type. This appears to be a bug with connection handling in the Twisted framework - and once it occurs, it tends to bug up for the lifetime of the process. I've seen it multiple times, but unfortunately, it seems the bug has not been closed by upstream yet [1], even though a pull request containing a solution was opened two years ago.
Restarting the ipdevpoll process tends to resolve it.
On Mon, 4 Feb 2019 15:20:16 +0100 Vinsonnaud Ludovic ludovic.vinsonnaud@institutoptique.fr wrote:
Well it seems a restart change nothing. Maybe when I have time I will recreate the VM from scratch to see if it happens again
No, that would be way overkill. Please just try more ipdevpoll restarts :-) I've subscribed to the issue at Github so I'll be made aware of any progress in producing a solution.
Ah yes, you're right, another restart and everything is back in place Thanks
Cordialement, IOGS Logo https://www.institutoptique.fr *Ludovic Vinsonnaud * - Ingénieur Réseau basé à Bordeaux, bureau F108 (IOA, Rue François Mitterrand, 33400 Talence)
*Institut Optique Graduate School* 2 Avenue Augustin Fresnel - 91127 PALAISEAU Cedex Tel. +33 5 57 01 71 52 - Mob. +33 6 08 08 41 05
Le 05/02/2019 à 11:04, Morten Brekkevold a écrit :
On Mon, 4 Feb 2019 15:20:16 +0100 Vinsonnaud Ludovic ludovic.vinsonnaud@institutoptique.fr wrote:
Well it seems a restart change nothing. Maybe when I have time I will recreate the VM from scratch to see if it happens again
No, that would be way overkill. Please just try more ipdevpoll restarts :-) I've subscribed to the issue at Github so I'll be made aware of any progress in producing a solution.