/ipdevinfo contains a button named "DNS" which shows all the IP-addresses connected to the device and their hostnames.
I have one device which is correctly registered i DNS. Even so, the DNS info says "No addresses found".
Why? Where to start debugging?
ipdevpolld -J dns -n <ip-address> runs as expected.
--Ingeborg
On Tue, 09 May 2023 15:05:17 +0200 Ingeborg Hellemo ingeborg.hellemo@uit.no wrote:
/ipdevinfo contains a button named "DNS" which shows all the IP-addresses connected to the device and their hostnames.
I have one device which is correctly registered i DNS. Even so, the DNS info says "No addresses found".
Why? Where to start debugging?
The DNS lookup works like this:
1. A forward lookup for AAAA and A records for the "Full sysname" value is performed.
2. For each of the found IP addresses (if any), a reverse lookup is performed.
The code normally uses the settings of `/etc/resolv.conf` to discover which resolver to query.
Does your NAV web server log any errors when you click the "DNS" button?
ipdevpolld -J dns -n <ip-address> runs as expected.
The dns plugin in ipdevpoll doesn't have the same task as the DNS button in the user interface: It only performs a PTR lookup for the configured IP address of an IP Device and sets the device's sysname attribute from that.
The DNS lookup works like this:
A forward lookup for AAAA and A records for the "Full sysname" value is performed.
For each of the found IP addresses (if any), a reverse lookup is performed.
The code normally uses the settings of `/etc/resolv.conf` to discover which resolver to query.
Does this happen real time or is there som internal caching involved? I was able to spot DNS lookups for some devices, but not for all I've tried. - This could be an error since it is a busy NAV-server with a lot of DNS traffic.
When I do DNS lookups from the command line to the /etc/resolv.conf-resolvers both A and PTR RR are correct.
The DNS-info is correct for the majority of devices, but for a handful it returns "No addresses found".
Is there some case sensitivity involved? After making sure that both A and PTR RR had the same case plus a reload of apache, I am getting the expected output.
Does your NAV web server log any errors when you click the "DNS" button?
No errors.
--Ingeborg
On Wed, 10 May 2023 09:56:08 +0200 Ingeborg Hellemo ingeborg.hellemo@uit.no wrote:
Does this happen real time or is there som internal caching involved?
It's real-time *with* caching. I.e. there's a response cache per web UI process. If there is no cache hit, the DNS queries are sent out and the corresponding responses cached.
But, the cache is an LRU cache with 128 slots (the default value of the cache implementation), which might not be a good idea for this purpose. This means that once the cache contains 128 host lookups, the least recently used cache entry is evicted. If you only ever use the host information button for the same handful of devices, the DNS responses will basically be cached indefinitely, or until the process is restarted, whichever comes first.
Process restarts are also entirely deployment-specific. We typically deploy with a configuration that kills and restarts the worker processes after a certain number of requests have been handled by it.
Is there some case sensitivity involved? After making sure that both A and PTR RR had the same case plus a reload of apache, I am getting the expected output.
Well, the cache key is the hostname/sysname as known by NAV, which ultimately comes from the PTR record for the device's IP address. If you change the casing of the PTR record in DNS, you also effectively invalidate the ipdevinfo DNS cache for that host.
Is there some case sensitivity involved? After making sure that both A and PTR RR had the same case plus a reload of apache, I am getting the expected output.
Well, the cache key is the hostname/sysname as known by NAV, which ultimately comes from the PTR record for the device's IP address. If you change the casing of the PTR record in DNS, you also effectively invalidate the ipdevinfo DNS cache for that host.
This cache is only in memory, right? Why do I still get this error after apache is restarted? (On a different device than the one I was originally debugging)
--Ingeborg
On Wed, 10 May 2023 15:11:41 +0200 Ingeborg Hellemo ingeborg.hellemo@uit.no wrote:
comes from the PTR record for the device's IP address. If you change the casing of the PTR record in DNS, you also effectively invalidate the ipdevinfo DNS cache for that host.
I suspect that there is som kind of case sensitive comparison somewhere besides the cache.
All the troublesome devices I have found have a sysname like "roomb1000.domain" while the DNS RRs are "roomB1000.domain", both A and PTR.
Interesting. So you're actually saying the sysname registered for this device in NAV has the wrong casing too? I do see that if the casing of the PTR record changes around, that will not actually reflect in NAV, as the ipdevpoll DNS plugin will do a case insensitive match between the retrieved DNS record and the existing sysname value of a device before it decides whether to update the database.
However, testing the innards of the async DNS resolver code in NAV (which is used by the web UI) I found this line, which actually seems to be your problem:
https://github.com/Uninett/nav/blob/4c9e477db003d25ab89391234a6cb80514cafc72...
The records that are returned in a DNS response are matched *case sensitively* against the hostname from the query. So when the casing of the DNS result doesn't match the query, the result record is just ignored.
Bug report time! :)
morten.brekkevold@sikt.no said:
The records that are returned in a DNS response are matched *case sensitively* against the hostname from the query. So when the casing of the DNS result doesn't match the query, the result record is just ignored.
Bug report time! :)
Thank you! I am not crazy after all. Case in DNS might change from time to time.
Should you or I report the bug?
--Ingeborg
On Thu, 11 May 2023 12:15:08 +0200 Ingeborg Hellemo ingeborg.hellemo@uit.no wrote:
Thank you! I am not crazy after all. Case in DNS might change from time to time.
Should you or I report the bug?
I posted this, feel free to add more context if you want: https://github.com/Uninett/nav/issues/2615
morten.brekkevold@sikt.no said:
Well, the cache key is the hostname/sysname as known by NAV, which ultimately comes from the PTR record for the device's IP address. If you change the casing of the PTR record in DNS, you also effectively invalidate the ipdevinfo DNS cache for that host.
I suspect that there is som kind of case sensitive comparison somewhere besides the cache.
All the troublesome devices I have found have a sysname like "roomb1000.domain" while the DNS RRs are "roomB1000.domain", both A and PTR.
--Ingeborg