[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
looking for hostname router identifier validation
I legit guffawed.
On 19-04-29 13 h 13, Eric Kuhnke wrote:
> I would caution against putting much faith in the validity of
> geolocation or site ID by reverse DNS PTR records. There are a vast
> number of unmaintained, ancient, stale, erroneous or wildly wrong PTR
> records out there. I can name at least a half dozen ISPs that have
> absorbed other ASes, some of those which also acquired other ASes
> earlier in their history, forming a turducken of obsolete PTR records
> that has things with ISP domain names last in use in the year 2002.
>
>
>
> On Mon, Apr 29, 2019 at 6:15 AM Matthew Luckie <mjl at luckie.org.nz
> <mailto:mjl at luckie.org.nz>> wrote:
>
> Hi NANOG,
>
> To support Internet topology analysis efforts, I have been working on
> an algorithm to automatically detect router names inside hostnames
> (PTR records) for router interfaces, and build regular expressions
> (regexes) to extract them. By "router name" inside the hostname, I
> mean a substring, or set of non-contiguous substrings, that is common
> among interfaces on a router. For example, suppose we had the
> following three routers in the savvis.net <http://savvis.net>
> domain suffix, each with two
> interfaces:
>
> das1-v3005.nj2.savvis.net <http://das1-v3005.nj2.savvis.net>
> das1-v3006.nj2.savvis.net <http://das1-v3006.nj2.savvis.net>
>
> das1-v3005.oc2.savvis.net <http://das1-v3005.oc2.savvis.net>
> das1-v3007.oc2.savvis.net <http://das1-v3007.oc2.savvis.net>
>
> das2-v3009.nj2.savvis.net <http://das2-v3009.nj2.savvis.net>
> das2-v3012.nj2.savvis.net <http://das2-v3012.nj2.savvis.net>
>
> We might infer the router names are das1|nj2, das1|oc2, and das2|nj2,
> respectively, and captured by the regex:
> ^([a-z]+\d+)-[^\.]+\.([a-z]+\d+)\.savvis\.net$
>
> After much refinement based on smaller sets of ground truth, I'm
> asking for broader feedback from operators. I've placed a webpage at
> https://www.caida.org/~mjl/rnc/ that shows the inferences my algorithm
> made for 2523 domains. If you operate one of the domains in that
> list, I would appreciate it if you could comment (private is probably
> better but public is fine with me) on whether the regex my algorithm
> inferred represents your naming intent. In the first instance, I am
> most interested in feedback for the suffix / date combinations for
> suffixes that are colored green, i.e. appear to be reasonable.
>
> Each suffix / date combination links to a page that contains the
> naming convention and corresponding inferences. The colored part of
> each hostname is the inferred router name. The green hostnames appear
> to be correct, at least as far as the algorithm determined. Some
> suffixes have errors due to either stale hostnames or incorrect
> training data, and those hostnames are colored red or orange.
>
> If anyone is interested in sets of hostnames the algorithm may have
> inferred as 'stale' for their network, because for some operators it
> was an oversight and they were grateful to learn about it, I can
> provide that information.
>
> Thanks,
>
> Matthew
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nanog.org/pipermail/nanog/attachments/20190429/b1d1b34a/attachment.html>