[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[ih] inter-network communication history
On 11/9/19 4:23 AM, John Day wrote:
> As they say, Jack, ignorance is bliss! ;-) Were you doing configuration with it? Or was it just monitoring?
As I recall, configuration wasn't a big deal.? Nodes were typically
routers with Ethernets facing toward the users at the site and several
interfaces the other way for long-haul circuits.? Our approach was to
collect all the appropriate equipment for the next site in our
California lab, configure it and test it out on the live network, and
then ship it all to wherever it was to go.? So, for example, New Zealand
might have actually been in California at first, but when it got to NZ
it worked the same.
IIRC, there was lots of stuff that could be configured and tweaked in
the routers.?? There was even a little documentation on what some of
those "virtual knobs" affected.?? There was essentially nothing on why
you might want to set some knob to any particular position, what
information you needed to make such decisions, or how to predict
results.??? Anything could happen.?? So there was strong incentive never
to change the default configuration parameters after the site equipment
left our lab.
I don't remember any concerns about database performance.? But we only
had a hundred or so boxes out in our net.?? Perhaps the Network
Management vendors had visions of customers with thousands of their
boxes so we didn't see the same problems.? Also, we only collected the
specific data from sources like SNMP that we expected we could actually
use.? We thought our network was pretty big for the time, spanning 5
continents and thousands of users and computers.? The database we had
worked fine for that.?? Compared to other situations, like processing
credit card or bank transactions, it didn't seem like a big load.? I
think it all went into a Sparc.? But there were bigger machines around
if we needed one.
The vendor-supplied tools did provide some monitoring.? E.g., it was
fairly easy to see problems like a dead router or line, and pick up the
phone to call the right TelCo or local site tech to reboot the box.?
With alternate routing, often the users didn't even notice.? Just like
in the ARPANET...(Yay packet switching!)
To make things extra interesting, that was the era of "multi-protocol
routers", since TCP hadn't won the network wars quite yet.? Our
corporate product charter was to provide software that ran on any
computer, over any kind of network.? So our net carried not only TCP/IP,
but also other stuff - e.g., DECNet, AppleTalk, SPX/IPX, and maybe one
or two I don't remember.? SNA/LU6.2 anyone...??? Banyan Vines?
Most of our more challenging "network management" work involved fault
isolation and diagnosis, plus trend analysis and planning.
A typical problem would start with an urgent call from some user who was
having trouble doing something.? It might be "The network is way too
slow. ? It's broken."? or "I can't get my quarterly report to go in".??
Often the vendor system would show that all routers were up and running
fine, and all lines were up.? But from the User's perspective, the
network was broken.
Figuring out what was happening was where the ad-hoc tools came in.?
Sometimes it was User Malfunction, but often there was a real issue in
the network that just didn't appear in any obvious way to the
operators.?? But the Users saw it.
"You say the Network is running fine.....but it doesn't work!"
To delve into Users' problems, we needed to go beyond just looking at
the routers and circuits.? Part of the problem might be in the Host
computers where TCP lived, or in the Application, e.g., email.??
We ran the main data center in addition to the network.? There wasn't
anyone else for us to point the finger at.
We used simple shell scripts and common Unix programs to gather
SNMP-available data and stuff it into the database, parsed as much as we
could into appropriate tables with useful columns like Time, Router#,
ReportType, etc.?? That provided data about how the routers saw the
network world, capturing status and behavior over whatever period of
time we ran the collector.
Following the "Standard Node" approach, wherever we placed a network
node we also made sure to have some well-understood machine on the User
side that we could use remotely from the NOC.? Typically it would be
some kind of Unix workstation, attached to the site's Ethernet close to
the router.?? Today, I'd probably just velcro a Raspberry Pi to the router.
I used to call this an Anchor Host, since it provided a stable,
well-understood (by us at the NOC) machine out in the network.?? This
was really just copying the ARPANET approach from the early 70s, where a
"Fake Host" inside the IMP could be used to do network management things
like generate test traffic or snoop on regular network traffic.?? We
couldn't change the router code to add a Fake Host, but we could put a
Real Host next to it.