while i won't try to pass my advice off here as sage, i have a
considerable amount of experience working in very large networks w/
carriers (disclaimer: i work for a very large network equipment
vendor, likely the one you're discussing based on the model
numbers.). i would agree that your assessment here is dead on, it's
a process thing.
bear in mind, vendors come up with default configurations that
attempt to address the widest swath of deployment scenarios in their
default configuration. while i wouldn't advocate some of the less
reasoned responses for dealing w/people who attempt to be
"helpful" putting some process in place and configuring the upstream
equipment in such a manner that protects from this situation taking
place will make your life a lot easier going forward.
On Jan 27, 2006, at 11:26 PM, Torleiv Ringer wrote:
> Hello,
>
> Here is something that I must ask for guidance from the more
> experienced
> administrators. I must say up front that I am not a networking
> specialist. My forte is in our "custom" system.
>
> We have a situation at work where someone misconfigured a switch
> and it
> caused some major failures that were hard to trace. We have a unique
> environment where we route custom packets through a brand-new Cisco
> 4500. We
> have new Cisco 3560 switches that distribute links to each of two
> rooms that
> have "custom" digital equipment. This is a new setup for us, and is
> mission-critical. We have moved from an analog network to this new
> digital UDP based system. In total we have about 14 3560s, and
> plugged into
> these are about 200 other "custom" switches that are vendor-specific.
>
> The point of contention is that someone thought they were doing the
> right thing and jumped in where they were not asked to by installing a
> new 3560.
>
> *) This person set the switch up "hot" (on the network)
> *) They used two uplink ports, intending on ganging them togther
> *) They did not properly set the ports into a channel-group
>
> This made the 3560 seem like a router and flooded all of our custom
> switches with so much traffic that the devices could not effectively
> talk to each other. This would be sorta OK in a TCP environment,
> but we
> have a UDP based system that relies heavily on very low latency.
>
> So here is my question:
>
> I am being pushed by the higher-ups to come up with a software
> solution
> for this problem, which I feel is a process problem. The process
> should
> be to NOT SET THE SWITCH UP ON THE F**KING NETWORK! And to have
> another
> person verify the setup prior to bringing up a new piece of
> equipment on
> the network that is mission-critical. Beyond that the person just went
> and did it without coordinating with anyone.
>
i would go a step further an put any existing infrastructure gear
into a port shut mode. i know you can lock these network elements
down to prevent someone from merely attaching gear to the network and
interoperating with it. further, you can turn TACACS on and use
privilege levels to allow folks to view things w/o necessarily giving
them the keys to the kingdom on the network. additionally, when
people know that command logging and accounting is turned on, they're
much less likely to do, well, dumb stuff.
depending on your configuration and situation, i would recommend
going through your configurations and using the 'shutdown' command on
unused ports. requiring coordinated action in order to make things
come online in the network. a little more work, but it usually
results in very stable configurations and people not glibly plugging
things into the network. in short, there are some very reasonable
hardening guidelines that you can follow to put a little structure
around things and prevent meltdowns caused by cockpit error.
> Should I bow to the pressure and force our vendor to "fix" their
> software to be able to function in an abnormal network setup? This
> would
> allow certain folks to save face while straining our relationship with
> out vendor.
without more details on what your operating environment is here and
what the objectives are, i would say that this is difficult and not
likely to be a successful undertaking. vendors don't tritely change
default configurations and in the scenario you've outlined, it
doesn't sound like you're doing anything that's
> Or
>
> Should I instill a process such that this would never happen again and
> put the lock-down on people who configure devices in/on this network?
> This involves disallowing the people who are supposed to be the
> networking specialists from configuring the "custom" network.
>
> Or
>
> Is there a Cisco configuration that can be used to disallow "unknown"
> routers on the VLAN? This seems unlikely to me.
there are some features that you can use to require that network
elements authenticate themselves onto the network and some common
best practices which accomplish something very similar. these don't
usually work at the VLAN level, .1x works at the port level, but you
can do some simple things to prevent total topology meltdown in a
purely switched network.
> It's one or the other at this point, as we have lost a lot of
> credibility in this situation, and we must move forward with
> implementation. This is the second time now that a misconfigured
> switch
> has been setup hot on the "custom" network.
>
> Has anyone had a similar situation?
>
> Thanks in advance for your sage advice.
>
>
> p.s. No, at this point I cannot divulge what the "custom" is.
{ snipped - misc. signatures }
--
steve ulrich sulrich at botwerks.org
PGP: 8D0B 0EE9 E700 A6CF ABA7 AE5F 4FD4 07C9 133B FAFC