DC 3.0 » Network Janitor

Proprietary Cometh before the Standard

Driving home the other night I was listening to the latest episode of the podcast “Coffee with Thomas”. This episode had our host, Thomas Jones, interviewing Steve Chambers of ViewYonder (and also a history of great vendors!). During the interview, Steve made the following comment:

It amazes me that people criticise Cisco for not be standardised on things that are brand new, as if the standards bodies are innovators. That is not their job. They follow up after things have been invented.

This statement took me back at first, and I was about to write it off as protecting your own team, but the further I drove (I have a very long commute – about 100km each way) the more I thought about this statement and the evidence – both historical and current.

As network engineers we often love to get into religious arguments about which technology is better, and almost invariably a comparison will be made between a vendor proprietary solution and an open standards solution. Whether that is EIGRP vs OSPF, HSRP vs VRRP, or FabricPath vs TRILL they all seem to come down to a core need:

I have a need right now, and the standards bodies will take years to agree. Do I innovate and press forward with what I can do now, or do I wait?

When HSRP was introduced the market had a need for redundancy of Layer 3 gateway devices. A vendor saw a problem, thought of a solution, and implemented the solution without waiting for the rest of the market to catch up. And I firmly believe that this is in the best interest of the market. Once HSRP was recognised as a viable solution other vendors and standards bodies worked on bringing to market an industry standard protocol. Admittedly Cisco could have done many things to improve HSRP (proper load balancing is just one idea, though they brought similar options into GLBP), but they brought a solution to market when it was needed.

Zoom forward to today. The market today is seeing a push towards large virtualised domains (dare I say “Cloud”?), and as a side effect of this we (currently) need to implement these solutions using large Layer 2 domains. As has been the history of Layer 2 this comes with the “nightmares of spanning tree”, that seems to wake up so many admins at 2am in the morning in a cold sweat.

A problem exists now, and a solution is needed now. As is usual with standards bodies there is much debate as to how to move forward as everyone has their own opinion. In fact, which standards body do we look to? IETF with TRILL or IEEE with SPB? Life is slow in the land of standards, and while the boffins are battling it out there is a market place looking to roll out a solution to meet a need in there own environment.

So who is tackling these problems today? Well, like in the past, some vendors are claiming “There is no standard to move forward with”, or “We will support it when there is a standard”. This is not innovation – this is doing the same as everyone else. Im honestly not sure how I feel about a tech company who will only do exactly what everyone else is doing. Maybe in the sub-$50 SMB switch and router market sure – but not somebody I have to justify budgets and ROI and staking part of my career on!

Cisco’s FabricPath and OTV solutions may not be standards compliant. They do appear to have taken several of the nice features of the proposed standards and added some other features that will improve the Nexus platform overall, but they are not backwards (forwards?) compatible with either of the proposed standards.

Is this a bad thing? They have gone to market with a solution to a problem we have now. From the accounts I have heard the technology appears to do what it promised (even if it was behind schedule). With the standards “almost ratified” should they have waited so that their competitors could go to market with a solution at the same time?

The main thing I would like to hear from Cisco is that they will also support the standards compliant version when ever it finally gets ratified.

Im sitting in a hotel room in Hong Kong right now, with a whole city to explore so I will leave it here for now.

As always, thoughts, flames and comments welcome.

Comments (7)

February 19, 2011 · Filed under Rant

DCB: How to Engineer your way out of a poor architecture decision!

I recently gave a presentation to the New Zealand Network Operators Group (NZNOG) 2011 conference on “Data Centre 3.0”. During my research over the last 8 months coupled with the fact checking I had been following up during the creation of the slides, I kept asking myself:

“Would we need all these protocols if we, as an industry, had made better technology implementation decisions?”

I understand the background and requirements for some of the different technology proposals, particularly Layer 2 Multi-path and the various Data Centre Bridging (DCB) QoS standards, but I cant help but feel that we are trying to bring features of the higher layer protocols down into Layer 2.

Back when I started studying networking (probably around mid 2001 when I first obtained my CCNA), the CCNA curriculum was quite clear on the OSI layer and how each layer had a very particular purpose, with clear functional definitions:

Layer 2 – Communication of hosts on the same media segment

Layer 3 – End-to-End addressing and communication

Layer 4 – Connection Oriented traffic via TCP (including Window Scaling) , or Connectionless with UDP

Layers 5 to 7 – Application Layer with added session control and possible tracking of per flow based statisticsFrom early on we had options for end to end communications, we had options for scaling traffic based network conditions (eg TCP Window Scaling), and we have various iterations of Layer 3 Quality of Service (TOS, DSCP etc).

As the popularity of Ethernet Switching (Ivan: You know I mean bridging!) continued to grow and with the majority of Layer 2 networks standardising on Ethernet as the de-facto Layer2 Standard, we started to see individual Layer 2 domains span larger and larger areas. No longer were these simply a series of hosts on a shared bus segment (eg 10Base2) or even a simple hub and spoke segment on a single hub/bridge (eg 10BaseT) but rather a large interconnected mesh of bridges spreading across floors, buildings and campuses.

Now we needed a way of classifying traffic based on priorities that would be consistent across these large layer 2 domains. This was addressed in the 802.1p standard, which allowed priority classification on 802.1Q trunk links – but did nothing for access ports.

Various proposals have been put forward in an effort to address the need for end to end QoS control of Ethernet traffic. One of the driving forces behind this is the requirement of “Lossless Ethernet” in converged storage networks.

The history of SCSI, FibreChannel and FCoE is documented elsewhere, but needless to say some bright spark decided the best solution would be to embed SCSI commends directly into Layer 2 (plus some L2 headers of course), and not build in any error or packet loss checking. Had they chosen instead to use an IP based protocol, they could have easily used the functions already existing TCP/IP to detect these problems, but instead now boffins and propeller heads are busily creating an array of standards to try and combat the fear of dropped packets in storage networks. All of this adds up to new hardware, new chips, and more places for things to break!

On top of this, we have the wonderful phenomenon of “Virtualisation”. With the poor architecture choice of a single vendor (and those that copied them), we now have an army of SysAdmins shouting the mantra of Layer 2 Data Centre Interconnect. Not only do we need to have multiple locations for redundancy, but they must be in the same Layer 2 segment for this design to work correctly!

Traditional (and sensible) network design would put each of these locations into separate IP subnets, and utilise IP routing for clear separation of the distinct networks. Now vendors of network equipment – including load balancers and security devices are scrambling to re-architect their products to support this new design paradigm.

Greg Ferro and I were chatting a while back about all the things people are trying to tack onto “Ethernet” – QoS, OAM, end-to-end communication etc, and this question came up:

How Far do you go before it stops being ethernet?

Why is it that we continually are making a rod for our own back? When do we stop trying to extend protocols with functions they were not designed for, especially when we already had to solutions available to us elsewhere?

I’m not sure where this is all headed, but with the growth of Layer 2 networks spanning across geographic locations fuelled by the growth of virtualisation and converged storage networking are we treading down a well worn path to failure? What costs will there be for organisations when they need to re-evaluate the designs currently considered “Best Practice” by certain vendors?

As always, your thoughts (and flames) are welcome 🙂

Comments (6)

February 2, 2011 · Filed under Rant

Proprietary Cometh before the Standard

DCB: How to Engineer your way out of a poor architecture decision!

Recent Posts

Tags