**** Please note that these are my own thoughts and observations and should not in any way be taken to be the opinion(s) of my employer. Additionally, this is a rather long post, so please bear with me. I promise not to waste your time by babbling incessantly about non relevant things.
Finally! After many hours spent sifting through vendor websites and reading various documents, I have finished my comparison. If there’s one thing I came away with in this process, it’s that some vendors are better than others at providing specifics regarding their platforms. By far, Juniper was the best at providing in depth documentation on their hardware and software. Although Cisco has a ton of information out there about the Nexus 7000, I found that a lot of it was more on the architecture/design side and less on the actual specifics of the platform itself. Some vendors still hide documentation behind a login that only works with a valid support contract. In my opinion, that’s not a good thing. I think most people research products before they decide to buy, so why hide things that are going to cause roadblocks for people like myself trying to do some initial research? I’ve read MANY brochures, white papers, data sheets, third party “independent” tests(meaning a vendor paid for a canned report that gives a big thumbs up to their product), and other marketing documents in the past couple of weeks. I did not actively seek out conversations with sales people in regards to these products. I did have a couple of conversations around these products and not all the people I talked to were straight sales people. Some were very technical. However, I wanted to go off the things that the websites were advertising. Once the list is narrowed down to 2 or 3 platforms, the REAL work begins with an even deeper dive into the platforms.
I wish I could display the whole thing on this website and have it look pretty. Unfortunately, I don’t know how to do that and make it look nice. Remember, I get paid for networking stuff and not my web skills! In consideration of that, I have attached a PDF file of my comparison chart. I have the original in Excel format, but WordPress wouldn’t allow me to upload it. If you want a copy, I can certainly e-mail it to you. You can send me your e-mail address via a direct message in Twitter. I can be found here.
What IS included in the spreadsheet.
I would love to say that I did all of this work for the benefit of my fellow network engineers, but I would be lying if I said that. I built this out of a specific need that my employer has or will have in the coming months/years. Due to that, some of the features that were important to me may not be important to you. If you find yourself wondering why I included it, just chalk it up to it being something that I considered a
requirement. Having said that, it would be selfish not to share this information with you, so take it for what it’s worth.
When it comes to the actual numbers of things like fan trays and power supplies, I tend to build out the chassis to the full amount it will hold. If it can take 8 power supplies, I will probably use 8. Same with fabric
modules. I like to plan with the belief that I will fully populate the chassis at some point, so I want to have enough power, throughput, and cooling on board to handle any new blades. All chassis examined have the
ability to run on less than the maximum number of power supplies.
When it comes to throughput rates, you have to distinguish between full duplex numbers and half duplex numbers. They don’t always specify which is which, so you have to dig through a lot of documentation to figure out what they are really saying. Thankfully marketing people tend to favor the larger numbers so more often than not, the number given is full duplex. In the case of slot bandwidth, I used the half duplex speed. The backplane numbers are all full duplex.
What IS NOT included in the spreadsheet and why.
If I were to include every single thing these switches support, the spreadsheet would be 10 times bigger than it already is. There are quite a few things that I consider to be basic requirements. These basic things
were left out of the sheet to avoid cluttering it up with things you probably already know. For example, does the switch support IPv6? This should be a resounding yes. If it doesn’t, why in the world would I even
consider it? The same can be said with routing protocols. They all should support OSPFv2 and RIPv2 at a minimum. Most, if not all support IS-IS and BGP as well. It is also worth pointing out that I may not even need this switch to run layer 3. I am looking for 10Gig aggregation and am not necessarily concerned about anything other than layer 2. All of these switches also support QoS. Perhaps they do things a little differently
between each switch, but the basics are still the basics and I don’t really need a billion different options when it comes to QoS. That may change in a few years, but for now, I am not looking at running anything
other than non-storage traffic over these switches.
I think you see my point by now. I could go on and on about what isn’t included. If it is something well known like SSH for management purposes, I don’t need to include it in the list. It’s a given.
Special note on the TOR(Top of Rack) fabric extension.
While I primarily need 10Gig aggregation, another bonus is the ability to have 1Gig copper aggregation as well. However, I don’t want it all coming back to the chassis itself. The Nexus 7010 has the ability through the Nexus 5000’s(of which I already own several) to attach Nexus 2000 series fabric extenders that function as top of rack switches(although it’s not REALLY a switch). This is a nice bonus feature as I can aggregate a lot of copper connections back to 1 chassis without all the spaghetti wiring that is commonly seen in 6500’s and 4500’s. In the case of Brocade and Force10, they actually have the TOR extensions as nothing more than MRJ-21 patch panels. With 1 cable(which is the width of a pencil) per 6 copper ports, the amount of wiring coming back to the chassis is reduced tremendously.
Additionally, there is no power consumption at the top of the rack like there is with the Nexus 2000’s and it is a direct link to the top of rack connections unlike the Nexus model where I have an intermediate 5000 series switch in between.
One final note. The HP/H3C A12508 is listed on the HP site as the A12508, but when you click into the actual product page, it is listed as the S12508. These terms can be mixed and matched and mean the same chassis. I have chosen to use A12508 as the model number as much as possible in this post, but my previous post that mentioned the various switches used the letter “S” instead of “A”.
I plan on posting a few more thoughts on this process as it pertains to specific platforms. I was awed by several of the platforms, not just by the hardware itself, but by the approach the company is taking to the data center in general. Any of these platforms will do the job I need them to do. Some will do that job a lot better than others. As for cost, I have only seen numbers on a few of the platforms. That’s something that is important, but not the most important. You can read my previous post on this for more clarification on what my thought process is.
Remember that I am not claiming to be an expert in regards to any of these platforms. I have done many hours of research on them, but there is a chance that some information in this PDF file will be wrong. If you see any glaring errors, please let me know. I promise you won’t hurt my feelings. If anything is marked “Unknown”, rest assured that I looked at every possible piece of literature on the website that I could reasonably find. If you managed to read this far in the post, the file is below. Enjoy!
*****Update – The Juniper 8200 series does support multi-chassis link aggregation. It just requires another piece to make it work. The XRE200 External Routing Engine gives the 8200 this capability. Thanks to Abner Germanow from Juniper for clarifying that!
I have an increasing need for 10Gig connectivity. Although I may have enough ports today, I have to plan for the future. While I can easily buy some more Nexus 5000 series switches, I would rather have a more capable platform. As a heavy user of Cisco hardware, the logical choice was to use the Nexus 7000 series line. It is a platform that I can grow into over time. I don’t need the big 7018, so the 7010 will suffice. My company has a great relationship with Cisco and our sales rep and local engineer are top notch. No hard selling on their part so the relationship is, in my opinion, a very good one.
Having said that, I also have to point out that I have an obligation to my company to ensure the best product is selected. It would be irresponsible of me to make a technical decision of six digit magnitude and have it come up short in features. I need to make sure the product we select is the best fit for our particular needs. That doesn’t mean the Nexus 7010 is the wrong device. For all I know it will be the best thing for us. Of course, I still have to do my due diligence.
Over the past several weeks, I have been looking over some of the competition. Granted, I still have to spend a lot more time looking at Nexus 7010 competitors, so I am nowhere near done. I’ve been really busy with other things, so I haven’t been able to dedicate as much time as I thought I would to figuring this out. What I have done so far is narrow down a list of vendors and the appropriate product that can compete with the Nexus 7010. Here’s a short list of the features I am looking to compare:
1. 10Gig port count across the entire chassis.
2. 10Gig port/blade/module oversubscription rates. (Some products may not have this issue.)
3. Size of chassis.
4. Power consumption.
5. Layer 2 features(STP, TRILL, proprietary)
6. Layer 3 features(Standard based protocols, proprietary protocols)
7. Cost(Not the main driver, but it is a factor to consider after the technical merits.)
8. Product age(Is it a new platform, or has it been around for more than a year or two?)
9. Focus of the company
10. Size of the company
11. Support structure of the company
12. Code updates(Is there a defined release cycle?)
13. Availability of documentation from the vendor.
14. Connectivity options other than 10Gig(1Gig copper ports or some type of TOR integration aka Nexus 2000’s?)
Obviously there are going to be other things to consider. I also was very vague on the L2 and L3 feature requirements. That was on purpose. As I go through this process, I will be able to elaborate more on the particular L2/3 features that are needed vs those that are available.
Here’s the models I am comparing:
Cisco Nexus 7010
Brocade NetIron MLX 16
HP S12508 – This was recently changed from the S9512E as it was recommended by someone from HP that it was a better comparison to the Nexus 7010.
It is pretty hard if not downright impossible to find competing platforms that have exactly the same specs. I tried to find the closest match in terms of 10Gig port capability since that is the main driver behind this project.
More posts to come soon on this. I am still trying to decide if I want to do a post on each platform individually or do a few posts focusing on certain features that they all have in common. Any thoughts on this are appreciated.
It seems that if you read enough technical stuff, you are bound to find things you either disagree with, or know to be untrue. At least I think they are untrue until I validate my thoughts with the applicable standard or bounce my thoughts off of others within the IT community. I understand that it is VERY difficult to write books, white papers, articles, etc with a technical focus and have them turn out well. A lot of editing has to take place. The target audience has to be considered as well. In short, it is a lot different than writing a fiction novel or short story as when it comes to technical stuff, there’s a bunch of people out there second guessing every sentence you write and diagram you create. With fiction writing, the entire story is in someone’s head and accuracy is not a factor. Let me also state that I have tremendous respect for most technical authors.
I have been doing a lot of reading on switching lately. I’m lazily making my way to the CCIE R/S lab attempt so I have been reading the 3560 command reference guide. It’s the Good Will Hunting approach. In addition to that, I decided to supplement my lunch today with some reading of End-to-End QoS Network Design. So, now that I am thinking about switching and QoS, I came home after work and started in on my so far unused copy of Cisco LAN Switching Configuration Handbook. There’s a QoS chapter. Hooray! The best of both worlds.
I get to the end of the third page in the chapter and run into a problem. That’s page 223 if you have the printed version handy. I came across this tidbit:
Classes 1 through 4 are termed the Assured Forwarding (AF) service levels. Higher class numbers indicate higher-priority traffic. Each class or AF service level has three drop precedence categories:
* Low (1)
* Medium (2)
* High (3)
Traffic in the AF classes can be dropped, with the most likelihood of dropping in the Low category and the least in the High category. In other words, service level AF class 4 with drop precedence 3 is delivered before AF class 4 with drop precedence 1, which is delivered before AF class 3 with drop precedence 3, and so on.
Did you spot any errors? At first, I thought I was reading it all wrong. Maybe I misinterpreted what the authors were saying. AF classes have different priority levels? Hmmmm. I thought they were the same. Then of course, there’s the issue of the book stating that the higher the drop precedence, the less likely it will get dropped. In other words, the book is saying that AF43 will get dropped less than AF41 along with a repeat of the logic stated earlier that AF41 comes before AF33 in terms of priority. At this point, I am really starting to get confused. That can’t be right. I’ve never heard anything like that before.
Maybe it was an error that was fixed in the errata that Cisco Press usually posts on their site. I checked, and there were no corrections issued for that particular book on the Cisco Press site. I also went over to Amazon and read through the reviews people had written. A lot of times, people will list errors in their book reviews, so it is a good source of information when determining whether or not to buy the book. There was nothing there that alluded to this potential error.
The next step was to check the RFC along with the End-to-End QoS book I am reading as well. RFC 2597 is entitled Assured Forwarding PHB Group. Or, AF per hop behavior group. Remember that in a DiffServ environment, QoS is done on a per class basis. There is no end to end guarantee for an individual flow so to speak. That’s what IntServ is for. One might even say that a per hop behavior is indicative of each hop being able to do whatever they want to traffic based on the class it is in. The router could care less about the entire flow. It looks at DSCP markings and makes a decision based on that. Maybe that’s being too generic, but that is the way that I understand DSCP, PHB, etc. The RFC said the following:
Section 1 Paragraph 3
Within each AF class IP packets are marked (again by the customer or
the provider DS domain) with one of three possible drop precedence
values. In case of congestion, the drop precedence of a packet
determines the relative importance of the packet within the AF class.
A congested DS node tries to protect packets with a lower drop
precedence value from being lost by preferably discarding packets
with a higher drop precedence value.
Section 2 Paragraphs 1 and 2
Assured Forwarding (AF) PHB group provides forwarding of IP packets
in N independent AF classes. Within each AF class, an IP packet is
assigned one of M different levels of drop precedence. An IP packet
that belongs to an AF class i and has drop precedence j is marked
with the AF codepoint AFij, where 1 <= i <= N and 1 <= j <= M.
Currently, four classes (N=4) with three levels of drop precedence in
each class (M=3) are defined for general use. More AF classes or
levels of drop precedence MAY be defined for local use.
A DS node SHOULD implement all four general use AF classes. Packets
in one AF class MUST be forwarded independently from packets in
another AF class, i.e., a DS node MUST NOT aggregate two or more AF
Okay, so the RFC is fairly clear that the lower the drop precedence, the safer the traffic should be. Routers and switches should drop AFx3 before AFx2, and drop AFx2 before AFx1. Additionally, we learn that the 4 AF classes are general use. There is no hierarchy. AF4x is no different than AF2x. We may give AF4x more bandwidth in a given QoS policy during periods of congestion, but we can also do the same for AF3x, AF2x, or AF1x traffic if we want to.
Looking for the same type of information in the End-to-End QoS book yields the following taken from chapter 3, Classification and Marking Tools, in the “Marking Tools” section:
DSCP values can be expressed in numeric form or by special keyword names, called per-hop behaviors (PHB). Three defined classes of DSCP PHBs exist: Best-Effort (BE or DSCP 0), Assured Forwarding (AFxy), and Expedited Forwarding (EF). In addition to these three defined PHBs, Class-Selector (CSx) codepoints have been defined to be backward compatible with IP Precedence (in other words, CS1 through CS7 are identical to IP Precedence values 1 through 7). The RFCs describing these PHBs are 2547, 2597, and 3246.
RFC 2597 defines four Assured Forwarding classes, denoted by the letters AF followed by two digits. The first digit denotes the AF class and can range from 1 through 4. (Incidentally, these values correspond to the three most significant bits of the codepoint, or the IPP value that the codepoint falls under.) The second digit refers to the level of drop preference within each AF class and can range from 1 (lowest drop preference) to 3 (highest drop preference). For example, during periods of congestion (on an RFC 2597–compliant node), AF33 would be dropped more often (statistically) than AF32, which, in turn, would be dropped more often (statistically) than AF31. Figure 3-7 shows the Assured Forwarding PHB encoding scheme.
More validation that I was correct in my initial hunch. Even though I was fairly certain the LAN Switching Configuration Guide book was wrong, I wanted to double-check with perhaps one of the greatest tools available. Twitter. I got responses from 3 different people, which is saying something considering it was about 10PM CST. Thanks to Steve Shaw, Steve Rossen, and Andrew vonNagy for their assistance in validating my assumptions. Steve Shaw pointed me to the exact paragraph in RFC 2597 and Andrew mentioned 1 of the 3 new QoS videos that Kevin Wallace created and posted on YouTube which discuss the AF class and drop precedence. Check it out. It’s a fantastic video.
Now, you might be wondering why I am making such a big deal out of this. Is it really important? In this case, yes. Understanding the AF class and drop precedence is vital to one’s understanding of DSCP as a whole. If you get this wrong, it could bite you down the road when designing a QoS policy for a large network. It can bite you when troubleshooting a QoS problem. Some things have to be right every single time they are put in print. QoS is not some super-easy technology that can be mastered in a half-day. Auto-QoS is a great functionality on hardware. I used it this past weekend when configuring 3750 POE switches for use with Cisco phones. However, typing a command and understanding what the ramifications are of that command are 2 different things.
Thank goodness for multiple sources!
Think about something you know a fair amount about. It can be anything in the realm of networking. Now imagine yourself explaining it to someone. Not just anyone. Someone who has a decent grasp on it, but maybe not all of the particulars. Can you explain it to them on the fly without stammering and stuttering your way through it?
I am a Twitter addict. I use it primarily for IT related stuff. There are plenty of valuable links and comments that show up on a given day. Amazing things. Things I never thought about. Comments that come from people who’s books I have read. Comments that come from 4 and 5 time CCIE’s. Comments that come from people who’s podcasts I listen to every week driving to and from work. In short, it is almost as if you know them on some weird Internet non-stalker type level.
Today I saw and even somewhat participated in a discussion about EIGRP. That got me thinking. I like EIGRP. I think it’s neat as far as routing protocols go. It doesn’t have the whole “standards” thing going for it like OSPF or IS-IS. It doesn’t run the Internet like BGP. There aren’t very many books written about it. The CLI options are a lot smaller when compared to OSPF and BGP. The list goes on and on. The more I thought about it, the more I realized I don’t have the complete understanding of it that I wish I did.
Replace EIGRP with about 20 or 30 other networking technologies/protocols and I can make the same argument. I may know all the little acronyms or terms that go along with that technology or protocol, but can I break it down and explain it to someone who sort of understands it and just needs the finer points? Isn’t that what separates the really good engineers from the average ones?
Back to EIGRP though. I understand metric calculation. I understand K values. I understand several other things about EIGRP that go beyond the CCNP level and possibly approaching, maybe even exceeding, CCIE level. I am not bragging. I’ve just put in the hours from an “academic” standpoint, which translates to reading a lot of books, design guides, whitepapers, etc about EIGRP. However, I find myself struggling to come up with all of the arguments for why EIGRP is a hybrid routing protocol compared to a distance vector protocol and vice versa. There are people out there who swear it is one or the other. That should be a relatively simple thing to discern. It makes me think I really don’t understand EIGRP as well as I think I do. Granted, you can NEVER know it all about anything in the IT field, but we still have to try. We read questions on forums from people just starting out with something like EIGRP and think: “How could you not know that? Everyone knows that K1 is bandwidth and K3 is delay.” Maybe we pass by a CCNA book at the bookstore and chuckle at how trivial the description is of EIGRP. “What? You don’t even mention stub routers or how to avoid SIA conditions?” Admit it. You do it. If you don’t, then you are truly the example of a good engineer.
What to do about this? Well, I should study more. I should study and lab so much that when a CCIE walks up to me and says: “How does EIGRP do this?”, I can answer them in a fair amount of detail and even break out the whiteboard and draw it out. Or, crank out a config in a few minutes. Imagine if you knew the protocol or technology so well that you could just spew forth tons of factual information about it? Imagine if you could sit down with a blank piece of paper and fill it up on both sides with information about something like DWDM, 802.11n, PPP, or HSRP. What would that be like? Not just know from an academic standpoint, but be able to apply it to real world scenarios. There is tremendous value in that.
Just something to think about. Imagine having to teach cooking to Emeril. Or martial arts to Chuck Norris. Or basketball to Michael Jordan. Would you want to know your stuff? You betcha. Think about the things you deal with in the networking world and apply the same philosophy to it.
When I begin to understand something well enough to teach it to people that understand it as well and not have them laugh me out of the room, I will be at the level I want to be at. Impossible to do with all things network related, but definitely achievable to do with a dozen or so things. Perhaps the hardest part of it is dedicating the time to achieve that level of proficiency.
I’m going to revisit EIGRP over the next couple of weeks and try to increase my level of understanding even more. Then, I will read someone’s blog post or Twitter comment and realize how little I actually know and go back and do it all over again. Frustrating? Sure, but I will take that any day over a job where you can learn it all in a couple of months. Happy learning!