Days 3 and 4 did not disappoint! I don’t know if I stated this in the earlier posts, but the days basically consisted of lecture in the morning and labs after lunch. I REALLY, REALLY enjoyed the lecture portion. Again, I have to state that the instructor was fairly knowledgeable in regards to ACE, so he was able to actually teach instead of regurgitate a slide deck like other classes I have been in. That makes all the difference in the world. As for the labs, I guess they do some good if you have not had much experience with the ACE CLI. We did not do any labs using the built in GUI or ANM. The problem I have with labs is that they are a very canned and controlled environment. You end up just going through the motions without actually soaking up what it is that you are doing. Ideally, the labs would need to be tailored to your environment to have the greatest effect. This of course, is not realistic. Having said that, I am sure there are some people who get something out of it. My opinion was shared by others in the class in regards to the effectiveness of the labs, so I am not the only one who feels this way. However, the effectiveness of the lecture portion completely overshadowed any shortcomings of the lab portion.
In the interest of brevity, I am going to touch on the things I thought were the most interesting, but I don’t want this post to be so long it requires a coffee break to finish.
Route Health Injection – On a simplistic level, RHI allows the ACE to inject a host route into the network. You would use this to advertise the VIP(virtual IP) that clients use to connect to a server farm. If the server farm is not available due any number of issues, the host route can be automatically removed from the route table and not advertised. The alternative is to simply advertise the VIP’s as part of a regular subnet advertisement like you do with any other VLAN or subnet. Again, I am simplifying this and need to point out that this is NOT something that is specific to Cisco ACE. Other vendors implement similar technologies.
KeepAlive-Appliance Protocol(KAL-AP) – There’s a few variations of the Cisco ACE, and one of those is the Global Site Selector(GSS). Its purpose is simply to provide higher level load balancing between data centers. Basically, it is a load balancer of load balancers. By using KAL-AP, the GSS can query VIP’s at multiple data centers and determine which one is the best fit to send traffic to.
There are a couple of things that the ACE 4710 appliance does that the ACE module cannot. I asked the question as to why this is the case and was told that the ACE appliance has different architecture than the module. It has certain functionality that might come to the module at some point, but for now is restricted to the appliance. These extra functions really revolve around the ACE appliance being able to cache certain HTTP objects and speeding up the process of delivering a web page to an end user. A fair amount of detail on this can be found here.
It sure seems as if I cut back on the information from days 3 and 4 when compared to 1 and 2. I did. Although there were plenty of interesting things covered in the past 2 days of class, a lot of those things would take a while to explain and draw out via diagrams. That’s also assuming that I actually understand these things well enough to explain them in depth.
That brings to me to a more philosophical point in regards to the type of niche product that Cisco ACE is. While it would be great if you knew the CLI on ACE backwards and forwards, it really isn’t necessary. What is necessary is an understanding of what a platform like ACE is capable of. I sat in a meeting today in which some developers wanted ACE to perform health checks on a server outside of a load balance pool and use the results of that query to determine whether or not servers should be removed from a load balance pool. Basically, they wanted to do something that ACE is not really designed to do. Spending 4 days in a classroom learning all about ACE gave me the information needed to have a productive meeting with these developers today. I was able to answer their questions and give better guidance than I would have a couple of weeks ago. I don’t know all the commands for ACE. I will still have to use the configuration guides to look things up now and again. The important thing is that I understand the capabilities and limitations of the ACE load balancer a lot better today than I did prior to taking the ACE class. My main goal is to know what it can and cannot do in order to design anything requiring load balancing properly. To me that is more important than memorizing commands.
Day 2 of ACE boot camp did not disappoint! Another full day of lecture and labs. We covered the following topics:
Modular Policy CLI
Managing the ACE Appliance and Service Module
Layer 4 Load Balancing
I’ll cover some general things about each topic and go into additional details on the points I thought were interesting.
Modular Policy CLI – ACE classifies which traffic it will load-balance based on policy maps, which are comprised of class maps. If this sounds a lot like how you build QoS policies on IOS based routers, it is. The big difference is that ACE is far more restrictive in what those policies contain.
Managing the ACE Appliance and Service Module – Like most Cisco devices, ACE can be managed in a number of different ways. Telnet, SSH, HTTPS, and SNMP. You can even use the XML API if you want. With SNMP, versions 1 and 2 cannot understand contexts. SNMP version 3 can. In order for SNMP version 1 and 2 to work with contexts, you have to use the community string format of “community@context” where “community” is the community name and “context” is the name of the virtual context. When the GET, SET, or whatever SNMP action you choose hits the ACE, the “@context” portion is understood and passed along to the appropriate context.
Security Features – There are a ton of different ways to restrict traffic entering and leaving the ACE. Most of the time you will be focused on traffic entering the ACE. As with applying ACL’s to interfaces on switches and routers, very rarely will you see access lists applied in the outbound direction. That feature is there in case you have some special need to use it.
An interesting capability that the access lists have in ACE is the ability to use object groups to identify which traffic to permit or deny. If you have ever worked on the PIX, ASA, or FWSM, you will be familiar with object groups. They make traffic identification much easier not to mention the simplification of the ACE configuration itself.
The much more granular security options were of great interest to me. Take something like IP fragmentation and reassembly. You can specify the max number of fragments allowed from one packet. If it exceeds the number you specify, you can just drop the traffic. Many other options exist with regards to the packet stream itself. You can enforce certain flags from being set. If violations occur, not only can you drop the traffic, but you could actually reset the flag itself and then send the traffic through the ACE. While most options are configurable, there are some rules that are always enforced. For example, the source IP of a traffic flow can never equal the destination IP.
Layer 4 Load Balancing – This is exactly what it sounds like. Load balancing based on TCP/UDP flows. I think the neatest part about this particular topic was the fact that you can actually load balance traffic across multiple firewalls and have the return traffic come back through the same firewall. This of course requires an ACE on both sides of the firewall, but withe ability for the ACE module to have up to 250 virtual contexts, it doesn’t have to be 2 separate physical ACE modules. The same module can host both contexts that live on either side of the firewall. It is fairly clever how they make this work. Essentially, when traffic comes from one firewall into the ACE, it remembers the MAC address of the sending firewall and places that connection in a state table. When traffic comes back through the ACE, it already knows which firewall to send the traffic to based on that state table. I’m not sure I would want to use an ACE module for load balancing through firewalls, but there are plenty of customers out there that are already doing it or could see the benefit in doing something like that.
Health Monitoring – If there’s one thing the ACE seems to have a fairly large amount of options on, it’s the health monitoring or probes. All the major protocols have specific probes on the ACE that are used to check the health of the back end or “real” servers. This is way beyond the load balancer simply pinging the server to make sure it is up and running. Let’s say you used the HTTP probe. Instead of just trying a simple ping to check a back end servers’ status, the HTTP probe can actually go out and make an HTTP connection to the server or serverfarm. That’s a far more intelligent way to query server status. Based on the probe results, any number of things can be done to the various serverfarms and servers ACE may be providing services for. They may be taken out of active status, have their priority reduced, etc.
There’s a LOT more to this stuff. This was only day 2 of 4! More to come.
First off, let me point out that this is not a boot camp with a certification in mind. It’s a 4 day course given by Firefly Communications. Although I booked the course through Global Knowledge, I was told that they typically outsource their data center courses to Firefly. Works for me. As long as it is quality training, I don’t care if you outsource it to Elbonia. I am assuming they use the term “boot camp” because it is an end to end ACE class taught in just 4 days.
Which brings me to my first point. My company was able to use Cisco Learning Credits to pay for this class. At 30 credits, that translates to $3,000 US dollars for 4 days worth of training. Sitting in the class, I couldn’t help but notice people doing regular work while the instructor was going through his lecture. I realize most places are understaffed. Outages happen. Fires have to get put out. However, $3,000 for 4 days to me is a big deal. If you send your employees off to training that is critical/applicable for their job, LET THEM TRAIN! Leave them alone while they are there. Of course, that’s a 2 way street in that some employees need to learn to let go as well. The company will function without them for a few days. You can turn off “martyr” and “hero” mode for a couple of days. I am checking e-mail at night, but not being obsessive about it. I have very capable co-workers who can do anything and everything without my help.
Now, on to the actual class. Let me begin by commenting on the quality of instruction. I’ve been to plenty of poor classes in which someone was trying to shovel test material down your throat the whole time. I’ve also sat in several classes where the instructor was obviously out of their league and could not field questions from the crowd that weren’t covered on the vendor approved slide deck. That is simply not the case with Firefly. My instructor is very competent and when he hits the limit of his knowledge, he indicates that. So far, I think I have only seen 1 time out of the dozen or so questions he was hit with today in which that was the case. I guess that is what $3,000 a seat gets you.
It seems as if there is a fairly decent mix of people in this class. About a dozen or so in attendance. A fair amount of them are actually using the ACE 4710 appliance which I thought was rather interesting. Of course, most are using the standard ACE module. There are varying levels of experience with ACE as well. I was under the impression that I would be here mainly for the second half of the class, as I felt comfortable with the basics. Of course, just when you go and get comfortable, you realize how little you know. I learned a LOT today. Mainly, it was about things I never really bothered to dig into. You see, like most people, we probably only dig into the features we absolutely need right now. Maybe we plan on coming back and covering everything else at a later time, but I think that happens far less than we’d like it to. Some of the things we covered today that I was horribly deficient on were:
Resource Management – If you use multiple contexts, RM can prevent a single context from taking over the entire resources of the module. I don’t use this as it is currently not a concern, but good to know if things change!
ACE 4710 appliance – I don’t use it and never have. However, it does do a few things the module does not mainly centered around application acceleration. We have not covered that exhaustively yet, but I will take good notes when we do.
There were other things covered in which I was glad to get a decent refresher. The main one being TCP sequence numbers. They are always a bit confusing to me if I don’t study them on a fairly regular basis. Although you weren’t there with me in class today, you can read this post by Jeremy Stretch which talks about TCP sequencing. He even uses nice graphics!
We ended the day doing a pretty simple lab in which we created some contexts and messed around with resource management to see if we could oversubscribe the module in terms of CPU, memory, etc in regards to other contexts. Overall, it was a really good first day. I am eagerly anticipating what tomorrow will be like. It is also good to be taught by someone who actually helped develop the slide deck the course is taught from. He was able to add funny little details about how he created this drawing or that. It’s always nice to have someone teach who has a great sense of humor. So far, I give the Firefly ACE boot camp 2 thumbs up!
I am hoping to get a wee bit more technical in the following posts regarding ACE boot camp as the remaining days will REALLY focus on load balancing. Who knows? I might even post a graphic or two! Shocking isn’t it?
It’s not that I don’t have anything to say! People who know me know that I very rarely shut up for more than a few minutes. It’s just that I have been fairly busy lately. A lot of different things have been eating into my time and writing things for a network blog take a lot of time and effort. I have a 4 day Cisco ACE class next week in which I will be out of town, so I hope to get several posts done at night when I am sitting in the hotel. You don’t actually think I will be going out at night do you? Hmmmm…..a week away from the office and a training day that ends at 4:30pm. That leaves me all sorts of time for the following:
1. Catch up on the billion or so web pages I have bookmarked.
2. Get some things written for the blog that revolve around possible competitors to the Nexus 7000. With HP, Arista, Brocade, Force10, and Juniper selling competing products, there’s a lot of data to sift through. I honestly have no idea who will come out on top. It might just be the Nexus 7000!
3. Comment on my experience with the ACE class I will be taking with Global Knowledge. I’ve spent the last several days at work focused on ACE, so I am very interested in filling in the gaps of my knowledge regarding this interesting product.
4. Read up a little more on the Cisco/EMC/VMware vBlock concept. I went to a presentation today about that and am intrigued to say the least.
5. Write about the concept of baselining your in-house applications. This would be focused on knowing what the normal TCP/UDP operations look like from a packet capture standpoint.
I try and keep a running list in Evernote of the things I would like to write about. The list continues to grow, but the time it takes to transform just one of those ideas into a somewhat coherent post just hasn’t been there.
I hope to have some new content up early next week. The last thing I want is to end up abandoning this blog and waste all my time playing mindless games on my iPad, although I do enjoy doing that a few times a week.
It seems that if you read enough technical stuff, you are bound to find things you either disagree with, or know to be untrue. At least I think they are untrue until I validate my thoughts with the applicable standard or bounce my thoughts off of others within the IT community. I understand that it is VERY difficult to write books, white papers, articles, etc with a technical focus and have them turn out well. A lot of editing has to take place. The target audience has to be considered as well. In short, it is a lot different than writing a fiction novel or short story as when it comes to technical stuff, there’s a bunch of people out there second guessing every sentence you write and diagram you create. With fiction writing, the entire story is in someone’s head and accuracy is not a factor. Let me also state that I have tremendous respect for most technical authors.
I have been doing a lot of reading on switching lately. I’m lazily making my way to the CCIE R/S lab attempt so I have been reading the 3560 command reference guide. It’s the Good Will Hunting approach. In addition to that, I decided to supplement my lunch today with some reading of End-to-End QoS Network Design. So, now that I am thinking about switching and QoS, I came home after work and started in on my so far unused copy of Cisco LAN Switching Configuration Handbook. There’s a QoS chapter. Hooray! The best of both worlds.
I get to the end of the third page in the chapter and run into a problem. That’s page 223 if you have the printed version handy. I came across this tidbit:
Classes 1 through 4 are termed the Assured Forwarding (AF) service levels. Higher class numbers indicate higher-priority traffic. Each class or AF service level has three drop precedence categories:
* Low (1)
* Medium (2)
* High (3)
Traffic in the AF classes can be dropped, with the most likelihood of dropping in the Low category and the least in the High category. In other words, service level AF class 4 with drop precedence 3 is delivered before AF class 4 with drop precedence 1, which is delivered before AF class 3 with drop precedence 3, and so on.
Did you spot any errors? At first, I thought I was reading it all wrong. Maybe I misinterpreted what the authors were saying. AF classes have different priority levels? Hmmmm. I thought they were the same. Then of course, there’s the issue of the book stating that the higher the drop precedence, the less likely it will get dropped. In other words, the book is saying that AF43 will get dropped less than AF41 along with a repeat of the logic stated earlier that AF41 comes before AF33 in terms of priority. At this point, I am really starting to get confused. That can’t be right. I’ve never heard anything like that before.
Maybe it was an error that was fixed in the errata that Cisco Press usually posts on their site. I checked, and there were no corrections issued for that particular book on the Cisco Press site. I also went over to Amazon and read through the reviews people had written. A lot of times, people will list errors in their book reviews, so it is a good source of information when determining whether or not to buy the book. There was nothing there that alluded to this potential error.
The next step was to check the RFC along with the End-to-End QoS book I am reading as well. RFC 2597 is entitled Assured Forwarding PHB Group. Or, AF per hop behavior group. Remember that in a DiffServ environment, QoS is done on a per class basis. There is no end to end guarantee for an individual flow so to speak. That’s what IntServ is for. One might even say that a per hop behavior is indicative of each hop being able to do whatever they want to traffic based on the class it is in. The router could care less about the entire flow. It looks at DSCP markings and makes a decision based on that. Maybe that’s being too generic, but that is the way that I understand DSCP, PHB, etc. The RFC said the following:
Section 1 Paragraph 3
Within each AF class IP packets are marked (again by the customer or
the provider DS domain) with one of three possible drop precedence
values. In case of congestion, the drop precedence of a packet
determines the relative importance of the packet within the AF class.
A congested DS node tries to protect packets with a lower drop
precedence value from being lost by preferably discarding packets
with a higher drop precedence value.
Section 2 Paragraphs 1 and 2
Assured Forwarding (AF) PHB group provides forwarding of IP packets
in N independent AF classes. Within each AF class, an IP packet is
assigned one of M different levels of drop precedence. An IP packet
that belongs to an AF class i and has drop precedence j is marked
with the AF codepoint AFij, where 1 <= i <= N and 1 <= j <= M.
Currently, four classes (N=4) with three levels of drop precedence in
each class (M=3) are defined for general use. More AF classes or
levels of drop precedence MAY be defined for local use.
A DS node SHOULD implement all four general use AF classes. Packets
in one AF class MUST be forwarded independently from packets in
another AF class, i.e., a DS node MUST NOT aggregate two or more AF
Okay, so the RFC is fairly clear that the lower the drop precedence, the safer the traffic should be. Routers and switches should drop AFx3 before AFx2, and drop AFx2 before AFx1. Additionally, we learn that the 4 AF classes are general use. There is no hierarchy. AF4x is no different than AF2x. We may give AF4x more bandwidth in a given QoS policy during periods of congestion, but we can also do the same for AF3x, AF2x, or AF1x traffic if we want to.
Looking for the same type of information in the End-to-End QoS book yields the following taken from chapter 3, Classification and Marking Tools, in the “Marking Tools” section:
DSCP values can be expressed in numeric form or by special keyword names, called per-hop behaviors (PHB). Three defined classes of DSCP PHBs exist: Best-Effort (BE or DSCP 0), Assured Forwarding (AFxy), and Expedited Forwarding (EF). In addition to these three defined PHBs, Class-Selector (CSx) codepoints have been defined to be backward compatible with IP Precedence (in other words, CS1 through CS7 are identical to IP Precedence values 1 through 7). The RFCs describing these PHBs are 2547, 2597, and 3246.
RFC 2597 defines four Assured Forwarding classes, denoted by the letters AF followed by two digits. The first digit denotes the AF class and can range from 1 through 4. (Incidentally, these values correspond to the three most significant bits of the codepoint, or the IPP value that the codepoint falls under.) The second digit refers to the level of drop preference within each AF class and can range from 1 (lowest drop preference) to 3 (highest drop preference). For example, during periods of congestion (on an RFC 2597–compliant node), AF33 would be dropped more often (statistically) than AF32, which, in turn, would be dropped more often (statistically) than AF31. Figure 3-7 shows the Assured Forwarding PHB encoding scheme.
More validation that I was correct in my initial hunch. Even though I was fairly certain the LAN Switching Configuration Guide book was wrong, I wanted to double-check with perhaps one of the greatest tools available. Twitter. I got responses from 3 different people, which is saying something considering it was about 10PM CST. Thanks to Steve Shaw, Steve Rossen, and Andrew vonNagy for their assistance in validating my assumptions. Steve Shaw pointed me to the exact paragraph in RFC 2597 and Andrew mentioned 1 of the 3 new QoS videos that Kevin Wallace created and posted on YouTube which discuss the AF class and drop precedence. Check it out. It’s a fantastic video.
Now, you might be wondering why I am making such a big deal out of this. Is it really important? In this case, yes. Understanding the AF class and drop precedence is vital to one’s understanding of DSCP as a whole. If you get this wrong, it could bite you down the road when designing a QoS policy for a large network. It can bite you when troubleshooting a QoS problem. Some things have to be right every single time they are put in print. QoS is not some super-easy technology that can be mastered in a half-day. Auto-QoS is a great functionality on hardware. I used it this past weekend when configuring 3750 POE switches for use with Cisco phones. However, typing a command and understanding what the ramifications are of that command are 2 different things.
Thank goodness for multiple sources!
If you have been around IT for more than 5 minutes, you have probably been involved in a technology dispute. You have come across the person who loathes any company but one. Or, they hate one company more than any other. Perhaps they hate certain protocols or technologies because they are slightly proprietary. You get the point.
These people are everywhere. Perhaps you are one. I have been one at times. Maybe even right now. With the sheer amount of things your average networking professional is required to know, it is often easier to take refuge in the arms of a select few vendors. In a previous post, I asked the question regarding whether or not we can stay vendor neutral. I think we can, but it takes some concerted effort on our part to do so.
I don’t want to re-hash that old post, so I will move on to the point I want to make in this post. When you think about the companies you buy from, (By that I mean the actual hardware/software producer and not the reseller.) why do you buy from them? Surely you are not using only price to justify your selection are you? What are the technical reasons you buy from certain vendors? Can you name any of them? How about if I give you a competing product? Can you tell me why your choice is better than the competition?
About a month ago, I bought an iPad. I went into the Apple store and stood in line to buy my iPad. As I was standing there, a young couple was looking at a Macbook, or iMac, or whatever and asked the sales guy why they should buy a Mac. I was actually impressed with how the lady asked the question. She said: “We are looking to get a new computer and I want you to tell me why I should buy a Mac. They cost a lot more than an HP or Dell system.” Obviously someone who is open to different technology, but wants to make the right purchase. She had “accountant” written all over her. The reply from the sales man really took my by surprise. He said: “You buy a Mac for several things. First, you don’t have to worry about any viruses. Second, it is a lot more secure than any Windows machine. Third, you don’t have to worry about it crashing on you. Fourth, it costs more because it is a much higher quality product.”
I didn’t stick around long enough to hear if he closed the sale or not. I was too enamored with my ability to con my wife into letting me spend $499 on a device that will waste even more of my time with meaningless games and YouTube videos. As I heard him say those things to that couple, I was thinking how incredibly naive and wrong they were. The Apple computing platforms have been relatively unharmed by large amounts of viruses and security issues because their market share has always been in single digits and wasn’t worth the criminal/hacker community’s time and effort. If 90% or more people are using Windows boxes, why would you spend time on less than 10% of the computer population? In the past couple of years, Apple has made huge gains in the consumer market. Huge. You’ll see an increasing number of exploits head Apple’s way as their market share increases. My opinion. I could be wrong, and if I am, call me out on it. As for Apple having to deal with OS or app crashes? Nah. That would never happen right? Perhaps the only thing he said that I would possibly agree with is that it costs more because it is higher quality. After using my iPad for a month, I must say that it is a VERY polished system. I love the way it works, but I do have plenty of apps that crash. Safari included.
Whew! Enough talk about Apple. I mentioned that story just to make a point. Sometimes we delude ourselves into believing that one product/company is better than another based on hearsay, groupthink, or own positive experience with that product/technology/protocol. Perhaps it is all we’ve ever known and thus come to the conclusion that it is the best. Or maybe that guy was just trying to make a sale and counted on the ignorance of the consumer. I don’t know. I doubt I will make another trip to the Apple store unless they are the only ones selling Apple TV. What can I say? I’m becoming a convert/fanboy/zombie when it comes to Apple.
Here’s an exercise for you. Don’t worry. It’s purely a mental one. Act as if you were a first time visitor to your company data center, computer room, closet, or wherever you hide your network gear. Ask about the various products you bought and why you chose them over a competing product. If you run a Cisco ASA firewall, why did you pick that over CheckPoint, Juniper NetScreen, WatchGuard, or SonicWall? Why did you choose that Juniper router over Cisco, Vyatta, Brocade, or Adtran? It’s a good exercise because it forces you to confront the real reasons you buy from certain vendors. You see, you can be a fan of a product or a company and buy continually from them without ever really considering why you do it in the first place. At some point, someone who knows a fair amount about that particular product space might ask you to defend your selection. You better have a better answer than cost or the plethora of free lunches you get from the vendor. If you have no idea what the criteria is for determining the best choice, then you might be in over your head. Don’t worry though. Most people won’t notice as long as the free lunches keep rolling in.
In closing, can you be a technology bigot? Not if you want to be a professional. Every company has flaws and every company will produce bad technology from time to time. Being open to all solutions will keep you from buying the bad technology or using the wrong protocol. Your job as a corporate drone like myself is not to convert everyone to a particular product/technology to where they shut out reason and refuse to consider alternatives. Your job is to find the right product for your particular situation. Let the facts behind your decision speak for themselves. Tell people why you chose a particular product or technology from technical merits alone and you’ll find most people will accept that. Tell people that only a moron would pick something else and you’ll end up with a lot fewer friends. You better hope the vendor you buy from wants to buy you lunch all the time because no one else will.
****EDIT: I should probably make the point that I am only focusing on technical merits of hardware/technology first. There are other very valid reasons to buy or not buy certain products such as ease of use or familiarity by existing staff, ability to procure said equipment, or size and scope of project. If you have a fairly nailed down requirements list for some remote sites and need to deploy equipment there, then I wouldn’t advocate going through a full blown product selection procedure every single time. My point is simply that before any of those things are considered, the product must meet the technical requirements of the job at hand. After determining that, then you can consider the support structure, cost, etc. If the cost is too much, your requirements will have to change.