ACE Boot Camp – Days 3 and 4

September 28, 2010 Leave a comment

Days 3 and 4 did not disappoint! I don’t know if I stated this in the earlier posts, but the days basically consisted of lecture in the morning and labs after lunch. I REALLY, REALLY enjoyed the lecture portion. Again, I have to state that the instructor was fairly knowledgeable in regards to ACE, so he was able to actually teach instead of regurgitate a slide deck like other classes I have been in. That makes all the difference in the world. As for the labs, I guess they do some good if you have not had much experience with the ACE CLI. We did not do any labs using the built in GUI or ANM. The problem I have with labs is that they are a very canned and controlled environment. You end up just going through the motions without actually soaking up what it is that you are doing. Ideally, the labs would need to be tailored to your environment to have the greatest effect. This of course, is not realistic. Having said that, I am sure there are some people who get something out of it. My opinion was shared by others in the class in regards to the effectiveness of the labs, so I am not the only one who feels this way. However, the effectiveness of the lecture portion completely overshadowed any shortcomings of the lab portion.

In the interest of brevity, I am going to touch on the things I thought were the most interesting, but I don’t want this post to be so long it requires a coffee break to finish.

Route Health Injection – On a simplistic level, RHI allows the ACE to inject a host route into the network. You would use this to advertise the VIP(virtual IP) that clients use to connect to a server farm. If the server farm is not available due any number of issues, the host route can be automatically removed from the route table and not advertised. The alternative is to simply advertise the VIP’s as part of a regular subnet advertisement like you do with any other VLAN or subnet. Again, I am simplifying this and need to point out that this is NOT something that is specific to Cisco ACE. Other vendors implement similar technologies.

KeepAlive-Appliance Protocol(KAL-AP) – There’s a few variations of the Cisco ACE, and one of those is the Global Site Selector(GSS). Its purpose is simply to provide higher level load balancing between data centers. Basically, it is a load balancer of load balancers. By using KAL-AP, the GSS can query VIP’s at multiple data centers and determine which one is the best fit to send traffic to.

There are a couple of things that the ACE 4710 appliance does that the ACE module cannot. I asked the question as to why this is the case and was told that the ACE appliance has different architecture than the module. It has certain functionality that might come to the module at some point, but for now is restricted to the appliance. These extra functions really revolve around the ACE appliance being able to cache certain HTTP objects and speeding up the process of delivering a web page to an end user. A fair amount of detail on this can be found here.

It sure seems as if I cut back on the information from days 3 and 4 when compared to 1 and 2. I did. Although there were plenty of interesting things covered in the past 2 days of class, a lot of those things would take a while to explain and draw out via diagrams. That’s also assuming that I actually understand these things well enough to explain them in depth.

That brings to me to a more philosophical point in regards to the type of niche product that Cisco ACE is. While it would be great if you knew the CLI on ACE backwards and forwards, it really isn’t necessary. What is necessary is an understanding of what a platform like ACE is capable of. I sat in a meeting today in which some developers wanted ACE to perform health checks on a server outside of a load balance pool and use the results of that query to determine whether or not servers should be removed from a load balance pool. Basically, they wanted to do something that ACE is not really designed to do. Spending 4 days in a classroom learning all about ACE gave me the information needed to have a productive meeting with these developers today. I was able to answer their questions and give better guidance than I would have a couple of weeks ago. I don’t know all the commands for ACE. I will still have to use the configuration guides to look things up now and again. The important thing is that I understand the capabilities and limitations of the ACE load balancer a lot better today than I did prior to taking the ACE class. My main goal is to know what it can and cannot do in order to design anything requiring load balancing properly. To me that is more important than memorizing commands.

Categories: ace, cisco, learning, load balancing Tags: , ,

ACE Boot Camp – Day 2

September 24, 2010 Comments off

Day 2 of ACE boot camp did not disappoint! Another full day of lecture and labs. We covered the following topics:

Modular Policy CLI
Managing the ACE Appliance and Service Module
Security Features
Layer 4 Load Balancing
Health Monitoring

I’ll cover some general things about each topic and go into additional details on the points I thought were interesting.

Modular Policy CLI – ACE classifies which traffic it will load-balance based on policy maps, which are comprised of class maps. If this sounds a lot like how you build QoS policies on IOS based routers, it is. The big difference is that ACE is far more restrictive in what those policies contain.

Managing the ACE Appliance and Service Module – Like most Cisco devices, ACE can be managed in a number of different ways. Telnet, SSH, HTTPS, and SNMP. You can even use the XML API if you want. With SNMP, versions 1 and 2 cannot understand contexts. SNMP version 3 can. In order for SNMP version 1 and 2 to work with contexts, you have to use the community string format of “community@context” where “community” is the community name and “context” is the name of the virtual context. When the GET, SET, or whatever SNMP action you choose hits the ACE, the “@context” portion is understood and passed along to the appropriate context.

Security Features – There are a ton of different ways to restrict traffic entering and leaving the ACE. Most of the time you will be focused on traffic entering the ACE. As with applying ACL’s to interfaces on switches and routers, very rarely will you see access lists applied in the outbound direction. That feature is there in case you have some special need to use it.

An interesting capability that the access lists have in ACE is the ability to use object groups to identify which traffic to permit or deny. If you have ever worked on the PIX, ASA, or FWSM, you will be familiar with object groups. They make traffic identification much easier not to mention the simplification of the ACE configuration itself.

The much more granular security options were of great interest to me. Take something like IP fragmentation and reassembly. You can specify the max number of fragments allowed from one packet. If it exceeds the number you specify, you can just drop the traffic. Many other options exist with regards to the packet stream itself. You can enforce certain flags from being set. If violations occur, not only can you drop the traffic, but you could actually reset the flag itself and then send the traffic through the ACE. While most options are configurable, there are some rules that are always enforced. For example, the source IP of a traffic flow can never equal the destination IP.

Layer 4 Load Balancing – This is exactly what it sounds like. Load balancing based on TCP/UDP flows. I think the neatest part about this particular topic was the fact that you can actually load balance traffic across multiple firewalls and have the return traffic come back through the same firewall. This of course requires an ACE on both sides of the firewall, but withe ability for the ACE module to have up to 250 virtual contexts, it doesn’t have to be 2 separate physical ACE modules. The same module can host both contexts that live on either side of the firewall. It is fairly clever how they make this work. Essentially, when traffic comes from one firewall into the ACE, it remembers the MAC address of the sending firewall and places that connection in a state table. When traffic comes back through the ACE, it already knows which firewall to send the traffic to based on that state table. I’m not sure I would want to use an ACE module for load balancing through firewalls, but there are plenty of customers out there that are already doing it or could see the benefit in doing something like that.

Health Monitoring – If there’s one thing the ACE seems to have a fairly large amount of options on, it’s the health monitoring or probes. All the major protocols have specific probes on the ACE that are used to check the health of the back end or “real” servers. This is way beyond the load balancer simply pinging the server to make sure it is up and running. Let’s say you used the HTTP probe. Instead of just trying a simple ping to check a back end servers’ status, the HTTP probe can actually go out and make an HTTP connection to the server or serverfarm. That’s a far more intelligent way to query server status. Based on the probe results, any number of things can be done to the various serverfarms and servers ACE may be providing services for. They may be taken out of active status, have their priority reduced, etc.

There’s a LOT more to this stuff. This was only day 2 of 4! More to come.

ACE Boot Camp – Day 1

September 21, 2010 Leave a comment

First off, let me point out that this is not a boot camp with a certification in mind. It’s a 4 day course given by Firefly Communications. Although I booked the course through Global Knowledge, I was told that they typically outsource their data center courses to Firefly. Works for me. As long as it is quality training, I don’t care if you outsource it to Elbonia. I am assuming they use the term “boot camp” because it is an end to end ACE class taught in just 4 days.

Which brings me to my first point. My company was able to use Cisco Learning Credits to pay for this class. At 30 credits, that translates to $3,000 US dollars for 4 days worth of training. Sitting in the class, I couldn’t help but notice people doing regular work while the instructor was going through his lecture. I realize most places are understaffed. Outages happen. Fires have to get put out. However, $3,000 for 4 days to me is a big deal. If you send your employees off to training that is critical/applicable for their job, LET THEM TRAIN! Leave them alone while they are there. Of course, that’s a 2 way street in that some employees need to learn to let go as well. The company will function without them for a few days. You can turn off “martyr” and “hero” mode for a couple of days. I am checking e-mail at night, but not being obsessive about it. I have very capable co-workers who can do anything and everything without my help.

Now, on to the actual class. Let me begin by commenting on the quality of instruction. I’ve been to plenty of poor classes in which someone was trying to shovel test material down your throat the whole time. I’ve also sat in several classes where the instructor was obviously out of their league and could not field questions from the crowd that weren’t covered on the vendor approved slide deck. That is simply not the case with Firefly. My instructor is very competent and when he hits the limit of his knowledge, he indicates that. So far, I think I have only seen 1 time out of the dozen or so questions he was hit with today in which that was the case. I guess that is what $3,000 a seat gets you.

It seems as if there is a fairly decent mix of people in this class. About a dozen or so in attendance. A fair amount of them are actually using the ACE 4710 appliance which I thought was rather interesting. Of course, most are using the standard ACE module. There are varying levels of experience with ACE as well. I was under the impression that I would be here mainly for the second half of the class, as I felt comfortable with the basics. Of course, just when you go and get comfortable, you realize how little you know. I learned a LOT today. Mainly, it was about things I never really bothered to dig into. You see, like most people, we probably only dig into the features we absolutely need right now. Maybe we plan on coming back and covering everything else at a later time, but I think that happens far less than we’d like it to. Some of the things we covered today that I was horribly deficient on were:

Resource Management – If you use multiple contexts, RM can prevent a single context from taking over the entire resources of the module. I don’t use this as it is currently not a concern, but good to know if things change!

HTTP Message Structure – 3 fields make up each HTTP message: Start/Request line(includes the METHOD), Header fields, and Body(which is optional)

ACE 4710 appliance – I don’t use it and never have. However, it does do a few things the module does not mainly centered around application acceleration. We have not covered that exhaustively yet, but I will take good notes when we do.

There were other things covered in which I was glad to get a decent refresher. The main one being TCP sequence numbers. They are always a bit confusing to me if I don’t study them on a fairly regular basis. Although you weren’t there with me in class today, you can read this post by Jeremy Stretch which talks about TCP sequencing. He even uses nice graphics!

We ended the day doing a pretty simple lab in which we created some contexts and messed around with resource management to see if we could oversubscribe the module in terms of CPU, memory, etc in regards to other contexts. Overall, it was a really good first day. I am eagerly anticipating what tomorrow will be like. It is also good to be taught by someone who actually helped develop the slide deck the course is taught from. He was able to add funny little details about how he created this drawing or that. It’s always nice to have someone teach who has a great sense of humor. So far, I give the Firefly ACE boot camp 2 thumbs up!

I am hoping to get a wee bit more technical in the following posts regarding ACE boot camp as the remaining days will REALLY focus on load balancing. Who knows? I might even post a graphic or two! Shocking isn’t it?

Busy, Busy, Busy!

September 14, 2010 2 comments

It’s not that I don’t have anything to say! People who know me know that I very rarely shut up for more than a few minutes. It’s just that I have been fairly busy lately. A lot of different things have been eating into my time and writing things for a network blog take a lot of time and effort. I have a 4 day Cisco ACE class next week in which I will be out of town, so I hope to get several posts done at night when I am sitting in the hotel. You don’t actually think I will be going out at night do you? Hmmmm…..a week away from the office and a training day that ends at 4:30pm. That leaves me all sorts of time for the following:

1. Catch up on the billion or so web pages I have bookmarked.
2. Get some things written for the blog that revolve around possible competitors to the Nexus 7000. With HP, Arista, Brocade, Force10, and Juniper selling competing products, there’s a lot of data to sift through. I honestly have no idea who will come out on top. It might just be the Nexus 7000!
3. Comment on my experience with the ACE class I will be taking with Global Knowledge. I’ve spent the last several days at work focused on ACE, so I am very interested in filling in the gaps of my knowledge regarding this interesting product.
4. Read up a little more on the Cisco/EMC/VMware vBlock concept. I went to a presentation today about that and am intrigued to say the least.
5. Write about the concept of baselining your in-house applications. This would be focused on knowing what the normal TCP/UDP operations look like from a packet capture standpoint.

I try and keep a running list in Evernote of the things I would like to write about. The list continues to grow, but the time it takes to transform just one of those ideas into a somewhat coherent post just hasn’t been there.

I hope to have some new content up early next week. The last thing I want is to end up abandoning this blog and waste all my time playing mindless games on my iPad, although I do enjoy doing that a few times a week.

Technical Book Annoyances

September 3, 2010 Leave a comment

It seems that if you read enough technical stuff, you are bound to find things you either disagree with, or know to be untrue. At least I think they are untrue until I validate my thoughts with the applicable standard or bounce my thoughts off of others within the IT community. I understand that it is VERY difficult to write books, white papers, articles, etc with a technical focus and have them turn out well. A lot of editing has to take place. The target audience has to be considered as well. In short, it is a lot different than writing a fiction novel or short story as when it comes to technical stuff, there’s a bunch of people out there second guessing every sentence you write and diagram you create. With fiction writing, the entire story is in someone’s head and accuracy is not a factor. Let me also state that I have tremendous respect for most technical authors.

However………

I have been doing a lot of reading on switching lately. I’m lazily making my way to the CCIE R/S lab attempt so I have been reading the 3560 command reference guide. It’s the Good Will Hunting approach. In addition to that, I decided to supplement my lunch today with some reading of End-to-End QoS Network Design. So, now that I am thinking about switching and QoS, I came home after work and started in on my so far unused copy of Cisco LAN Switching Configuration Handbook. There’s a QoS chapter. Hooray! The best of both worlds.

I get to the end of the third page in the chapter and run into a problem. That’s page 223 if you have the printed version handy. I came across this tidbit:

Classes 1 through 4 are termed the Assured Forwarding (AF) service levels. Higher class numbers indicate higher-priority traffic. Each class or AF service level has three drop precedence categories:

* Low (1)
* Medium (2)
* High (3)

Traffic in the AF classes can be dropped, with the most likelihood of dropping in the Low category and the least in the High category. In other words, service level AF class 4 with drop precedence 3 is delivered before AF class 4 with drop precedence 1, which is delivered before AF class 3 with drop precedence 3, and so on.

Did you spot any errors? At first, I thought I was reading it all wrong. Maybe I misinterpreted what the authors were saying. AF classes have different priority levels? Hmmmm. I thought they were the same. Then of course, there’s the issue of the book stating that the higher the drop precedence, the less likely it will get dropped. In other words, the book is saying that AF43 will get dropped less than AF41 along with a repeat of the logic stated earlier that AF41 comes before AF33 in terms of priority. At this point, I am really starting to get confused. That can’t be right. I’ve never heard anything like that before.

Maybe it was an error that was fixed in the errata that Cisco Press usually posts on their site. I checked, and there were no corrections issued for that particular book on the Cisco Press site. I also went over to Amazon and read through the reviews people had written. A lot of times, people will list errors in their book reviews, so it is a good source of information when determining whether or not to buy the book. There was nothing there that alluded to this potential error.

The next step was to check the RFC along with the End-to-End QoS book I am reading as well. RFC 2597 is entitled Assured Forwarding PHB Group. Or, AF per hop behavior group. Remember that in a DiffServ environment, QoS is done on a per class basis. There is no end to end guarantee for an individual flow so to speak. That’s what IntServ is for. One might even say that a per hop behavior is indicative of each hop being able to do whatever they want to traffic based on the class it is in. The router could care less about the entire flow. It looks at DSCP markings and makes a decision based on that. Maybe that’s being too generic, but that is the way that I understand DSCP, PHB, etc. The RFC said the following:

Section 1 Paragraph 3

Within each AF class IP packets are marked (again by the customer or
the provider DS domain) with one of three possible drop precedence
values. In case of congestion, the drop precedence of a packet
determines the relative importance of the packet within the AF class.
A congested DS node tries to protect packets with a lower drop
precedence value from being lost by preferably discarding packets
with a higher drop precedence value.

Section 2 Paragraphs 1 and 2

Assured Forwarding (AF) PHB group provides forwarding of IP packets
in N independent AF classes. Within each AF class, an IP packet is
assigned one of M different levels of drop precedence. An IP packet
that belongs to an AF class i and has drop precedence j is marked
with the AF codepoint AFij, where 1 <= i <= N and 1 <= j <= M.
Currently, four classes (N=4) with three levels of drop precedence in
each class (M=3) are defined for general use. More AF classes or
levels of drop precedence MAY be defined for local use.

A DS node SHOULD implement all four general use AF classes. Packets
in one AF class MUST be forwarded independently from packets in
another AF class, i.e., a DS node MUST NOT aggregate two or more AF
classes together.

Okay, so the RFC is fairly clear that the lower the drop precedence, the safer the traffic should be. Routers and switches should drop AFx3 before AFx2, and drop AFx2 before AFx1. Additionally, we learn that the 4 AF classes are general use. There is no hierarchy. AF4x is no different than AF2x. We may give AF4x more bandwidth in a given QoS policy during periods of congestion, but we can also do the same for AF3x, AF2x, or AF1x traffic if we want to.

Looking for the same type of information in the End-to-End QoS book yields the following taken from chapter 3, Classification and Marking Tools, in the “Marking Tools” section:

DSCP values can be expressed in numeric form or by special keyword names, called per-hop behaviors (PHB). Three defined classes of DSCP PHBs exist: Best-Effort (BE or DSCP 0), Assured Forwarding (AFxy), and Expedited Forwarding (EF). In addition to these three defined PHBs, Class-Selector (CSx) codepoints have been defined to be backward compatible with IP Precedence (in other words, CS1 through CS7 are identical to IP Precedence values 1 through 7). The RFCs describing these PHBs are 2547, 2597, and 3246.

RFC 2597 defines four Assured Forwarding classes, denoted by the letters AF followed by two digits. The first digit denotes the AF class and can range from 1 through 4. (Incidentally, these values correspond to the three most significant bits of the codepoint, or the IPP value that the codepoint falls under.) The second digit refers to the level of drop preference within each AF class and can range from 1 (lowest drop preference) to 3 (highest drop preference). For example, during periods of congestion (on an RFC 2597–compliant node), AF33 would be dropped more often (statistically) than AF32, which, in turn, would be dropped more often (statistically) than AF31. Figure 3-7 shows the Assured Forwarding PHB encoding scheme.

More validation that I was correct in my initial hunch. Even though I was fairly certain the LAN Switching Configuration Guide book was wrong, I wanted to double-check with perhaps one of the greatest tools available. Twitter. I got responses from 3 different people, which is saying something considering it was about 10PM CST. Thanks to Steve Shaw, Steve Rossen, and Andrew vonNagy for their assistance in validating my assumptions. Steve Shaw pointed me to the exact paragraph in RFC 2597 and Andrew mentioned 1 of the 3 new QoS videos that Kevin Wallace created and posted on YouTube which discuss the AF class and drop precedence. Check it out. It’s a fantastic video.

Now, you might be wondering why I am making such a big deal out of this. Is it really important? In this case, yes. Understanding the AF class and drop precedence is vital to one’s understanding of DSCP as a whole. If you get this wrong, it could bite you down the road when designing a QoS policy for a large network. It can bite you when troubleshooting a QoS problem. Some things have to be right every single time they are put in print. QoS is not some super-easy technology that can be mastered in a half-day. Auto-QoS is a great functionality on hardware. I used it this past weekend when configuring 3750 POE switches for use with Cisco phones. However, typing a command and understanding what the ramifications are of that command are 2 different things.

Thank goodness for multiple sources!

Categories: cisco, documentation, qos, routing, switching Tags:

Are You A Technology Bigot?

September 1, 2010 3 comments

If you have been around IT for more than 5 minutes, you have probably been involved in a technology dispute. You have come across the person who loathes any company but one. Or, they hate one company more than any other. Perhaps they hate certain protocols or technologies because they are slightly proprietary. You get the point.

These people are everywhere. Perhaps you are one. I have been one at times. Maybe even right now. With the sheer amount of things your average networking professional is required to know, it is often easier to take refuge in the arms of a select few vendors. In a previous post, I asked the question regarding whether or not we can stay vendor neutral. I think we can, but it takes some concerted effort on our part to do so.

I don’t want to re-hash that old post, so I will move on to the point I want to make in this post. When you think about the companies you buy from, (By that I mean the actual hardware/software producer and not the reseller.) why do you buy from them? Surely you are not using only price to justify your selection are you? What are the technical reasons you buy from certain vendors? Can you name any of them? How about if I give you a competing product? Can you tell me why your choice is better than the competition?

About a month ago, I bought an iPad. I went into the Apple store and stood in line to buy my iPad. As I was standing there, a young couple was looking at a Macbook, or iMac, or whatever and asked the sales guy why they should buy a Mac. I was actually impressed with how the lady asked the question. She said: “We are looking to get a new computer and I want you to tell me why I should buy a Mac. They cost a lot more than an HP or Dell system.” Obviously someone who is open to different technology, but wants to make the right purchase. She had “accountant” written all over her. The reply from the sales man really took my by surprise. He said: “You buy a Mac for several things. First, you don’t have to worry about any viruses. Second, it is a lot more secure than any Windows machine. Third, you don’t have to worry about it crashing on you. Fourth, it costs more because it is a much higher quality product.”

I didn’t stick around long enough to hear if he closed the sale or not. I was too enamored with my ability to con my wife into letting me spend $499 on a device that will waste even more of my time with meaningless games and YouTube videos. As I heard him say those things to that couple, I was thinking how incredibly naive and wrong they were. The Apple computing platforms have been relatively unharmed by large amounts of viruses and security issues because their market share has always been in single digits and wasn’t worth the criminal/hacker community’s time and effort. If 90% or more people are using Windows boxes, why would you spend time on less than 10% of the computer population? In the past couple of years, Apple has made huge gains in the consumer market. Huge. You’ll see an increasing number of exploits head Apple’s way as their market share increases. My opinion. I could be wrong, and if I am, call me out on it. As for Apple having to deal with OS or app crashes? Nah. That would never happen right? Perhaps the only thing he said that I would possibly agree with is that it costs more because it is higher quality. After using my iPad for a month, I must say that it is a VERY polished system. I love the way it works, but I do have plenty of apps that crash. Safari included.

Whew! Enough talk about Apple. I mentioned that story just to make a point. Sometimes we delude ourselves into believing that one product/company is better than another based on hearsay, groupthink, or own positive experience with that product/technology/protocol. Perhaps it is all we’ve ever known and thus come to the conclusion that it is the best. Or maybe that guy was just trying to make a sale and counted on the ignorance of the consumer. I don’t know. I doubt I will make another trip to the Apple store unless they are the only ones selling Apple TV. What can I say? I’m becoming a convert/fanboy/zombie when it comes to Apple.

Here’s an exercise for you. Don’t worry. It’s purely a mental one. Act as if you were a first time visitor to your company data center, computer room, closet, or wherever you hide your network gear. Ask about the various products you bought and why you chose them over a competing product. If you run a Cisco ASA firewall, why did you pick that over CheckPoint, Juniper NetScreen, WatchGuard, or SonicWall? Why did you choose that Juniper router over Cisco, Vyatta, Brocade, or Adtran? It’s a good exercise because it forces you to confront the real reasons you buy from certain vendors. You see, you can be a fan of a product or a company and buy continually from them without ever really considering why you do it in the first place. At some point, someone who knows a fair amount about that particular product space might ask you to defend your selection. You better have a better answer than cost or the plethora of free lunches you get from the vendor. If you have no idea what the criteria is for determining the best choice, then you might be in over your head. Don’t worry though. Most people won’t notice as long as the free lunches keep rolling in.

In closing, can you be a technology bigot? Not if you want to be a professional. Every company has flaws and every company will produce bad technology from time to time. Being open to all solutions will keep you from buying the bad technology or using the wrong protocol. Your job as a corporate drone like myself is not to convert everyone to a particular product/technology to where they shut out reason and refuse to consider alternatives. Your job is to find the right product for your particular situation. Let the facts behind your decision speak for themselves. Tell people why you chose a particular product or technology from technical merits alone and you’ll find most people will accept that. Tell people that only a moron would pick something else and you’ll end up with a lot fewer friends. You better hope the vendor you buy from wants to buy you lunch all the time because no one else will.

****EDIT: I should probably make the point that I am only focusing on technical merits of hardware/technology first. There are other very valid reasons to buy or not buy certain products such as ease of use or familiarity by existing staff, ability to procure said equipment, or size and scope of project. If you have a fairly nailed down requirements list for some remote sites and need to deploy equipment there, then I wouldn’t advocate going through a full blown product selection procedure every single time. My point is simply that before any of those things are considered, the product must meet the technical requirements of the job at hand. After determining that, then you can consider the support structure, cost, etc. If the cost is too much, your requirements will have to change.

Thanks to Scott and Jon for their thoughts on the matter.

Categories: vendors Tags: ,

It’s Game Day! Are YOU Ready?

It’s late August here in the United States. That means one thing for a lot of people. Football is starting. No offense rest of the world. Your football is my soccer, although I tend to side with you that my soccer should be called football. How often does one kick an American football? A LOT less than we touch the ball with our hands. I’m getting off on a “semantics” tangent though. It is the one sport that predominately resides within North America. Yes, I am acknowledging you too Canada!

Many athletes at all age levels have been practicing for several months and are ready to get started with the football season. Many a Saturday afternoon, Sunday afternoon, and Monday night will be spent watching people knock each other over to carry a piece of pig skin across some lines on the ground and celebrate by dancing as gracefully as one can when covered by all that protective gear. Millions of people will watch all the way up to early next year when the championships are decided by as little as 1 point. For the most part, there are no do-overs. All of you “instant replay” fans just bite your tongue and let me carry this analogy as far as I can. When the game is over, it is over. There are no series of games like baseball, hockey, and basketball have. You have one shot at glory. Miss it, and you’ll have to wait until next season.

There’s a German proverb which says: “To aim is not enough. You must hit!”

I get paid for things that go bump in the night. Whether that thing happens to be a router failing, or a circuit deciding it no longer likes my 1’s and 0’s, my job is to fix it and fix it fast.

I do come to work during the day. I go to meetings and look at configurations of various hardware. I build network diagrams and dispense or seek advice on a number of different things. I participate in the important philosophical discussions like whether or not Anakin Skywalker was a better Jedi than Luke or Yoda(In my opinion, Anakin (aka Darth Vader) was the better Jedi and was robbed of his destiny by his meddling child and his rebel scum friends). I put in change requests for maintenance that must be performed. I can plow through the day to day stuff without hardly any interaction from management. Of course, they care about the quality of the work and if I used these stencils in my Visio diagrams, they might object. However, my overall existence in the day to day network operations life is rather calm.

In essence, I do the things that need to be done during the day, but my REAL job comes in spurts. Kind of like football(From now on, when I say football, I mean American football.) players. My game time comes at odd hours much like the police officer or fire fighter. When trouble happens, I need to perform. I need to be able to ask the right questions and formulate a short list of what the possible problems are. I need to be able to troubleshoot in a logical fashion either working up/down the OSI model or grabbing a packet capture and examining the session flows. When it is my equipment or systems that are at fault, I have to get in there and make the big play. I need a touchdown each and every time. I can’t drop a pass or fumble the ball. I get paid for results and rest assured my management is watching. They have to. All it takes is for someone much higher up on the food chain to ask why they pay the salaries of network people who can’t seem to fix the network. Then, I am out on the street forced to sell my services to the highest bidder, who hopefully doesn’t play Dungeons and Dragons or World of Warcraft with any of my now former co-workers/managers. I would have used a sport like tennis or basketball, but since I am in IT, the odds of that happening are much less than a bunch of technical geeks sitting around in Viking helmets and leather tunics taking part in the raid of an Ogre village on World of Warcraft over a shared broadband connection in an obscure apartment complex deep in suburbia while guzzling Red Bulls and listening to angry death metal music. By the way, for all of you D&D geeks who are shaking your heads in disgust at my mention of Ogre villages, I get it. I saw Shrek. I know they are solitary creatures, but I needed an effective illustration. If I used Elf village, the visual would have been less powerful.

Am I saying that we can’t make mistakes? Well, that depends. There are some places in which you can’t. Ever. Most places will allow mistakes. We’re all human and mistakes will happen. Of course, with enough attention to detail those mistakes can be minimized significantly. What I AM saying is that you need to be able to perform when a crisis hits. Your entire career at a particular company may come to a screeching halt over just a few minutes of doing the wrong thing. It won’t matter how long you have been with company X if your performance is so poor that company X starts bleeding millions of dollars due to an outage that you can’t fix. Problems are going to happen. Outages are going to happen. If your company expects you to fix them, you better fix them. Now I know that some people get in over their heads. It may be the company’s fault for placing an unrealistic demand on you, or it may be your fault for misrepresenting your capabilities. If your company is expecting you to fix and support issues with F5 load balancers and you have never so much as looked at an F5 load balancer, you better let someone know and get up to speed as fast as you can. After all, your job typically is whatever your company says it is. Don’t like that? Tough. Go somewhere else. Life isn’t fair. Sometimes you are the one in the room everyone is counting on to fix the problem, even if it isn’t your equipment that is causing the problem.

In the interest of brevity, let me close with some thoughts on how to ensure your performance is top notch.

1. Know what the scope of your job is. – This may seem a bit simplistic, but you need to be on the same page as management when it comes to your responsibilities. You cannot rely on someone else to tell you if that piece of network gear buried in some rack in a data center is your responsibility. You are going to have to find that out yourself and it needs to happen before the problem occurs. Hopefully your co-workers who have been there longer than you have a good grasp on what things belong to you. For example, if your boss expects you to take care of the wireless network, you better do it or have it handed off to someone else who can take care of it when a problem arises.

2. Develop your skills around your responsibilities. – I’m not advocating you abandon any sort of professional development that is not DIRECTLY related to your job. However, a BIG part of getting a pay check from a company is directly tied to being able to do your job as defined by the company. Good managers won’t load you up with things you are not able to do unless you have managed to con your way into a job by being a bit liberal with your resume. If you are stuck with something you are relatively new to, do the best you can and make sure your management KNOWS you are doing the best you can. Read books, configuration guides, white papers, and other technical documentation. Attend a training class. A career in IT is all about adaptation. None of us are working with the same hardware/software we were 10 years ago. If you are, odds are you either work for the government or a REALLY cheap company. Perhaps there are one or two things that have had a ten year plus shelf life, but for the most part, technology changes so fast that a decade is a lifetime in IT.

3. Be prepared. – Expect the unexpected. Think about different failure scenarios and design the network to remediate any single points of failure. If need be, have some block time purchased with an external consultant or VAR that has considerable experience with your specific hardware/software platforms. Carry maintenance contracts on all your hardware/software that is critical.

4. Raise any red flags early on. – If there are issues you know are going to be a problem, let someone know as soon as possible. Document those issues. Fix those issues. Even if the company says no due to budgetary reasons or some technical issue, at least you have done your homework and tried to make these issues known. If a problem does occur, nobody can come back to you and say that you should have known about this, or that it was your fault, etc. Additionally, it may work out to your benefit as management typically appreciates people who just want to make things better and do what is right for the stability of the network.

5. Stay calm during the outage/problem. – Remember that in a lot of people’s eyes, it is always the network that is at fault. Don’t let that get you down. Stay focused and work on the problem at hand. Ask as many questions as needed to get an idea of what the scope of the problem is. Don’t be afraid to ask very basic questions. One of the best ones to ask is “What changed?” or “When did the problem start?”. Maintain professionalism at all times. I get upset when I am on a conference call and someone won’t stop moaning about why it’s not their fault long enough for me to ask a question or answer one. However, it’s rather immature and unprofessional for me to lash out at them in anger even if I know it’s not my issue. There ARE times when I have had to tell someone to stop talking so that I could either answer a question or ask one of someone else on the call. I hate having to do that, but sometimes in the interest of getting it fixed YOU HAVE TO. If you are dealing with an issue where people are congregating around your desk watching over your shoulder, try and tune them out. You can’t always tell them to get lost or to leave you alone. You have to learn to work under pressure, but if you have taken item number 2 to heart, you should be able to minimize the time these people are hovering near your desk.

6. Be humble. – If people know that you don’t know it all, they tend to cut you a little more slack. If you are condescending and treat people like garbage because they don’t know the difference between a “routed” protocol and a “routing” protocol, they will be very unforgiving of your mistakes. Remember, there is no way possible you can know it all. There are people out there who know far more than you do. Sometimes they are in the same room as you. If you save the day and score a touchdown, good job. You don’t have to do the happy dance in front of everyone if you figure out what caused that routing loop. Your actions will speak for themselves. On the other hand, if you storm into the room demanding people shut up and watch you perform, you better get it right. If you don’t, your stock just went down and at some point, you’ll be looking for work elsewhere.

Ask any athlete how hard they have to work in order to get to their peak performance level and you’ll no doubt hear a recurring answer. You will find that it took a lot of time and effort to get there. There are no short cuts. When the wide receiver catches the ball and runs 80 yards to the end zone for a touch down, you can bet he ran sprints hundreds of times in the months prior. When the quarterback throws the ball for 50 yards and drops it right on the chest of the wide receiver, you can bet he threw that same pass hundreds of times in the months prior. When the defensive end wraps his arms around the running back and slams him to the ground, you can bet he practiced on a tackling dummy hundreds of times in the months prior. The examples go on and on. Peak performance takes time and effort. You practice and refine your skills for what is usually a short performance. Sometimes the performance extends over a couple of days or weeks, but generally issues get diagnosed and resolved in a relatively short time. How you prepare will determine the outcome. If you take shortcuts, expect poor results. If you put in the effort to perform well, good things will come your way. Granted, you probably won’t get a multi-million dollar contract with company X, but how many football players do you know who understand cool stuff like policy routing and VRF’s? Oh, and being able to fix problems on the network quickly leaves you more time to play World of Warcraft.

Categories: efficiency, learning

Don’t Just Collect. Consume.

August 19, 2010 14 comments

I have a bit of a problem when it comes to information. I tend to resemble someone on the TV show Hoarders. I have loads of PDF files on my laptop. Some are on my iPad. Some are on my desktop PC. I even have some on a little flash drive I carry around in my pocket. Of course, I have plenty of books. Just for networking related stuff, I have a pile at home as well as a good size collection at work. Then there are the URL’s. Every day I save all of the valuable URL’s I have discovered from Twitter and RSS feeds and put them in their own little folder with the date as the name under my bookmarks in Firefox. If I follow you on Twitter and you post a link, odds are I have looked at it and bookmarked it if it is something that pertains to my interests. If I read your blog, and odds are I do, I will bookmark various posts of yours and at some point go back and reference them. You see, I don’t always have time to read everything during the day. Additionally, if it is a post like this, or this, I will have to go back and read it all when I have a considerable amount of free time.

Therein lies the problem. I never seem to have time to go back and sift through every thing like I had planned. Well, that’s not entirely true. I have the time. I just get caught up in all the new links that are posted on Twitter every day and wind up spending study time skimming new blog posts or digging through websites. There’s a lot of good info out there that people are sharing. I suppose I could limit my intake to just routing and switching, but what fun would that be? Besides, I don’t want to be ignorant of the other things that are out there. After all, it wasn’t that long ago that I had absolutely nothing to do with voice, storage, wireless, and security. Times are changing, and changing fast.

There’s just so much out there that needs to be absorbed. Just when I think I have a handle on most of the Cisco product line, they go and release UCS, and the Nexus 1000V, and the ASR1000’s, and Clean Air. It never ends. There is always a new technology or some new hardware to read up on.

The realization I have come to is that there is no use in collecting information if you are not going to use it. All of those PDF’s, books, and URLs will do me no good if I never use them. At the same time, if I stop keeping up with what is current, I will fall behind and be of less help to my employer. I won’t be able to effectively design anything because I won’t be aware of what the possibilities are.

One of two things has to happen. The first option is that I can really narrow down the focus to just the things that directly pertain to my job. That will alleviate some of the information I have been hoarding. The second option is to start dedicating a bigger portion of my day to information consumption. I think option two is the best one as I can’t see myself ignoring products and technologies that I am not using today due to the fact that I may be using them tomorrow. Besides, it’s more fun when you have a wide range of technologies to keep up with as opposed to a handful.

I don’t know how everyone else handles their technical knowledge maintenance. If you happen to have a tried and true method of keeping up with all things networking, I would love to hear about it.

Categories: efficiency, learning, vendors

Drowning in Features

August 12, 2010 3 comments

Have you ever bought a car without all the bells and whistles? You end up with some blank buttons in your dashboard. You’re not really sure what they are for, but there’s that little voice in the back of your mind telling you that you should have bought that feature. Of course, you can drive the car for years and never need that button. Or, you can flip through the driver’s manual and see just what it is that button does on the fully loaded model you didn’t buy.

Perhaps you are a student of automobiles and wouldn’t dream of buying a new car without knowing all the possible options or features. You make sure you buy exactly what you need. Nothing more. Nothing less.

What about features that you never knew existed? My mother drove a 1994 Mazda 626 for about 7 or 8 years. It was a pretty nice car, but it had a feature that I have not seen in any other car. The center vent could oscillate back and forth between the driver and passenger seat. There was a button on the center console labeled “Swing”. Push the button, and the vents “swing” back and forth. Leave it off and the air blows in the direction you have the vents turned. Before I saw this, I had no idea such a thing existed. After I saw it, I looked for it in every car I drove or rode in. Not long after my mother bought her 626, I bought a Mazda Protegè. Sadly, I did not have the “Swing” button as an option on my car. Although I drove that car for a good 8 years or so, I never forgot about the “Swing” button letdown and felt as if my car was inferior. My mother moved on and bought a Mazda Millenia. That was the step up from the 626. The flagship car of Mazda, much like the Toyota Avalon or Chrysler 300. Sadly, the Millenia lacked the “Swing” feature in the AC vents, but it did have blue colored gauges at night on the dashboard. Now the “Swing” feature didn’t seem as cool next to the blue colored gauges and dials. One of my friends had recently bought a Volkswagen Jetta around the same time. He had blue colored gauges and dials as well. Of course, his CD changer was in the trunk or boot, and I was not a fan of that feature at all.

Now that I have exhausted my knowledge of automobiles, let me relate this to what you and I do for a living(or at least I assume you do the same thing as me). Features come and go. Some are neat and have a practical purpose. Others are just there. Eye-candy. Nothing more. Sometimes what we need is not the same as what we want. Sometimes we don’t want something until we find out it exists. Ahem, iPad anyone? Now before any Apple fan-boys or fan-girls jump down my throat, I must admit that I own one. I bought one recently and have decided that if my house were burning down and I had to choose one item to take with me in addition to my wife and kids, it would probably be my iPad. 🙂 Having said that, I was perfectly fine living with a laptop and desktop PC at home prior to the iPad’s debut. Once it was marketed to me, and I must say it was marketed rather well, I needed one. Not wanted. NEEDED.

I’m getting away from what I wanted to focus on and that was features, versus an entirely new product, but you get the point. There are a lot of neat little things out there that one vendor does over another. However, I wonder if those particular features are REALLY something we need. Do I REALLY need something like OTV? Some people will say yes. Others will say no. I would say it depends. What were you doing prior to OTV? Although my main focus is on network hardware and software, the same holds true for features in software and hardware outside of the network space. In the case of security, sometimes features can actually end up being vulnerabilities or additional entry points that you have to lock down.

So what is my point in all of this? Well, I am not going to give you answers because to be quite honest, I don’t have them. Remember, this is a blog about network therapy. A big part of therapy is simply stating the problem or concerns. Here’s what I think. If you are spending a lot of money on something, make sure you need what you are buying. Not want. Need. Yes it takes time to go through everything, but that’s what you get paid for. Don’t buy a Lamborghini if a Kia will suffice. If you need the Lamborghini, make your case and get it. Don’t settle for the Kia. If you absolutely have to settle for the lesser due to decisions made above your pay grade, then put in writing your concerns about why the Kia is not sufficient and move on. I know I said I had no answers, but I do have some suggestions. It’s better than a kick in the head, and it’s free, so take it for what it’s worth.

1. Only buy what you need or will need in the near future. You’ll want to consider future requirements as well (ie expansion, features needed down the road). It is often hard to predict the future, but do the best you can.

2. Careful consideration of how to spend company dollars will ultimately reflect good things about you or your particular group. You don’t want to be known as a money wasting group or person.

3. Careful attention to features will help you navigate the difficult waters of vendor selection. This is one of the harder things to master. If you know what you need and are relatively aware of what the major vendors are doing, product selection along with the right feature set becomes a bit easier. For example, check out this article by Greg over at etherealmind.com. If you ONLY live in the Cisco 3750 world of stackable switches, you might miss the fact that Juniper can do the same thing, but extend the logical switch over a LOT longer distances. This is but one example. There are many more like this out there. Go find them and buy them, but only if you must. 😉

Thankfully, most network hardware comes with a ton of features by default. It’s usually the higher end stuff that we talk ourselves into buying and don’t necessarily need. I’m looking at you VSS. I’m not saying there isn’t a place for it. I use it and think it is some pretty cool stuff, but I wonder if the expense is worth the benefit sometimes.

If you are a consultant, ignore everything I just said(or “wrote” if you want to nitpick). You make your living off of selling services and equipment. You are exempt. However, if you are the reason a 10 user network of office workers have dual 6513’s with Sup720’s, ACE, FWSM, and WiSM, you should be ashamed of yourself. In that case, I can simply quote Jesus: “Go and sin no more.“.

Categories: vendors Tags: ,

You’re lying. Well, at least until we see the packet capture.

August 10, 2010 4 comments

Maybe you’ve experienced this before. You are minding your own business without a care in the world when all of the sudden the phone rings.

You: Hello?
Them: There’s a problem and we think the network is the cause. Can you check it?
You: Check what? The network?
Them: Yes.
You: Which part? What am I looking for?
Them: Any sort of problem.
(Fast forward an hour or so later)
You: Well, I ran a packet capture on the switch port connecting to system XYZ. I see a bunch of TCP resets coming from your server.
Them: Okay. We’ll take a look.
(Fast forward another half hour or so)
Them: It looks like we found the problem. Process blah-blah-blah was failing due to a dependency on process ha-ha-ha. We reset the services and everything is working again. Thanks for your help.
You: Okay. Not a problem.
(Back to life as before)

Sound familiar? If you have been in networking for more than a couple of years, this should invoke all kinds of warm and fuzzy memories. Meals were missed. Plans were canceled. Sleep was lost. All in the name of defending the network’s honor. Oh yes. This is the part about a career in networking that is conveniently left out of the brochure you are given before signing your life over to Cisco/Juniper/Citrix/Aruba/Nortel/F5/Brocade/Alcatel/etc.

I have seen more than my fair share of these incidents. With the exception of a brief stint in consulting and about 2 years doing things in the US military that you’ll never do anywhere else, I have lived my entire IT existence in the “corporate” setting. By that I mean chained to a desk looking over logs and configurations. Slaving away on the same network for years on end. Getting to know the lay of the land in the same way one knows all the sounds an old car or house makes. In short, after you work on a certain network long enough, you can see into the guts of it like Neo can with the Matrix.

If you are like me, you have a certain affinity towards your network. Sure, it may need some help with cabling or a cleaner route table, but you work with what you have. You make changes as you can. You replace hardware as the budget allows. You care for it like a farmer does his corn fields. Is this creeping you out yet? Well it shouldn’t. There are plenty of people out there who love their networks even to the point of showing them off to the world.

Here’s the problem with being a networking engineer/administrator/architect/designer/janitor. You have to understand everyone else’s piece of the pie, but not too many people have to understand yours. Fair? No, it isn’t, but as an officer I once worked for in the military told me: “That’s a burden you have to bear.” He was right, even if I didn’t like hearing it. That is not to say that all other entities within IT or greater corporate America are completely clueless when it comes to networks. Quite the contrary. There are plenty of systems people who understand networks very well. You can give them an IP with a classless subnet mask and they don’t even bat an eye because they know exactly what you mean when you say it’s a slash 26 network. However, when it comes to “applications” people, my experience has been that they only have to know their piece of the pie and can conveniently blame the network when a problem arises. I know what you’re thinking. Did he just paint all applications people with a broad brush? Yes. Yes I did. Of course, if you happen to be an applications person, I meant everyone else. Not you. 😉

That brings me to the title of this mini-rant/post. You can plead your case before everyone telling them that it probably isn’t the network, but they’re not going to believe you. Why? A lack of understanding or a lack of visibility into your world. You see, the network is just a big murky box to them. Maybe if they had access to some monitoring platforms they could be swayed, but unless your monitoring package can go down to the transaction level like Compuware’s Vantage product, you’re still going to have some explaining to do. However, in a way that I cannot begin to explain, people tend to believe packet captures. Don’t ask me why. I can tell you until I am blue in the face that the switches and routers on the network for the most part could care less what your payload is and you won’t believe me. You may not even understand TCP, UDP, and the rest of the acronym soup being tossed around, but for some reason, Wireshark or tcpdump results are more credible than Steven Hawking discussing time travel. If you want some good laughs around things like this, follow this guy on Twitter. He seems to deal with this on a regular basis and has some hilarious tweets to show for it.

Let me end this post with the following suggestions:

1. Get familiar with interpreting packet captures. Wireshark is the most well known packet capture utility for Windows boxes out there. There’s even a good book out there that covers everything in detail. You’ll also need to know about TCP and how it works. There are other protocols like UDP and ICMP that will be good to know, but TCP is by far the most useful protocol to know and understand when dealing with packet captures. For some good info on TCP, see here.

2. Don’t be afraid to run a packet capture early on in the troubleshooting process. I am finding that this tends to solve the problem when all other methods fail.

3. Don’t EVER, and I stress EVER, state emphatically that there is no way possible that the network is at fault. 99 times out of 100 you may be right. Get it wrong 1 time, and everyone will be gunning for you. There’s always the possibility that the network is at fault. Even when everything you know is telling you that it isn’t the network, if you don’t have a packet capture to back it up, you’re wasting your time.

4. Educate your co-workers about the network, or networking in general. Try to do this without condescension. Nobody wants to listen to Nick Burns tell them how stupid they are. The more people know, the less likely they are to hurl unsubstantiated accusations your way that you are manipulating traffic to break their application. It makes every organization a lot stronger when education is provided from the various departments. Please understand that although you and I might get excited when talking about routing protocols, not everyone else will. Oh how I wish my wife and I could have the EIGRP vs OSPF discussion, but it’s just not going to happen. Some people are not going to want to know a whole lot about the network, so try and figure out how much they really want to know and tailor the education to that level.

If nothing else, looking at a bunch of packet captures will help you appreciate what is going on behind the scenes every time you read an e-mail message or look at a website. Although other people might not appreciate it, I find that it helps my wife fall asleep faster when I talk about the various TCP flags and why they are used in data transmissions. At least she will never blame the network. 🙂

Categories: efficiency, learning Tags: