2014 and moving to the national team at WWT

2014 i became more involved with the national team as well as managing some of the data center redesigns we had in flight. Since my role is a pre=sales architect, we would get it to the HLD and BoM phases in a design, then professional services would take over the day to day implementation and migration phases. Of course since its our design, we need to remain plugged in to implementations in flight as well as new opportunities. As I am traveling around the southeast helping account managers close new business I am also supporting existing implementations in flight. Now thing really start to get hectic traveling and meeting customers by day and managing projects in hotel at night. Lots of travel across south as well as St louis and the west coast.

I was working on my VCP5-DCV cert plus i started working with the national team as Cisco had just released the alpha code for ACI(before the first ACI code). We had some amazing ties into the Cisco Insemi BU at the time, and had gotten the first 2 NEXUS 9508’s off the assembly line back in Nov 2013. With this new hardware we were able to install the first ACI code, and worked with the BU as they started releasing the beta code to partners. Once ACI was introduced, the national team created a ACI class to help customers understand ACI. I worked with them and got trained so I too could give the class as a overlay for the southeast. I was also working with the Cisco folks on a UCS director now that Cloupia had been purchased by Cisco. I was familiar with Cloupia and had worked with it before but now we were pushing it and had to be a expert so went to some bootcamps and training.

In the Oct/Nov 2014 time frame i was approached by the national team to join them and focus exclusively on NEXUS, ACI and data center switching technologies for the entire country. Now its really going to get crazy.

Exploits 2013-2014

So after my post a few days back and cleaning up the blog design I started to think what the heck have I been doing for 5+ years and why haven’t I been writing. Around that time is also when I transitioned roles from being a pre-sales engineer attached to a couple of account managers to a regional overlay for the southeast team focusing on datacenter technologies(NEXUS, UCS, storage, storage switching Virtualization. This also meant a lot more travel before this we would be within a few hours a customer and rarely spent the night in a hotel. Moving to this new role meant covering accounts in Tennessee, Virginia, Georgia, Florida, as well as where some of their other locations were.

At this point I also started working a lot with what we called our National Team who were considered our “experts” in a particular technology. I worked on being able to give our UCS class, our NEXUS class, and our RAD demo so this way we could offload some of these tasks to local guys as we needed to scale.

Some of the highlights for this year were 4 major data center redesigns from 6500 to NEXUS I designed as well as major LAN and VoIP upgrades for other customers networks. Multiple NEXUS and UCS classes as well as RAD demos to help out the national team. Customers ranged from several major banks in the North Carolina area as well as a home improvement company and a television network in Knoxville. There were also Cloupia, Flexpod, Nexus and ACI bootcamps given by the OEMs to bring us up to speed. I also participated in several beta tests for the ASAv and the NEXUS 1000V.

Looking back at my calendar, emails and designs from that year no wonder I had no time. However this was only the beginning

NEXUS 7K and Nortel vPC and SMLT interoperatability

We have some customers that are interested in migrating from the Nortel 8600 line over to the NEXUS switches. These are very large clients with mission critical networks so they cannot be taken down and the migration process must be done during small windows with minimal outages. So the basic idea will be as follows;

Stand up the entire infrastructure and connect to the Nortels. The question was would we set it up; using just spanning tree, using SMLT,vPC? In doing a lot of research, when you turn on SMLT it disables spanning tree on the SMLT ports. How would this affect spanning tree on the 7K? Our original proposal to these customers was to use SMLT from the 8600’s and regular spanning tree on the 7K’s as edge switches.

So the next step was that after the switches are in was to migrate the edge switches over to the NEXUS 7k’s via vPC’s and use the 7K’s as distribution switches during this process. All L3 traffic will still be switched on the 8600’s. One client is also replacing their closets switches and going to 3750X stacks so that will be fairly easy. The other client has many Nortel edge switches and these will remain in place. We will simply migrate the port channels and place them in a vPC.

Once the edge and closet switches have been migrated, we will move the datacenter servers onto the new 5k’s and 2248’s. Again this is all layer 2 and all layer 3 switching will be done on the 8600.

The next step was to migrate the layer 3 switching from the 8600’s down to the 7k’s. This would be done by creating P-P SVI’s are using these SVI’s to create OSPF neigbors to get the routing of the 8600’s onto the 7K’s. Once full layer 3 routing was obtained we would simply add the layer 3 interfaces of each subnet to the vrrp groups and once vrrp was up and running we would migrate each subnet from the 8600’s to the 7K’s by simply changing priority and the vrrp default gateway would be moved to the 7K’s 

NEXUS 7k vPC and 3750X stacks vPC failure on reboot.

 I a configuring a customers datacenter infrastructure and during failover testing of the 3750 stacks I get a very unusual error. When the stack fails, it appears to segment the stack and the NEXUS vPC see this as coming from 2 seperate switches. Now the weird thing was that the port channel stays up, however the vPC stays down. Once the switch comes back up all is well however in a failure or power loss scenario of the port channel members this would sever all connectivity of clients on the stack.

2011 Mar 25 23:16:41.277 ODFL-N7010A %ETHPORT-3-IF_ERROR_VLANS_SUSPENDED: VLANs
1,8,12,16,20,28,32,36,40,44,48,52,56,60,90-91,98-100,108,112,116,120,137-138,152
,200,224,232,252-255,4088-4090,4093 on Interface port-channel8 are being suspend
ed. (Reason: vpc port channel mis-config due to vpc links in the 2 switches

I had followed the vPC config guide and setup all parameters as indicated and the 7K is setup as LACP active and the 3750 stack as LACP passive, with spanning guard root on as well. A quick call to TAC indicated its a know issue with LACP and the stacks, setting the port channels to on with no LACP and removign the root gurad did the trick

Cisco UCS quick start guide pt 4

 

Now that we have the UCS system racked, powered up, connected to the upstream switch (please use 10GB only see previous posts on why) and SAN we can configure the 2 6100 Fabric interconnects (FI). Everything should be connected similary to Fig 1. One thing not shown here and probably the most important is the L1 and L2 ports on each FI. They must be connected to the other respective ports. These two ports are used for keep alives and to determine which FI is the master. We will configure the primary FI and save and reload. Once back up, we will perform a minimal configuration on the secondary FI and it will sync up with the primary. Each Fabric Interconnect will have upstream 10/100/1000 connection for out of band management. Connect the mgmt0 from each Fabric Interconnect to an upstream switch for connectivity to the Fabric Interconnects and blades.

  Here are some things we need to have before we power on the primary UCS sysmtem and go through  the startup wizard.

Before beginning the initial configuration of the primary fabric interconnect switch you will need the following information:

  • System name (Keep in mind the system name is automatically appended with an A and B)
  • Password for the admin account
  • Three static IP addresses: two for the management port on both interconnect (one per interconnect), and one for the virtual IP address used by Cisco UCS Manager
  • Subnet mask for the three static IP addresses
  • Default gateway IP address
  • DNS server IP address (optional)
  • Domain name for the system (optional)

First interconnect configuration

  • Connect to the console port.
  • Power on the interconnect device. You will see the power on self test messages as the interconnect boots.
  • When the unconfigured system boots, it prompts you for the setup method to be used. Enter console to continue the initial setup using the console CLI.
  • Enter setup to continue as an initial system setup.
  • Enter y to confirm that you want to continue the initial setup.

Second fabric interconnect configuration

  • Connect to the console port.
  • Power on the interconnect device. You will see the power on self test messages as the interconnect boots.
  • When the unconfigured system boots, it prompts you for the setup method to be used. Enter console to continue the initial setup using the console CLI.
  • Enter y to add the subordinate interconnect to the cluster. The cluster is built automatically, and the primary cluster member has been detected.
  • Enter the admin password of the peer interconnect.
  • Enter the IP address for the management port on the subordinate interconnect switch.
  • Review the setup summary and enter yes to save and apply the settings, or enter no to go through the setup wizard again to change some of the settings.

UCS MANAGER INITIAL CONFIGURATION  

  • With the SAN and LAN upstream switches configured we will start configuring the UCS environment using the UCS Manager. To use the UCSM interface you will need to have a Java runtime environment loaded on your system.
  • To begin point your web browser to the cluster IP address you configured on the fabric interconnect switches. On the Welcome screen select launch to download and run the UCS Manager. When the Logon screen appears enter the admin account and password you configured during the initial configuration of the fabric interconnect switches.

At this point UCS is up and running but we still need a lot more work. The chassis need to be discovered by the FI’s so we shall configure this now.

Hardware configuration (Equipment tab)  

We will start the configuration by performing some of the general hardware configurations such as defining the server uplinks, defining LAN uplinks and configuring the chassis discovery policy. The following is a list of tasks that will be performed during the hardware configuration: From the equipment tab navigate to policies and under the global polices configure these. Please keep in mind the port licensing structure when you configure the Server and uplink ports. By default only the first 8 ports on the 6120 and first 16 ports on the 6140 as well as the ethernet ports on the 4×4 and 6 port 10 gb expansion module are licensed. So if you didnt get the port licenses, the ports are basically dead.

From the equipment tab we will perform the following tasks so we can have a fully functional and ready to configure UCS system.

  • DEFINE SERVER PORTS
  • You can only configure server ports on the fixed port module. Expansion modules do not include server ports.

    This task describes only one method of configuring ports. You can also configure ports from a right-click menu, from the General tab for the port, or in the LAN Uplinks Manager.

    Procedure


    Step 1   In the Navigation pane, click the Equipment tab.
    Step 2   In the Equipment tab, expand Fabric Interconnects > Fabric Interconnect_Name > Fixed Module > Unconfigured Ports .
    Step 3   Click one or more ports under the Unconfigured Ports node.
    Step 4   Drag the selected port or ports and drop them in the Server Ports node.The port or ports are configured as server ports, removed from the list of unconfigured ports, and added to the Server Ports node.
  • DEFINE UPLINK PORTS
  • Configuring Uplink Ethernet Ports

    You can configure uplink Ethernet ports on either the fixed module or an expansion module.

    This task describes only one method of configuring uplink Ethernet ports. You can also configure uplink Ethernet ports from a right-click menu or from the General tab for the port.

    Procedure


    Step 1   In the Navigation pane, click the Equipment tab.
    Step 2   On the Equipment tab, expand Equipment > Fabric Interconnects > Fabric_Interconnect_Name.
    Step 3   Depending upon the location of the ports you want to configure, expand one of the following:

    • Fixed Module
    • Expansion Module
    Step 4   Click one or more of the ports under the Unconfigured Ports node.
    Step 5   Drag the selected port or ports and drop them in the Uplink Ethernet Ports node.The port or ports are configured as uplink Ethernet ports, removed from the list of unconfigured ports, and added to the Uplink Ethernet Ports node.
  • CONFIGURE CHASSIS DISCOVERY POLICY
  • Configuring the Chassis Discovery Policy

    Procedure

    Step 1   In the Navigation pane, click the Equipment tab.
    Step 2   On the Admin tab, click the Equipment node.
    Step 3   In the Work pane, click the Policies tab.
    Step 4   Click the Global Policies subtab.
    Step 5   In the Chassis Discovery Policy area, choose the number of links to be used by the chassis from the Action drop-down list.
    Step 6   In the Power Policy area, click one of the following radio buttons in the Redundancy field:

    • non-redundant—All installed power supplies are turned on and the load is evenly balanced. Only smaller configurations (requiring less than 2500W) can be powered by a single power supply.
    • n+1—The total number of power supplies to satisfy non-redundancy, plus one additional power supply for redundancy, are turned on and equally share the power load for the chassis. If any additional power supplies are installed, Cisco UCS Manager sets them to a “turned-off” state.
    • grid—Two power sources are turned on, or the chassis requires greater than N+1 redundancy. If one source fails (which causes a loss of power to one or two power supplies), the surviving power supplies on the other power circuit continue to provide power to the chassis.
    Step 7   Click Save Changes
  • CONFIGURE POWER POLICY
  • Configuring the Power Policy

    Power Policy

    The power policy is a global policy that specifies the redundancy for power supplies in all chassis in the Cisco UCS instance. This policy is also known as the PSU policy.

    For more information about power supply redundancy, see Cisco UCS 5108 Server Chassis Hardware Installation Guide.

    Configuring the Power Policy

    Procedure

    Step 1   In the Navigation pane, click the Equipment tab.
    Step 2   On the Admin tab, click the Equipment node.
    Step 3   In the Work pane, click the Policies tab.
    Step 4   Click the Global Policies subtab.
    Step 5   In the Power Policy area, click one of the following radio buttons in the Redundancy field:

    • non-redundant—All installed power supplies are turned on and the load is evenly balanced. Only smaller configurations (requiring less than 2500W) can be powered by a single power supply.
    • n+1—The total number of power supplies to satisfy non-redundancy, plus one additional power supply for redundancy, are turned on and equally share the power load for the chassis. If any additional power supplies are installed, Cisco UCS Manager sets them to a “turned-off” state.
    • grid—Two power sources are turned on, or the chassis requires greater than N+1 redundancy. If one source fails (which causes a loss of power to one or two power supplies), the surviving power supplies on the other power circuit continue to provide power to the chassis.

    For more information about power supply redundancy, see Cisco UCS 5108 Server Chassis Hardware Installation Guide.

    Step 6   Click Save Changes.
  • ACKNOWLEDGE CHASSIS(S)
  • Acknowledging a Chassis

    Perform the following procedure if you increase or decrease the number of links that connect the chassis to the fabric interconnect. Acknowledging the chassis ensures that Cisco UCS Manager is aware of the change in the number of links and that traffics flows along all available links.

    Procedure


    Step 1   In the Navigation pane, click the Equipment tab.
    Step 2   On the Equipment tab, expand Equipment > Chassis.
    Step 3   Choose the chassis that you want to acknowledge.
    Step 4   In the Work pane, click the General tab.
    Step 5   In the Actions area, click Acknowledge Chassis.
    Step 6   If Cisco UCS Manager displays a confirmation dialog box, click Yes.Cisco UCS Manager disconnects the chassis and then rebuilds the connections between the chassis and the fabric interconnect or fabric interconnects in the system.

    At this point you should go into the USCM and verify that all chassis as well as all blades, power supplies, fans etc are all goo dna there are no alarms or events.

    Cisco UCS quick start guide pt 1

    Our company has been working with a large storage vendor that purchsed UCS and with Cisco advanced AS helped them do the initial deployments of their UCS systems. They are one of the biggest UCS deployment in the world with over 900 blades and growing. My job as a pre-sales engineer has been to educate various departments on UCS, what it is and what the technological advancements it has. I was asked recently to craft a quick start guide that would help these users that have purchased UCS with some understanding of what goes into deploying UCS. I am documenting the major steps involved for a basic UCS deployment and try and put it in layman’s terms so people that haven’t had the exposure will hopefully understand.

    Cisco UCS quick start guide pt 2

    At this point I am going to assume that the readers has had some kind of exposure to a presentation on UCS so we will take a brief look back. A UCS system is a blade chassis environment composed of the following

    • The 6100 series controller either a 20 port 6120XP or 40 port 6140XP. Either 1 or 2 expansion slots are available for fiber channel or 10 GB connectivity. The 6100 provide the connectivity from the chassis to the local LAN and SAN as well as management functions of the UCS system.
    • The 2104XP fabric expansion module is what connects the servers to the upstream 6100 series switches. This runs FCoE for LAN/SAN connectivity.
    • The 5108 chassis is the sheet metal box which houses the fans, power supplies, blades and fabric extenders.
    • The B200 series blades can run the 2 5500 and 5600 series processors in a half width form factor with up to 96GB of memory. Supports 2 HD in a RAID 0 or 1 configuration as well as a mezzanine card for LAN/SAN connectivity.
    • The B250 blades support 2 5500 or 5600 processors in a full width form factor with up to 384 GB of RAM. Supports 2 HD in a RAID 0 or 1 configuration as well as 2 mezzanine cards for LAN/SAN connectivity.
    • The B440 blades support 4 7500 series processors in a full width form factor with up to 256 GB of RAM. Supports 2 HD in a RAID 0 or 1 configuration as well as 2 mezzanine cards for LAN/SAN connectivity.
    • The UCSM manager which is a GUI driven configuration tool. This gives you a single pane of glass that you use to configure all aspects of UCS.

    Cisco UCS quick start guide pt 3

                                          Planning a UCS installation.

    Here are some things to think about before a final order is even placed. Please see the 5108 installation guide. http://www.cisco.com/en/US/docs/unified_computing/ucs/hw/chassis/install/ucs5108_install.html

    • LAN connectivity. UCS requires 10 GB connectivity to the upstream LAN. While 1GB is now supported in the newest Aptos code it is not recommended for anything more than a test or DR environment. One issue that can be seen is that if serverA is communicating to serverB and they are on the same VLAN and are pinned to the same 6120 they will get their packets switched via the 10gb connectivity to 6120 fabric interconnect. However if server A is communicating to server B and they are pinned to different fabric interconnects, they will have to travel up the 1 GB to the upstream switch and then back down the 1Gb to the second fabric interconnect. This can lead to somewhat unpredictable network performance between servers in the same chassis. A pair of NEXUS 5010’s would be the ideal solution, while other 10 GB capable switches such as the 3750 or 2960 could be used. Finally if the 1GB must be used in a transitory period make sure that this is understood and pin servers to the same fabric interconnect that must communicate with each other using heavy I/O. Finally the performance of the 10 GB vs 1gb even in a etherchannel cannot compare due to the 64b/66b encoding vs the 8b/10b encoding of 1gb ethernet( 10 gb puts 64 bits of data on the wire for every 66 bits for a efficicy of 99.7% while 1 gb puts 8 bits of data for every 10 bits transmitted or a efficeincy of 80%.
    • Also the number of uplinks to your upstream switches need to be considered. Typically a uplink from each 6100 to each redundant upstream switch would be used. Plan the oversubscription of your LAN subscription to be in the neighborhood of industry recommended 4-12:1.
    • SAN connectivity can be 1/2/4/8 GB depending on the upstream SAN switch. UCS must be connected to a SAN switch capable of running NPIV mode. Also if you are planning on using VSAN’s, each FC uplink can be associated to a VSAN. So if you had 6 VSAN’s, you would need 6 FC connections to your SAN switch.
    • 6100 connectivity to chassis. Currently Cisco offers 1,3,5 meter SFP cables and a 7meter is supposed to be released. These lengths will dictate how you rack you UCS system so plan accordingly. You can used fiber SFP’s but that would be extremely expensive especially when the copper SFP+ cables are bout $180.
    • Rack space. These things are downright heavy. A single UCS 5108 chassis is 6RU and weighs in at 250 lbs with everything in it. A 6120 is 1RU and weighs in at 35 lbs so 2RU and 70lbs for 2 6102’s. They fit in a standard 19 in rach and need 32 in depth.
    • Power. A single 5108 chassis in a grid redundancy mode requires 4 20A 200v circuits, and they should be on seperate bussways. The 6120 require 2 15A circuits and can be run at 100-240V
    • Network Switch that  system will connect to?
    • Is the required optic(s) available on customer upstream network equipment to provide Short Range 10GE to CA system?
    • Are cables available to connect CA Fabric Controller to Customer upstream 10GE network ports? What types and lengths of cable? Cable counts?
    • What is the expected port licensing; does the customer have the licensing info and keys?
    • If SAN will be connected, are appropriate components available to connect? (Expansion Modules, optics, cables, FC switches, etc.) What type of FC-SW is being used? Number of connections per fabric?  VSAN requirements?