VMware SRM Testing on Cisco UCS with Routing

I work for a Cisco/EMC/VMware VAR named Varrow and we do a fair amount of VMware SRM projects.

One of the challenges we face in doing SRM failover testing is being able to route between VMs that are brought up at the recovery site in a test bubble.  Not all of our customers that use SRM need to be able to test this as a lot of them just need to verify that the VMs boot and can access storage.

For the few that need to be able to do more extensive testing and need to be able to route between VMs on different VLANs we have come up with a simple solution.

This solution will work in any VMware environment but when the customer has Cisco UCS at their recovery site there are additional benefits and functionalities that can be realized.

The solution utilizes a free VM router appliance from Vyatta and can be downloaded from the VMware Virtual Appliance Market – http://www.vmware.com/appliances/directory/va/383813

The advantage you get when you have Cisco UCS at the recovery site is that you can easily create a new vNIC and the way the layer 2 switching works within UCS allows you to be able to route between VMs across multiple ESX hosts.

For non-UCS environments it will not be possible to route between VM on different ESX hosts without some additional hardware; pNIC and L2 switch.

To test this out in our lab here are the steps I followed:

  1. Created 3 new test VLANs that only exist in UCS. It is important that these VLANs do not exist on your northbound layer switch.
    image
  2. Created a new vNIC template in UCS Manager named vmnic8-srm-b and added it to my ESXi Service Profile Template. This vNIC is configured to use Fabric B as primary but with failover enabled so that if B is down it will failover to A. I normally configure 2 vNICs per VMware vSwitch and let VMware handle the failover but with this solution I needed a vSwitch with only 1 uplink so that routing between VMs across multiple ESX host could be achieved.
    image
  3. After a reboot of my UCS hosted ESXi host the new vmnic8 was present
    image
  4. Created a new vSwitch and uplinked vmnic8 to it.
  5. Created 3 new VM port groups on the new vSwitch; one for each test VLAN.
    image
  6. Imported the Vyatta OVF into vCenter and placed the 3 default vNICs into each of the new port groups.
    image
  7. Powered on the Vyatta VM and logged into the console as root with the default password of vyatta.
  8. Configured the 3 Ethernet interfaces using these commands

configure
set interfaces ethernet eth0 address 10.120.10.254/24
set interfaces ethernet eth0 description “VLAN-120-SRM-TEST”
set interfaces ethernet eth1 address 10.130.17.253/24
set interfaces ethernet eth1 description “VLAN-117-SRM-TEST
set interfaces ethernet eth2 address 10.13.7.245/24
set interfaces ethernet eth2 description “VLAN-107-SRM-TEST”
commit
save

After the interface configuration I issued these commands to verify configuration and routing

image

From my 2 test VMs I was then able to ping between them across ESXi hosts

image

Advertisements

Cisco UCS 1.4 Post Upgrade Warnings

After upgrading our UCS lab to 1.4 all of my Service Profiles and Service Profile Templates were in a warning state with blue boxes around them. There wasn’t an outage and all of our blades were still functioning.

The warnings were due to they way Cisco changed the Serial over LAN Policy. My current Service Profile Template had the Serial over LAN Policy set to <Not Set> but in 1.4 that isn’t a valid option. I changed the Serial over LAN Policy to “No Serial over LAN Policy” and all of the warnings disappeared.

image

Cisco UCS Firmware 1.4

I have to say now that I have had a chance to implement Cisco UCS firmware 1.4 and look at the new features I am blown away. Cisco should have made this version 2.0.

Here is a list of my favorite new features included in 1.4:

SAN Port Channeling:

This allows you to bundle all of the FC connections on a 6120 so that there is 1 logical uplink to the northbound FC switch. Port channels provide faster convergence when there is link failure because a server vHBA doesn’t have to get re-pinned to another uplink.

image

Maintenance Policies:

Remember when making a change on a Service Profile rebooted the server without asking? Or even worse making a change on an updating Service Profile template caused all of the Service Profiles bound to that template to reboot?
Well Maintenance Policies prevent these unplanned reboots by forcing the user to acknowledge the reboot or you can even schedule it to happen after hours.

image

Policy Usage Reporting:

Ever wondered what Service Profiles and Service Profile templates were using one of the many UCS policies? Well wonder no more there is now a Show Policy Usage report on every policy in UCSM. I wish there was a similar feature for templates, I am guessing that will be included in a future update.

image

Enhanced Active Directory Support:

You no longer have to extend the Active Directory schema and you can map UCS roles to Active Directory groups.

Multiple Authentication Sources:

Pre-1.4 you could only have one authentication source at a time. Now you can create authentication domains and have the option to login to all of them.

image

Local File System Download Option:

A remote SCP, SFTP server is no longer required for downloading new firmware or saving backups.

image

There are lots of other new features in 1.4 that I didn’t mention here, these were just some of my favorites.

Sample script for automating the installation of ESXi 4.1 on Cisco UCS with UDA

Thanks to Mike Laverick the Ultimate Deployment Appliance now supports ESXi 4.1. – http://www.rtfm-ed.co.uk/vmware-content/ultimate-da/

I developed a script for automating the installation of ESXi 4.1 on Cisco UCS boot from SAN. The UDA template I setup uses a subtemplate for the host names, management IP and vMotion IP.

Here is my configuration:

  1. 6 GB Boot LUN hosted on an EMC CX4
  2. Cisco UCS B200-M1 blades
  3. Cisco VIC (Palo) adapters
  4. These vNICs

image

And these vHBAs

image

 

Here is the subtemplate I am using

SUBTEMPLATE;IPADDR;HOSTNAME;VMOTIONIP
UCSESX1;10.150.15.11;ucsesx1;10.150.12.11
UCSESX2;10.150.15.12;ucsesx2;10.150.12.12

Here is the script I am using

accepteula
rootpw –iscrypted $1$NvqID7HA$mkw26HiBQgbso6jk1jX014
clearpart –alldrives –overwritevmfs
autopart –firstdisk=fnic –overwritevmfs
reboot
install url
http://[UDA_IPADDR]/[OS]/[FLAVOR]
network –bootproto=static –ip=[IPADDR] –gateway=10.150.15.254 –nameserver=10.150.7.3 –netmask=255.255.255.0 –hostname=[HOSTNAME].domain.com –addvmportgroup=0

%firstboot –unsupported –interpreter=busybox

#Set DNS
vim-cmd hostsvc/net/dns_set –ip-addresses=10.150.7.3,10.150.7.2

#Add pNIC vmnic1 to vSwitch0
esxcfg-vswitch -L vmnic1 vSwitch0

#Add new vSwitch for vMotion
esxcfg-vswitch -a vSwitch1

#Add vMotion Portgroup to vSwitch1
esxcfg-vswitch -A vMotion vSwitch1

#Add pNIC vmnic2 to vSwitch1
esxcfg-vswitch -L vmnic2 vSwitch1

#Add pNIC vmnic3 to vSwitch1
esxcfg-vswitch -L vmnic3 vSwitch1

#Assign ip address to vMotion vmk1
esxcfg-vmknic -a -i [VMOTIONIP] -n 255.255.255.0 -p vMotion

#Assign VLAN to vMotion PortGroup
esxcfg-vswitch -v 12 -p vMotion vSwitch1

#Enable CDP listen and advertise
esxcfg-vswitch -B both vSwitch0
esxcfg-vswitch -B both vSwitch1

sleep 5

#Enable vMotion to vmk1
vim-cmd hostsvc/vmotion/vnic_set vmk1

#Set NIC order policy for vMotion port groups
vim-cmd hostsvc/net/vswitch_setpolicy –nicorderpolicy-active=vmnic3 –nicorderpolicy-standby=vmnic2 vSwitch1

#enable TechSupportModes
vim-cmd hostsvc/enable_remote_tsm
vim-cmd hostsvc/start_remote_tsm
vim-cmd hostsvc/enable_local_tsm
vim-cmd hostsvc/start_local_tsm
vim-cmd hostsvc/net/refresh

# NTP time config
echo restrict default kod nomodify notrap noquerynopeer > /etc/ntp.conf
echo restrict 127.0.0.1 >> /etc/ntp.conf
echo server 0.vmware.pool.ntp.org >> /etc/ntp.conf
echo server 2.vmware.pool.net.org >> /etc/ntp.conf
echo driftfile /var/lib/ntp/drift >> /etc/ntp.conf
/sbin/chkconfig –-level 345 ntpd on
/etc/init.d/ntpd stop
/etc/init.d/ntpd start

#One final reboot
reboot

The four additional vNICs will be used as uplinks to the Cisco Nexus 1000v dvSwitch. 

image

Cisco UCS vNIC/vHBA Placement policies

I recently have had a few people ask me what Cisco UCS adapter placement policies are used for and how/when to use them. This post will hopefully answer those questions and give a few examples.

First I will start with the Cisco definition of what vNIC/vHBA placement policies are. This definition was copied from the Cisco UCS Manager GUI Configuration guide – Cisco UCS GUI Configuration Guide

vNIC/vHBA placement policies are used to assign vNICs or vHBAs to the physical adapters on a server. Each vNIC/vHBA placement policy contains two virtual network interface connections (vCons) that are virtual representations of the physical adapters. When a vNIC/vHBA placement policy is assigned to a service profile, and the service profile is associated to a server, the vCons in the vNIC/vHBA placement policy are assigned to the physical adapters. For servers with only one adapter, both vCons are assigned to the adapter; for servers with two adapters, one vCon is assigned to each adapter.

You can assign vNICs or vHBAs to either of the two vCons, and they are then assigned to the physical adapters based on the vCon assignment during server association. Additionally, vCons use the following selection preference criteria to assign vHBAs and vNICs:

All

The vCon is used for vNICs or vHBAs assigned to it, vNICs or vHBAs not assigned to either vCon, and dynamic vNICs or vHBAs.

Assigned-Only

The vCon is reserved for only vNICs or vHBAs assigned to it.

Exclude-Dynamic

The vCon is not used for dynamic vNICs or vHBAs.

Exclude-Unassigned

The vCon is not used for vNICs or vHBAs not assigned to the vCon. The vCon is used for dynamic vNICs and vHBAs.

For servers with two adapters, if you do not include a vNIC/vHBA placement policy in a service profile, or you do not configure vCons for a service profile, Cisco UCS equally distributes the vNICs and vHBAs between the two adapters

Usage Scenarios

Half-width Blades (B200-Mx)

If you have half-width blades (B200-Mx) then you will only ever have a single mezzanine card or vCon1. In this case you would only use a vNIC/vHBA placement policy in these two scenarios:

  1. In a VN-Link in hardware configuration and you are attaching a Dynamic vNIC Connection Policy to the service profile. In this scenario a vNIC/vHBA Placement Policy is required so that the dynamic vNICs get assigned after the non-dynamic vNIC/vHBAs. This guarantees that the ESX vmnics and HBAs are at the top of the PCI numbering and that some of the dynamic vNICs aren’t intermixed. Here is a screen shot of this configuration, anything not assigned (dynamic vNICs) are placed below the assigned vNICs/vHBAs.

    image

  2. To force the PCI numbering of the NICs/HBAs as seen by the operating system. If you wanted to make sure the HBAs were seen before the NICs or vice versa you could do that with a placement policy.

 

Full-width Blades (B250-Mx, B440-M1)

  1. Use a placement policy to evenly distribute vNICs and vHBAs across 2 mezzanine cards (vCon1 and vCon2). Here is a screen shot of this configuration,

    image
  2. You have two different type of mezzanine cards; Cisco UCS VIC M81KR (aka Palo) and Cisco CNA M71KR (aka Menlo). Lets say for compatibility reasons you want all vNICs on the Palo and all vHBAs on the Menlo. In this scenario would create a placement policy to configure this assignment. Here is a screen shot of this configuration,

    image

  3. In a VN-Link in hardware configuration and you are attaching a Dynamic vNIC Connection Policy to the service profile and you want all of the Dynamic vNICs to be on one adapter and regular vNICs/vHBAs to be on the other.

    image
  4. You have two different types of mezzanine cards; Cisco UCS VIC M81KR (aka Palo) and Cisco CNA M71KR (aka Menlo) and you are configuring VN-Link in hardware. Lets say for compatibility reasons you want all vNICs on the Palo and all vHBAs on the Menlo. In this scenario would create a placement policy to configure this assignment. Here is a screen shot of this configuration,

    image

 

It is important to note that only the Cisco UCS VIC M81KR (aka Palo) allows you to have more than 2 vNICs/vHBAs per adapter and is the only card that allows for VN-Link in hardware where you have up to 54 Dynamic vNICs that are dynamically assigned to VMs that are configured to be part of the UCSM Managed Distributed Virtual Switch. – VN-Link in Cisco UCS

image

image

Cisco UCS Ethernet Frame Flows

In this post I am going to walk through the northbound frame flows in the Cisco UCS system. Hopefully this information will help you better understand UCS networking.

In other posts I will walk through the southbound and maybe east-west frame flows. I am breaking the posts down into multiple posts to make it easier for the reader to get through and understand.

I learned most of the information in this post from the Cisco Live 2010 session “Network Redundancy and Load Balancing Designs for UCS Blade Servers” presented by Sean McGee of Cisco. Some of the diagrams in this post were also taken from that same session.

Acronyms used in this post:

CNA – Converged Network Adapters on the mezzanine card

FEX – Fabric Extender 2104XP that lives in the chassis. There are usually two of these per chassis, one for Fabric A and on for Fabric B.

FI = Fabric Interconnect, the 6120s/6140s. Usually deployed as a pair in an HA configuration.

IOM – I/O Module, another name for the FEX.

First lets take a look at the physical components within the UCS System. Below is a logical and physical diagram of the components that make up UCS.

image 

 

Now lets take a look at northbound frame flows. The following diagram lists the decision points that take place as the frame travels from the OS up through and out of UCS to the LAN.

image

 

First we will start with the northbound traffic leaving from the Operating System.

  1. OS creates the frame – The happens the same way in UCS as is it does in any other hardware.
  2. The OS then has to decide which PCIe Ethernet interface to forward the frame out of. If there multiple PCIe Ethernet interfaces the OS NIC teaming software decides which interface to use based on a hashing algorithm or a round robin mechanism. In the case of VMware and the VIC (Palo) this could be from 1 to 58 choices, realistically you probably will have less than 10.
  3. If there is just 1 PCIe interface and Fabric Failover is enabled then it is the UCS Menlo ASIC that makes the frame forwarding decision.

After the frame has entered the physical mezzanine CNA another decision must be made as to which one of the 2 physical CNA ports are used to send the frame to the FEX. This decision depends on how the vNIC was configured in UCSM. When configured a vNIC as part of a Service Profile in UCSM you must choose which fabric to place this vNIC on (A or B) and whether or not Fabric Failover is enabled. The screen shot below shows these configuration options.

image

With this configuration Fabric A will be used unless for some reason it is down and then Fabric B will be used because Fabric Failover is enabled. For the purpose of this post we will assume the frame is exiting CNA Port 1 that is physically pinned to FEX (Fabric Extension Module in the back of the chassis) 1 on the mid-plane of the chassis. FEX 1 is the left hand FEX as you are looking at the back of the chassis, see the diagram below.

image

 

The FEX have 8 10GE KR back-end mid-plane ports or traces and 4 front-end 10GE SFP+ ports. The picture below shows the 8 back-end ports and the 4 front-end ports on a FEX. The 8 back-end ports are not physically visible like the picture shows this is just a good picture that logically shows the ports.

image

The back-end FEX port that the frame goes across depends on which blade this particular OS is running on. In our example we will say blade 3. There is a blade CNA port to FEX back-end port pinning that happens. For blade 3 CNA 1 the back-end FEX port will be port 3 or Eth1/1/3.

Eth X/Y/Z where

  • X = chassis number
  • Y = mezzanine card number or CNA number
  • Z = FEX (IOM) port number

Below is a screen shot of the output of the cmd “show platform software redwood sts”. This command must be executed from the fex-1# context. To get here from the UCSM cli prompt first type “connect local-mgmt” then “ucs-A(local-mgmt)# connect iom 1” The 1 in this cmd is the chassis number.

image

 

Once the frame is in the FEX it has to be forwarded to the Fabric Interconnect across one of the 4 front-end 10GE SFP+ ports. The port the frame goes across depends on how many FEX to Fabric Interconnect cables are connected and which blade it originally came from. There is a blade to IOM uplink pinning that happens based on the number of uplinked SFP+ ports. This pinning is not user controllable.

Here is a table and diagram showing how the pinning works in the three different uplink options.

1 IOM-FI Cable 2 IOM-FI Cables 4 IOM-FI Cables
All blades CNA 1 goes out this one cable, 8:1 oversubscription slots 1,3,5,7 are pinned to one uplink and slots 2,4,6,8 are pinned to the other, 4:1 oversubscription slots 1,5 out 1
slots 2,6 out 2
slots 3,7 out 3
slots 4, 8 out 4

image

 

For our example there are 4 IOM-FI uplinks.

So to summarize the path the frame has traveled up to this point.

  1. OS generated frame.
  2. OS chose the PCIe Ethernet Interface using either NIC teaming software. In our case there was only one PCIe Ethernet vNIC assigned to the Service Profile on this blade.
  3. Frame was forwarded to CNA 1 because we set Fabric A as the preferred fabric unless it is down. The Fabric Failover ASIC chose CNA 1 because Fabric A is up.
  4. CNA 1 is pinned to IOM 3 because it is blade 3.
  5. The uplink IOM SFP+ port 2 was used to forward the frame to Fabric Interconnect A because it is blade 3 and there are 4 SFP+ uplinks connected from IOM-FI.

The next step is for Fabric Interconnect A to forward this frame to a northbound L2 LAN switch. Before this can happen the FI has to determine which northbound Ethernet port to send it out of. This decision process is as follows

  1. Are there any LAN Pin Groups? If manual pinning is configured for this blade’s vNIC 1 to go out a specific uplink or Port Channel then that port or Port Channel is used to forward the frame. If it is Port Channel then the specific interface of the Port Channel that is used is based on the Port Channel hashing algorithm.
  2. Are Port Channels in use? If it is Port Channel then the specific interface of the Port Channel that is used is based on the Port Channel hashing algorithm.

If there are no LAN Pin Groups or Port Channels then automatic pinning is used in a round robin fashion. If there are 8 blades in the chassis and 4 northbound Ethernet uplinks then there will be 2 blades pinned to each uplink. This works in much the same way the default VMware vSwitch teaming works if you familiar with that.

So I think that about covers the northbound Ethernet frame flows. I stated earlier I will be posting the corresponding south, west and east flows in follow on posts.

Cisco UCS Enhancements in Firmware 1.3

Cisco UCS firmware 1.3 was released last week. There are lots of new features in this latest update and Dave Alexander (ucs_dave) outlined most of those in his blog post – http://www.unifiedcomputingblog.com/?p=151

What I will be covering here are the features that Dave didn’t cover nor were they mentioned in the release notes for 1.3 – http://www.cisco.com/en/US/docs/unified_computing/ucs/release/notes/ucs_22863.html

The feature I was most excited about was the advanced BIOS settings control from the Service Profile.

I updated our UCS lab this morning and on the the first login to UCSM after I finished the update I was surprised to see the Fabric Interconnects now have listed in their name what their current cluster role is. Before you had to drill down into each one to see this.

image

The next change I noticed was that the BMC had been renamed to CIMC (Cisco Integrated Management Controller). I personally like the new name better because I think it makes more since than BMC.

image

I was then looking around in the Admin tab and noticed a new section labeled Capability Catalog. The UCSM GUI Configuration Guide for 1.3 states this about the catalog

“The capability catalog is a set of tunable parameters, strings, and rules. Cisco UCS Manager uses the catalog to update the display and configurability of components such as newly qualified DIMMs and disk drives for servers.”

The catalog also has an Update Catalog function that can be used to update as Cisco updates it.

image

On the VM tab there is a new “Configure VMware Integration” wizard to assist in configuring VN-Link in hardware. (this requires the VIC adapter in the server blades and VEM installed on the ESX hosts)

image

Here are a few screen shots of the new BIOS settings that can be configured. These new settings are implemented via a BIOS Policy that is then tied to a Service Profile or Service Profile Template.

image

image

image

I am sure there may be a few more hidden gems in here but these are the ones I have noticed so far.

Cisco UCS Server Pools: Configuration

In my previous post I discussed Cisco UCS Server Pools and use cases. In this post I will be walking through the configuration of Server Pools, Server Pool Policies and Server Pool Policy Qualifications.

As stated in my previous post one of the use cases for Server Pools is in the server farm model where you have varying amounts of RAM and CPUs in your blades and and you want a way to efficiently deploy a farm of web server, ESX hosts and database servers without having to specifically choose a server blade.

In this configuration example I will walk through the configuration on an auto-populated Server Pool using Server Pool Policies and Server Pool Policy Qualifications.

The first step is to create a Server Pool. To do this go to the Servers tab, Pools.

image 

 

Right-click on Server Pools to create a new pool

image

On the add server page temporarily add a blade and then remove it. If you don’t do this the Finish button stays grayed out. This is one of the UCSM nuances.

image

The next step is to create a Server Pool Policy Qualification

image

I am going to create a web server qualification that uses a Memory Qualification for 8 GB RAM

image

Now to tie it all together with a Server Pool Policy

image

To use this new auto-populating Server Pool associate a Service Profile Template to is so that when you deploy new Service Profiles from the template you don’t have to select a blade, it will be automatically selected if there is one available in the associated pool.

image

 

One important note is that if your blades have already been discovered you will need to re-acknowledge them before they will be added to a server pool.

Another important note is that if you re-acknowledge a blade it will reboot it without asking so you probably only want to re-acknowledge blades that are not associated with a Service Profile.

Cisco UCS Server Pools: Use Cases

This is a a two part blog post on Cisco UCS Server Pools. This first post will focus on what Server Pools are and use cases and the next post will focus on configuration.

Cisco UCS Server pools are one of the more mysterious features of UCS. I say mysterious because in my opinion the usage and configuration of them is not very intuitive. Let’s start with Cisco’s definition of a Server Pool, this definition was copied from the Cisco UCS Manager GUI Configuration Guide, Release 1.2(1) found here – http://www.cisco.com/en/US/products/ps10281/products_installation_and_configuration_guides_list.html

A server pool contains a set of servers. These servers typically share the same characteristics. Those characteristics can be their location in the chassis, or an attribute such as server type, amount of memory,local storage, type of CPU, or local drive configuration.”

You are probably thinking that doesn’t sound very mysterious and you are right it doesn’t until you start digging into the use cases and configuration of Server Pools.

Here are some characteristics of a Server Pool:

  • Can be manually populated or auto-populated
    • Manual is where you select the blades that you want to be part of a given pool.
    • Auto-populated works by creating Server Pool Qualifications and Server Pool Policies where you define the specific hardware characteristics and based on that the blades are placed into one pool or another.
  • A blade can be in multiple pools at the same time – For example lest say you have a multi-tenancy UCS environment and multiple Server Pools have the same blades in them. The blades are not actually tied to a specific organization in UCS and so when a user deploys a new service profile and uses a Server Pool for the assignment they will grab the next available server in their pool. This blade could have also been in any other Server Pool in any other UCS organization that met the qualification. It works on a first come first serve model. Once a blade is associated with a Service Profile is not available to any
  • A service profile template or service profile can be associated with a Server Pool for rapid blade deployments
  • A blade can be located in any chassis that is management by the UCS Cluster

Server Pool Use Cases

Server Farm Model – In this model you have an environment made up of varying applications that have varying hardware requirements.

For example:

  1. Web server farm that needs at least 8 GB RAM and 1 quad core CPU.
  2. Database server farm that needs at least 32 GB RAM and 2 quad core CPUs.
  3. An ESX server farm that needs at least 48 GB RAM, 2 quad core CPUs and a Cisco UCS VIC M81KR adapter.

Rapid Deployments – This would be a new UCS implementation where you are wanting to deploy new blades as fast and efficiently as possible without have to select specific blades to associate Service Profiles with.

Cisco UCS Palo and EMC PowerPath VE Incompatibility

******UPDATE****** There is a new VMware driver out that corrects this incompatibility. You can download it herehttp://downloads.vmware.com/d/details/esx40_cisco_cna_v110110a/ZHcqYmRwKmViZHdlZQ

I came across a much unexpected incompatibility this week between Cisco UCS VIC M81KR (Palo) and PowerPath VE.

I was implementing Cisco Nexus 1000v and EMC PowerPath VE on Cisco UCS blades with the new Cisco UCS VIC M81KR Virtual Interface Card (Palo). We did the Nexus 1000v implementation first and that went flawlessly. Being able to present 4 10G vNICs to the UCS blade with Palo makes for a very easy and trouble free Nexus 1000v  install because you don’t have to put the ESX Service Console in the Nexus 1000v vNetwork Distributed Switch.

After the Nexus 1000v was complete we moved on to PowerPath V/E. This environment was already using PowerPath VE on their other UCS blades but those have the Menlo mezzanine cards with the QLogic HBA chip set. We were expecting this piece of the project to be the easiest because with PowerPath V/E you install it on each ESX host, license it and then that is it. There is zero configuration with PowerPath VE on ESX.

So we downloaded the latest PowerPath VE build from Powerlink (5.4 sp1). We then configured an internal vCenter Update Manager patch repository so that we could deploy PowerPath V/E with VUM. After we deployed PowerPath VE to the first host we noticed in the vSphere client that the LUNs were still owned by NMP. At first I thought maybe it was because it wasn’t licensed yet but then I remembered on the other PowerPath VE installs I did that PowerPath should already own the SAN LUNs.

I SSHed into the host and looked at the vmkwarning log file and sure enough there were lots of these warnings and errors.

WARNING: ScsiClaimrule: 709: Path vmhba2:C0:T1:L20 is claimed by plugin NMP, but current claimrule number 250 indicates that it should be claimed by plugin PowerPath.

vmkernel: 0:00:00:50.369 cpu8:4242)ALERT: PowerPath: EmcpEsxLogEvent: Error:emcp:MpxEsxPathClaim: MpxRecognize failed

It took us a few minutes but then we realized it was probably an incompatibility between Palo and PowerPath VE. We opened both a Cisco TAC and EMC support case on the issue and sure enough there is an incompatibility between the current ESX Palo driver and PowerPath VE. Cisco TAC provided us a beta updated fnic ESX driver for us to test but said that it wasn’t production ready.

We tested the new driver and that fixed the issue. PowerPath VE was then able to claim the SAN LUNs. Since the driver is beta and not fully tested by VMware we are going to hold off using it. Cisco didn’t give us date as to when the driver would be released. I imagine that once VMware gives it their blessing they will post it to the vCenter Update manager repository and it can be installed from there. Cisco may even have it out sooner as a single driver download from their UCS downloads page.

Since both the UCS Palo and PowerPath VE are part of vBlock I am very surprised this wasn’t already tested by Cisco, VMware and EMC. Oh well I know Cisco will have this fixed soon so it isn’t that big of a deal.

How to Enable CDP On Cisco UCS vNICs

If you are familiar with managing VMware ESX 3.5/4.x in an environment that includes Cisco LAN switches then you have probably used “CDP listen state” that is enabled by default on an ESX install. To view this information in vCenter select an ESX host, go to the configuration tab in the right pane, select the Networking link and then click on the little blue call out box next to a vmnic that is uplinked to a vSwitch. A pop-up window opens displaying the CDP information. The information can be invaluable when troubleshooting networking issues. You can determine which switch/switch port the NIC is plugged into, native VLAN and other useful information. This is also a great way to verify that your vSwitch uplinks are going to 2 different physical switches (if you have that option).

 

image

 

 

As I stated earlier the default CDP configuration on an ESX vSwitch is the listen only state. I have found that the network engineers find it very useful if you configure CDP to advertise as well. When you enable this on a vSwitch the network engineer can issue the “show cdp neighbors” command from the IOS command line and witch switch ports each ESX vmnic is plugged into. This can also be very useful when you and the network engineer are troubleshooting network issues with ESX.

image

To configure CDP to advertise run this command from the ESX console or from an SSH session.

“esxcfg-vswitch -B both vSwitch0”

To check the state of the CDP configuration run this command..

“esxcfg-vswitch -b vSwitch0”

Note – you must enable CDP on all vSwitches if you want to see every vmnic from the switch side.

If you are using a VMware vNetwork Distributed Switch then you can configure the CDP state from the vCenter GUI. To do this go to the edit settings on the dvSwitch and then go to Advanced.

 

image

 

Ok, now to the point of configuring all of this on Cisco UCS blades.

By default the vNICs in Cisco UCS have CDP listen and advertise turned off. You can see this from an ESX host that is running on a UCS blade by clicking on the little blue call out box. When the pop-up opens it states that Cisco Discovery Protocol is not available.

 

image

 

To enable CDP the first thing you must do is to create a new Network Control policy. To do this go to the LAN tab in UCSM, expand Policies, right-click Network Control Policies to create a new policy. Name it something like “Enable-CDP” and select the option to enable CDP.

 

image

 

The next step is to apply the new policy to the ESX vNICs. If you are using updating vNIC templates then all you need to do is go to each vNIC template for your ESX vNICs select the new policy from the Network Control Policy drop down. If you are not using vNIC templates but you are using an updating Service Profile Template then you can enable it there. If you are using one-off Service Profiles are a non-updating Service Profile then you must go to every Service Profile and enable this new policy on every vNIC.

 

image

 

 

 

 

 

Now when you click the call-out box you should see the CDP information coming from the Fabric Interconnect that you are plugged into.

 

image

Cisco UCS VN-Link Hardware Implementation Walk Through

Here is another video blog on Cisco UCS. In this video I walk through the implementation of VMware VN-Link in hardware on the Cisco UCS VIC (Palo) adapter.

http://www.screencast.com/t/YWU3ZGVmYzAt

Command line output of the ESX configuration

esxupdate query
—-Bulletin ID—- —–Installed—– ————Summary————-
ESX400-Update01     2010-03-25T22:03:33 VMware ESX 4.0 Complete Update 1
VEM400-200912272-BG 2010-03-25T22:35:03 Cisco Nexus 1000V  4.0(4)SV1(2)

vem version
Package vssnet-esx4.1.0-00000-release
Version 4.0.4.1.2.0.80-1.9.179
Build 179
Date Wed Dec 9 08:13:45 PST 2009

vem status

VEM modules are loaded

Switch Name    Num Ports   Used Ports  Configured Ports  MTU     Uplinks
vSwitch0       32          4           32                1500    vmnic0,vmnic1
DVS Name       Num Ports   Used Ports  Configured Ports  Uplinks
UCS-dvSwitch0  256         230         256               vmnic3,vmnic2

VEM Agent (vemdpa) is running

esxcfg-vswitch -l
Switch Name    Num Ports   Used Ports  Configured Ports  MTU     Uplinks
vSwitch0       32          4           32                1500    vmnic0,vmnic1

PortGroup Name      VLAN ID  Used Ports  Uplinks
Service Console     0        1           vmnic0,vmnic1

DVS Name       Num Ports   Used Ports  Configured Ports  Uplinks
UCS-dvSwitch0  256         230         256               vmnic3,vmnic2

DVPort ID           In Use      Client
1628                1           vmnic2
1629                1           vmnic3
1630                0
1631                0

esxcfg-nics -l
Name    PCI      Driver      Link Speed     Duplex MAC Address       MTU    Description
vmnic0  08:00.00 enic        Up   10000Mbps Full   00:25:b5:0c:95:a9 1500   Cisco Systems Inc 10G Ethernet NIC
vmnic1  09:00.00 enic        Up   10000Mbps Full   00:25:b5:0c:95:ba 1500   Cisco Systems Inc 10G Ethernet NIC
vmnic2  0a:00.00 enic        Up   10000Mbps Full   00:25:b5:0c:95:aa 1500   Cisco Systems Inc 10G Ethernet NIC
vmnic3  0b:00.00 enic        Up   10000Mbps Full   00:25:b5:0c:95:bb 1500   Cisco Systems Inc 10G Ethernet NIC

lspci | egrep -i “cisco.*pass”
0c:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
0d:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
0e:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
0f:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
10:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
11:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
12:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
13:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
14:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
15:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
16:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
17:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
18:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
19:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
1a:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
1b:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
1c:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
1d:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
1e:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
1f:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
20:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
21:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
22:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)
23:00.0 Ethernet controller: Cisco Systems Inc 10G Ethernet pass-thru NIC (rev a2)

Cisco UCS vNIC Failover

When you create a vNIC in UCSM with Cisco UCS you have the option to pin that vNIC to Fabric A or Fabric B and an option to Enable Failover. Most of the servers we deploy on UCS blades are ESX 4 and with ESX we always create two vNICs, one on Fabric A and one on Fabric B and then let ESX handle the NIC teaming and failover. With the new Palo adapter we create 4 vNICs (eth0-eth3), assign two for SC/vMotion in a Standard vSwitch and two for VM networking in a Distributed vSwitch or Nexus 1000v.

image

I was curious about how the UCS level vNIC failover worked so I built a Windows 2008 R2 blade. In the Service Profile I presented only on vNIC to it and checked the Enable Failover option. I was wondering if Windows would see two NICs or one because someone else had told me that Windows will see two NICs but will auto failover between them if the uplink on Fabric A goes down.

After I loaded the Cisco enic Palo drivers in Windows it only sees one NIC,  here is a screen shot of Device Manager and the Network Connections window.

image

To test the failover I first SSH into our Cisco 3750E switch to see witch 10GB uplink the MAC address was coming across and it was using Te1/0/1.

Cisco3750#show mac address-table | include 0025.b50c.95ab
1    0025.b50c.95ab    DYNAMIC     Te1/0/1

I then started an extended ping to the server and shutdown switch port Te1/0/1, failover happened almost immediately with only one ping missed. I then check where the MAC address was and it was on the other uplink for Fabric B.

Cisco3750#show mac address-table | include 0025.b50c.95ab
1    0025.b50c.95ab    DYNAMIC     Te1/0/2

Wow!! that was cool, NIC failover without having to mess with NIC teaming drivers in Windows. This is a much cleaner solution than using the old Intel ProSet, HP or Broadcom tools to create the third virtual NIC. It has to be more efficient using hardware based failover as well.

I then wanted to see what would happened if I brought switch port Te1/0/1 back online. To my surprise it switched back over to Fabric A almost immediately after enabling the switch port but this time there were 0 pings missed.

I checked the Windows event log and didn’t see any event where Windows detected a loss of network connectivity. This is really cool that Windows never knew anything about the uplink failure.

Very cool stuff.

Presenting 4 vNICs to VMware ESXi 4 with the Cisco UCS VIC (Palo) Adapter

We got our own Cisco UCS lab at the office last week complete with two 6210S, two server blades and the new Cisco VIC (Palo) adapter.

Here is a diagram I copied from the Chassis 1 –> Hybrid Display tab in UCS Manager.

image

What is not in the diagram are two Cisco MDS 9124s and a Cisco 3750E with two Ten Gigabit uplinks.

After we got our first ESXi 4.1 blade up booting from a CX4-120 LUN I was itching to present more than two 10GB adapters to the ESXi host.

When I initially looked at how to add more than two vNICs to a Service Profile I couldn’t figure out how to do it. I was thinking there was some new configuration screen somewhere where you had to go to and enable the additional vNICs. I was also unable to find any good documentation on how to do it so a posted a question to the Unified Computing Cisco Support Community – https://supportforums.cisco.com/community/netpro/data-center/unified-computing?view=discussions&start=0

If you haven’t checked out this community and you are interested in Cisco UCS you should definitely browse through it. There are some good tips in there.

This is a very active community and the two times I posted a question it was answered within 12 hours.

I posted a question on how to configure the new VIC (Palo) adapter and to my surprise it is a lot easier to configure than what I initially thought.

All that I had to was add two more vNICs to my Service Profile template. I don’t know why I didn’t just try that first.

I went into my updating Service Profile Template and added eth2 to fabric A and eth3 to fabric B, here is a screen shot

Now be careful because when you add vNICs or modify the storage of a Service Profile or an updating Service Profile Template it will power cycle the blade that is bound to the template. I don’t know if there is a way to change this behavior but I think this is dangerous.

After my ESX 4i server rebooted I first checked the vmnic list using the esxcfg-nics –l command. Here is the output

~ # esxcfg-nics -l
Name    PCI           Driver      Link Speed     Duplex MAC Address       MTU    Description
vmnic0  0000:08:00.00 enic        Up   10000Mbps Full   00:25:b5:0c:95:af 1500   Cisco Systems Inc VIC Ethernet NIC
vmnic1  0000:09:00.00 enic        Up   10000Mbps Full   00:25:b5:0c:95:bf 1500   Cisco Systems Inc VIC Ethernet NIC
vmnic2  0000:0a:00.00 enic        Up   10000Mbps Full   00:25:b5:0c:95:ae 1500   Cisco Systems Inc VIC Ethernet NIC
vmnic3  0000:0b:00.00 enic        Up   10000Mbps Full   00:25:b5:0c:95:be 1500   Cisco Systems Inc VIC Ethernet NIC

AWESOME!!!

Next I logged into my vSphere Client and checked the Network Adapters and added a new vSwitch for VM traffic.

image

image

Now I can keep my Management Network (Service Console) on a standard vSwitch and put my VM networks in a Distributed vSwitch or in Nexus 1000v without worrying about loosing access because a configuration error.