NetApp Tech OnTap

End-to-End Quality of Service:

Cisco, VMware, and NetApp Team to Enhance Multi-Tenant Environments

Building shared infrastructure has always been something of a challenge. If you look at a typical corporate data center design you find that important applications either have their own dedicated infrastructure or that shared elements have been overengineered to far exceed requirements. Either approach underutilizes resources and wastes your IT budget.

The problem is that no one really knows how infrastructure components such as servers, networks, and storage will behave as additional load is added. Will a resource become a bottleneck, decreasing the performance of an important application unexpectedly? If so, how can you quickly identify the source of such bottlenecks?

The current interest in cloud computing has made understanding all aspects of multi-tenant environments—infrastructures in which all resources are shared—even more critical. In fact, many companies hesitate to build cloud infrastructure or contract for cloud services because of fears about security and quality of service (QoS).

Cisco has teamed with VMware and NetApp to design and test a secure, multi-tenant cloud architecture that can deliver on what we see as four pillars of secure multi-tenancy:

  • Secure separation. One tenant must not be able to access another tenant's virtual machine (VM), network, or storage resources under any circumstance. Each tenant must be securely separated.
  • Service assurance. Compute, network, and storage performance must be isolated and guaranteed during normal operations as well as when failures have occurred or certain tenants are generating abnormal loads.
  • Availability. The infrastructure must ensure that required compute, network, and storage resources remain available in the face of possible failures.
  • Management. The ability to rapidly provision, manage, and monitor all resources is essential.

In this article I describe the unique architecture the three companies have designed to address these pillars of multi-tenancy. I go on to discuss our efforts around the second pillar—service assurance—in more detail.

A recently released design guide provides full details of a Cisco validated design that uses technology from all three companies to address all four pillars described above. A companion article in this issue of Tech OnTap describes one element of the architecture, NetApp® MultiStore®, in more detail.

Architecture Overview

A block-level overview of the architecture is shown in Figure 1. At all layers, key software and hardware components are designed to provide security, quality of service, availability, and ease of management.

End-to-end block diagram

Figure 1) End-to-end block diagram.

Compute Layer
At the compute layer, VMware® vSphere™ and vCenter™ Server software provide a robust server virtualization environment that allows server resources to be dynamically allocated to multiple guest operating systems running within virtual machines.

VMware vShield Zones provides security within the compute layer. This is a centrally managed, stateful, distributed virtual firewall bundled with vSphere 4.0 that takes advantage of ESX host proximity and virtual network visibility to create security zones. vShield Zones integrates into VMware vCenter and leverages virtual inventory information, such as vNICs, port groups, clusters, and VLANs, to simplify firewall rule management and trust zone provisioning. This new way of creating security policies follows VMs with VMotion™ and is completely transparent to IP address changes and network renumbering.

The Cisco Unified Computing System™ (UCS) is a next-generation data center platform that unites compute, server network access, storage access, and virtualization into a cohesive system. UCS integrates a low-latency, lossless 10-Gigabit Ethernet network fabric with enterprise-class, x86-architecture servers. The system is an integrated, scalable, multichassis platform in which all resources participate in a unified management domain.

Network Layer
The network layer provides secure network connectivity between the compute layer and the storage layer as well as connections to external networks and clients. Key components include:

  • Cisco Nexus 7000, which provides Ethernet (LAN) connectivity to external networks
  • Cisco Nexus 5000, which interfaces with both FC storage and the Cisco 7000
  • Cisco Nexus 1000V, a software switch that runs within the VMware kernel to deliver Cisco VN-Link services for tight integration between the server and network environment, allowing policies to move with a virtual machine during live migration
  • Cisco MDS 9124, a Fibre Channel switch that provides SAN connectivity to allow SAN boot for VMware ESX running on UCS

Storage Layer
The storage layer consists of NetApp unified storage systems capable of simultaneously providing SAN connectivity (for SAN boot) and NFS connectivity for the running VMware environment. NetApp storage can also meet the specialized storage needs of any running application. Running the VMware environment over Ethernet provides a greatly simplified management environment that reduces costs.

NetApp MultiStore software provides a level of security and isolation for shared storage comparable to physically isolated storage arrays. MultiStore lets you create multiple completely isolated logical partitions on a single storage system, so you can share storage without compromising privacy. Individual storage containers can be migrated independently and transparently between storage systems.

Tenant Provisioning
When a tenant is provisioned using this architecture, the resulting environment is equipped with:

  • One or more virtual machines or vApps
  • One or more virtual storage controllers (vFiler units)
  • One or more VLANs to interconnect and access these resources

Together, these entities form a logical partition. The tenant cannot violate the boundaries of this partition. In addition to security we also want to be sure that activities happening in one tenant partition do not interfere indirectly with activities in another tenant partition.

End-to-End QoS

Very few projects tackle end-to-end quality of service. In most cases, a QoS mechanism is enabled in one layer in the hope that downstream or upstream layers will also be throttled as a result. Unfortunately, different applications have different characteristics—some may be compute intensive, some network intensive, and others I/O intensive. Simply limiting I/O does little or nothing to control the CPU utilization of a CPU-intensive application. It’s impossible to fully guarantee QoS without appropriate mechanisms at all three layers. Our team set out to design such a system.

Companies such as Amazon, Google, and others have built multi-tenant or “cloud” offerings using proprietary software that took years and hundreds of developers to create in house. Our approach was to use commercially available technology from Cisco, NetApp, and VMware to achieve similar results.

One design principle we applied in all layers is that when resources are not being utilized, high-value applications should be allowed to utilize those available resources if desired. This can allow an application to respond to an unforeseen circumstance. However, when contention occurs, all tenants must be guaranteed the level of service they have contracted for.

Another design principle is to set the class of service as close to the application as possible, map that value into a policy definition, and make sure that the policy is applied uniformly across all layers in accordance with the unique qualities of each layer.We used three mechanisms in each layer to help deliver QoS:

Table 1) QoS mechanisms.

Compute Network Storage
• Expandable Reservations
• Dynamic Resource Scheduler
• UCS QoS System Classes for Resource Reservation and Limit

• QoS—Queuing
• QoS—Bandwidth Control
• QoS—Rate Limiting
• FlexShare®
• Storage Reservations
• Thin Provisioning


Compute Layer
At the server-virtualization level, VMware vSphere provides many capabilities to ensure fair use, especially of CPU and memory resources. A vSphere resource pool is a logical abstraction for flexible management of resources. Resource pools can be grouped into hierarchies and used to hierarchically partition available CPU and memory. By correctly configuring resource pool attributes for reservations, limits, shares, and expandable reservations, you can achieve very fine-grained control and grant priority to one tenant over another in situations in which resources are in contention.

VMware Distributed Resource Scheduler (DRS) allows you to create clusters containing multiple VMware servers. It continuously monitors utilization across resource pools and intelligently allocates available resources among virtual machines. DRS can be fully automated at the cluster level so infrastructure and tenant virtual machine loads are evenly load balanced across all of the ESX servers in a cluster.

At the hardware level, Cisco UCS uses Data Center Ethernet (DCE) to handle all traffic inside a Cisco UCS system. This industry-standard enhancement to Ethernet divides the bandwidth of the Ethernet pipe into eight virtual lanes. System classes determine how the DCE bandwidth in these virtual lanes is allocated across the entire Cisco UCS system. Each system class reserves a specific segment of the bandwidth for a specific type of traffic. This provides a level of traffic management, even in an oversubscribed system.

Network Layer
At the network level, traffic is segmented according to the Class of Service (CoS) already assigned by the Nexus 1000v and honored or policed by the UCS system. There are two distinct methods to provide steady-state performance protection:

  • Queuing allows networking devices to schedule packet delivery based on classification criteria. The end effect of the ability to differentiate which packets should be preferentially delivered is providing differentiation in terms of response time for important applications when oversubscription occurs. Queuing only occurs when assigned bandwidth is fully utilized by all service classes.
  • Bandwidth control allows network devices an appropriate amount of buffers per queue such that certain classes of traffic do not overutilize bandwidth. This allows other queues to have a fair chance to serve the needs of the rest of the classes. Bandwidth control goes hand in hand with queuing, since queuing determines which packets are delivered first while bandwidth determines how much data can be sent per queue.

A set of policy controls can be enabled such that any unpredictable change in traffic pattern can be treated either softly, by allowing applications to burst/violate for some time above the service commitment, or by a hard policy, dropping the excess or capping the rate of transmission. This capability can also be used to define a service level such that noncritical services can be kept at a certain traffic level or the lowest service-level traffic can be capped such that it cannot impact higher-end tenant services.

Policing as well as rate limiting is used to define such protection levels. These tools are applied as close to the edge of the network as possible to stop the traffic from entering the network. In this design, the Nexus 1000V is used for the policing and rate-limiting function for three types of traffic:

  • VMotion. VMware traditionally recommends a dedicated Gigabit interface for VMotion traffic. In our design the VMotion traffic has been dedicated with a nonroutable VMkernel port. The traffic for VMotion from each blade server is kept at 1Gbps to reflect the traditional environment. This limit can be either raised or lowered based on requirements, but should not be configured such that the resulting traffic rate impacts more critical traffic.
  • Differentiated transactional and storage services. In a multi-tenant design, various methods are employed to generate differentiated services. For example, a "priority" queue is used for the most critical services and "no-drop" is used for traffic that cannot be dropped but can sustain some delay. Rate limiting is used for fixed-rate services, in which each application class or service is capped at a certain level.
  • Management. The management VLAN is enabled with rate limiting to cap the traffic at 1Gbps.

Storage Layer
As described above, NetApp MultiStore software provides secure isolation for multi-tenant environments. (MultiStore is described in more detail in a companion article in this issue.)

In the storage layer, delivering QoS is a function of controlling storage system cache and CPU utilization as well as ensuring that workloads are spread across an adequate number of spindles. NetApp developed FlexShare to control workload prioritization. FlexShare allows you to tune three independent parameters for each storage volume or each vFiler unit in a MultiStore configuration so you can prioritize one tenant partition over another. (FlexShare is described in more detail in a previous Tech OnTap article.) Both MultiStore and FlexShare have been available for the NetApp Data ONTAP® operating environment for many years.

NetApp thin provisioning provides tenants with a level of "storage on demand." Raw capacity is treated as a shared resource and is only consumed as needed. When deploying thin-provisioned resources in a multi-tenant configuration you should set the policies to volume autogrow, Snapshot™ autodelete, and fractional reserve. Volume autogrow allows a volume to grow in defined increments up to a predefined threshold. Snapshot autodelete is an automated method for deleting the oldest Snapshot copies when a volume is nearly full. Fractional reserve allows the percentage of space reservation to be modified based on the importance of the associated data.

When using these features concurrently, important tenants can be given priority to grow a volume as needed with space reserved from the shared pool. Conversely, lower-level tenants require additional administrator intervention to accommodate requests for additional storage.


Cisco, VMware, and NetApp have teamed to define and test a secure, multi-tenant cloud architecture capable of delivering not just the necessary security, but also quality of service, availability, and advanced management.

This article introduced our end-to-end approach to QoS. You can read more about QoS or the other pillars of multi-tenancy in our recently released design guide, which describes the elements of the architecture in detail along with recommendations for correct configuration.

 Got opinions about QoS in multi-tenant environments?

Ask questions, exchange ideas, and share your thoughts online in NetApp Communities.

Chris Naddeo

Chris Naddeo
Technical Marketing Engineer for UCS
Cisco Systems

Chris joined Cisco to focus on customer enablement and the design of optimal storage architectures for Cisco’s Unified Computing System. He has an extensive storage background, including one year spent at NetApp as a Consulting Systems Engineer for Oracle and Data ONTAP GX as well as nine years at Veritas, where he served as a product manager for Veritas storage software.