NetApp Tech OnTap
     

A 9,000-Seat Virtual Desktop Deployment
with Rapid Cloning

Desktop virtualization is enabling companies to radically change the way they conduct business. I recently worked with a business process outsourcing company that is leveraging virtualization technology to eliminate the need for bricks-and-mortar call centers. The IT team is developing a technology strategy that will allow thousands of call center agents to work from home, so the company can stop the continued procurement of expensive real estate and gain full access to a worldwide workforce.

Having agents work from home presents some unique challenges in terms of providing access to the latest data, ensuring security, and supporting the agents’ working environments. Employing desktop virtualization to enable up to 9,000 agents to work from almost any wired location represents a new approach, and I believe elements of this solution are appealing for a wide variety of situations, not just call center support:

  • VMware® virtual desktop infrastructure provides each agent with a secure, tailored workspace that includes appropriate support tools.
  • NetApp® FlexClone® technology enables the rapid and space-efficient provisioning of agent desktops.
  • NetApp SnapMirror® software allows active desktop environments to be replicated between the company’s three data centers so that a data center outage will not result in data loss or keep agents offline for an extended period.
  • IP telephony extends the customer phone connection to the agent’s home.

In this article I describe the methods the company uses to connect and support users anywhere in the world while creating a highly scalable infrastructure that can survive a site failure with minimal disruption to ongoing operations. I start by looking at the technology evaluation process, describe the details of the technology deployment, and discuss results so far as well as work remaining.

Deciding on the Right Technologies

Approximately six months ago, when I first met with the company’s IT team to discuss storage, they had already made a number of technology decisions involving virtual desktop software and networking. Because they wanted to create a tailored desktop environment for each product they supported and have each type of desktop be a clonable element, they settled on a combination of Citrix to provide the desktop images and VMware to host the desktop images. At the time, this was the only way they could get all the capabilities they needed.

They had also decided on iSCSI rather than Fibre Channel for back-end storage. The company had made a large commitment to IP telephony and had a global IP network. As a result, the IT team was sold on IP for its flexibility and wanted to stick with it if possible.

When the team saw the NetApp presentation, three technologies caught their interest:

  • Multiprotocol support. Although the company was confident about the decision to go with iSCSI, the IT team recognized that NetApp’s proven ability to support any protocol would give them the ability to change if needed, and the benefits of NFS in virtual environments also intrigued them.
  • FlexClone. The IT team immediately recognized that NetApp FlexClone technology might provide them the ability to clone desktop environments and eliminate the need for the Citrix/VMware combination they had previously settled on. The NetApp Rapid Cloning Utility (RCU) version 1.0 provided the necessary integration between NetApp Data ONTAP® and VMware.
  • Deduplication. While NetApp FlexClone would save the company from having to maintain 9,000 discrete copies of the desktop operating environment, the IT team also knew from experience that agents accumulated large amounts of unstructured data in the course of their work. NetApp deduplication was immediately seen as a way to reduce that storage requirement. For instance, if all 9,000 agents store copies of a 1GB file set, that alone would require 9TB of storage.

Additional conversations helped the IT team recognize the value of another set of NetApp technologies, including:

  • Performance Acceleration Module (PAM). The company already knew from initial tests with other vendors’ storage that booting hundreds of virtual desktops at once places a heavy demand on disk subsystems. The ability to meet the I/O demands of these “boot storms” is required to quickly bring a large number of desktops for call center agents online. The NetApp Performance Acceleration Module alleviates the impact of a boot storm by caching the blocks that make up the boot image in memory as soon as the first virtual desktop is booted, so that the remaining boot requests are largely satisfied from cache.
  • SnapManager® for Virtual Infrastructure (SMVI). While most virtual desktops in this environment will be nonpersistent and, therefore, will not require backup (a new desktop will be created for each agent at the start of each work session), SMVI provides the company a way to automate the protection of the persistent desktops required by more senior support agents or other users who need access to the same desktop on an ongoing basis for continuing work.
  • SnapMirror. When I initially met with the company, the IT team hadn’t yet designed a disaster recovery plan for its virtual desktop environment. NetApp SnapMirror software provides a means to regularly replicate the active virtual desktop environment from one data center to another.

This unique set of NetApp capabilities solved many of the problems that the company faced, allowing the IT team to rethink its earlier decisions when architecting the final design.

Deployment Details

The final architecture for the virtual desktop environment consists of five main elements:

  • “Pods” capable of supporting up to 2,400 users that consist of:
    • 2 HP C-Class Blade Enclosures with 32 eight-core blades, each with 32GB of memory
    • A clustered NetApp FAS3160 storage system with 20TB of FC disk storage and 32GB of memory in 2 Performance Acceleration Modules
    • 4 Catalyst 3120 Blade Switches
    • 2 Catalyst 4948 Switches
  • VMware Virtual Desktop Manager (VDM) and VMware ESX
  • NetApp RCU version 1.0 on each NetApp storage system to provide cloning
  • NetApp SnapManager for Virtual Infrastructure for automated backups and rapid, application-consistent recovery
  • NetApp SnapMirror software for remote site replication

By deploying two such pods in each of its three data centers, the company will be able to support 9,000 users with just two operational data centers. SnapMirror will be used to replicate active desktop environments at each data center to another site to provide full redundancy and recoverability should a data center go offline.

Typical approaches to improve application performance

Figure 1) The rapid cloning infrastructure used to create up to 9,000 virtual desktops.

Cloning for rapid virtual desktop provisioning. The cloning process is used to create copies of a golden desktop image with the appropriate tools and other resources for each project the company supports. It uses NetApp RCU to clone instances of each environment based on the number of agents contracted to work on a project. RCU automatically creates an import file that is placed in the VMware Virtual Desktop Manager so that all virtual desktops are registered with the VDM and ready for use. The virtual desktop is cloned as a nonpersistent virtual image. This setting causes any changes made to the desktop during the agent’s session to be removed once the agent logs out of the virtual desktop session. The nonpersistent state ensures that the support environment is standard and eliminates the need for ongoing support of the customized desktop environment.

NFS replaces iSCSI. As we began architecting the final layout, the decision to use iSCSI was reexamined. Ultimately, the IT team opted for NFS over iSCSI for the following reasons:

  • Number of virtual machines per LUN or volume. Based on server virtualization results from NetApp and other storage vendors, 16 to 25 server VMs per LUN are typical in server virtualization deployments. With limited data on sizing the virtual desktop-to-LUN ratio, we estimated it would be on the order of 30 to 75 per LUN. By comparison, a previous server virtualization project I was involved in supported 200 server VMs per NFS volume.
  • Impact on ESX server. A recent NetApp Multiprotocol Performance Test with VMware shows that NFS has less impact on ESX CPUs than iSCSI. Less CPU consumption translates to more virtual machines per server.
  • Provisioning. With thousands of virtual machines, we wanted the provisioning environment to be as simple and easy as possible and need as little time as possible spent on storage performance management. Based on discussions with numerous NetApp/VMware reference accounts, the IT team was satisfied that NFS was the best choice given its combination of performance, scalability, and ease of management.

Results So Far and Work Remaining

Currently, the company has one pod deployed in full production and a second in the process of being deployed. The IT team is very pleased with the results thus far:

  • Cloning process. The cloning process presently benefits from two NetApp utilities. The read-few-write-many (rfwm) utility efficiently creates copies of golden virtual machines within the golden datastore. NetApp FlexClone is executed to clone multiple space-efficient copies of the golden datastore. The time this process takes varies based on the number of virtual machines in the datastore and the number of volume-level copies of that datastore that are required. The availability of file-level FlexClone technology in Data ONTAP 7.3.1 is expected to provide even better cloning times and its implementation is planned.
  • Boot performance. PAM provides a 4x performance improvement versus booting without cache in place.
  • Desktop performance. So far, virtual desktops have demonstrated great performance. Most users say performance is faster than that from running a local OS.
  • Security. Supplying home agents with a traditional desktop fully loaded with potentially sensitive information that has to be protected via a local firewall is a security nightmare. Because virtual desktops run within the company’s data center, this security risk is eliminated. Agents access desktops via virtual private network connections (VPNs) that ensure security. Virtual desktops can even include secure, direct access to the client company. For instance, an agent providing software support might have direct access to the software company’s internal product resources, troubleshooting databases, and problem reports. Because these secured network connections exist between corporate data centers with appropriate security, the risk is much lower than having remote agents access those resources directly from home.
  • Ability to support existing call centers. Although the primary goal of the project was to support remote agents, the company is making the transition to using the same approach to support existing call centers. Using virtual desktops greatly simplifies the process of supporting call center agents and helps ensure uniformity of service. All agents have the same tools and resources, and it’s easy to keep everyone up to date.
  • Disaster recovery for desktops. In environments such as this one, employee desktops are mission critical, but it’s difficult and sometimes impossible to provide any reasonable level of disaster recovery for large numbers of widely dispersed physical systems. Desktop virtualization makes it easier to protect this critical resource. If an agent’s display hardware fails, he or she can quickly regain full desktop access using alternate hardware. Running virtual desktops are themselves protected by replication to a peer data center.
  • No remote hardware. Agents working from home are required to have a broadband Internet connection and a computer capable of running Internet Explorer, so the company doesn’t have to supply and support hardware resources for its remote agents.

Over the next 18 months, the IT team plans to focus on deploying the remaining pods and enabling disaster recovery between data centers. Once the project is complete, the impact on the company’s business—in fact, the impact on the way the company does business—will be substantial.

Conclusion

Today’s economic realities require businesses to gain efficiency by thinking differently. The company profiled in this case study rethought its entire approach to the call center business, resulting in an infrastructure that will cost far less to maintain while providing important new capabilities and a platform for ubiquitous growth. Desktop virtualization combined with unique NetApp storage capabilities made this deployment a reality.

Jamon Bowen

Trey Layton
Systems Engineer
NetApp

Trey has been working with NetApp since 2006, specializing in the design of next-generation data centers using VMware. His wealth of experience with networking and virtualization makes him uniquely well suited to the current evolution to network storage. With over 18 years of IT experience, Trey began his career in the United States Army at USCENTCOM supporting U.S. Special Operations groups operating in the Middle East. He has also held key network consulting and systems engineering positions at Eastman Kodak, GE, and Cisco Systems.

 
Explore