Case Study: PeakColo Accelerates Cloud withData ONTAP 8 Cluster-Mode
At PeakColo, we specialize in providing turnkey cloudinfrastructure designed specifically for value-added resellers(VARs) and managed service providers (MSPs), including suchindustry leaders as Sayers, BitRefinery, Data Fortress, Parsec Data Management, and Lewan.
Our premier WhiteCloud infrastructure as a service (IaaS)offering lets a VAR or MSP become a fully branded VMwarevCloud® services provider in a matter of hours without theupfront capital investment that would otherwise be required.
Over the last three years, we've experienced 100% growth everyyear. What fuel this growth, and what make our services attractiveto our customers, are the tremendous performance, availability, andfeature set that we deliver.
By building our infrastructure on NetApp® Data ONTAP® 8 Cluster-Mode,we've been able to:
In this article I'll describe how we built an agile ITinfrastructure-compute, networks, and storage-that addresses ourunique challenges and explain how that infrastructure deliverstremendous advantages to our customers and to us. I'll also talkabout new technologies such as Flash Pool, Infinite Volume, andparallel Network File System (pNFS) that we're excited aboutdeploying in the future. I think you'll find that you can benefitfrom many of the same technologies we use, whether you deploy themin your own data centers or contract for them in the cloud.
For a typical IT infrastructure, planning downtime for hardwaremaintenance can be extremely difficult; when you have hundreds oftenants sharing the same infrastructure, it becomes impossible. Inthe first two iterations of our infrastructure-before NetApp andespecially Data ONTAP 8 were introduced-our SANs were purely aphysical construct. Even though we had multiple storage systems, aproblem that required maintenance on a particular storage systeminevitably affected multiple customers. That made achieving a 100%SLA impossible.
From the standpoint of provisioning, a customer might ask for anew chunk of storage-for example, .5PB or 1PB-and we had no way tobring that online without disrupting the customer's ongoingoperations.
We were also limited in our ability to share a SAN amongmultiple customers. iSCSI was our protocol of choice, but wecouldn't do that in a secure, multi-tenant fashion. We had tocontrol access to physical hosts, which limited the types ofservice we could offer and also limited the value we could pass onto our reseller partners. They themselves needed the ability tosupport multi-tenancy for their own customers and a way to makesure of compliance for regulations such as HIPAA and standards suchas PCI.
Finally, it's the nature of our business that we never have morethan limited visibility into the demands of the various workloadsrunning on our infrastructure at any given time. We really neededan infrastructure that could elastically adapt to spikes in demandfrom different workloads-for example, virtual desktopinfrastructure (VDI) boot storms-and that would also allow us toload balance workloads to accommodate longer term trends.
Design of Our Cloud Architecture
We currently have five Type-II SSAE 16/SOC 1 data centers: fourin the United States and one in the United Kingdom. Our cloudarchitecture is designed throughout with bandwidth and redundancyas top priorities. Key components include:
Figure 1) Overview of PeakColoarchitecture.
All our data centers use a blended Internet access approach. Weload balance across the top 16 Internet carriers to increaseflexibility, reduce costs, and deliver the best possibleperformance and reliability. We typically achieve latencies of 40milliseconds or less in the continental United States.
We use carrier-class Brocade VDX and CER networking componentsin our networks:
The needs of a cloud provider routinely exceed the limits ofenterprise-class hardware. We chose Brocade because it gives usgreat modularity and scalability and a roadmap for the future.
The recent acquisition of Nicira by VMware underscores theimportant role that software-defined networking (SDN) is likely toplay in the future. Brocade's commitment to the OpenFlow protocolgives us confidence that we'll be able to fully leverage SDN as thestandard matures.
PeakColo extends the idea of SDN a step further with ourpatent-pending Layer 2 process that we use to cross-connectcustomers' Layer 2 resources into our cloud environment. Use casesfor this are hybrid cloud deployments where an enterprise mightwant to keep its existing firewall, AS400, legacy storage and tape,or other physical resources, but also leverage cloud components andservices from PeakColo.
I discuss the network technologies we use in more detail in arecent interview.
On the server side, we use our own Open Compute Platformservers, each with dual 10 Gigabit Ethernet (10GbE) networkinterface cards (NICs). The NICs are active-active and carry bothuser and data traffic. We use cross-fabric link aggregation (LAG)to provide load balancing and eliminate single points of failure.In total, we have about 2,500 such servers in our five data centersright now.
We chose Open Compute because it lets us source servers frommultiple vendors to fulfill our procurement cycles and not getlocked into a single vendor. We are able to purchase servers withthe same defined set of components, parts, and drivers and havethem shipped preinstalled with VMware®. Open Compute platformsare supported by VMware and everybody else, so we know that when wedeploy new servers, they are going to work just like the existingones without any surprises.
For storage, PeakColo uses NetApp FAS3240 and NetApp FAS3270systems exclusively. These are configured in clusters of four nodesusing Data ONTAP 8 Cluster-Mode. We currently have two NetAppclusters deployed and will be deploying two more very soon. If youaren't familiar with Cluster-Mode, you can read more about it in anarticle from last month's issue of TechOnTap®. There are also an article on Cluster-Mode block performance in this issueand a recent article on Cluster-Mode NAS performance and scaling.
We chose Data ONTAP 8 Cluster-Mode because no other storagetechnology out there came close to delivering the same level ofscalability, flexibility, performance, and features. NetAppmarketing describes this architecture as intelligent, immortal, andinfinite. While those sound like pretty bold claims, the technologydelivers for us. It's intelligent in terms of storage efficiency,and near infinite in terms of scaling. Support for nondisruptiveoperations allows us to meet 100% SLAs, and we think it will alsolet us get much more life out of each storage system we deploy.I'll talk more about this later.
Each of our customers maps to a separate Vserver on a storagecluster; this is the key to our multi-tenant environment andenables many of our key capabilities. A Vserver is a secure,virtualized storage container that includes its own administration,security, IP addresses, and namespace. A Vserver can includevolumes on multiple nodes in the cluster and is not tied to anyparticular node. We can move Vservers as necessary to domaintenance or rebalance load without disrupting workloads runningon those Vservers.
Figure 2) Data ONTAP 8 Cluster-Mode usesVservers to provide multi-tenancy and enable nondisruptiveoperations (NDO).
Each of our clusters has a mix of SSD, SAS, and SATA disks, andall nodes have Flash Cache. Our customers contract for the amountsand types of storage they need from each tier. We use six 10GbEconnections per storage system (including redundant clusterinterconnects) to deliver the necessary connectivity andthroughput.
Figure 3) NetApp connectivity in the PeakColoarchitecture uses six 10GbE connections per storage system.
As part of its vCloud initiative, VMware established the VMwareService Provider Program (VSPP) as a framework to allow serviceproviders like us to consume and offer VMware virtualizationsolutions in a way that aligns with our business models. We are apremier-level VSPP partner.
Our WhiteCloud service can deliver a fully branded and dedicatedsolution based on vCloud Director to our customers. We can alsoprovide other virtualization platforms such as Hyper-V™ andCitrix XenServer or a mix of physical and virtual servers. We areable to do this because all servers (physical and virtual) connectback to a Vserver and a dedicated VLAN on a NetApp cluster,providing the necessary multi-tenancy support. Because of theperformance we deliver, we have many customers that use us tosupport VDI solutions such as XenDesktop.
Operational Advantages for PeakColo
Using Data ONTAP 8 Cluster-Mode as the foundation of ourarchitecture helped us address our infrastructure challenges andgives us significant operational advantages.
The ability to perform critical operations on our NetAppclusters without disrupting users is critical to our ability todeliver on our 100% SLA to our customers. Maintenance activitiessuch as firmware and software upgrades and hardware upgrades andreplacements can be performed by moving active Vservers off of acluster node prior to performing a given operation so thatcustomers are not disrupted. This is done in a round-robin fashionin cases when all nodes require upgrading. For storageprovisioning, we can bring new storage online without disruptionand transparently migrate a customer's data to that newstorage.
The same ability to move active Vservers also provides aconvenient way of doing load balancing. OnCommand® SystemManager makes it easy for our administrators to see what'shappening across all Vservers in order to make load-balancingdecisions.
Multi-Tenancy, Feature Pass-Through, and DelegatedManagement
Because multi-tenancy is built into Cluster-Mode through theability to create Vservers, we are able to better shareinfrastructure between our customers for greater infrastructureefficiency without sacrificing customer isolation. In addition, wecan delegate the management of a Vserver to a customer (if theywant) and pass control of the full NetApp feature set-includingdeduplication, compression, thin provisioning, backup, replication,and more-through to them.
Since our customers are service providers themselves, this isimportant. We can deliver true IaaS to them-giving them fullmanagement control over their infrastructure instead of justmanaged IaaS where we retain control over most of the bells andwhistles.
Many of our customers are NetApp VARs themselves, so theyalready know how to manage NetApp storage and understand the valueof the NetApp feature set, and we provide some pretty intensivetraining for new customers to make sure they understand how andwhen to take advantage of NetApp features. We can see thatdeduplication is enabled on all NetApp volumes (except for a fewvolumes containing geospatial data, where compression was a biggerspace saver) for a total space savings of about 70%. Those spacesavings translate to significant cost savings for PeakColo and ourcustomers.
Preserving Existing Investments
Most scale-out storage uses specialized building blocks. Anothergreat thing about Cluster-Mode is that it uses the same buildingblocks as Data ONTAP 7 and Data ONTAP 8, 7-Mode. We already had anumber of systems running 7-Mode, which could be repurposed andused in our Cluster-Mode clusters. We accomplished this bymigrating the data from the 7-Mode systems to an existing clusterusing tools such as VMware Storage Motion and then joining thehardware to the cluster. This means that if you're not ready forNetApp clustering today, you can start with NetApp 7-Mode andconvert to Cluster-Mode when you need it.
Keeping Storage Longer
As a service provider, we want to get the longest possible lifeand the maximum utilization from our infrastructure investments.Customer performance requirements, however, traditionally drive afairly rapid upgrade cycle in which you have to replace storagesystems every two to three years.
Cluster-Mode will allow us to hold onto storage hardware longer.NetApp clusters don't have to be built from identical buildingblocks; cluster nodes can be heterogeneous. This means we'll beable to add the latest generation of storage nodes to our existingclusters as we need them. Then we can migrate Vservers that needthe highest performance to the new nodes, while retaining the olderhardware in the cluster as another tier of storage that we canoffer our customers. We anticipate holding onto storage systemsfive to seven years.
Advantages for PeakColo Customers
Using our cloud infrastructure, PeakColo can create a fullybranded vCloud Director solution with a virtual SAN and 10TB to500TB of storage in just four to eight hours. We think that gives acustomer a lot of market value in a very short time. Some of ourVAR and MSP customers contract for a single Vserver, which theyshare among their own customers. Others that have customers thatneed, for instance, HIPAA or PCI compliance might have a Vserverfor each individual customer. Service providers that want to offerdisaster recovery services contract for Vservers in multiplePeakColo data centers.
Possibly the biggest advantage that we deliver to our customersis the full value of NetApp cluster performance combined with theNetApp feature set. Even for a customer with a modestrequirement-say, just 10TB of storage-we are able to deliverdramatic I/O performance as well as deduplication and other storageefficiency technologies that actually reduce the amount of storagefor which they have to pay. The also get the full advantages ofNetApp Snapshot™ and all the other NetApp data management anddata protection features. It's as if they purchased a multinodeNetApp cluster of their own.
Our ability to deliver a high level of I/O performance is a keydifferentiator for us. We've never lost a prospective customerafter they do testing or a proof of concept with us; the clearperformance advantage we provide just blows people away.
Data ONTAP 8.1.1 includes several new technologies that we'reexcited about and currently investigating for futuredeployment.
Infinite Volume and pNFS
NetApp Infinite Volume technology provides a compound volume inwhich data is distributed across multiple constituent volumesspread across all the nodes of the cluster. This dramaticallyincreases the throughput that can be delivered from a singlevolume. We think when you combine the capabilities of Cluster-Modewith the pNFS capability of NFS version 4.1, it has the potentialto be a market changer, allowing us to drastically increase speedwhile dramatically increasing usability in comparison tospecialized, parallelized file systems such as GPFS and Lustre.We'll be able to pass that value down to our existing VARs and MSPsthat serve science, engineering, and other big data markets such asHadoop. Advantages include:
We're already using pNFS with one customer and are investigatingways we can combine it with Infinite Volume in the future to createnew high-performance services.
As a service provider, we are able to exercise almost no controlover the workloads that run on our infrastructure. Problems such asmisaligned volumes or VMs, boot and login storms, and similarevents can occur without warning, so any technology that allows ourinfrastructure to better adapt to these kinds of unexpected eventsis welcome.
We see NetApp Flash Pool technology as a potentially importanttool to help us better accommodate these unexpected events and alsoas another value-add that will give us the ability to create andoffer new tiers of storage. Flash Pool is part of the NetAppvirtual storage tiering (VST) technology, which adaptsautomatically to keep hot datasets on high-performance storage. Itallows you to create aggregates of NetApp disks that combinetraditional disk drives with SSDs. Random write and read data isautomatically cached on SSDs to accelerate performance.
We're currently testing the performance of Flash Pools, whichcombine high-capacity SATA disks with SSDs, and hope to offer it asa new tier of storage to our customers in the future.
For PeakColo, continued growth means continuing to deliver theperformance, scalability, and capabilities that our VAR and MSPcustomers need. We're confident that we've chosen the besttechnology partners to sustain our growth and accommodate importantindustry trends such as the adoption of flash-based storage,software-defined networking, and big data. NetApp Data ONTAP 8Cluster-Mode is delivering the capabilities we need to succeed nowand to meet our future challenges. The flexibility of Cluster-Modeis invaluable to us as a cloud service provider, allowing us to bemuch more agile than our competitors.
Got OpinionsAbout PeakColo Case Study?
Ask questions, exchange ideas, and share your thoughts online inNetApp Communities.
Visit Tech OnTap in the NetApp Community to subscribetoday.