NetApp Tech OnTap
     

Virtualizing Microsoft SharePoint

Microsoft® SharePoint® is an integrated suite of services that provides comprehensive content management, enterprise search, and other capabilities to improve business collaboration. The latest version, SharePoint 2010, adds a host of new features, including business analytics capabilities with PowerPivot, making the software even more versatile.

SharePoint is one of the fastest-growing products in Microsoft's history. According to a recent IDG survey, 62% of CIOs identified SharePoint as a critical component of their technology portfolios, and more than half reported that SharePoint challenges—particularly storage-related challenges—are affecting their businesses. Today, half of companies with more than 1,000 users are experiencing a SharePoint data growth rate of 41% or higher* annually. Rapid growth results in a significant physical footprint that eats up resources and makes management and data protection more difficult.

Virtualization can solve many of these issues if approached correctly. In this article I want to examine how you can go about virtualizing your SharePoint environment (either SharePoint 2007 or SharePoint 2010) using Microsoft Hyper-V™ and NetApp® storage. Virtualization can significantly shrink your overall physical footprint—saving you electricity, cooling, and physical space and also simplifying management. When you need to add to your SharePoint environment, you can quickly add resources to existing virtual machines or add additional virtual machines. Virtualization also gives you more and better options for data protection, availability, and disaster recovery.

Components of SharePoint

SharePoint is a multitier application that uses roles to scale each of the tiers independently. IT teams can use any number of physical servers to support these SharePoint roles with the collection of these devices referred to as a “farm.” You can choose to run certain SharePoint roles independently and combine others on the same physical server, but Microsoft best practices typically advise running a single role per server, and many IT departments run SharePoint roles on independent physical servers to avoid any potential performance bottlenecks when those roles need to scale.

Three-tiered SharePoint environment.

Figure 1) Three-tiered SharePoint environment.

The Web tier consists of one or more stateless Web servers referred to as Web front-end servers, or WFEs. The WFE servers handle incoming requests and route them to the correct server in the application tier. The WFEs can be load-balanced, and—based on your scalability requirements—additional servers can be added. It’s not unheard of to have 20 or more of these, which contributes greatly to physical sprawl in a SharePoint environment built around physical servers.

The application tier runs SharePoint administration Web sites, end user Web sites, and shared service providers. (SharePoint Web sites and shared service providers are typically run on separate physical servers.) Administrative sites are special SharePoint sites that allow administrators to set up and configure sites for end users. SharePoint 2010 adds an additional server role to the application tier for PowerPivot.

The database tier provides all the back-end database services needed by the application tier. SharePoint Server relies on SQL Server® databases to store configuration, administrative, site content, and search data. A SharePoint installation will have a configuration database that contains information such as global configuration data (for example, information on the Web servers in the installation and server settings). SharePoint also maintains all site content in SQL Server databases—for example, documents managed in SharePoint document libraries are stored in databases instead of the Windows® file system. Other databases store information used by SharePoint search services (indexes, for instance) and for features that are unique to a particular SharePoint installation. Typically, a single system running SQL Server provides the back end for SharePoint, but these databases can also be spread across multiple physical servers in large installations.

Proliferation of servers in each tier is what contributes to the sprawl in a physical environment. Virtualizing and consolidating both servers and storage significantly reduce the number of physical servers required for a SharePoint environment and simplify all aspects of management while increasing server utilization.

Planning Deployment on NetApp and Hyper-V

In terms of Hyper-V, the simplest way to convert from physical to virtual is to simply replace each physical server with a virtual machine using a tool such as Microsoft System Center Virtual Machine Manager. You want to be careful to arrange things such that the failure of a single physical server will have as little impact as possible on SharePoint. In other words, while it might be possible to put all the virtual machines used by SharePoint on a single physical server, it’s not recommended. Distribute SharePoint functions across your available servers for resiliency and performance. Microsoft provides more guidelines for virtualizing SharePoint on Hyper-V in this TechNet article. NetApp also provides detailed guidance for SharePoint environments in its recently released best practices guide. While this guide focuses on physical deployments, the best practices also apply to virtual environments.

From a NetApp perspective, there are several important considerations:

  • Sizing your environment
  • Eliminating duplication
  • Laying out your data

Sizing
When it comes to sizing your environment for capacity and performance, you obviously want to avoid oversizing or undersizing. NetApp uses two approaches to size your environment, the first is based on the total number of users and the amount of storage you need to provide per user. The second approach requires that you determine the number of documents you have, their average size and number of versions, and estimate those values at several points in the future. Obviously, this gives you a more accurate prediction assuming you can provide the current information and have some idea of your growth rate.

Eliminating Duplication
In any SharePoint environment there is significant duplication in the form of operating system files, application binaries, and so on installed on each server. A unique benefit associated with virtualizing SharePoint with NetApp storage is the ability to eliminate duplicate data on primary storage and recover that space. Using NetApp FlexClone®, you can create template virtual machines with the appropriate software and then clone each template as many times as necessary for each virtual machine of that type. This process is extremely fast and space-efficient since the entire template does not need to be copied. Only the differences between the various clones are stored on disk.

For virtual machines that have already been deployed using standard provisioning methods, NetApp deduplication can recover much of the duplicate storage provided the virtual machines are sharing the same LUN or volume.

Layout
The final consideration has to do with how to best lay out the data for data protection and disaster recovery. NetApp recommends that all SharePoint data from both the application tier and the database tier be stored in LUNs separate from the virtual machine operating system and applications. (This is similar to the layout that was described in a recent Tech OnTap article that discussed virtualizing Microsoft applications with VMware®.) This approach will allow you to take best advantage of the NetApp SnapManager® tools to protect your SharePoint data.

For SharePoint environments running on Hyper-V there are three SnapManager tools you will use:

  • SnapManager for Hyper-V. Installed on each Hyper-V server. Provides consistent backups and replication for Hyper-V virtual machines.
  • SnapManager for Microsoft Office SharePoint Server. An agent is installed on each virtual machine associated with SharePoint to coordinate consistent backups and replication.
  • SnapManager for SQL Server. Installed on each SQL Server virtual machine to provide consistent backup and replication for SQL Server. (SnapManager for SQL Server is under the control of SnapManager for SharePoint and included as part of that solution.)

Data Protection and DR for SharePoint
SharePoint is commonly used for project management and collaboration and is becoming increasingly popular to automate customer service, research and development, and other departmental-level processes. Any disruption to your SharePoint environment might delay product launches or leave customers waiting. According to ESG, approximately one-third of planned SharePoint users are deploying it across the entire organization, meaning that—similar to Exchange—almost everyone will be affected if an outage occurs*. These factors make SharePoint data protection and disaster recovery increasingly important.

The SnapManager tools mentioned in the previous section can provide both backup and replication for your virtualized SharePoint environment. SnapManager for Hyper-V protects the virtual machines themselves. SnapManager can perform regular virtual machine backups using NetApp Snapshot™ technology for minimum disruption and almost instantaneous recovery. Using SnapManager for Hyper-V to replicate your virtual machines to a secondary site allows you to quickly restart them should a disaster occur at the primary site.

SnapManager for Microsoft Office SharePoint Server (SMMOSS) coordinates backup and replication for your entire SharePoint environment.

SnapManager for Microsoft Office SharePoint Server (SMMOSS).

Figure 2) SnapManager for Microsoft Office SharePoint Server (SMMOSS).

The SMMOSS Manager is responsible for providing central backup/restore management by utilizing the services of the Control and Member Agents installed in your SharePoint environment. It also provides the central graphical user interface (GUI) to initiate backup and restore tasks for SharePoint Web applications.

The SMMOSS Media Server generates and stores various artifacts related to a SharePoint Web application’s backup set. This includes backup set indexes and backup set metadata.

The SMMOSS Control Agent runs as a service on each SharePoint Web front-end server and is responsible for discovering the SharePoint Web apps that run on that WFE. It also is responsible for initiating backup and restore tasks for the Web apps on its respective WFE server. It does this with the help of Member Agents.

The SMMOSS Member Agent on each SQL Server actually performs the backup or restore task by using SnapManager for SQL Server (SMSQL)–based commands. The reason SMSQL is needed is because only SMSQL is capable of backing up or restoring SQL Server databases. SharePoint Web apps use a special SQL Server database (content database) to store all their contents.

The SMMOSS Member Agent on the SharePoint Index Server performs the backup or restore of the SharePoint search database and index files. (Index files can be backed up only if they reside on a NetApp LUN.)

Because SnapManager backups use NetApp Snapshot technology, they occur in a matter of seconds. This means they can be performed frequently without disruption. Once SharePoint backups are created, they can be easily replicated to your secondary site. SnapManager makes it easy to establish a replication schedule.

Replicating both your virtual machines and your SharePoint data to a secondary site gives you everything you need to recover your SharePoint environment in the face of a disaster. (This process can either be performed manually or scripted.)

Disaster recovery in a joint Microsoft and NetApp environment.

Figure 3) Disaster recovery in a joint Microsoft and NetApp environment.

This approach offers significant advantages in a virtualized environment versus disaster recovery in a physical environment:

  • It eliminates the need for complicated, server-based disaster recovery software.
  • With physical servers, you have to configure your DR environment in advance with essentially the same servers or incur the downtime that would result from rebuilding your environment from scratch on bare metal. With virtual machines (assuming your virtual machine data has been replicated) you can restart the necessary SharePoint virtual machines on any Hyper-V capable server in minutes. You can also provide fewer Hyper-V servers in your secondary environment and accept lower performance from SharePoint should a disaster occur. (You can add additional servers and live migrate running virtual machines if necessary if an extended disaster occurs.)
  • NetApp solutions can reduce your overall storage requirement. NetApp FlexClone and deduplication technology eliminate duplication in both your primary and secondary storage environments. Many sites find that they can offset the cost of a secondary environment as a result of these savings.

Conclusion

Virtualizing your SharePoint environment can eliminate many of the associated costs. By eliminating servers, shrinking the physical footprint, and increasing utilization, you save money on power, cooling, physical space, and maintenance costs. Management also becomes simpler, and provisioning time for new servers drops from days to hours or even minutes.

Adding NetApp storage to your virtualized SharePoint environment multiplies these benefits. NetApp eliminates the duplication that is inherent in virtual environments with its FlexClone and deduplication technologies while providing improved data protection and disaster recovery to better protect critical SharePoint resources.

* ESG Research Report, Microsoft SharePoint Adoption, Market Drivers, & IT Impact, March 2009.

NetApp Coommunity
 Got opinions about Virtualizing SharePoint?

Ask questions, exchange ideas, and share your thoughts
online in NetApp Communities.


John Parker

John Parker
Reference Architect
NetApp

John is responsible for developing storage directions and best practices for Microsoft SQL Server and SharePoint in conjunction with NetApp storage. He has a long-time interest understanding how knowledge management can enhance the performance of an organization. John’s specialties include IT systems architecture and database server architecture.

 
Explore