Monday, January 7, 2008

Scaling SharePoint 2007 Storage Architecture in a brief!

Well.. Happy anniversary to Sharepoint 2007!

Another day and another thought! Time to discuss the gotchas for the administrators!!
Great challenge to any administrator is to Plan the available resources such that the resources are used for optimum solutions and overhead! And I am not an exceptional. So lets walk through the strategies that turns the effective planning in the reality!!

Sharepoint and content database

All the bulk data in all the site collections that comprise the portal instance is stored in the SharePoint Content databases. So SharePoint content databases grow very rapidly when used in file share replacement/document imaging system scenarios. This gives the headsup to consider the growth and overhead factors when determining how much content will eventually be stored in a given content database. You must be able to ensure that one site collection cannot use so many resources that other site collections can no longer function.

Windows SharePoint Services allows you to specify quota for site collections, so that you can manage your site and server resources. When you are planning the taxonomy and information architecture of your sites, a good amount of effort and thought should be put into Site Quotas.

Site Quotas are a key component for SharePoint Administrators to be able to properly manage and administer data restrictions on your site collections. SharePoint stores site content within a SQL Server database. Therefore, if you want to optimize SharePoint storage architecture, you have to focus on SQL Server. fortunately, Microsoft provides several Performance Monitors you can use with SQL Server. Great!
The content/data in the Recycle Bin is also included in the Quota space (it’s part of the site).

Site Collection Quotas and Quota templates are managed in Central Administration Microsoft suggests that the optimal maximum content database size should be 50GB. They alsorecommend that for most SharePoint implementations, the solution should not include any contentdatabases larger than 100GB. This is commonly referred to as the “100GB content database sizelimitation”. In fact, this is not a true limitation but rather a recommendation. SQL Server databases have beenscaling far beyond 100GB for years now.

Practically speaking, the recommendation is based primarilyon two significant factors:

1. Service Level Agreement (SLA) requirements for a given organization may dictate that backupoperations for the SharePoint databases must be executable in a limited amount of time. Thesize of the content databases will have a direct impact on how long it takes to execute thatbackup.

2. The storage subsystem must be robust enough to handle the disk I/O requirements of theSharePoint solution that it serves.

As long as a given organization is able to mitigate these two considerations, then the contentdatabases can be allowed to grow. Real world implementations have seen successful SharePointdeployments that have implemented database sizes of 100GB, 150GB, 200GB, 250GB, 300GB,350GB and 400GB. In order to mitigate the SLA requirements, a solution might employ a high speed disk-to-disk backupsolution, database mirroring, or database log shipping to a disaster recovery site.

Some tips and tricks to mitigate:

1. Add tempdb(space for the intermediate data) data files such that the number of data files equals the number of CPU corespresent in the SQL Server machine. Each data file (primary .mdf and additional .ndf(s)) shouldbe equal in size.
The size of each data file can be calculated using this formula:


2. once the raw file storage quantities are determined, the following formula may be used to estimate databaseoverhead:

Database Overhead Formula:
Low: 1.2 * [Raw Storage Size] = ContentDB Size
High: 1.5 * [Raw Storage Size] = ContentDB Size

3. Use the following formula to calculate how much disk space needed for the search database:

GB of disk space required = Total_Content_Size (in GB) x File_Size_Modifier x 4
where File_Size_Modifier is a number in the following range, based on the average size of the files inyour corpus:

• 1.0 if your content consists of very small files (average file size = 1KB).
• 0.12 if your content consists of moderate files (average file size = 10KB).
• 0.05 if your content consists of large files (average file size 100KB or larger)

4. Place the content database file(s) on RAID 10 logical unit(s) if at all possible.

5. Place the search database log file on its own logical unit.

Some SQL gotchas:

Move tempdb
After executing this script and restarting SQL Server, the tempdb data and log files willbe initialized in the new location.

USE master
ALTER DATABASE tempdb MODIFY FILE (NAME = tempdev, FILENAME = ‘[New data folderpath]\tempdb.mdf')
ALTER DATABASE tempdb MODIFY FILE (NAME = templog, FILENAME = ‘[New log folderpath]\templog.ldf')
(Don't forget to Restart SQL Server)