Overview: This post aims to provide guidelines for building SharePoint 2013 Search farms. There are 6 Search components (labelled C1-C6 below) and 4 database types (labelled DB1-DB4). Index partitions are a big factor is search planning.
Example: Throughout this post I provide an example of a 60 million item search farm with redundancy/High Availability (HA).
Index partitions: Add
1 index partition per 10 million items is the MS recommendation, this really depends on IOPs and how the query is used. An twinned partition (partition column) is needed for HA, this will improve query time over a single partition.
Example: So assuming a max of 10 million items per index, to have a HA farm for 30 million items requires 6 partitions.
Index component (C1): 2 index components for each partition.
Example: 12 index components.
Query component (C2): Use 2 query processing components for HA/redundancy, add an additional 2 query components at 80 million items increase.
Example: 2 Query components.
Crawl database (DB1): Use 1 crawl database per 20 million items. This is probably the most commonly overlooked item in search farms. The crawl database contains tracking and historical information about the crawled items. It also contains info such as the last crawl id, time etc, crawl history. Crawl component feeds into the crawl database. Medium usage should be under 100GB. Add more content database before 20 million or 100GB database size.
Example: 3 crawl databases at 20 million items each allows for a search farm containing 60 million items.
Link database (DB2): Use 1 link database per 60 million items. I believe 1 link database will handle up to 100 million items.
Example: 1 link database.
Analytics reporting database (DB3): Add 1 search analytics reports database for each 500,000 unique items, viewed each day or every 10-20 million total items. This is the heavy search database. Add a new database to keep each Analytics reporting database under +-250GB.
Example: Start with 1 and grow as needed.
Analytics Processing Component (C3):Content Processing Component (C4): processes crawled items and moves the item data to the index component. It's function is to parses documents, performs property mapping and entity extraction, perform language processing, and ultimately moves crawled items into indexed items.
Example: 4 Content Processing components.
Admin component (C5): Use 1 administration components or 2 search for redundancy/HA. For all farm sizes.
Example:2 Admin components.
Admin database (DB4): Low usage, even in big farms, you only need 1 database. Should stay well under 100GB.
Example: 1 Admin database.
CrawlComponent (C6): The crawl component crawls content sources and delivers crawled items including metadata to the Content Processing component. In SP2013 you don't specify the relationship between the crawl database and the crawl component. The crawl component will distribute to all available crawl databases. The 3 types of crawls available in SP2013 are: Full, Incremental and Continuous (only works for SP2013 content). Schema changes still require a full crawl to pickup the change in SP2013. Crawl does not do as much analysis as was the case in SP2010 so it is a much lighter/faster process.
Example:2 Crawl components allows for HA and improved performance
Database Hardware: for the example use 8CPUs, 16GB of Ram, disk size depends on content but it is smaller than SP2010.
Placing components on VMs for the example:
Group your search roles onto servers:
- Index & Query Processing
- Analytics & Content Processing
- Crawl, Content processing & Search Admin
More Info:
Troubleshooting Crawl