What a long title for a post, right? And with the current trend to scripting installs, why would anyone in their right mind manually configure anything in SharePoint. The truth of the matter is that I didn’t plan on configuring the Search Topology. In fact, this manual configuration was done as part of my poorly scripted configuration. It is rare in my experience that learning real lessons come from planning. The real lessons come from life not going quite as planned…
SharePoint 2010 is great, but what are the Top 5 enhancements or new features that are available in SharePoint 2010? Everyone who works with SharePoint will have their own Top 5 lists.
One of the key potential uses of Search Server 2010 Express is to provide a great search engine for your existing public facing website. I work with a lot of different associations that run a lot of different CMS platforms. While I’m a huge fan of utilizing the CMS capabilities of SharePoint 2010 for a variety of reasons, there isn’t a single platform that is right for everyone. There isn’t a single auto make and model for everyone, and there isn’t a single pair of shoes that will work for everyone, so why would the CMS industry be any different? However, a powerful search IS relevant to everyone (pun intended!).
In Part 1, we walked through a generic install. Once you have the Search Server 2010 Express up and running, it is extremely simple to configure a new content source. If you are jumping directly from the vanilla install, you should see a screen that will link you directly to the Search Administration page.
If you are just jumping in to Central Admin, the link path that you’ll need to get to the Search Administration page is under Application Management, click on Manage Service Applications, and then click on Search Service Application. While the concept of Service Applications is beyond the scope of this particular post, know that in larger environments (such as SharePoint 2010) you can run multiple Search Service Applications.
In the left nav, under Crawling, click Content Sources. You will be linked to Manage Content Sources page. You can use this page to add, edit, or delete content sources, and to manage crawls.
Before we go any further, what is a Content Source? For that matter, what is Content? In the context of Microsoft SharePoint and Search Servers, Content is any item that can be indexed. This can be HTML,a Web page, a Microsoft Office Word document, a text file, a PDF file, business data, or even an e-mail message. Content lives somewhere, such as a Web site, file share, a Notes database, a SQL database, or SharePoint site. A Content Source specifies the settings that define what content should be indexed and on what schedule it should be crawled.
You should notice on the Manage Content Sources page that there is at least one Content Source already defined: Local SharePoint sites. Using the wizard to manage the install that we followed in Part 1, all local SharePoint sites are already defined as a Content Source.
In order to create a new Content Source (such as our external site), click the New Content Source at the top. You will see the Add Content Source Page:
Content Source Name – A title that you are giving as a reference to manage this Content Source.
Content Source Type – Type of Content that you will be crawling. This is an important setting because it instructs the crawler on not only the type of content that will be located there, but also how to actually communicate with the Content Source. For example, communicating with a File Share utilizes a completely different protocol than communicating with a web site. The default types of Content Sources supported listed here. Note that I said ‘default’. You can work with vendors or write your own custom interface to crawl and index content types not specified out of the box. Also note that if you select different types, the Crawl Settings change to specify different details for the specific type of Content Source you are specifying.
Start Addresses – the URLs the search system should start crawling. For SharePoint sites and Web sites, these are traditional URLs. For File Shares, these will be UNC paths that are accessible from the server. You can supply more than one Start Address for a Content Source. If, for example, I wanted to include a single Content Source to manage various SusQtech websites that I am crawling, I could add http://www.susqtech.com/, http://www.sharepointacademy.org, http://www.sharepointconference.org, and http://www.thesug.org. I can then manage all of these URLs as a single Content Source. I could also opt to create multiple Content Sources so that I can manage each of the crawl schedules and details independently.
Crawl Settings – used to specify the behavior of crawling for this Content Source.
Crawl Schedules – used to schedule the crawls for this Content Source. This allows you to configure 2 different crawl schedules: full and incremental. Why would you ever want an incremental instead of a full? Incremental crawls are supposed to only crawl content modified since the last crawl and thus take less bandwidth, server memory, and CPU cycles. I typically configure these schedules with a Full crawl on the off hours on the weekend and Incremental crawls every night during the week. Keep in mind that you may need more frequent incremental crawls – such as every hour for your public facing website if you are continuously adding new content.
Content Source Priority – normal or high. The crawler will prioritize ‘high’ items when you have multiple content sources that must be crawled.
Start Full Crawl – a checkbox to start a full crawl immediately.
I am a huge fan of free software. I think that there are tons of great free software packages available: WordPress, Ubuntu, Microsoft SharePoint Foundation, Microsoft SQL Server 2008 R2 Express, Picasa, TeraCopy and many more. When most people hear FREE software, they hear Open Source. However, not all free software is open source. One of the absolute greatest free software packages available for you to use today is Microsoft Search Server 2010 Express. Any association, nonprofit, charity, school or company can use this software to significantly improve their search capabilities. Internally searching the S:\ drive or externally on you existing public facing website – Microsoft Search Server 2010 brings a lot of great capabilities to the table – for free.
This is just a series of screen caps of the vanilla install environment. Keep in mind the requirements for Search Server 2010 Express are similar to those of Search Server 2010 and SharePoint 2010: 64-bit edition of Windows Server 2008 Standard, Enterprise, Data Center, or Web Server with Service Pack 2 (SP2) or 64-bit Windows Server 2008 R2 (various flavors).
I’ve installed this on 64-bit Windows Server 2008 R2.
After downloading and launching the executable, you should see:
There are various links in the splash screen, but basically under Install, you can let the wizard install all of the prerequisites needed (including IIS). If you want, you can manually download and install all prereqs to ensure they are installed exactly how you want, see http://technet.microsoft.com/en-us/library/bb905370.aspx.
Even if you install the prereqs manually, it’s still a pretty good idea to run the wizard to validate your environment. The wizard will check that everything is right as rain before installing.
After accepting the Ts & Cs, the prereq wizard will run through…
Complete! The prerequisites are now installed (or validated) and you can run the actual Search Server 2010 Express install.
There are two modes for installing Search Server 2010 Express:
I’ve selected Stand-alone which should go through and configure absolutely everything I need to have Search Server 2010 Express running.
The actual install goes fairly quickly. The installation provides an opportunity to run the Configuration Wizard immediately or not.
With Search Server 2010 Express in Stand-alone mode, the configuration wizard is pretty straightforward.
Simple dialog making sure you know that some services may be restarted.
After clicking Yes, sit back and relax for a bit. I have a pretty fast virtual environment and the configuration screens take about ten minutes.
And…
After the configuration wizard finishes, you should automatically be taken to the Central Administration screen with a few steps listing how to begin configuration of your specific search implementation.
You should also notice that you now have some new Administrative shortcuts installed, namely a folder with shortcuts to the SharePoint 2010 Central Administration, SharePoint 2010 Products Configuration Wizard, and the SharePoint 2010 Management Shell and a second folder called Microsoft SQL Server 2008. What? SharePoint 2010? SQL Server 2008? But I thought I installed Search Server 2010 Express. Several Microsoft products utilize SharePoint as the interface for the applications. SharePoint uses SQL Server. SharePoint Foundation is free and provides a great user interface experience, security components, an application development framework, a deployment framework, and so much more. SQL Server 2008 Express is free and is, well, SQL Server – arguably one of the strongest database management platforms on the market. Once any developer learns to leverage the SharePoint framework, the time and effort required to write a web based application can be significantly shortened. After all, that’s what frameworks and APIs provide – the ability to leverage existing ‘stuff’ and not having to write everything from scratch every time.
Get more info directly from Microsoft.
Marketing site: http://www.microsoft.com/enterprisesearch/searchserverexpress/en/us/default.aspx
Download site: http://www.microsoft.com/enterprisesearch/searchserverexpress/en/us/download.aspx
TechNet site: http://technet.microsoft.com/en-us/enterprisesearch/ee263912.aspx#tab=1
While I wrote a very generic overview of search technology as a favor in a blog post today, it appears that I’m not alone in thinking about the history of search today. Xeni pointed out two other History of Search items today as well. It appears that my text heavy approach to the topic has been trumped by these great infographics!
Infographic by the PPC Blog.com
Infographic byWordStream Internet Marketing