<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>StoverEffect &#187; Metadata</title>
	<atom:link href="http://stovereffect.com/tag/metadata/feed/" rel="self" type="application/rss+xml" />
	<link>http://stovereffect.com</link>
	<description>John Stover. Entrepreneur. Consultant. Author. Speaker. Mentor. Strategist. Expert.</description>
	<lastBuildDate>Mon, 27 Jun 2011 18:00:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>What is in your Website Search?</title>
		<link>http://stovereffect.com/2010/08/03/what-is-in-your-website-search/</link>
		<comments>http://stovereffect.com/2010/08/03/what-is-in-your-website-search/#comments</comments>
		<pubDate>Tue, 03 Aug 2010 17:40:10 +0000</pubDate>
		<dc:creator>John Stover</dc:creator>
				<category><![CDATA[Gadgets]]></category>
		<category><![CDATA[SharePoint]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Licensing]]></category>
		<category><![CDATA[Metadata]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Search]]></category>

		<guid isPermaLink="false">http://stovereffect.com/2010/08/03/what-is-in-your-website-search/</guid>
		<description><![CDATA[Original Photo by JohnStover There are literally hundreds and thousands of ‘search engines’ out there. Some of these search engines are for finding stuff on the Internet, like Google, Bing and Yahoo. Some search engines are more specialized, like the search box you see on a single web site that searches only that single website. [...]]]></description>
			<content:encoded><![CDATA[<div style="margin-bottom: 10px; float: right; margin-left: 10px"><a title="photo sharing" href="http://www.flickr.com/photos/stovereffect/4791662667/"><img style="border-bottom: #000000 2px solid; border-left: #000000 2px solid; border-top: #000000 2px solid; border-right: #000000 2px solid" alt="" src="http://farm5.static.flickr.com/4095/4791662667_c696ef8dbf_m.jpg" /></a>    <br /><span style="margin-top: 0px; align: center,font-size: 0.9em">
<p align="center"><font size="1">Original Photo by </font><a href="http://www.flickr.com/people/stovereffect/"><font size="1">JohnStover</font></a>        </p>
<p></span></div>
<p>There are literally hundreds and thousands of ‘search engines’ out there. Some of these search engines are for finding stuff on the Internet, like Google, Bing and Yahoo. Some search engines are more specialized, like the search box you see on a single web site that searches only that single website. Search is an incredibly complex topic that has an astounding number of factors that contribute to finding that single important piece of content that you are trying to find. Frankly, Google spoiled all of us. I expect to find exactly what I’m looking for out of the millions of pages of stuff all over the internet by simply typing a single word into a single little box. If I don’t find what I want on the first page of results, I might try changing my search a little bit or adding two words, but I won’t keep trying for long.</p>
<p>The Internet contains <strong>at least 27.5 billion pages</strong>, as of Tuesday, 03 August, 2010, according to <a href="http://www.worldwidewebsize.com">http://www.worldwidewebsize.com</a>. Not only do I expect to find exactly what I want on the Internet, but if I use the search on your website, I get EXTREMELY frustrated when it doesn’t find exactly what I want when I want. How is this possible? I know what I want is on your website somewhere. Figure out what I want and show it to me! And please do it in under a second if it’s not too much trouble!</p>
<p>In the beginning, search was simple. Search was based on keyword matching. If I typed in a keyword, the ‘search engine’ scanned the content and found instances of that word and showed me hyperlinks with those results. I could search for ‘blog’ and the search would show me any page that had the word ‘blog’ in it. That was perfect! It’s all anyone needed. Then websites started to grow in complexity. Soon, each website had thousands of pages. If I did a simple keyword search, I would get hundreds of results. This wasn’t useful anymore. Search had to get better.</p>
<p>Search introduced major improvements. Boolean search operators were introduced. I could search for “SharePoint AND WordPress”. I could search for “SharePoint NOT WordPress”. I had some control on what I was searching for exactly. I also got search result sorting. I could sort all of the results to see the most recently created pages at the top. After all, if the page was newer then it clearly was more relevant, right?</p>
<p>That statement introduces a very important topic: RELEVANCE. Relevance denotes how well the results meet the need of the user searching; see the all-knowing Wikipedia for more details at <a href="http://en.wikipedia.org/wiki/Relevance_(information_retrieval)">http://en.wikipedia.org/wiki/Relevance_(information_retrieval)</a>. Relevance is determined by the search algorithm. That’s right; a computer programmer wrote a mathematical formula that uses the available information to determine the relevance of the content to your search word. In reality, that algorithm was written by a very large team of programmers, analysts, mathematicians, executives and many others. And the search is getting more complicated and far better every day.</p>
<p>Most modern search engines are comprised of two different primary components: the INDEX and the QUERY. The index is just like the index at the back of a book. Rather than scanning all of the content in real time, the search engine builds a big index of all of the content. This is much faster than scouring through the content in real time. Furthermore, the index can be optimized for the type(s) of searches being performed. Your individual website search is responsible for searching your website. Facebook search searches Facebook – the profiles, comments, photos, tags, etc. Google and Bing try to search everything – your website, my website, her website, their website. Your website search should search ALL of your content – web pages, HTML, PDF files, Word docs, PowerPoint files, Excel files, images, comments. The index should include ALL of your content.</p>
<p>So how is the index built? Usually indexes are built by a <a href="http://en.wikipedia.org/wiki/Web_crawler">Web crawler</a> – some type of automated software that scours all of the links and content on your site. The index uses the concept of word breaker to look for different words. In the English language, there are many characters that break words apart. Spaces, hyphens, periods, colons, semicolons, exclamation points all separate words in English. When you get into multi-lingual content, the story gets even more complicated because other languages don’t even use the same characters. So the crawler goes through all of the content and builds this enormous index for use in queries. The index contains the words, counts, metadata, information about where the words were found, information about the pages, information about the documents, titles, cached portions of pages and much more.</p>
<p>When a user enters a query, the search engine uses it’s algorithm to provide the most relevant information possible. What determines relevancy? There are many factors that should determine relevancy…</p>
<ul>
<li><strong>Content Type</strong>. What type of content is the word found on? PowerPoint files typically have fewer words. If your keyword is one of the 20 words on a slide, that file is likely more relevant than a Word document or web page that has 2000 words.</li>
<li><strong>Location</strong>. If your keyword is found on the homepage or main landing page it is likely more relevant than if the page is found 30 nodes away through some obscure navigation.</li>
<li><strong>Popularity and linking</strong>.&#160; How popular is the page? How many other pages and documents link to the page? How frequently is the page visited?</li>
<li><strong>Analytics</strong>.&#160; How frequently is the page visited with similar queries? If 50 other people searched for the same keyword(s) you searched for, which pages did they eventually go to?</li>
<li><strong>Words</strong>. How many times is the keyword on the page?&#160; How many</li>
<li><strong>Metadata</strong>. Is your keyword in the metadata or just the main content area? Is your keyword in the page title?</li>
<li><strong>Language Detection</strong>. Is my browser set to Spanish? Should documents in Spanish show up with a higher ranking in the search results?</li>
<li><strong>Variants</strong> (Word Stemming). What if I search for the word “Flying”? Should the search engine also search for Fly and Flew and Flown? What if it’s a different language? Should the search engine be aware of other word variations?</li>
<li><strong>Human Influence</strong>. What about best bets, synonyms and keyword mapping. If someone is on the Association site and searches for the word <em>Meeting</em>, do you want to artificially influence the search results to show ‘Sign up for the Annual Conference’ as the first result?&#160; I bet the conference organizers do!</li>
</ul>
<p>As you can see, the effectiveness of the search engine depends on the ability to determine relevance and then use that relevance to rank the search results. Modern search engines are available both inherently integrated and completely independent from your website content management technology. WordPress, for example, has a built in search that is pretty simple (and thus largely ineffective).&#160; It’s great for finding a keyword, but I would hardly call it a search engine.&#160; Both Microsoft and Google provide real search solutions.&#160; The have solutions for you at every level: your desktop, your enterprise, your website, and the Internet.&#160; We are focusing primarily on your website and to a lesser extent your enterprise. The <a href="http://www.google.com/enterprise/search/gsa.html">Google Search Appliance</a> provides a great solution that provides excellent relevancy that can be customized for your particular web site needs. The Google Search Appliance and Google Mini require annual maintenance fees.</p>
<p>Microsoft provides a free solution to search for your website and for the enterprise. That’s right; Microsoft provides enterprise level search capabilities for <b>FREE</b>. <a href="http://www.microsoft.com/enterprisesearch/en/us/search-server-express.aspx">Microsoft Search Server 2010 Express</a> provides the search capabilities described in this overview for FREE. While this solution may not be the perfect fit for every website, I think it is at least worth evaluating. You can download the software for free, install it, and configure it in a matter of minutes. If it works for you, implementing it with your website is as simple as replacing the search box.</p>
<p>    <br clear="all" /></p>
]]></content:encoded>
			<wfw:commentRss>http://stovereffect.com/2010/08/03/what-is-in-your-website-search/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>SharePoint Folders vs Metadata&#8230; the ultimate battle!</title>
		<link>http://stovereffect.com/2010/07/12/metadata-versus-folders/</link>
		<comments>http://stovereffect.com/2010/07/12/metadata-versus-folders/#comments</comments>
		<pubDate>Tue, 13 Jul 2010 01:40:18 +0000</pubDate>
		<dc:creator>John Stover</dc:creator>
				<category><![CDATA[SharePoint]]></category>
		<category><![CDATA[Metadata]]></category>
		<category><![CDATA[SharePoint 2010]]></category>

		<guid isPermaLink="false">http://stovereffect.com/?p=157</guid>
		<description><![CDATA[SharePoint has extremely robust content storage capabilities.  Being such a robust framework, there are no “wrong ways” to use SharePoint.  Sure, I’ve seen SharePoint poorly implemented, but that actually speaks to the capabilities of the platform.  Due to the feature rich toolset, there are literally hundreds of ways to configure and use SharePoint.  Some are [...]]]></description>
			<content:encoded><![CDATA[<p>SharePoint has extremely robust content storage capabilities.  Being such a robust framework, there are no “wrong ways” to use SharePoint.  Sure, I’ve seen SharePoint poorly implemented, but that actually speaks to the capabilities of the platform.  Due to the feature rich toolset, there are literally hundreds of ways to configure and use SharePoint.  Some are great, some not so good.  That is a primary reason behind the concept of best practices.  Unfortunately, best practices are generally taken as the only way to do something in a technology platform, but in reality these best practices are usually just prescriptive guidance based upon experience, usability, functionality, and performance.</p>
<p>So what is the best practice related to document libraries with regards to folders?  Do you use folders or not?  Here is my prescriptive guidance…</p>
<p><img style="display: block; float: none; margin-left: auto; margin-right: auto;" src="http://upload.wikimedia.org/wikipedia/en/thumb/4/4b/Godzilla-megalon-us.jpg/200px-Godzilla-megalon-us.jpg" alt="" /></p>
<p>Using folders is such a great concept that the idea largely hasn’t changed since the advent of paper.  In fact, even <a href="http://en.wikipedia.org/wiki/Multics" target="_blank">Multics</a> utilized the concept of folders in the early 1960s.  The idea of using folders is simple – store related content items close together to make them easier to find when you need them.  In fact, I use folders all the time at home.  I have a an entire file folder cabinet that I use to store papers in their relevant folders.  I have folders for bills, folders for tax info, folders for warranty information, etc.  I use these folders out of necessity because the content that I store in them is physical – not digital.</p>
<p>Folders have persevered through nearly all versions of computing devices from websites to mobile devices.   Does it make sense to keep doing something just because that’s the way we’ve always done it?  Folders may be easy to understand and explain, but is it really the best use of technology?</p>
<p>I don’t think so.  I think folders are an antiquated way of storing and retrieving content, and I’m not alone in this.  Google agrees with me.  Yes, the multi-billion dollar organization has a singular hive mind – and this massive mind agrees with me.  Don’t believe me?  Gmail doesn’t have folders.  Gmail has labels.</p>
<p>Labels, tags, keywords or metadata are terms that people use interchangeably.  Labels can be applied to any piece of content to help describe the content item.  Most things you purchase have labels: food, clothing, autos, computers, and even mobile devices.  they all come with attached labels.  Labels can also be attached to content.  For example, if I upload a video to share of my child swimming and title it, “John’s kids at the beach”, you have no idea from the title alone that it is a video about a 7 year old child learning to swim to a floating dock.    This is where adding labels to help describe the video can help.  I will likely add labels with my child’s name, and then some very specific labels, such as Learning, Dock, Ocean City, MD, Swimming, etc.  This enables me to go back and find videos at a later date based on a variety of sorting.  I could easily find all videos with that particular child.  I could easily find all videos marked as Ocean City.  I could easily find all videos that were specifically about Summer 2010.  These labels will also help other people locate the information that they are seeking.</p>
<p>Can you do this with folders?  What folders would you create?  If I create a folder for each child, then there is no way to group by activities.  If I create a folder for each type of activity, then there is no way to group by child.  A major difference between folders and labels is that each piece of content can only exist in a single folder but can be marked with many labels.</p>
<p>SharePoint supports both folders and labels (though in SharePoint labels are called metadata and columns).  So which should you use?  I think the answer is clear: use metadata.  Though they are definitely not mutually exclusive, here are some other good reasons to use metadata INSTEAD of folders.</p>
<ol>
<li>Metadata can be used to create views.  Sure, views can be created within a folder as well.  But views cannot span 2 folders.</li>
<li>Metadata can be a required property.  In SharePoint, you cannot dictate which folder items get stored in.  You can dictate that uploaded content will be classified by as many properties as you see fit.</li>
<li>Folders do not give you ‘counts’ of how many items they contain until you open them.  With metadata, you can easily see counts in grouping, views, etc.</li>
<li>Any single content item can have as many pieces of metadata as you wish, thus being shown in as many views.  However, content cannot exist in multiple folders.</li>
<li>In SharePoint, folders make unnecessarily long and complicated URLs (and don’t forget the URL length is still limited).</li>
<li>Updating a single column to change the metadata of an item is easy.  Moving content from one folder to another requires more thought.</li>
<li>Navigating through a folder hierarchy can only be efficient to the people that know the entire folder hierarchy.  After about 2 weeks, this is no one.</li>
<li>SharePoint 2010 allows you to modify the navigation to leverage metadata and content types.  You do not have to utilize the giant collapsible tree of folders that is inherent within Windows Explorer.</li>
</ol>
<p>Of course, you will still run into folders in SharePoint.  In fact, SharePoint 2010 has many new enhancements around using folders.  Plus, folders are comfortable.  Some people will mention view limits in SharePoint as a reason for folders.  SharePoint 2010 throttling makes this argument go away.  Some people will still stand by organization.  Other people will say that security is a reason to use folders.  While it’s true that you can put security on a Folder (and thus the items within the folder), managing security at the subfolder level is both time consuming and a management headache.  It is much easier to manage security at the library/list/site level, as typical best practices would prescribe.  I mean, you have item level security too, but who wants to manage security at the item level?  This is an exception and not the rule.</p>
<p>Am I saying that I avoid folders where possible?  Yes.  Am I saying that there is no place for folders?  No.  Folders can still be an effective tool if used correctly.  Are folders and metadata mutually exclusive?  Of course not!  Even if you elect to use folders, you should still use an effective metadata structure.</p>
<p>Please wield this powerful folder weapon wisely&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://stovereffect.com/2010/07/12/metadata-versus-folders/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

