Folders bad, metadata good!

Or is it meta-data? META-data? MetaData? Nevermind.

Dustin Miller put together a nice summary of his experiences during his training sessions about organizing information and advocating the use of meta-data (meta data? ARGH!) over folders. I totally 100% agree with him and hope others follow this advice. Here are a few other reasons why you should just say no to folders.

  • There is a path limitation in SharePoint (or maybe IIS) of 260 characters in total. As you start creating the folder structure from hell, you'll find this gets wiped out very quickly and you end up staring at yet another cryptic SharePoint error message (basically "Something bad has happened") with no indication of the what the problem is. Better yet, documents just vanish into the ether and you have no idea that they're really still there, tucked away and taking up precious space in your SQL Server but you're unable to access them.
  • Ever need something and try to go look for it. If you know where to find it, it kind of defeats the purpose of creating a complex organization system if you already know where it is. If you don't, search might turn it up but the vast majority of carbon-based units out there can't figure out how to search for something so that's a bit of a waste. Folders only allow you to look for things the way someone who put the stuff there. If I had a brain fart and filed the asset records for 1997 under Financials -> 1997 -> Assets would you think to look there or in Assets -> By Year -> 1997. Again, you need to know the organization structure to navigate it. Using folders for organization just compounds this as we end up with deeply nested folders upon folders and nobody can find anything (even the people that put it there).
  • Folders are one-dimensional. Think of a Yellow Pages. I can open it up and look for a pizza place by looking in Pizza, Restauraunts, or maybe even Dining. I don't need to know the section that I want to look in, I can find it in various ways. Everyone is wired differently and will look for things the way they were brought up. What if I was the owner of a parts company and decided to start organizing my inventory using SharePoint. Would I put my "X89 Widget" under "Aircraft -> DC9 -> Parts" or in "Parts -> By Aircraft -> Large". I can only organize things one way with folders and if I need to slice information up differently, I either end up with links all over the place (which will easily break and become out of date) or multiple copies of the same thing because the Finance department looks for things by Asset Number and the Parts guys look for it by Part Number. Metadata is the way to please everyone. 

I understand the desire for folders. After all, growing up in the DOS and Windows world it's all about folders. You create subfolders to organize information. You're used to it. You covet it. You rub the lotion on its skin and it place the lotion in the basket. Hmmm. Anyways, yes it's natural to feel this is the way to organize and since we've been doing it all our virtual lives, why change? I think the key thing is thinking about how you organize your information and Metadata is critical to the SharePoint concept. With metadata you can effectively create customized search arguments that permit you to organize information dynamically, and to even use search criteria from one document library to retrieve information from another.

Put another way, you can forego the traditional hierarchical folders in organizing your document libraries. Instead, you can create metadata lookups that can not only be used as organizational keys for documents in one library, but can be used as search arguments to locate documents in other libraries. In this way, you can create searchable document pools with effectively dynamic organization, not only searchable but re-organizable without any physical manipulation of the documents themselves.

Using metadata gives you the property search (with SharePoint Portal Server). During the indexing process, the IFILTERs, which extract the text out of the documents, put property information into special property buckets that are kept separate in the index so they can be searched separately. This allows you to set properties in your Office documents such as department, project number, author, keywords, etc., and then have the ability to search on those fields individually.

You can use the search engine in SharePoint to search for documents where the department is engineering and the project is 123. Where a full text document search for engineering and 123 may find hundreds of entries because the words engineering and the number sequence 123 appears in many documents, a search via properties may yield the 10 or so documents that are truly relevant to your search.

Properties are what most people believe they are creating when they create a new field in a document library. That's not actually true. The meta data fields in a document library don't have anything to do with properties directly. During the edit process, however, Office performs a little slight of hand. It takes the information you enter in the meta data fields for the document library and makes corresponding custom properties in the document. The net effect is that, although you've only created fields in a document library, your documents now have custom properties.

These custom properties are picked up by the indexing process (more specifically, the IFILTER for Office documents) and they are placed into the search index. You can then use those properties by making them available via the advanced search page in SharePoint. This also means that non-Office documents don't share the same relationship between fields in the document library and the properties of the document itself. So if you're trying to develop a searching mechanism for documents like TIF documents or PDFs, you'll find that setting up a meta data field for those document libraries won't allow you to search for those documents directly via their properties. You'll still be able to organize the information.

Bottom line, get your head out of the sand and stop trying to mimic what is "traditional" as it's not going to give you the best bang for your buck. Use SharePoint and leverage what's there as it will doing the heavy lifting for you, you just have to tell it to get started.

Sources:
Harnessing Properties in SharePoint Search

8 Comments

  • Props for working my favorite line from Silence of the Lambs into your post.

  • I'm not sure we should throw away folders just because we are adding meta data about the contents of the folders.



    Folders within folders matches with the concept that everything goes in something. If I don't put the files in cabinet, then they are all over everywhere.



    No-one want to be faced with one location that has 40000 files when the meta data system stumbles.



    Meta data can be a great enhancement for working with files in folders, but it is not a replacement.



  • Andrew: I agree that having a ton of files in a single folder would be an issue so maybe a compromise of using folders to organize information from a technical perspective (much like optimizing databases) would be appropriate, but not have that relate to the organization of the information. The problem with that is of course people will fall back into their old habits and you're back to square one where information gets organized via folders. Treating them like containers would be a good thing, but adding metadata to folders just gets you back to a bad place IMHO.

  • I agree 100% percent, but it not function with the "Upload Multiple Files..." feature. So I have to "fall back" to the folder structure.

  • (To flo katze) Upload by all means your files with Upload Multiple Files but before you do that set your fields like Catgeory that you will be using for Views to good default values.



    Then open the Document Library in Datasheet View; sort so you get all the new files bunched (using that default value perhaps) and do block changes to the files that need new Category etc. values.



    You don't have to "fall back to the folder structure" that's just being lazy.

  • Theory vs reality? While it may make sense in theory, how do you convince 4,000 prospective users of this when they're not enthused about SharePoint to begin with because it's so foreign to them? The reality is that users generally don't embrace change, making it hard to deploy SharePoint in it's simplest form to begin with.

  • I generally advocate the use of metadata, but I'm wondering whether folders may sometimes be advisable for large libraries, for two reasons:

    -- Microsoft itself says the enumeration of items does not work well beyond 1 or 2 thousand items, and yet says a single library can handle up to 1 million files if you use 1,000 folders.

    -- "Group By" views can give the same (or better) effect as folders if the number of documents is a few hundred, but if you try to do a grouped view on 3,000 files, you'll find that the "Item Limit" set in the library gets in the way. That is, Sharepoint returns only as many headings as are associated with the first 100 files, or however many you have specified. If you increase the item limit to 3,000, it takes much longer to draw the page than it would with folders.



    Bottom line: Metadata is definitely the preferred approach, but I'm afraid the current version of SharePoint may sometimes force you to use folders as well.

  • [Pingback/trackback]

Comments have been disabled for this content.