Folders bad, metadata good!

Or is it meta-data? META-data? MetaData? Nevermind.

Dustin Miller put together a nice summary of his experiences during his training sessions about organizing information and advocating the use of meta-data (meta data? ARGH!) over folders. I totally 100% agree with him and hope others follow this advice. Here are a few other reasons why you should just say no to folders.

  • There is a path limitation in SharePoint (or maybe IIS) of 260 characters in total. As you start creating the folder structure from hell, you'll find this gets wiped out very quickly and you end up staring at yet another cryptic SharePoint error message (basically "Something bad has happened") with no indication of the what the problem is. Better yet, documents just vanish into the ether and you have no idea that they're really still there, tucked away and taking up precious space in your SQL Server but you're unable to access them.
  • Ever need something and try to go look for it. If you know where to find it, it kind of defeats the purpose of creating a complex organization system if you already know where it is. If you don't, search might turn it up but the vast majority of carbon-based units out there can't figure out how to search for something so that's a bit of a waste. Folders only allow you to look for things the way someone who put the stuff there. If I had a brain fart and filed the asset records for 1997 under Financials -> 1997 -> Assets would you think to look there or in Assets -> By Year -> 1997. Again, you need to know the organization structure to navigate it. Using folders for organization just compounds this as we end up with deeply nested folders upon folders and nobody can find anything (even the people that put it there).
  • Folders are one-dimensional. Think of a Yellow Pages. I can open it up and look for a pizza place by looking in Pizza, Restauraunts, or maybe even Dining. I don't need to know the section that I want to look in, I can find it in various ways. Everyone is wired differently and will look for things the way they were brought up. What if I was the owner of a parts company and decided to start organizing my inventory using SharePoint. Would I put my "X89 Widget" under "Aircraft -> DC9 -> Parts" or in "Parts -> By Aircraft -> Large". I can only organize things one way with folders and if I need to slice information up differently, I either end up with links all over the place (which will easily break and become out of date) or multiple copies of the same thing because the Finance department looks for things by Asset Number and the Parts guys look for it by Part Number. Metadata is the way to please everyone. 

I understand the desire for folders. After all, growing up in the DOS and Windows world it's all about folders. You create subfolders to organize information. You're used to it. You covet it. You rub the lotion on its skin and it place the lotion in the basket. Hmmm. Anyways, yes it's natural to feel this is the way to organize and since we've been doing it all our virtual lives, why change? I think the key thing is thinking about how you organize your information and Metadata is critical to the SharePoint concept. With metadata you can effectively create customized search arguments that permit you to organize information dynamically, and to even use search criteria from one document library to retrieve information from another.

Put another way, you can forego the traditional hierarchical folders in organizing your document libraries. Instead, you can create metadata lookups that can not only be used as organizational keys for documents in one library, but can be used as search arguments to locate documents in other libraries. In this way, you can create searchable document pools with effectively dynamic organization, not only searchable but re-organizable without any physical manipulation of the documents themselves.

Using metadata gives you the property search (with SharePoint Portal Server). During the indexing process, the IFILTERs, which extract the text out of the documents, put property information into special property buckets that are kept separate in the index so they can be searched separately. This allows you to set properties in your Office documents such as department, project number, author, keywords, etc., and then have the ability to search on those fields individually.

You can use the search engine in SharePoint to search for documents where the department is engineering and the project is 123. Where a full text document search for engineering and 123 may find hundreds of entries because the words engineering and the number sequence 123 appears in many documents, a search via properties may yield the 10 or so documents that are truly relevant to your search.

Properties are what most people believe they are creating when they create a new field in a document library. That's not actually true. The meta data fields in a document library don't have anything to do with properties directly. During the edit process, however, Office performs a little slight of hand. It takes the information you enter in the meta data fields for the document library and makes corresponding custom properties in the document. The net effect is that, although you've only created fields in a document library, your documents now have custom properties.

These custom properties are picked up by the indexing process (more specifically, the IFILTER for Office documents) and they are placed into the search index. You can then use those properties by making them available via the advanced search page in SharePoint. This also means that non-Office documents don't share the same relationship between fields in the document library and the properties of the document itself. So if you're trying to develop a searching mechanism for documents like TIF documents or PDFs, you'll find that setting up a meta data field for those document libraries won't allow you to search for those documents directly via their properties. You'll still be able to organize the information.

Bottom line, get your head out of the sand and stop trying to mimic what is "traditional" as it's not going to give you the best bang for your buck. Use SharePoint and leverage what's there as it will doing the heavy lifting for you, you just have to tell it to get started.

Sources:
Harnessing Properties in SharePoint Search

Published Tuesday, January 03, 2006 6:50 PM by Bil Simser
Filed under:

Comments

# re: Folders bad, metadata good!

Tuesday, January 03, 2006 9:36 PM by JasonF
Props for working my favorite line from Silence of the Lambs into your post.

# Mart Muller's Sharepoint Weblog

Wednesday, January 04, 2006 8:32 AM by TrackBack
Mart Muller's Sharepoint Weblog

# Angus Logan's Portals Blog

Wednesday, January 04, 2006 8:33 AM by TrackBack
Angus Logan's Portals Blog

# re: Folders bad, metadata good!

Wednesday, January 04, 2006 1:06 PM by AndrewSeven
I'm not sure we should throw away folders just because we are adding meta data about the contents of the folders.

Folders within folders matches with the concept that everything goes in something. If I don't put the files in cabinet, then they are all over everywhere.

No-one want to be faced with one location that has 40000 files when the meta data system stumbles.

Meta data can be a great enhancement for working with files in folders, but it is not a replacement.

# re: Folders bad, metadata good!

Wednesday, January 04, 2006 1:54 PM by Dustin Miller
AndrewSeven: It absolutely is a replacement for folders. You can make it look and feel just like folders, but preserve the richness of having real, useful metadata about the documents, through the use of additional fields.

You tell me: What can you do with folders that you can't do with custom fields and, perhaps, a group-by header in a view of a document library?

# re: Folders bad, metadata good!

Wednesday, January 04, 2006 3:58 PM by Bil Simser
Andrew: I agree that having a ton of files in a single folder would be an issue so maybe a compromise of using folders to organize information from a technical perspective (much like optimizing databases) would be appropriate, but not have that relate to the organization of the information. The problem with that is of course people will fall back into their old habits and you're back to square one where information gets organized via folders. Treating them like containers would be a good thing, but adding metadata to folders just gets you back to a bad place IMHO.

# The Dean's Office

Wednesday, January 04, 2006 4:00 PM by TrackBack
The Dean's Office

# re: Folders bad, metadata good!

Thursday, January 05, 2006 3:25 AM by flo katze
I agree 100% percent, but it not function with the "Upload Multiple Files..." feature. So I have to "fall back" to the folder structure.

# re: Folders bad, metadata good!

Thursday, January 05, 2006 4:46 AM by MikeWalshHelsinki
(To flo katze) Upload by all means your files with Upload Multiple Files but before you do that set your fields like Catgeory that you will be using for Views to good default values.

Then open the Document Library in Datasheet View; sort so you get all the new files bunched (using that default value perhaps) and do block changes to the files that need new Category etc. values.

You don't have to "fall back to the folder structure" that's just being lazy.

# re: Folders bad, metadata good!

Thursday, January 05, 2006 8:55 AM by wdempl
Theory vs reality? While it may make sense in theory, how do you convince 4,000 prospective users of this when they're not enthused about SharePoint to begin with because it's so foreign to them? The reality is that users generally don't embrace change, making it hard to deploy SharePoint in it's simplest form to begin with.

# Breaking Point Blog

Thursday, January 05, 2006 10:20 AM by TrackBack
Breaking Point Blog

# re: Folders bad, metadata good!

Thursday, January 05, 2006 8:07 PM by Gene Kraybill
I generally advocate the use of metadata, but I'm wondering whether folders may sometimes be advisable for large libraries, for two reasons:
-- Microsoft itself says the enumeration of items does not work well beyond 1 or 2 thousand items, and yet says a single library can handle up to 1 million files if you use 1,000 folders.
-- "Group By" views can give the same (or better) effect as folders if the number of documents is a few hundred, but if you try to do a grouped view on 3,000 files, you'll find that the "Item Limit" set in the library gets in the way. That is, Sharepoint returns only as many headings as are associated with the first 100 files, or however many you have specified. If you increase the item limit to 3,000, it takes much longer to draw the page than it would with folders.

Bottom line: Metadata is definitely the preferred approach, but I'm afraid the current version of SharePoint may sometimes force you to use folders as well.

# re: Folders bad, metadata good!

Friday, January 06, 2006 5:29 AM by Thomas Hjorth Biilmann
Well, that this ("folders are bad - use metadata") can be such a big issue at all, is a pity. I understand the key point that folders are not metadata, but also agree on the issue Gene points out above re. size limitations in SharePoint.

Another point is that using metadata over folders also makes it irrelevant for a user to look for a copy/move function (which is not there out of the box). If a file must be "moved", give it a new category value, or whatever group by field you are using. Of course, moving a file from one doc lib to another is still a pain.

I think using a file system metaphor for document libraries could be truly useful, since most users will know how this works, but without third-party products or own customization, the metaphor is awkward in SharePoint.

Just an idea; what if folders _were_ metadata? Say, I could add custom fields to a folder, and these would apply to all documents in that folder? I know christmas is over, but that would be a great feature, of course even more nice, if such metadata were searchable. I could then search for a category set on a folder and find the folder plus all files... oh well, I'm dreaming now.

Does anyone know about the coming "v3" - are there any changes underway relating to this? E.g. improved folder/file handling or the ability to search any file based on custom metadata, not just those that maps to the file format, allowing the IFilter to pick it up?

# Goodiebag 06

Saturday, January 07, 2006 7:50 PM by Michael Ekegren
[Pingback/trackback]

# Document Libraries and Subfolders: Why it's a bad idea

Tuesday, June 19, 2007 8:16 PM by dustin

I'm often asked during a class questions like, “how should I do this,” or, “how