Extended Blog Conversation

Here's what I came out with after the blogging birds of a feather session last night:

The main issue with making an easy to read, filter, and search blogging tool utilizing the cool new search features of WinFS in Longhorn is the metadata.

Point 1: Metadata needs to be standardized.

I submit that the entire model of posting to individual aggregate sites, or worse, individual sites, is a model that should be discarded.  Instead, a server, which I'll call MAPDS for Metadata and Posting Distribution Server should decide where to crosspost to.  The sites which recieve the posts (waiting for a push from the central MAPDS), are specialized nodes with a registry on the MAPDS.  IE, my site has a url, title, and a list of metadata categories that I'd like to show up on my site.

On a given blogger post to their account on MAPDS, the RSS/ATOM (hint: here's a way for ATOM to make significant advances), they also include metadata about their post so the server can figure out which nodes should be displaying it.

Point 2: Metadata needs to be trusted / Metadata needs to be error-correcting

On a rendered post, when the initial post is put up with it's poster-suggested metadata, users reading the site may add/approve/recommend for deletion a specific bit of metadata.  Once a server-set threshold of low moderated metadata score is reached, it is "deleted" from the page, although in reality future "adds" really just adjust the "deleted" items' score.

Point 3: When performed on a large scale, moderation detects and FIXES "lies" in metadata.

Now here's where .NET comes in: on the client side, you can have a WinFS data store with the ColumnTypes as the metadata.  So the final USER aggregator can sort by anything.

Point 4:When a user first uses an aggregator, it should still be able to intelligently search.  under repeat usage, it should get smarter.

On the client tool side, higher ranking is also given to moderation by blogs you have subscribed to. It's similar to "trusting" content by that blogger (sorry, I'm not sure who suggested that, but this one was not my idea). Lower increases to metadata rank is also given to blogs on the blogroll of the feeds that you add.

Point 5: the more sites the user adds to their list, the "smarter" the search will get.

This is the beautiful part - it's not necessarily because the metadata is more correct, but it's more likely to be harmonious with the user's way of thinking.  It's adaptive metadata for your profile in your OPML!

Personally, with the popularity of blogs, I believe this app/server could be the “killer WinFS app” Gord Mangione was talking about during this morning's PDC keynote. Please feel free to provide feedback; I hope to have a demonstration app that implements some of these ideas on my personal server.  Clemens Vasters is on the right track with crossblogging, but I really think this is the next step.

4 Comments

  • Although I agree with every point *heading*, I disagree with the implied architecture. Everything you describe is what RDF does, without a central server.

  • Very true with the RDF; that would alleviate the need to extend ATOM/RSS to include the metadata. As far as the central server, it's not a mandatory key point in the architecture, but it would help to create journal sites that are highly specialized, and would also help authors to spread their content farther and get a broader audience. Currenly with cross-blogging, it's write-once, post-multiple. With this implementation, it's write-once, post-once. Benefits: single entry with multiple access points. Drawbacks: single entry with single point of failure.



    I don't know if I said it, but I'm pretty skeptical about if this model will work, but I'm going to try it and write a doc on the good & bad. Thanks for the feedback!

  • In order to fix the metadata, you will need some machine based classifier to tag the content appropriately. Trusting humans, will always lead to misuse.

  • Very true... there's IP address, but that could always be spoofed as well. After the tremendous traffic I've been getting on my MONAD post, though, I should be busy for a while covering that technology.

Comments have been disabled for this content.