Vista stores metadata INSIDE the object itself?

First, read the article @ C|Net.. According to Gartner analysts, Windows Vista stores meta-data, used for search and entered by the OS user, inside the object and owner of the meta-data . This of course gives problems if the user sends the object, for example a Word document, or image, to another person, as the meta-data is still inside the document, image etc.

I simply can't believe this. If you embed meta-data usage inside your OS, why on earth would you store the meta-data inside the actual file it belongs to? It's opening up all kinds of problems, like the one brought forward by Gartner. The obvious, lame and not working pandora's box solution is to offer some kind of tool no-one will use at every moment to remove the meta-data, or to do a lot of post-processing to make sure meta-data is stripped off.

IMHO a very bad design flaw, which can lead to a lot of problems in the coming years, if Microsoft keeps the meta-data inside the objects. Oh, and before I forget: if you now think "Oh, but I never enter meta-data", that might be true, but programs will add it for you. That's even worse: you're not aware there was meta-data inside the file you just sent to that customer, but unfortunately, there was...

9 Comments

  • Is it possible that is store the meta-data using streams?

    That way it is in the same file, but it is not copied by default for web/email, etc.



    Beside, I can certainly see reasons why I would want to send my meta-data often.



    It's also interesting to know _what_ metadata they are storing. Tagging information is something that the OS handles, change tracking is in the realm of Word.

  • I was very sceptic too when this news was out. Nice to see that you think so. I'd like to see someone of the "internals" to explain and motivate this choice.



  • Metadata is stored in alternate NTFS streams (there is a great alternate NTFS streams explorer available over at codeproject.com), but applications like Word/Excel/... tend to also save some of the metadata inside the files : author, locale, ...

  • OLE Compound storage uses NTFS alt. streams as a mechanism to store this stuff iff the file type itself doesn't do it. Word (and all office docs) store things such as title, page count etc dirctly inside the document (i.e. they are native OLE compound storage types). This is why the values persist even when you email / copy to a CD.

  • I have a feeling there's more to this than meets the eye. Since Windows is in control of the filesystem, whose to say the simply don't strip the additional stream before it's "exportable" to anywhere other than another location in the filesystem. Or, whose to say the don't encrypt the data w/ the machine key or something of that matter.



    I find it extremely hard to believe that this hasn't been a concern or consideration of MS's and once again some silly CNET jounalist who knows 1/4 of the story is commenting with doing his/her homework.

  • James: I thought of that too, but for the file system, it's not known what the destination of the data will be. So if it has to strip it off has to be told to the filesystem, which then makes it fragile for mistakes.

  • If it is stored in alternate NTFS streams, it is probably only copied with the document if it is copied to another location on NTFS, not when it is copied to a non NTFS system, or attached to an e-mail.

  • FWIW, the NTFS metadata storage techniques you guys are talking about have been in Windows since W2K. And the OLE metadata since before Windows 95.



    Whether Vista uses the same techniques or not, the only "news" here is that people might acutally start using this metadata now that it will be so prominently featured in the new UI.





  • Chris, streams are part of NTFS since its conception (the original purpose was to be able to serve files to Mac).



    I know of at least one virus that used this method to hide itself. And it's _very_ useful in certain instances.

Comments have been disabled for this content.