Archives

Archives / 2008 / April
  • Overview of FeedSync support in the Microsoft Sync Framework

    What is FeedSync ?

    If you have not heard about "FeedSync" yet, it is the official name of an open and interoperable standard for representing synchronization metadata in a XML format. This specification was formerly called by Microsoft as "SSE" (Simple Sharing Extensions), and its name was changed to "FeedSync" prior to the first release version.

    The fact that the "FeedSync" model is completely based on XML and it is quite simple to represent, makes this specification a good candidate to guarantee interoperability with other platforms.

    The complete specification is based on two fundamental aspects,

    1. A way to represent the "FeedSync" model as XML extensions that can be easily attached to any existing document. The most common use nowadays is to include them as part of a syndication feed like RSS or ATOM to keep synchronization metadata about each available item. Many people have started proving parsers for this specification as part of their Syndication libraries, some examples are Simple Sharing Extensions For .NET, Rome for Java, or Argotic (.NET)

    Let's take a quick look at the metadata attached to some items in a RSS feed.

    <feed xmlns:sx='http://feedsync.org/2007/feedsync'>

      <title>Hello World</title>

      <link>http://weblogs.asp.net/cibrax</link>

      <description>this is my feed</description>

      <sx:sharing since='11-05-2007T19:33:27Z' until='14-05-2007T19:33:27Z'>

        <sx:related link='http://kzu/full' type='complete' />

      </sx:sharing>

      <item>

       <title>Foo Title</title>

       <description>Foo Description</description>

       <Foo Title='Foo' />

       <sx:sync id='d92d64f5-c006-4086-a35a-c5195448d3ad' updates='2' deleted='false' noconflicts='false'>

         <sx:history sequence='2' when='18-05-2007T19:33:27Z' by='Cibrax' />

         <sx:history sequence='1' when='14-05-2007T19:33:27Z' by='JohnFoo' />

        </sx:sync>  </item>

     

    </feed>

    First of all, it adds optional metadata at feed level through the "sharing" element to include aggregated information about related feeds or sources. (The since and until attributes in that element represents the lower and upper bounds of the items contained within the feed).

    Secondly, it includes synchronization metadata at item level with the "sync" element. This element contains the following information:

    a. An identifier to represent the item (Id attribute), this identifier must be unique across all the synchronization peers.

    b. An history of updates that includes the person that made a change to the item, and the date/time of that change. As you can see, this information is quite simple, it does not say anything about what were the changes made to the item (We will see later that this information is actually not important for the merging algorithm).

    c. Flags to indicate if the item was deleted or contains conflicts. (The same item version or sequence number was modified in one or more replicas at the same time).

    The rest of the information not mentioned here is part of the RSS specification itself and it does not have anything to do with "FeedSync".

    2. An algorithm that use the synchronization metadata to merge the items across several peers. For instance, to perform a bidirectional synchronization of local information between two peers. The algorithm also uses the metadata to detect conflicts at the moment of doing the merge operation.

    For example, if we have the following information in two different replicas

    REPLICA A

    REPLICA B

      ItemA-Sequence1 ItemC-Sequence1
      ItemB-Sequence1  

    After the first synchronization, the resulting information will be:

    REPLICA A

    REPLICA B

      ItemA-Sequence1 ItemC-Sequence1
      ItemB-Sequence1  ItemA-Sequence1
      ItemC-Sequence1  ItemB-Sequence1

    Now, Let's say the replica B changes the item A and B( The sequence or version is incremented)

    REPLICA A

    REPLICA B

      ItemA-Sequence1 ItemC-Sequence1
      ItemB-Sequence1  ItemA-Sequence2
      ItemC-Sequence1  ItemB-Sequence2

    If they synchronize again, the resulting information will be:

    REPLICA A

    REPLICA B

      ItemA-Sequence2 ItemC-Sequence1
      ItemB-Sequence2 ItemA-Sequence2
      ItemC-Sequence1 ItemB-Sequence2

    Now, if both replicas change the same item before synchronizing, for instance ItemC. At the moment of synchronizing the information, they will have the same sequence for ItemC, and therefore a conflict will exist. (The algorithm to detect conflicts is more complex than this, I am just giving an example)

    REPLICA A

    REPLICA B

      ItemA-Sequence2  ItemC-Sequence2
      ItemB-Sequence2 ItemA-Sequence2
      ItemC-Sequence2 ItemB-Sequence2

    Having these two important aspects, a developer wanting to synchronize information with other applications only has to represent that information as a FeedSync feed (A normal feed that includes "FeedSync" metadata) and expose it somewhere. Another application can later on combine that feed with its own feed using the merging algorithm, which will result in a full biredirectional synchronization between those two application.

    As you can see, both aspects complement each other, an bidirectional synchronization can not be done with just one of them.

    How to expose your information in a FeedSync feed with the Sync Framework

    1. The first thing you have to do is to determine which information you will want to expose in the final feed. Let's say that you have several copies of a database with customer information spread across several locations (Different company's branches perhaps), and you want to keep all those copies synchronized. You will have to expose that information as part of a feed, so people in other locations can synchronize their data against that feed (This example assumes that one central feed is used, but you can actually have multiple feeds that synchronize each other having a kind of tree structure, this is one of the goodness about FeedSync, no central replica is required).

    If you want to represent the customer's full name, address and phone number as an entry in a RSS feed, the entry will look as follow:

    <item>

      <title>John Doe</title>

      <description>John Doe's Information</description>

      <Customer>

        <FullName>John Doe</FullName>

        <Address>Test Adress</Address>

        <PhoneNumber>xxx-xxxx-xxxx</PhoneNumber>   </Customer>

    </item>

    This mapping between the real data and a XML payload (The RSS's entry data) can be done in the Sync Framework by having a concrete implementation of "FeedItemConverter" class. The implementation of this class will know how to map custom data to a XML payload and vice versa.

    public abstract class FeedItemConverter

    {

      protected FeedItemConverter();

      public abstract string ConvertItemDataToXmlText(object itemData); //Method to convert the actual item data into XML

      public abstract object ConvertXmlToItemData(string itemXml); //Method to convert the XML data into the item data

    }

    I personally do not like much the signature of those methods, an string could be anything (itemXml), not just XML and that can not be enforced by the current API design. I would prefer to have something like this:

    public abstract class FeedItemConverter

    {

      protected FeedItemConverter();

      public abstract void WriteItemDataToXml(object itemData, XmlWriter xmlWriter); //Method to convert the actual item data into XML

      public abstract object ReadItemDataFromXml(XmlReader xmlReader); //Method to convert the XML data into the item data

    }

    In the sample above, you should implement a "CustomerFeedItemConverter" to convert a customer instance into XML the xml representation and vice versa.

    2. Secondly, you have to provide a mapping between your internal identifiers and the identifiers used in the synchronization metadata. From the MSDN documentation "

    "A FeedIdConverter can convert replica IDs and item IDs from the flexible-length format of the provider to strings, and vice versa. Also, the ID converter must be able to generate a replica ID for an anonymous change. The FeedSync history for a change contains three potential attributes: sequence, when, and by. The by attribute represents the replica that made the change, but it is not required by the FeedSync schema and so might be absent. If a change does not include a by value, a replica ID must be generated for the change by combining the sequence and when values"

    public abstract class FeedIdConverter

    {

      protected FeedIdConverter();

      public abstract SyncIdFormatGroup IdFormats { get; }

      public abstract string ConvertItemIdToString(SyncId itemId);

      public abstract string ConvertReplicaIdToString(SyncId replicaId);

      public abstract SyncId ConvertStringToItemId(string value);

      public abstract SyncId ConvertStringToReplicaId(string value);

      public abstract SyncId GenerateAnonymousReplicaId(string when, uint sequence);

    }

    Since the implementation of this provider will be the same most of the times, I think it would be a good idea to provide a default implementation of this provider as part of the Sync framework (It could provide extensibility points through virtual methods).

    3. Once you have the item and id converters, they should be enough to start consuming or publishing FeedSync feeds with your custom information. The framework comes with two classes for that purpose, FeedConsumer and FeedProducer.

    public class FeedProducer

    {

      public FeedProducer(KnowledgeSyncProvider storeProvider, FeedIdConverter idConverter, FeedItemConverter itemConverter);

    public FeedIdConverter IdConverter { get; set; }

      public EndpointState IncrementalFeedBaseline { get; set; }

      public FeedItemConverter ItemConverter { get; set; }

      public KnowledgeSyncProvider StoreProvider { get; set; }

      public void ProduceFeed(Stream feedStream);

    }

    public class FeedConsumer

    {

      public FeedConsumer(KnowledgeSyncProvider storeProvider, FeedIdConverter idConverter, FeedItemConverter itemConverter);

      public FeedItemConverter FeedItemConverter { get; set; }

      public FeedIdConverter IdConverter { get; set; }

      public KnowledgeSyncProvider StoreProvider { get; set; }

      public EndpointState ConsumeFeed(Stream feedStream);

    }

    As you can see, those classes basically receive in the constructor the converters along with a sync repository (KnowledgeSyncProvider), the one that knows specific details about the underline data source. (for this example, it should know about the customers data in the database). All the magic is done by the methods ProduceFeed and ConsumeFeed, which reads/saves a feed from/to an existing stream. Again, I have three ideas to improve these classes,

    a. They are tied to Rss/Atom, it should be extensible enough to support any document payload.

    b. The underline stream must contain an existing Rss/Atom feed in order to save/read items on it. This is how these classes determine the format of the items. It would be great to have a way to specify a formatter for the items at the moment to save/read them. A approach very similar to the one taken by the Syndication API in WCF.

    c. I would like to have a better integration with the WCF syndication API. I mean, a direct way to expose a REST endpoint for a feedsync feed, the consumer/producer classes could accept a SyndicationFeedFormatter as input to generate/consume the feed.

    Once the data have been read from the underline stream into a sync repository, it can be used with the "SyncOrchestrator" (Part of the Sync Framework) to initiate a new synchronization session.

    Complete Example "Browser Favorites"

    A complete example that illustrates all the steps to implement a custom sync provider and expose its data as a FeedSync feed can be found here, http://blogs.msdn.com/sync/archive/2008/04/08/synchronization-of-browser-favorites-using-feedsync-and-the-microsoft-sync-framework.aspx

    It basically shows how to synchronize the browser favorites information using FeedSync as the underline synchronization protocol.

    Do you want to synchronize FeedSync data with clients behind a firewall and NAT ?

    Now, that is possible thanks to WCF and the Relay service provided in Biztalk services. Imagine a WCF service that exposes feedsync items through the Relay service, if we use a WCF duplex contract for that service, it will be able to start a bidirectional synchronization with any client behind NAT and Firewalls. I will try to show this functionality any time soon.

    In the meantime, If you want to know more about WCF duplex callbacks through NAT and Firewalls, I recommend this post written by Christian Weyer.

    Read more...

  • Federation Over TCP With WCF

    One of the discussions that we had during the last summit with the rest of "Connected Systems" MVPs was the possibility of supporting a Federation Scenario over TCP in WCF. For many of us that scenario was possible in theory, but unfortunately no documentation or samples existed to support it. In fact, WCF only comes with pre-built binding for federation scenarios, the "WsFederationHttpBinding" binding, which is completely tied to Http.

    For that reason, I decided to give it a shot and try to manipulate some custom bindings to use tcp instead of the common used http transport. One curios thing about TCP is that it requires security sessions (SecureConversation with requireSecurityContextCancellation equals to "True") in order to work fine. If you do not configure the binding with those security settings, WCF will throw a nice error message saying that the order of the binding elements is not correct. At the beginning I did not configure it in that way, and it took me sometime to figure out what the problem was, I would save some time with a better error description. 

    The resulting bindings for client, STS and the sample service were the following (In this sample, the client is authenticating against the service with a client certificate).

    1. Client

    <bindings>

      <customBinding>

        <binding name="STSBinding">

          <security authenticationMode="SecureConversation" requireSecurityContextCancellation="true">

            <secureConversationBootstrap authenticationMode="MutualCertificate"/>

          </security>

          <binaryMessageEncoding/>

          <tcpTransport />

        </binding>

        <binding name="ServiceBinding">

           <security authenticationMode="SecureConversation">

             <secureConversationBootstrap authenticationMode="IssuedToken">

               <issuedTokenParameters tokenType=http://docs.oasis-open.org/wss/oasis-wss-saml-token-profile-1.1#SAMLV1.1>

                 <issuer address="net.tcp://localhost:8000/sts" bindingConfiguration="STSBinding" binding="customBinding">

                   <identity>

                     <dns value="STSAuthority"/> <!--Sample Cert for the STS -->

                   </identity>

                 </issuer>

               </issuedTokenParameters>

            </secureConversationBootstrap>

          </security>

          <binaryMessageEncoding/>

          <tcpTransport />

        </binding>

      </customBinding>

    </bindings>

    2. STS

    <bindings>

      <customBinding>

        <binding name="MutualCertificateBinding">

          <security authenticationMode="SecureConversation" requireSecurityContextCancellation="true">

            <secureConversationBootstrap authenticationMode="MutualCertificate"/>

          </security>

          <binaryMessageEncoding/>

          <tcpTransport />

        </binding>    </customBinding>

    </bindings>

    3. Sample Service

    <bindings>

      <customBinding>

        <binding name="SampleService">

          <security authenticationMode="SecureConversation" requireSecurityContextCancellation="true">

            <secureConversationBootstrap authenticationMode="IssuedToken">

               <issuedTokenParameters tokenType=http://docs.oasis-open.org/wss/oasis-wss-saml-token-profile-1.1#SAMLV1.1>

            </issuedTokenParameters>

           </secureConversationBootstrap>

         </security>

         <binaryMessageEncoding/>

        <tcpTransport />

       </binding>

      </customBinding>

    </bindings>

    It is not required that the STS and service use both TCP transport for communicating with the client, which is a cool thing because now we can combine different transports in a whole federation scenario. For instance, we can have a Http communication between the client and the STS, and a TCP communication with between the client and the final service.

    The complete sample is available to download from here.

     

    Read more...

  • My first MVP summit

    As a MVP rookie, the only thing I can say is that the event met all the expectations I had before traveling to Seattle, the wait was worth it. I could attend a lot of interesting sessions about WCF, WWF, and OSLO among others, which gave me a complete picture about where Microsoft is headed in the future with respect to Service oriented applications.

    Another interesting aspect of the summit was the opportunity to meet other MVPs in person, very smart people and community leaders like Scott Hanselman, Roy Osherove (IXMLSerializable), Sam Gentile, Jesus Rodriguez or Roman Kiss (The brain behind many of the cool WSE, WWF and WCF cool samples that you can find around) to name a few.  

    I look forward to attending the summit again next year :).

    Read more...

  • Windows Live Contacts API

    With the boom in social networking, many web sites have started offering new tools for building or expanding your initial network of contacts.

    For instance, sites like Facebook or Linkedin provide a tool to get your Windows Live contacts and use them to more easily find or invite those people into your social networks.

    Although the idea is very good, the current implementation contains some serious security issues from my point of view.  The user has to enter in some way his Windows LiveID credentials into those sites (using a custom http form), so they can log into Windows Live and get the user's contacts.

    There is not any particular difference between this approach and a phising web site created by a malicious user only with the intention of getting your personal information.

    Not all people (including me) trust these sites enough to provide them with valuable Windows LiveID credentials. These sites are very well-known, but we are not certainly sure which will be the final user of our credentials. (or even, if they are keeping them somewhere).

    This security issue could be basically solved if Windows Live provided two fundamental things:

    1.  Http REST Services to get the user contacts or other personal information.

    2. An authentication mechanism for those services based on security tokens. Something similar to what OAuth provides, so the user will never have to enter his credentials in a site different from Microsoft again. Another advantage of this protocol is that the user will finally decide whether he authorize third party sites to get his personal information or not. If you are curious about how OAuth works behind scene,  an excellent "Begginer's guide" is available here.

    Fortunately, the Windows Live team has made an excellent progress in these two aspects. One one hand, they have developed an "Windows Live ID Delegated Authentication SDK" to integrate Application providers through a protocol pretty similar to OAuth (I haven't had enough time yet to take a more detailed look at this SDK)

    From the MSDN site,

    "Delegated Authentication is based on a block of information, called a consent token, that is provided to your Web site by the Windows Live ID service for a given resource provider (such as contacts and photos). To obtain a consent token for use at a particular resource provider, you must first request it from the user by means of the Windows Live ID consent service. Your application must then manage the authentication data that is returned. For detailed information about how to request and manage consent, see the Windows Live Delegated Authentication SDK."

    On other hand, as part of "Windows Live User Data APIs", they have started providing REST services to allow Windows Live users to safely and securely share their information stored in Windows Live services. One of this services is what they have called "Windows Live Contacts API", a REST service that enables developers to programmatically submit queries to, and retrieve results from, the Windows Live Contacts Address Book database service.

    Hopefully, it will be a matter of time until Facebook or Linkedin start integrating services like these into their sites for the benefit of all :).

    Read more...

  • MVP Again 2008!!!

    I am so excited, this is third time I received the Microsoft MVP award. Thanks to all people involved in the evaluation process and my MVP lead Fernando Garcia Loera.

    I am looking forward to continue that collaborating throughout this year.

    Thanks again Microsoft!!!

    Read more...