Solving big business problems in our little toolbox application. A use case for Project Distributor.
Project Distributor: Introduction to our distributed web service model
So Darren and I have put in about a month now on the Project Distributor website. We are starting to reach that critical point where the site is pretty cool, we have plenty of users, we are thinking about running out of the allowable bandwidth for the demo site, and all sorts of other things that tend to happen all at once. Now, there are some problems you can design yourself out of, and others that you really have to throw some money at. Our latest enhancements can be summed up in a short list.
- Buy a domain name and start hosting in two places. Project Distributor.com should be up fairly soon to accompany MarkItUp.ASPXConnection.com
- Have people host their own versions of the application. And that means a big source release is in the future. At this juncture risk fragmentation.
- Design away fragmentation with a series of ingenious features that will make everyone want to use the application at hand.
I'm here to talk about the last two, since Darren already bought some additional hosting for us. The concept will be to release a fairly stable version of the application so that groups can host tools, code snippets and other source/binary releases for their teams to share. The application is very lightweight and easy to set-up, so it won't require a bunch of hand holding and configuration to get up and running initially. From our standpoint we solve a number of issues at this juncture. The most obvious problem is what we classify the Lutz Roeder use case. .NET Reflector is the key type of application we'd love to get hosted because it makes it a bit easier to find, not that Google does a bad job, we'd just like to get a bunch of tools in one place, with some features for feedback, new releases, and some cool client tools for publishing.
Now, Lutz would put his application up and he'd whack our bandwidth. He is the prime example of someone that should be hosting their own tools, but possibly using our interface. He doesn't have to, we haven't even asked him yet in fact, but if he decides to do so, then all the better for the web application moving forward. Users such as Lutz probably want a certain level of control over their own sites as well in terms of branding and controlling access. This will only come from hosting the application yourself (and maybe some other features we'll see later).
From a security standpoint many teams will also want to host their own servers. In this manner they get control over the hardware their sources and binaries are stored on. They can accept tools up to any maximum (instead of our imposed limits) and provide unlimited download bandwidth if they choose. Or they can take advantage of our gating mechanisms to make sure their server doesn't get overloaded with downloads and open their tools up to the public.
The only major problem from this source release is that the initial problem we were trying to solve, promoting the visibility of tools, starts to erode. You see, the more sites that host their own tools the harder it is to find the right site with the right tools. We are trying to solve this in a number of ways. The first is allowing users of a site to store bookmarks to other projects and external resources. This is only a temporary fix, because it still doesn't allow a mass search and categorization infrastructure required to truly promote the visibility of the tools being hosted. We have to come up with a solution that brings all of the sites, but we don't want to create just another portal or gateway site. That is boring. Now you have the background, so how will we solve the fragmentation issue?
Designing away Fragmentation
I won't lie to you, I've implemented this model several times, but have never had a project that was capable of really showing off the feature set we are about to talk about. The concept is to unify all of the sites, by allowing them to easily manage views of data from all of the sites combined. Each site owns their own content, maintains their own users, but in turn peers with other sites to obtain additional content.
Web services provide a dual feature set in this model. At the current level they allow us to generate really great client-side tools for managing, well, your tools! We have a drop-client target right now so you can drag and drop new releases to existing projects in just a few seconds. Some new tools for working with build systems to promote the source code up to the server are in the works. We natively integrate with your RSS reader and will have our own alert services in the drop client just in case you don't have one. There aren't any search or local caching features, but those are also planned for the drop client so you can background download new releases, just like Windows Update.
That doesn't solve fragmentation though, that just makes me realize how much work I have left to do. The second feature of web services lies in the ability for each site to aggregate data from the many other sites that are out there hosting the application. Remember, everything we make available at the service layer can also now be remoted. The more caching we put into the data layer, the more performant the entire process will be, and we can even tune the caching depending on whether the data layer is merging off-site contents or database contents.
I'm sure there is another name out there somewhere, but for the past 2 years I've called these peer sites. Each instance of the project distributor will have a number of options allowing for adding peers that will be aggregated and added to the local collection while users traverse the site. The first step is to get the peer sites running in a read-only mode. And set up some really great options so the entire process can be controlled. This solves a number of use case scenarios for us including the following.
- Fragmentation can be mitigated through proper configuration. If everyone aggregates 5 or 6 sites into their peers, then we have a huge network now of interconnected peers and users can pick and choose which one they use for purposes of searching the tool network.
- Peer connections are unidirectional or bidirectional. Access is configurable. Teams can include tools from external sites while keeping their own tools completely private. They can exist behind a DMZ or a private network.
- Users can host their own personal tool sites in the same manner as the team sites. They can configure statically which projects to make available even. In this way you can build a collection of personal tools that you love, and have the latest information automatically update on your machine for your perusal.
Peer sites solve plenty of visibility issues, but that is pretty much all they solve for now. We still want to enable all of the features available to the client tools. After all, the web service methods and proxy infrastructure is in place to do so much more.
Well, we want to solve another problem. That is where you edit your data. A master site is where the users, groups, projects, etc... are all hosted, but thankfully, you'll be able to log in through any site (assuming it is peered with your master site) and then edit your own projects and such. This is a remote principal context and is actually one of the cooler features associated with the peering functionality of project distributor. We'll be fully secure in our login and credentials region, but unfortunately we'll still be transferring data in open text in the short term. Maybe we'll fix that with enough push back.
A clone site is where we empower a site to act on behalf of a master site. For me, my local project distributor is currently cloned to the main project distributor site. What does this mean? Right now it means I get all of the data from PD, and that users who trust my site can log-in to their project distributor accounts and cross edit data. Pretty nice if you ask me. It basically means you can fully host a project distributor installation and never, ever have to install a database server. Users can just act on behalf of a remote server.
This isn't a super reusable model like some of those you read about in the popular software architecture books, and it probably accounts for why master/peer/clone sites don't exist very often. The considerations for every option are heavily customized to the problem being solved, and I'm sure we'll be making modifications or updating the configuration context for a while. Right now you can independently configure your primary server type, whether master or clone, whether or not users can use you for a pass-through authentication and edit server, whether or not web services are enabled so peers can enable unidirectional only communications, setting up asymmetric security credentials. Man, you name it and it is in there
For the peer section we have full and selective modes. A full peer pulls all of the data on the remote peer locally for display (in a delay caching manner, just like you'd expect, unless you set up a scheduled pull which is also possible). I expect most people to configure full peers because they really are really easy to set up and maintain. A selective peer is where you specify the groups/projects that you want to display. This is best for a user setting up their own personal toolbox who wants to select a couple of items from many different peers.
We have an extensively exhaustive configuration module already and we'll be continuously adding more to it. The concept is to easily modify your toolbox to your own designs without having to touch the code. If we haven't given you enough options to satisfy your need then we'll have to make something up, because I'm just about running out ;-)
These are the basics of the model ideas I have for project distributor. That doesn't mean Darren doesn't have other great ideas happening as well. He has some pretty extensive UI enhancements, but I'll let him talk about those. We even have another product idea that is kind of a bolt-on for project distributor, but that is probably a couple of months out putting it into next year. Unfortunately we have too many ideas for our own good right now. Better than not having any ideas I guess. I'll try to drop some code with some of the ideas above, that way you can get a look at how the entire system is implemented. I have some diagrams as well, but I'm far too tired right now to add the img tags to the HTML view.