Around the end of April, I put v11 of POP Forums into production on CoasterBuzz. Probably the biggest feature of that release was all of the new real-time stuff in the forum, with new posts appearing before your eyes and in the topic lists and such. This was all enabled in part by SignalR, the framework that allows for bidirectional communication between the browser and the server over an open connection (or simulated open connection, depending on the browser).
The first problem is that Googlebot appears to be somewhat stupid. While it identifies the endpoint, the actual URL that SignalR uses, it seems to have no regard as to what data has to be posted to it. That's where the exceptions come from, because SignalR doesn't understand the request.
The second problem is that Googlebot understandably expects to get a response and move along. But SignalR likes to keep an open connection so that the client and server bits can talk to each other. That's kind of the whole point. I didn't catch this issue until I used Google Webmaster Tools to see what my load times were looking like. You can very plainly see where I started using SignalR, and when I fixed the problem. Google was hanging on for as long as two seconds.
The reason I looked is because Google was being relentless at one point, banging on the thing hard enough to generate hundreds of exceptions every hour. The fix was easy enough, just put a few lines in your robots.txt file that tell the Google to back off:
The more you know, the smarter you grow.
[This is actually a repost from my personal blog, but I think the technical audience might “dig” it as well.]
Innovation is hard. You can definitely foster it, but you can't really force it. It's completely fascinating when people innovate in a massively disruptive way. While you can't make innovation happen, it's something I try to strive for. There are certain ways that I've had a great deal of success innovating, and others where I haven't. Professionally, it's easy to get into the rut of doing things a certain way, because everyone else does it that way. The first step to doing it in a better way often requires questioning the establishment. While my inner rebel is all about that, it's also an exhausting practice.
Coaching volleyball is one of those scenarios where the questioning comes easy. For example, before a match, you're given several minutes of court time to warm up (the actual time depends on the governing organization). Since I was in high school, that time was always used by coaches to send perfectly tossed balls into the air for hitting, while your one or two short defensive specialists tried to dig those hit balls. This results in a lot of "whoo-hoo's" and pleasure on the part of your athletes, but I wasn't sure if it was constructive.
Attacking the ball is always step three in volleyball. Someone has to expertly pass the ball first, then someone has to set it for the hitter. Without those two things, there is no hitting. So after a season or two, I thought, why am I wasting time on this, especially when my kids can't pass to save their lives? So despite the protest of the kids (and parents, who always have the answers), I ditched the hitting lines. I put six kids on the other side of the net, and tossed balls in for them to pass, set and hit. I rotated them around. This exercised all of the skills necessary to score, including the ever important transition on and off the net. It was real, core to the game, and made a huge difference. It also happened to be noisy and menacing in appearance, which freaked out the other team, so that was a plus.
I tossed out what everyone else was doing, and tried something that seemed to better serve the scenario. I try to do this with all things in life. And yes, it can be exhausting questioning everything, especially if you end up where you started, and "everyone" had it right.
It's a lot harder to innovate your way out of the norm in my line of work. In terms of the actual computer science, sure, there are a lot of things that have been thought to death and they're good ideas. It tends to be the process and the associated people issues that are harder to change. There is an important parallel though to the volleyball warm-up. It turns out that process is almost always wrought with wasted time for things that don't matter, that don't get to something real and valuable. Even in celebrated (capital "A") Agile practices, teams have a hard time identifying the things they do that aren't adding value, let alone innovating.
Innovation isn't easy, but you can get practice at it. It starts when you stop accepting sheep behavior and ask if there's a better way.
I wrote previously about how I built a "live blog" app in Azure, so we could use it for PointBuzz during last week's festivities at Cedar Point. Not surprisingly, it worked just fine. As I expected, the whole thing was kind of overkill. Sweet, over-provisioned overkill.
The traffic loads we encountered were not a big deal. At one point, we were close to 300 simultaneous connections. We didn't really need Azure to handle that, but the truth is that I wasn't entirely sure what to expect in terms of SignalR and its effect on the servers. What better reason to spin up virtual machines in Azure? I still think that's the biggest magic about cloud services, that you can turn on stuff just when you need it, and pay just for that. It sure beats having to buy or rent a rack of equipment.
The performance was stellar. Average CPU usage never went over 1.5%. I ran two instances of the MVC app as a Web role. I chose this over straight Web sites because of the distributed cache that comes free with it. Of course I didn't really need it, but why not? I didn't do any page rendering timing at the server, because grabbing a few objects out of the cache and making them into a simple page was probably stupid fast, but testing from the park's WiFi, the time from request to last byte (not counting images) was generally under 200 ms. The AJAX calls on scroll for more content were just slightly faster.
The CDN performance was similarly pretty solid. I did a few unscientific tests on that prior to our big day, comparing the latency of the CDN to direct calls to blob storage. Predictably, the first hit to the CDN was always slower as it grabbed the data from storgage, but after that, they were about the same. Again, this was not scientific, and I also can't control which point of presence I was hitting on the CDN. This was another feature I certainly didn't need, but figured I would try it since it was there.
We moved a total of 6 gigs of photos that day, which was a lot more than I expected. This isn't a big deal for a few days of activity, but if I were using this (or any of the cloud services) long-term, bandwidth costs would be a concern. They're still a lot higher than the "free" terabytes of transfer you typically get when you rent a box in some giant data center.
At the end of the day, the app proved two things. The first was that SignalR imposes very little overhead, even with hundreds of open connections. The service bus functionality, still in beta, works great, to shuttle messages between running instances of the app. The other thing that it proves is that I bet you could throw this simple thing up for a big live blog event like an Apple product announcement and it would work just fine. I need to find someone willing to take that chance now. :)
So what does all of this overkill cost?
166 compute hours (2 small instances): $13.28
6 gigs out of the CDN: $0.60
18,000 CDN transactions: $0 (it's a dime for a million)
11,000 service bus message: $0 (it's a buck for a million)
1.2 gigs out from app: $0.10
200 MB storage: < $1
SQL database: < $5 for a month
There are a lot of things that one can find satisfying about building stuff for the Web. For a lot of people, it's probably just the act of building something cool, pretty and useful. These are certainly things to strive for, but for me, the interesting thing has always been to build something that can scale.
Like so many things in life, this particular desire grew out of experience. Very early on, before I was technically getting paid to be a software developer, I learned about scale problems. In the wild west of 2000, I launched CoasterBuzz and did some advertising for it. I was on a shared hosting plan, and the site started to get slow in a hurry. There were a number of things I did poorly, including some recursive database queries, and worse, fetching more data than I needed. You live and learn, as they say, and I got better at it over time.
Many years later, I would have the chance to work on the MSDN/TechNet forums, which served well over 45 million pages per month. It's not lost on me how rare it is for anyone to get to work on a Web app that has to scale to that size. My team was actually there to try and rope it in a little, because it required a huge number of servers to run. There was a lot of low-hanging fruit, and some really hard things to do as well. I didn't directly do a lot of the performance enhancing stuff (though I did pair for it), but I still took a lot away from that experience.
With my own sites, they collectively do 12 to 15 million pages per year, depending on what's going on that year. Respectable, but under any normal circumstances, not a lot. At peak times, that works out to be between 6 to 10 pages per second, and less than 1 in off-peak times. It's very rare that my server ever gets pushed beyond 25% CPU usage (it doesn't hurt that it's total overkill, with four fast cores).
Still, I've noticed that people who work on Web applications don't always think in Web terms. By that, I mean it's not uncommon for them to think in "offline" terms, where time is not nearly as critical. For example, someone who works a typical job doing line-of-business applications doesn't care if they build a SQL query that has a ten-way join over two views. It might take a few seconds (or minutes) to get results, but it doesn't matter for the report it's going to generate. For the Web, that timing matters.
So here are a few of the things that I think people building apps for the Web need to think about. If there are others you can think of, I'd love to hear them! This is not an exhaustive list...
- Denormalize! Disk space is cheap, and disks are huge. Really, it's OK to duplicate data if it means you don't have to do a bunch of expensive database joins. This is even more important now, in an age where we might use different kinds of data storage, like the various flavors of table and blob storage.
- Calculate once. This is perhaps the biggest sin I've seen. You might have a large set of rules on whether or not you should display some piece of data. You have two choices: You can make those calculations every time the data is requested, or you can do it once and store the outcome of that decision. Which is going to be faster? Calculating once, probably in an infrequent data writing situation, or calculating every time, in a frequent read-only situation? I think the answer is pretty obvious.
- Use caching, but only when it makes sense. Slapping an extra box with a bunch of memory on your network to store data is a pretty quick way to boost performance. There are some pitfalls to avoid, however. If the data changes frequently, make sure your code to invalidate the cache is well tested. Beware giant object graphs that serialize into gigantic objects that are many times larger than their binary counterparts. If you're caching because of expensive data querying or composition, fix that problem first.
- Don't wait until the end to understand performance. I'll be honest, premature optimization annoys the crap out of me. Developers who waste time on what-ifs and try to code for them drive me nuts. That said, you can't pretend that performance is a last mile consideration. Fortunately, most shops these days are working with continuous integration environments at least as far as staging or testing, so problems should become apparent early on.
- Use appropriate instrumentation. I worked with one company that had a hard time finding the weak spots in its system, because it wasn't obvious where the problems were. Big distributed systems can have a lot of moving parts, and you need insight into how each part talks to the other parts. For that company, I insisted that we had a dashboard to show the average times and failure rates for calls to an external system. (I also wanted complete logging of every interaction, but didn't get it.) Sure enough, one system was choking on requests at one point every day, and we could address it.
- Remember your HTTP basics. I'm being intentionally broad here. Think about the size and shape of scripts and CSS (minification and compression), the limits in the number of connections a browser has to any one host, cookies and headers, the very statelessness of what you're doing. The Web is not VB6, regardless of the layers of abstraction you pile on top of it.
These are mostly off the top of my head, but I'd love to hear more suggestions.
If you're a technology nerd, then you've probably seen one technology news site or another do a "live blog" at some product announcement. This is basically a page on the Web where text and photo updates stream into the page as you sit there and soak it in. I don't remember which year these started to appear, but you may recall how frequently they failed. The traffic would overwhelm the site, and down it would go.
So I got to thinking, how would I build something like this? We've got a pretty big media day at Cedar Point coming up for GateKeeper, and it would be fun to post updates in real time. Ars Technica posted an article about how they tackled the problem a couple of months ago, and while elegant, it wasn't how I would do it.
My traffic expectations are lower. I don't expect to get tens of thousands of simultaneous visitors, but a couple thousand is possible. The last time we even had the chance to publish real-time from an event was 2006, for Skyhawk. Maverick got delayed the next year, and that event was scaled back to a few hours in the morning. Still, the server got stressed enough back in 2006 with a lot of open connections, in this case because I was serving video myself, and I didn't write the code that I write today. Regardless, I still wanted to build this using cloud services, as if I was expecting insane traffic. The resulting story, from a development standpoint, is wholly unremarkable, but I'll get to why that's important.
So the design criteria went something like this:
- Be able to add instances on the fly to address rising traffic conditions.
- Update in real-time with open connections, not a browser polling mechanism.
- Keep as much stuff in memory as possible.
- Serve media from a CDN or something not going through the Web site itself.
- Not spend a ton of time on it.
The first thing I did was wire up the bits on the server and client to fire up a SignalR connection, and have an admin push content to the browsers. I won't go deeper into that, because there are plenty of examples around the Internets showing how to do it with a few lines of code. Later in the process, I added the extra line of code and downloaded the package to make SignalR work through Azure Service Bus. This means that if I ran three instance of the app, the admin pushing content out from one instance, will have the content go via the service bus to the other instances, where other users are connected via SignalR. It's stupid easy. Adding instances on the fly and make it real-timey, check.
Next, I needed a way to persist the content. Originally I toyed with using table storage for this, because it's cheaper than SQL. However, ordering in a range becomes a problem, because while you can take a number of entities in a time stamp range, and then order them in code, there's no guarantee you'll get that number of entities. After thinking about it, SQL is $5/month for 100 MB, and I was only going to be using it for a few days. Performance differences would likely be negligible, and since I was going to cache the heck out of everything, that was even less important. I used SQL Azure instead.
Instead of using the Azure Web Sites, I used Web roles, or "cloud services" as they're labeled in the Azure portal. These are the original PaaS things that I was originally drawn to. Sure, they're technically full blown VM's, you can even RDP into them, but I like the way they're intended to exist for a single purpose, maintained and updated for you. More to the point, they have Azure Cache available, which is essentially AppFabric spun up across every instance you have. So if you have two instances up, and they use 30% of the memory of these small instances, that adds up to around a gigabyte of distributed cache, for free. Yes, please! I had my data repositories use this cache liberally. The infinite scroll functionality takes n content items after a certain date, which means different people starting up the page at different times will have different "pages" of data, but I cache those pages despite the overlap. Why not? It's cheap! Keep stuff in memory, check.
The CDN functionality is pretty easy too. I probably didn't need this at all, but why not? Again, for a day or two, given the amount of content, it's not an expensive endeavor. The Azure CDN is simply an extension of your blob storage, so there's little more to do beyond turning it on, adding a CNAME to my DNS, and off we go. CDN, check.
I stole a bunch of stuff from POP Forums, too. Image resizing was already there, the infinite scrolling, the date formatting, the time updating... all copy and paste with a few tweaks. I didn't do the page design either. Granted, most of it wasn't used, but my PointBuzz partner Walt did that. Total time into this endeavor was around 10 hours. Not spend a lot of time, check.
Here's the Visio diagram:
As I said, if this sounds unremarkable from a development standpoint, it is, and that's really the point. I'm whipping up and provisioning a long list of technologies without having to buy a rack full of equipment. That's awesome. Think about what this app is using:
- Two Web servers
- Distributed cache (on the Web servers, in this case)
- Database server
- A service bus
- External storage
- A CDN
For the four days or so that I'll use this stuff, it's going to cost me less than twenty bucks. This, my friends, is why cloud infrastructure and platform services get me so excited. We can build these enterprisey things with practically no money at all. Compare this to 2000, when the most cost effective way to run a couple of quasi-popular Web sites was to get a T-1 to my house, where I ran my own box, and paid $1,200 a month for 1.5 mbits. Things are so awesome now.
I'll let you know how it goes after the live event on May 9!
I'm not shy about telling people that I'm not much of a computer science kind of guy. It's not that I don't respect computer science or understand it, I'm just not one to get academic over it to the point of not building anything. And while I can't always remember what the hell SOLID stands for, I do remember that the "I" stands for the "interface segregation principle." It says, "Thou shalt not force everything to use one interface, because specific interfaces are better."
I've seen this principle violated many times, but twice in the last six months I've seen projects that just abuse it to death. The big problem for me is that trying to force everything into a particular interface leads to pain and suffering whenever you want to change something. While developers are in the general sense understanding what dependency injection is, I find that they're often doing it a way that violates the interface segregation principle.
For example, I've seen several instances where people are passing in the dependency resolver itself (for MVC, this is one of the IDependencyResolver interfaces) to various classes, and then those classes new up instances of whatever they need. This is bad for two reasons. For one, you're then coding on One Interface to rule them all, and for another, the overhead of mocking and verifying in testing gets bigger. That's no fun.
Another anti-pattern, related to the ISP, is the master do-it-all abstract class. These drive me nuts, too. While an abstract class isn't exactly an interface per se, it ends up being used as one. In an effort to keep the interface concise (as if it's easy to change when it's used in a thousand places), it ends up having one or two methods in order to conform to the base type. This is suboptimal because it keeps you from grouping similar functionality together, it abuses generics (which are fun when you have to bounce between value and reference types), and more to the point, that single interface isn't single for any really good reason. I would rather see you inject whatever functionality you need by way of a specific interface than force One Interface.
Think about it in terms of the mature frameworks that you've used. There aren't a lot of interfaces to be had that are widely used, because they would be hard to change, and force-feed members that don't need to be widely used. When there are widely used interfaces, they're pretty sparse (think IDisposable).
Specific is better. Don't try to cure cancer with an interface.
Get the juicy bits on CodePlex!
This is a significant upgrade that adds lots of real-time features (using SignalR), as well as a number of style improvements. I’m really excited about this version. I have to give a big shout out to the members of the forum on CoasterBuzz who have been so excellent in providing feedback.
This release has no data changes, and is consistent with v10. Views, scripts, content and of course the core library have all been updated.
- Updated to use v4.5 of .NET.
- External references now use NuGet.
- Adding an award definition in admin now bounces you to its edit page.
- Fixed: Show more posts updates topic context with updated page counts.
- Activities and awards restyled.
- User profiles are tabbed.
- Activity feed shows real-time view of activity sent via the scoring game API
- Times are updated every minute, formatted to current culture.
- More posts are loaded on scroll (a la Facebook), but pager links are maintained for search engine discoverability.
- New posts appear inline at end of post stream as they're made.
- Forum home and individual topic lists updated in real time.
- Breadcrumb/navigation floats at top of browser.
- .forumGrid CSS removes outline, so it's more Metro-y.
None to report yet.
I won’t repost all of the changes here, but this is the version of the app that gets all real-time and stuff (thank you, SignalR!). I also spent some time refining the UI. You can get these naughty bits, and the overall change log, here:
I’ve got the app running in production on CoasterBuzz, and I have to give that community a shout out for being my guinea pigs and an excellent source of feedback.
A few weeks to track down any final issues, and I’ll release the final version.
(I wrote this for my personal blog, but it’s obviously an important topic here.)
I've had the unusual opportunity to hire and manage people on and off starting with my first real job after college. I think it's one of the hardest things to do because it's time consuming and expensive to make mistakes. When I first had to assemble a team in a consulting gig (I think it was 2005, for context), I found out it's even harder to hire software developers.
First off, check out my former boss, Jonathan, and the talk he did with another guy about how not to do technical interviewing. The irony to people who have had bad experiences interviewing at Microsoft is not lost on me, but Jonathan gets it. Obviously, since he hired me. :) Go rate up his video!
The problem in hiring starts with the fact that resumes don't mean much. You look for the key word matches for what you're looking for, and from there look at the depth and breadth of the experience. If it doesn't smell like bullshit, you move on to a phone screen. From there you further dismiss the fakers. By the time you bring someone in, I would guess that 90% of the time you can already be pretty sure they would be a good fit, and you can have your choice of candidates provided they like you and your offer.
But it's the screening part that is such a huge burden. The resume part isn't that big of a deal. I can get through a stack of resumes pretty fast and figure out who I want to follow up with. It's the next stage of the screening that takes too damn long and sucks the life out of you.
My typical phone screen is more about the gauging the person's knowledge. I don't ask them to identify acronyms like SOLID or DRY (I can never remember what the former stands for), but I can walk them through questions about language and object-oriented patterns and get a pretty good feel for where they are. But even if you're trying to get a faker off of the phone, these still take a half-hour at least, and that's not counting the overhead in agreeing to a time to talk.
If I bring someone in for real interviews, that's going to take at least three hours, including some time working on a real software problem with, you know a computer. I don't complain about that part, it's the screening process that is a huge burden.
Hiring people, even for something as technical as a software development position, is still largely a problem with human beings. Expectations are set, social contracts have to be followed and of course people have to get along. It just doesn't feel like it should be so inefficient.
First off, job boards are nearly useless. They're just keyword matching devices. The quality of candidates varies by board, but it's still not a great value prop. Staffing agencies add even less value, especially now that it's common for a person sitting in India or China to just roll through a database, making keyword matches, spamming people.
I've been talking with people a lot lately about how to make the discovery and vetting process more efficient. The use cases for smaller to medium sized businesses in particular interest me, since they might not even have someone who is technically proficient enough to make that first critical hire.
I'm open to suggestions. How do you make the discovery and vetting easier? Is there a technical process that could help?
I think I started to mess with HTML in 1995, and the Internets became the focus of my profession in 1999. The fun thing about this is that I’ve watched the tools and development technology evolve most of the way, and it has been an awesome ride.
That said, the Web has only had what I would consider a small number of “wow” moments in terms of development technology. AJAX as a concept was a game changer, but it wasn’t really until jQuery came along that it became stupid easy to perform AJAX tasks. The ASP.NET MVC framework was another great moment, but as it was clearly inspired by other platforms, I don’t know that it’s a big deal outside of my little world. Beyond that, dev tech has been slow and evolutionary.
Until now. SignalR, as far as I’m concerned, is a huge deal. It really does change everything. It lifts the constraint inherent to standard HTTP exchanges, in that we call the server, get something back, and we’re done. Now the client and server can talk to each other, and do so on a potentially massive scale.
At first it sounds like we should credit WebSockets for this, but by itself, that technology is only slightly useful. I say that because browser implementation is not always consistent, and SignalR compensates by gracefully falling back to long “open” connections, or polling, if necessary. It’s also not entirely useful without the backend portion, which handles the registration of clients and the ability to remotely call code at the client end.
There are a great many things that I’m thinking about using it for, not the least of which is POP Forums, my open source project. I’m wrapping up real-time updating right now, in fact, for various forum and topic lists. The amount of code to do it has been trivial. It’s a big deal.
Go try it. You won’t regret it.
More Posts Next page »