Jeff Makes Software

The software musings of Jeff Putz

  • The great Azure outage of 2014

    We had some downtime on Tuesday night for our sites, about two hours or so. On one hand, November is the slowest month for the sites anyway, but on the flip side, we pushed a new version of PointBuzz that I wanted to monitor, and I did post a few photos from IAAPA that were worthy of discussion. It doesn't matter either way, because the sites were down and there was nothing I can do about it because of a serious failure in protocol with Microsoft's Azure platform.

    I'm going to try and be constructive here. I'll start by talking about the old days of dedicated hardware. Back in the day, if you wanted to have software running on the Internet, you rented servers. Maybe you had virtual networks between machines, but you still had physical and specific hardware you were running stuff on. If you wanted redundancy, you paid a lot more for it.

    I switched to the cloud last summer, after about 16 years in different hosting situations. At one point I had a T-1 and servers at my house (at a grand per month, believe it or not that was the cheapest solution). Big data centers and cheap bandwidth eventually became normal, and most of that time I was spending $200 or less per month. Still, as a developer, it still required me to spend a lot of time on things that I didn't care about, like patching software, maintaining backups, configuration tasks, etc. It also meant that I would encounter some very vanilla failures, like hard disks going bad or some routing problem.

    Indeed, for many years I was at SoftLayer, which is now owned by IBM and was formerly called The Planet. There was usually one instance of downtime every other year. I had a hard drive failure once, a router's configuration broke in a big way, and one time there was even a fire in the data center. Oh, and one time I was down about five hours as they physically moved my aging server between locations (I didn't feel like upgrading... I was getting a good deal). In every case, either support tickets were automatically generated by their monitoring system, or I initiated them (in the case of the drive failure). There was a human I could contact and I knew someone was looking into it.

    I don't like downtime, but I accept that it will happen sometimes. I'm cool with that. In the case of SoftLayer, I was always in the loop and understood what was going on. With this week's Azure outage, that was so far from the case that it was inexcusable. They eventually wrote up an explanation about what happened. Basically they did a widespread rollout of an "improvement" that had a bug, even though they insist that their own protocol prohibits this.

    But it was really the communication failure that frustrated most people. Like I said, I think most people can get over a technical failure, not liking it, but dealing with it. What we got was vague Twitter posts about what "may" affect customers, and a dashboard that was completely useless. It said "it's all good" when it clearly wasn't. Not only that, but if you then describe that there's a problem with blob storage but declare websites and VM's as all green, even though they depend on storage, you're doing it wrong. Not all customers would know that. If a dependency is down, then that service is down too.

    The support situation is also frustrating. Basically, there is no support unless you have a billing issue or you pay for it. Think about that for a minute. If something bad happens beyond your control, you have no recourse unless you pay for it. Even cable companies have better support than that (though not by much).

    Microsoft has to do better. I think what people really wanted to hear was, "Yeah, we messed up really bad, not just in service delivery, but in the way we communicated." The support situation has to change too. I have two friends now that had VM's more or less disappear, and they couldn't get them back. They had to buy support, which then failed to "find" them. Talk about insult to injury.

    Hopefully this is just a growing pain, but a significant problem can't go down like this again, from a communication standpoint.

    Read more...

  • Bootstrap in POP Forums, and why I resisted

    I haven't been writing much lately, in part because I spent a good portion of my free time in the last week overhauling the POP Forums UI to use the Bootstrap framework. You can see what it looks like on the demo site. It took me a long time to cave and do this, but I think I had pretty good reasoning.

    The forum app has always been at the core of my personal site projects, chief among them CoasterBuzz. I'm a little meticulous about markup and CSS. I hate having too much of it. I hated using jQuery UI because it felt like bloat. Grid frameworks always seemed to require more markup (and they still do) and global CSS almost always causes trouble with other stuff when you drop it in. All prior experiments with these things failed, and let's be honest... a two-column layout is a nut that has long since been cracked and requires very little markup or CSS. In order to bend the forums a little to match the rest of the site it would be contained in, there's incentive to keep it light in terms of CSS.

    I mostly achieved this. The number of overriding classes was not huge, and more global stuff around common elements mostly worked. You could basically drop the forum inside of a div and be on your way. It even worked pretty well on tablets, and in my last significant set of tweaks back in 2012, doing a responsive design didn't feel like a priority.

    And then there was the mobile experience. One of the trade-offs for responsive is that you typically end up with more markup and CSS instead of less, so I wasn't ready to fully embrace that. Some people still weren't on LTE networks either, so I was bit conscious. Since the UI rendering was done by ASP.NET MVC, it was easy to strip down the UI to mobile-specific views, and it only took a few hours to do it for the entire app, as well as CoasterBuzz. I also didn't force it, and users could choose mobile with a link at the bottom of the page. In fact, you can see it today on CB if you scroll to the bottom. It's super fast, super light weight and concentrates on the reading of text. You can debate the merits of different views vs. responsive all day, but in this case it did exactly what I wanted with very little effort.

    Around the time of that CB release in 2012, Twitter open sourced Bootstrap and it was starting to get popular. Early last year, it seemed like the web in general was starting to adopt its own look and feel, largely due to Bootstrap. It's like the web as an OS started to have a UI style guide. I was finally starting to think seriously about it because its use was so widespread, and they were even baking it into the MVC project templates. Then they released v3, and it broke a lot of stuff. That threw me back into caution mode.

    Since that time, several things have motivated me to reconsider Bootstrap. Again there's the bigger issue of adoption, which has become pretty epic in scope. Then there's the large number of themes, which are available in great numbers, and range from free to cheap. It isn't hard to make your own either. I've also been dissatisfied with using mobile ad formats, because they don't pay, and the regular ones aren't well suited to mobile specific UI. After two years of phone upgrade cycles, more people have more bandwidth and faster connections. On top of that, the devices themselves are faster at rendering. Oh, and most importantly, Bootstrap itself has very clearly matured. That's pretty compelling.

    So I made the revisions and committed them. The admin pages haven't been updated, but I'll get there. I feel like this gives me a good fresh start to make more changes and continue to see its evolution.

    Read more...

  • I moved my Web sites to Azure. You won't believe what happened next!

    TL;DR: I eventually saved money.

    I wrote about the migration of my sites, which is mostly CoasterBuzz and PointBuzz, from a dedicated server to the various Azure services. I also wrote about the daily operation of the sites after the move. I reluctantly wrote about the pain I was experiencing, too. What I haven't really talked about is the cost. Certainly moving to managed services and getting out of the business of feeding and caring for hardware is a plus, but the economics didn't work out for the longest time. That frustrated me, because when I worked at Microsoft in 2010 and 2011, I loved the platform despite its quirks.

    The history of hosting started with a site on a shared service that I paid nearly $50/month for back in 1998. It went up to a dedicated server at more than $650, and then they threatened to boot me for bandwidth, so I started paying a grand a month for a T-1 to my house, plus the cost of hardware. Eventually the dedicated servers came down again, and for years were right around $200. The one I had the last three years was $167. That was the target.

    Let me first say that there is some benefit to paying a little more. While you won't get the same amount of hardware (or the equivalent virtual resources) and bandwidth, you are getting a ton of redundancy for "free," and I think that's a hugely overlooked part of the value proposition. For example, your databases in SQL Azure exist physically in three places, and the cost of maintaining and setting that up yourself is enormous. Still, I wanted to spend less instead of more, because market forces being what they are, it can only get cheaper.

    Here's my service mix:

    • Azure Web Sites, 1 small standard instance
    • Several SQL Azure databases, two of which are well over 2 gigs in size (both on the new Standard S1 service tier)
    • Miscellaneous blob storage accounts and queues
    • A free SendGrid account

    My spend went like this:

    • Month 1: $204
    • Month 2: $176
    • Month 3: $143

    So after two and a half months of messing around and making mistakes, I'm finally to a place where I'm beating the dedicated server spend. Combined with the stability after all of the issues I wrote about previously, this makes me happy. I don't expect the spend to increase going forward, but you might be curious to know how it went down.

    During the first month and a half, only the old web/business tiers were available for SQL Azure. The pricing on these didn't make a lot of sense, because they were based on database size instead of performance. Think about that for a minute... a tiny database that had massive use cost less than a big one that was used very little. The CoasterBuzz database, around 9 gigs, was going to cost around $40. Under the new pricing, it was only $20. That was preview pricing, but as it turns out, the final pricing will be $30 for the same performance, or $15 for a little less performance.

    There ended up being another complication when I moved to the new pricing tiers. They were priced that any instance of a database, spun up for even a minute, incurred a full day's charge. I don't know if it was a technical limitation or what, but it was a terrible idea. You see, when you do an automated export of the database, which I was doing periodically (this was before the self-service restore came along), you incurred the cost of an entire day's charge for that database. Fortunately, starting next week, they're going to hourly pricing starting next month.

    I also believe there were some price reductions on the Web sites instances, but I'm not sure. There was a reduction in storage costs, but they're not a big component of the cost anyway. Honestly, I always thought bandwidth was my biggest concern, but that's because much of what I used on dedicated hardware was exporting backups. On Azure, I'm using less than 300 gigs out.

    So now that things have evened out and I've understood how to deal with all of the unknowns from previous months, coupled with a lot of enhancements the Azure team has been working in, I'm in a good place. It feels like it should not have been so difficult, but Azure has been having an enormous growth and maturity spurt in the last six months or so. It's really been an impressive thing to see.

    Read more...

  • You have a people problem, not a technology problem

    [This is a repost from my personal blog.]

    Stop me if you've heard this one before. Things are going poorly in your world of software development, and someone makes a suggestion.

    "If we just use [framework or technology here], everything will be awesome and we'll cure cancer!"

    I like new and shiny things, and I like to experiment with stuff. I really do. But every time I hear something like the above statement, it's like nails on a chalkboard. You know, most of the NoSQL arguments over the last few years sound like that. It's not that the technology isn't useful or doesn't have a place, but when I'm looking at it from a business standpoint, I have a perfectly good database system, that happens to be relational, that could do the same thing, is installed on my servers, will scale just fine for the use case, and I employ people who already know how to use it. Maybe I have something in production that uses it wrong, but that isn't a technology problem.

    I'm sure we're all guilty of this at various points in our career. We've all walked into situations where there is an existing code base, and we're eager to rewrite it all using the new hotness. It's true, there are often great new alternatives that you could use, but I find it very rare that the technologies in play are inadequate, they're just poorly used. That kind of thing happens because of inexperience, poor process, transient consultants or some combination of all of those things.

    The poor implementation is only a part of the people problem. There is a big layer of failure often caused by process, which is, you know, implemented by people. For example (this is real life, from a previous job), you've come up with this idea of processing events in almost-real-time by queuing them and then handing them off to various arbitrary pieces of code to process, a service bus of sorts. So you look at your toolbox and say, "Well, our servers all run Windows, so MSMQ will be adequate for the queue job." Shortly thereafter, your infrastructure people are like, "No, we can't install that, sorry." And then your release people are like, "Oh, this is a big change, we can't do this." You bang your head against the wall, because all of this kingdom building, throw-it-over-the-wall, lack of collaboration is 100% people problems, not technology. Suggesting some other technology doesn't solve the problem, because it will manifest itself again in some other way.

    What do you do about this? Change itself isn't that hard (if you really believe in the Agile Manifesto), but people changing is hard. If you have the authority, you remove the people who can't change. If you don't, then you have to endure a slower process of politicking to get your way. It's slow, but it works. You convince people, one at a time, to see things in a way that removes friction. Get enough people on board, and momentum carries you along so that everyone has to follow (or get off the boat). I knew a manager at Microsoft who was exceptionally good at this, and his career since has largely been to convince teams that there was a better way.

    At a more in-the-weeds level, you get people engaged beyond code. One of the weakest skills people in the software development profession have is connection with the underlying business. Mentor them so that they understand. Explain why the tool you use is adequate when used in the right context and will save the business time and money, compared to a different technology that has more cost associated with learning new skills, licenses or whatever. It's like the urge to buy a new phone every year to have the new hotness... It's fine when it's your money, but not so much when it comes at the cost of your employer.

    As technologists, yes, we want to solve problems with technology. Just don't let that desire obscure the fact that the biggest problems in our line of work are rarely technological in nature.

    Read more...

  • Suppressing FormsAuth redirect when using OWIN external logins

    This is probably the most specific post I’ve written in a long time, but given how long I let it fester, and how much debugging it took to figure out, I figure it’s worth saving someone the time. Last fall you might recall that I did a little bit of reverse engineering, and some cutting and pasting of source code, to use the OWIN-based external authentication stuff, decoupling it from ASP.NET Identity. This was a pretty exciting win for me because I was completely not interested in using yet another auth system in POP Forums, when the one I had was already pretty simple and embedded in some of my own projects.

    Read more...

  • The indie publisher moving to Azure, part 2: operation

    About a month ago, I wrote all about my experience migrating my sites off of dedicated hardware and into Azure. I figured I would wait awhile before writing about the daily operation of those sites, so I could gather enough experience to make a meaningful assessment. As I said in the previous post, this is a move that I was looking forward to make for a good three years or so, when I actually worked with Azure from within Microsoft. The pricing finally came down to a point where it made sense for an indie publisher, and here we are.

    Read more...

  • First impression: Xamarin 3

    Almost a year ago, I started to be a lot more interested in Xamarin, since I already was something of a Mac guy writing software for the Microsoft platform. I've been working in Windows VM's via Parallels for years. At work, the firm we were working with to build out our mobile apps was using Xamarin too. For me, it wasn't just about being able to use C# to write apps for iOS and Android, it was the idea that you could share a lot of code. That's pretty exciting.

    Read more...

  • The ugly evolution of running a background operation in the context of an ASP.NET app

    If you’re one of the two people who has followed my blog for many years, you know that I’ve been going at POP Forums now for over almost 15 years. Publishing it as an open source app has been a big help because it helps me understand how people want to use it, and having it translated to six languages is pretty sweet. Despite this warm and fuzzy group hug, there has been an ugly hack hiding in there for years.

    Read more...