Why is ASP.NET encoding &’s in script URLs? A tale of looking at entirely the wrong place for a cause to a non-existing bug.
Several people have reported seeing errors in their logs that seem to be due to requests such as this:
[lots of junk]&
The important part here is the HTML-encoded “&” sequence, which stands for “&” of course. If this exact URL is sent to the server, the server won’t know what to do with the escape sequence (URLs are not supposed to be HTML-encoded on the wire) so the parameters won’t get separated as expected, potentially resulting in a server error. This bug in the toolkit is an example of that: http://ajaxcontroltoolkit.codeplex.com/WorkItem/View.aspx?WorkItemId=13134
Of course, when people see 500 errors popping up in their server logs, they immediately assume the application is failing for some users. Or that some idiot at Microsoft did something incredibly stupid (that’s what we idiots at Microsoft do after all).
Case in point, a quick peek into the source code of the application’s pages immediately reveals that the script tags generated by ScriptManager do indeed generate these URLs:
So that’s where it came from! See? When I copy this URL into the browser’s URL bar, I do get the same error!
Then ensue various more or less rational reactions such as:
- Correlate the user agent to the faulty requests (which correlates more or less with normal browser usage, i.e. lots of IE and then lots of Firefox, when there is a large enough sample).
- Blame IE6 (lots of these requests come from IE6, hence it must be responsible: IE6 sucks).
- “Fix” ScriptManager and remove the HTML encoding.
Well, by copying the URL from the source view into the URL bar, you did indeed reproduce the problem. A little too well. Better than you realize.
This is why. All of the errors in your server logs come from people doing precisely what you just did: copy the URL from the source view into the browser’s URL bar. They do it for various reasons: look at the source code for the scripts, understand what these weird URLs are, who knows?
But the point is, you will never be able to reproduce these errors during normal use of the application. There is nothing to fix here. The value that gets sent to the server never has the “&” sequence. You can verify it in IE6, you can verify it in any browser on any OS, it will just work.
When putting a URL in an HTML attribute, you should *always* HTML-encode it. It’s the standard, and for good reason (it enables the browser to tell between “&”, “&” and “&amp;”, it enables quotes to be embedded into attributes, etc.).
A consequence of that is that if you’re going to copy the value of one of these attributes from the source view, you should do what the browser does when parsing the HTML: decode the value first (in other words, replace “&” with “&”).
So yes, people do fail to do that and copy the URL without decoding. Well, they are not supposed to do that, nor do they need to do it. The error is normal, it results from a bad URL having been entered manually. Nobody would be surprised to get an error when querying foo.aspx?somenumber=thisisnotanumber for example. Same thing here. Pretty much.
Of course, this is not entirely trivial to figure out and I did pull my remaining hair a bit trying to understand what was going on, and you tend to trust people when they tell you there is a problem, especially when the description seems to make sense. There is some sort of confirmation bias going on there. But the more I looked at the different pieces of evidence, the more this explanation looked like the most likely, by far.
But of course, I may be missing something…