Tales from the Evil Empire

Bertrand Le Roy's blog

News


Bertrand Le Roy

BoudinFatal's Gamercard

Tales from the Evil Empire - Blogged

Blogs I read

My other stuff

Archives

Why is ASP.NET encoding &’s in script URLs? A tale of looking at entirely the wrong place for a cause to a non-existing bug.

(c) Bertrand Le Roy 2003 Several people have reported seeing errors in their logs that seem to be due to requests such as this:

/ScriptResource.axd?d=
[lots of junk]&
t=ffffffffee24147c

The important part here is the HTML-encoded “&” sequence, which stands for “&” of course. If this exact URL is sent to the server, the server won’t know what to do with the escape sequence (URLs are not supposed to be HTML-encoded on the wire) so the parameters won’t get separated as expected, potentially resulting in a server error. This bug in the toolkit is an example of that: http://ajaxcontroltoolkit.codeplex.com/WorkItem/View.aspx?WorkItemId=13134

Of course, when people see 500 errors popping up in their server logs, they immediately assume the application is failing for some users. Or that some idiot at Microsoft did something incredibly stupid (that’s what we idiots at Microsoft do after all).

Case in point, a quick peek into the source code of the application’s pages immediately reveals that the script tags generated by ScriptManager do indeed generate these URLs:

<script src="/ScriptResource.axd?d=[lots of junk]&amp;t=ffffffff8824ac28" type="text/javascript"></script>

So that’s where it came from! See? When I copy this URL into the browser’s URL bar, I do get the same error!

Then ensue various more or less rational reactions such as:

  • Correlate the user agent to the faulty requests (which correlates more or less with normal browser usage, i.e. lots of IE and then lots of Firefox, when there is a large enough sample).
  • Blame IE6 (lots of these requests come from IE6, hence it must be responsible: IE6 sucks).
  • “Fix” ScriptManager and remove the HTML encoding.

Well, by copying the URL from the source view into the URL bar, you did indeed reproduce the problem. A little too well. Better than you realize.

This is why. All of the errors in your server logs come from people doing precisely what you just did: copy the URL from the source view into the browser’s URL bar. They do it for various reasons: look at the source code for the scripts, understand what these weird URLs are, who knows?

But the point is, you will never be able to reproduce these errors during normal use of the application. There is nothing to fix here. The value that gets sent to the server never has the “&amp;” sequence. You can verify it in IE6, you can verify it in any browser on any OS, it will just work.

When putting a URL in an HTML attribute, you should *always* HTML-encode it. It’s the standard, and for good reason (it enables the browser to tell between “&”, “&amp;” and “&amp;amp;”, it enables quotes to be embedded into attributes, etc.).

A consequence of that is that if you’re going to copy the value of one of these attributes from the source view, you should do what the browser does when parsing the HTML: decode the value first (in other words, replace “&amp;” with “&”).

So yes, people do fail to do that and copy the URL without decoding. Well, they are not supposed to do that, nor do they need to do it. The error is normal, it results from a bad URL having been entered manually. Nobody would be surprised to get an error when querying foo.aspx?somenumber=thisisnotanumber for example. Same thing here. Pretty much.

Of course, this is not entirely trivial to figure out and I did pull my remaining hair a bit trying to understand what was going on, and you tend to trust people when they tell you there is a problem, especially when the description seems to make sense. There is some sort of confirmation bias going on there. But the more I looked at the different pieces of evidence, the more this explanation looked like the most likely, by far.

But of course, I may be missing something…

Comments

Andrei Rinea said:

Haven't thought it could be such a simple thing.. But still, if you would fire up firebug and watch the Net tab you would see a 200 code for the script request.

# June 6, 2009 4:52 AM

Bertrand Le Roy said:

@Andrei: not sure what you're saying here: 200 is OK.

# June 7, 2009 4:16 PM

gunteman said:

The fact that the ampersands should be html encoded (and a lot of other things as well. HTML encoding is not used enough...) is something that should be shouted out loud to the web development community. I often encounter very experienced web developers that are completely oblivious to this fact, and often they even say "yeah, right".

# June 7, 2009 6:06 PM

Herman said:

@Bertrand: Yes, that's what he meant. Looking in Firebug would have already shown you nothing is wrong here, as the request returns a 200 status.

# June 8, 2009 3:11 AM

mattbrooks said:

@Bertrand: Please see this MS Connect issue for a similar but 'real' issue: connect.microsoft.com/.../ViewFeedback.aspx.

# June 15, 2009 6:15 AM

Bertrand Le Roy said:

@Matt: yes, that issue is definitely real, and definitely different from this. The issue you're pointing to is still under active investigation by Dave Reed. I can get you in contact with him if you want details or if you want to help with the investigation. We have a number of hypotheses for this that we're looking at, one of which is browser cache corruption.

# June 15, 2009 12:09 PM

SLa said:

Well, when it in URL to the 3rd party site. for exmaple you generate Image url to the flickr site, and ASP.NET keeps insisting to replace all & with &amp, images jsut doesnt work.

# May 16, 2011 11:41 AM

Bertrand Le Roy said:

@SLa: you might want to re-read the post. ASP.NET is doing the right thing here.

# May 17, 2011 5:20 PM