[In addition to blogging, I am also now using Twitter for quick updates and to share links. Follow me at: twitter.com/scottgu]
This is the nineteenth in a series of blog posts I’m doing on the upcoming VS 2010 and .NET 4 release.
Today’s post covers a small, but very useful, new syntax feature being introduced with ASP.NET 4 – which is the ability to automatically HTML encode output within code nuggets. This helps protect your applications and sites against cross-site script injection (XSS) and HTML injection attacks, and enables you to do so using a nice concise syntax.
Cross-site script injection (XSS) and HTML encoding attacks are two of the most common security issues that plague web-sites and applications. They occur when hackers find a way to inject client-side script or HTML markup into web-pages that are then viewed by other visitors to a site. This can be used to both vandalize a site, as well as enable hackers to run client-script code that steals cookie data and/or exploits a user’s identity on a site to do bad things.
One way to help mitigate against cross-site scripting attacks is to make sure that rendered output is HTML encoded within a page. This helps ensures that any content that might have been input/modified by an end-user cannot be output back onto a page containing tags like <script> or <img> elements.
How to HTML Encode Content Today
ASP.NET applications (especially those using ASP.NET MVC) often rely on using <%= %> code-nugget expressions to render output. Developers today often use the Server.HtmlEncode() or HttpUtility.Encode() helper methods within these expressions to HTML encode the output before it is rendered. This can be done using code like below:
While this works fine, there are two downsides of it:
- It is a little verbose
- Developers often forget to call the Server.HtmlEncode method – and there is no easy way to verify its usage across an app
New <%: %> Code Nugget Syntax
With ASP.NET 4 we are introducing a new code expression syntax (<%: %>) that renders output like <%= %> blocks do – but which also automatically HTML encodes it before doing so. This eliminates the need to explicitly HTML encode content like we did in the example above. Instead, you can just write the more concise code below to accomplish the exact same thing:
We chose the <%: %> syntax so that it would be easy to quickly replace existing instances of <%= %> code blocks. It also enables you to easily search your code-base for <%= %> elements to find and verify any cases where you are not using HTML encoding within your application to ensure that you have the correct behavior.
Avoiding Double Encoding
While HTML encoding content is often a good best practice, there are times when the content you are outputting is meant to be HTML or is already encoded – in which case you don’t want to HTML encode it again.
ASP.NET 4 introduces a new IHtmlString interface (along with a concrete implementation: HtmlString) that you can implement on types to indicate that its value is already properly encoded (or otherwise examined) for displaying as HTML, and that therefore the value should not be HTML-encoded again. The <%: %> code-nugget syntax checks for the presence of the IHtmlString interface and will not HTML encode the output of the code expression if its value implements this interface. This allows developers to avoid having to decide on a per-case basis whether to use <%= %> or <%: %> code-nuggets. Instead you can always use <%: %> code nuggets, and then have any properties or data-types that are already HTML encoded implement the IHtmlString interface.
Using ASP.NET MVC HTML Helper Methods with <%: %>
For a practical example of where this HTML encoding escape mechanism is useful, consider scenarios where you use HTML helper methods with ASP.NET MVC. These helper methods typically return HTML. For example: the Html.TextBox() helper method returns markup like <input type=”text”/>. With ASP.NET MVC 2 these helper methods now by default return HtmlString types – which indicates that the returned string content is safe for rendering and should not be encoded by <%: %> nuggets.
This allows you to use these methods within both <%= %> code nugget blocks:
As well as within <%: %> code nugget blocks:
In both cases above the HTML content returned from the helper method will be rendered to the client as HTML – and the <%: %> code nugget will avoid double-encoding it.
This enables you to default to always using <%: %> code nuggets instead of <%= %> code blocks within your applications. If you want to be really hardcore you can even create a build rule that searches your application looking for <%= %> usages and flags any cases it finds as an error to enforce that HTML encoding always takes place.
Scaffolding ASP.NET MVC 2 Views
When you use VS 2010 (or the free Visual Web Developer 2010 Express) to build ASP.NET MVC 2 applications, you’ll find that the views that are scaffolded using the “Add View” dialog now by default always use <%: %> blocks when outputting any content. For example, below I’ve scaffolded a simple “Edit” view for an Article object. Note the three usages of <%: %> code nuggets for the label, textbox, and validation message (all output with HTML helper methods):
The new <%: %> syntax provides a concise way to automatically HTML encode content and then render it as output. It allows you to make your code a little less verbose, and to easily check/verify that you are always HTML encoding content throughout your site. This can help protect your applications against cross-site script injection (XSS) and HTML injection attacks.
Hope this helps,