New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

[In addition to blogging, I am also now using Twitter for quick updates and to share links. Follow me at: twitter.com/scottgu]

This is the nineteenth in a series of blog posts I’m doing on the upcoming VS 2010 and .NET 4 release.

Today’s post covers a small, but very useful, new syntax feature being introduced with ASP.NET 4 – which is the ability to automatically HTML encode output within code nuggets.  This helps protect your applications and sites against cross-site script injection (XSS) and HTML injection attacks, and enables you to do so using a nice concise syntax.

HTML Encoding

Cross-site script injection (XSS) and HTML encoding attacks are two of the most common security issues that plague web-sites and applications.  They occur when hackers find a way to inject client-side script or HTML markup into web-pages that are then viewed by other visitors to a site.  This can be used to both vandalize a site, as well as enable hackers to run client-script code that steals cookie data and/or exploits a user’s identity on a site to do bad things.

One way to help mitigate against cross-site scripting attacks is to make sure that rendered output is HTML encoded within a page.  This helps ensures that any content that might have been input/modified by an end-user cannot be output back onto a page containing tags like <script> or <img> elements. 

How to HTML Encode Content Today

ASP.NET applications (especially those using ASP.NET MVC) often rely on using <%= %> code-nugget expressions to render output.  Developers today often use the Server.HtmlEncode() or HttpUtility.Encode() helper methods within these expressions to HTML encode the output before it is rendered.  This can be done using code like below:

image

While this works fine, there are two downsides of it:

  1. It is a little verbose
  2. Developers often forget to call the Server.HtmlEncode method – and there is no easy way to verify its usage across an app

New <%: %> Code Nugget Syntax

With ASP.NET 4 we are introducing a new code expression syntax (<%:  %>) that renders output like <%= %> blocks do – but which also automatically HTML encodes it before doing so.  This eliminates the need to explicitly HTML encode content like we did in the example above.  Instead, you can just write the more concise code below to accomplish the exact same thing:

image

We chose the <%: %> syntax so that it would be easy to quickly replace existing instances of <%= %> code blocks.  It also enables you to easily search your code-base for <%= %> elements to find and verify any cases where you are not using HTML encoding within your application to ensure that you have the correct behavior.

Avoiding Double Encoding

While HTML encoding content is often a good best practice, there are times when the content you are outputting is meant to be HTML or is already encoded – in which case you don’t want to HTML encode it again. 

ASP.NET 4 introduces a new IHtmlString interface (along with a concrete implementation: HtmlString) that you can implement on types to indicate that its value is already properly encoded (or otherwise examined) for displaying as HTML, and that therefore the value should not be HTML-encoded again.  The <%: %> code-nugget syntax checks for the presence of the IHtmlString interface and will not HTML encode the output of the code expression if its value implements this interface.  This allows developers to avoid having to decide on a per-case basis whether to use <%= %> or <%: %> code-nuggets.  Instead you can always use <%: %> code nuggets, and then have any properties or data-types that are already HTML encoded implement the IHtmlString interface.

Using ASP.NET MVC HTML Helper Methods with <%: %>

For a practical example of where this HTML encoding escape mechanism is useful, consider scenarios where you use HTML helper methods with ASP.NET MVC.  These helper methods typically return HTML.  For example: the Html.TextBox() helper method returns markup like <input type=”text”/>.  With ASP.NET MVC 2 these helper methods now by default return HtmlString types – which indicates that the returned string content is safe for rendering and should not be encoded by <%: %> nuggets. 

This allows you to use these methods within both <%= %> code nugget blocks:

image

As well as within <%: %> code nugget blocks:

image

In both cases above the HTML content returned from the helper method will be rendered to the client as HTML – and the <%: %> code nugget will avoid double-encoding it.

This enables you to default to always using <%: %> code nuggets instead of <%= %> code blocks within your applications.  If you want to be really hardcore you can even create a build rule that searches your application looking for <%= %> usages and flags any cases it finds as an error to enforce that HTML encoding always takes place.

Scaffolding ASP.NET MVC 2 Views

When you use VS 2010 (or the free Visual Web Developer 2010 Express) to build ASP.NET MVC 2 applications, you’ll find that the views that are scaffolded using the “Add View” dialog now by default always use <%: %> blocks when outputting any content.  For example, below I’ve scaffolded a simple “Edit” view for an Article object.  Note the three usages of <%: %> code nuggets for the label, textbox, and validation message (all output with HTML helper methods):

image

Summary

The new <%: %> syntax provides a concise way to automatically HTML encode content and then render it as output.  It allows you to make your code a little less verbose, and to easily check/verify that you are always HTML encoding content throughout your site.  This can help protect your applications against cross-site script injection (XSS) and HTML injection attacks. 

Hope this helps,

Scott

Published Tuesday, April 06, 2010 11:57 PM by ScottGu

Comments

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 3:30 AM by Tom Robinson

Will it be possible to extend this so that is uses libraries like AntiXSS instead? See: http://antixss.codeplex.com/

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 3:43 AM by shiju

Hi Scott,

The new <%: %> is cool. Thanks for sharing this.

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 3:55 AM by Sergey Lutay

Very useful. Scott thanks!

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 4:19 AM by zhe

Nice feature. Thanks

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 4:28 AM by ASP.NET MvcPager

Very cool feature that I always formet to use!!

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 4:30 AM by Magnus Markling

Great post, that will make things easier!

But what the interface really means, is that the class CAN return an HTML encoded string (via the ToHtmlString() method), rather than that it already IS encoded. It actually gives a choice between ToString() and ToHtmlString(). If this is correct, I found that part of your blog post as well as the MSDN entry a bit misleading.

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 4:32 AM by Pete S

Looks like a good feature for MVC, in webforms is there anyway to incorporate this with the <%# %> databinding syntax? It would be very useful if I could have my Eval statements automatically encoded like this.

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 4:32 AM by Andrei Rinea

I wonder why Scott hasn't mentioned that : is like = but seen from a side. I first read about this funny way of remembering it at Haack.

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 5:13 AM by Helge Kalnes

This is absolutely good news. But how about databinding syntax <%# %>?

We are planning to upgrade a couple of ASP.NET 2.0 Web Forms sites where databinding is used heavily, so it would be even better news if there were some simplifications on that area as well.

Thanx!

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 5:34 AM by shiju

Will it work on ASP.NET MVC 2 with VS 2008?

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 5:40 AM by atagaew

HtmlString type is really great!

Thanks for sharing this!

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 6:11 AM by Nick Masao

Wish there was a way to make this work with ASP.NET MVC 2 on VS 2008.

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 7:52 AM by Scott Kuhn

Excellent addition; thank you!

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 8:29 AM by Ben

Holy crap!  I didn't know about this at all! Sadly, I'm working with ASP.Net 4. Makes me wonder what else I'm missing and where I can find out...

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 8:35 AM by DavisBains

Hello Scott,

 What if you wanted the user to post HTML but also wanted to remove any malicious scripts (Javascript)?

Thank you

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 8:44 AM by Jay

Nice work Scott!!!

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 9:02 AM by Nathan

Is this going to work with data binding as well?.. ie <%#:Eval("blah")%>  ?

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 9:12 AM by Elijah Manor

I like how Scott Hanselman described the new syntax during one of his Mix10 talk... "It's like the <%= synxtax, but the you are looking at the equals character from the side"

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 9:17 AM by Greg

In "classic" ASP.NET, I can never remember the existing code nuggets and Intellisense is of no help. Does this version improve code expression discoverability, or are they still a quasi-hidden feature as far as the ASPX editor is concerned?

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 9:23 AM by ignatandrei

It is simple and effective!

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 9:52 AM by James Alexander

I've been working w/ .NET 4 and MVC 2 for a while now and had no idea! Thanks for dropping this bit of awesome!

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 10:22 AM by Fergara

Scott, I nerver found the explanation for <%# ... %> and <%= ... %>. I do know when to use them, but do not know the difference in syntax.

Thanks.

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 11:19 AM by Keith

Great feature, but can we please bring back the light-blue block highlighting from the old ASP/Visual Interdev days? My code now looks like someone is highlighting randomly across the page, and now with this new feature its just going to get worse...

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 12:58 PM by Dave

Can the Microsoft Anti-XSS library be used to encode the html with this syntax?

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 1:27 PM by Joo Park

Hi Scott,  what if article => article.Content contained malicious code?  Wouldn't avoiding double encoding allow malicious code to get to the client?  <%: HTML.TextBodyFor(article => article.Content) %>

thx!

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 2:50 PM by giorgio novello

dear Scott,

can you make more info circa asp like tags and how to use in developer scenario?

thank you

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 3:37 PM by jdu

I have two remarks:

First, is that IHtmlString interface really required? I personnaly think that you're adding infrastructure and runtime checks, which aren't required or even worthwhile.

I mean: if a web developper can't understand (or doesn't pay attention) to the difference between <%: and <%= (or more generally between encoded and non-encoded text), he really shouldn't be coding a website.

You should always do <%= Html.TextBoxFor() %> (and failure to do so would be very quickly noticed anyway).

And you should always do <%: Data.SomeUserComment %> (and failure to do so is dangerous, but IHtmlString doesn't help).

If you're unsure whether your expression is html (pass-through) or text (needs to be encoded), then you have a problem with your application.

Secondly, I - like other posters - would love to see a similar shortcut for the databinding tag. Let's say you're pulling user comments from your database through a SqlDataSource, you'll typically do this at some place: <%# Eval("Comments") %>. Today, I have to introduce the HtmlEncode call in that nugget, or bind it to some Literal controle with the proper encoding mode set. This is as verbose as your example, and doing something like <%:: (or whatever) to automatically encode it is just as nice. Please don't forget the old-school ASP.NET... not everyone is doing MVC.

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 3:42 PM by Areg Sarkissian

@Joo Park

In your example the HTML.TextBodyFor() helper Does indeed html encode

the article.Content and because the method returns the encoded result as an HtmlString, the "<:" will not encode it, avoiding double encoding.

You are confusing two separate things: The encoding done by the Html helper is separate from the encoding done by the "<:" tag. The Html helpers always encode their input, but the "<:" tag only encodes if a normal string type is passed to it and does not encode if a HtmlString type is passed to it.

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 5:14 PM by Ahmed Elbaz

cool

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 6:00 PM by Dave Tigweld

Nice, can't wait for you to add it to Classic ASP ;-)

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 07, 2010 7:37 PM by Kim

GREEEEEEEEAAAAAAAAAAAAAAAATTTTTTTTT

Always great

thank you

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Thursday, April 08, 2010 12:14 AM by Sam Stephens

This is a great feature, as far as it goes. However, HTML encoding is not the only form of encoding out there, web developers will regularly need to encode strings for use in html attributes, URLs, and javascript, to start with.

I'm especially concerned about this as less experienced developers often don't think clearly about the different encodings. Having HTML encoding built into a code expression, but the other encodings only accessible via supporting classes, will only exacerbate this.

A nicer solution would be a scheme such as ERB uses, that supports multiple encodings. Whilst ERB doesn't support all the encoding types I mention above it has these nice syntaxes (from www.stuartellis.eu/.../erb):

1 This will be HTML escaped: <%= h(this & that) %>

2 This will be JSON encoded: <%= j(this & that) %>

3 This will be converted to Textile markup: <%= t(this & that) %>

4 This will be URL encoded: <%= u(this & that) %>

Whilst these are actually functions with one letter names, a scheme like <%:h for HTML, <%:a for attribute and <%:u for URL would be awesome. And depending on how it was implemented, extensible... I would implement <%:c to escape stuff for CSV downloads, for starters :-)

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Thursday, April 08, 2010 1:56 AM by Andrew

I just broke Stephen Walter comment system by using this tag

<script id="movieTemplate" type="text/html">  

basically using such tag is not correct !

use StringBuilder from MicrosoftAjax.js and inject it with jQuery !

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Thursday, April 08, 2010 6:24 AM by Foysal

less typing is always cool. nice feature.

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Thursday, April 08, 2010 8:09 AM by w3cvalidation

Nice information, I really appreciate the way you presented.

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Thursday, April 08, 2010 9:24 AM by Ben

Scott,

Just wanted you to know that I've been following this series since about halfway through (and have caught up on what I missed in the first half), and it's doing a terrific job of getting both myself and my coworkers excited about moving up to .NET 4. Thanks so much.

Ben

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Thursday, April 08, 2010 12:55 PM by Bill Xie

awesome! It should come early but still not late yet.

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Friday, April 09, 2010 5:10 AM by Lee Timmins

As a Web Forms developer what about support for <%#?  Also it would be great if the built in controls encoded things more consistently or preferably not at all.  For example i've founded encoding the Text property on the ListItem class a real pain.

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Friday, April 09, 2010 10:59 PM by Thiago Couto

Great! It's more easy than check per-case if html is already encoded :)

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Tuesday, April 13, 2010 1:34 PM by Dennis Gorelik

@Scott: thank you -- this new syntax <%: %> would help to improve quality of ASPX code.

@jdu -- I don't agree that "don't code" is a good solution.

It's good that <%: %> syntax helps to cover for developers mistakes. Such help from development environment helps to speed up development and maintain good quality of the code.

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 14, 2010 7:03 AM by sachin

Great

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Monday, April 19, 2010 7:08 AM by marry

excellent:)

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Tuesday, April 27, 2010 7:54 AM by DalSoft

Great! Thanks, I'm glad it's compatible with html helpers, so that developers can default to <%: for most scenarios.

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Friday, April 30, 2010 8:51 AM by Wrongful Death Lawyer

Excellent addition; thank you!

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Friday, May 14, 2010 11:26 AM by Edmond Deuser

Scott,

will they include support for AntiXss in this release ?

# re: New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Friday, May 28, 2010 9:19 AM by mark

Finally. Not encoding by default is/was a huge ASP.NET problem.

How about some other encoding types (some mentioned above),

like javascript string (in html), etc...