New <%: %> Syntax for HTML Encoding Output in ASP.NET 4 (and ASP.NET MVC 2)

Wednesday, April 7, 2010

[In addition to blogging, I am also now using Twitter for quick updates and to share links. Follow me at: twitter.com/scottgu]

This is the nineteenth in a series of blog posts I’m doing on the upcoming VS 2010 and .NET 4 release.

Today’s post covers a small, but very useful, new syntax feature being introduced with ASP.NET 4 – which is the ability to automatically HTML encode output within code nuggets. This helps protect your applications and sites against cross-site script injection (XSS) and HTML injection attacks, and enables you to do so using a nice concise syntax.

HTML Encoding

Cross-site script injection (XSS) and HTML encoding attacks are two of the most common security issues that plague web-sites and applications. They occur when hackers find a way to inject client-side script or HTML markup into web-pages that are then viewed by other visitors to a site. This can be used to both vandalize a site, as well as enable hackers to run client-script code that steals cookie data and/or exploits a user’s identity on a site to do bad things.

One way to help mitigate against cross-site scripting attacks is to make sure that rendered output is HTML encoded within a page. This helps ensures that any content that might have been input/modified by an end-user cannot be output back onto a page containing tags like <script> or <img> elements.

How to HTML Encode Content Today

ASP.NET applications (especially those using ASP.NET MVC) often rely on using <%= %> code-nugget expressions to render output. Developers today often use the Server.HtmlEncode() or HttpUtility.Encode() helper methods within these expressions to HTML encode the output before it is rendered. This can be done using code like below:

While this works fine, there are two downsides of it:

It is a little verbose
Developers often forget to call the Server.HtmlEncode method – and there is no easy way to verify its usage across an app

New <%: %> Code Nugget Syntax

With ASP.NET 4 we are introducing a new code expression syntax (<%: %>) that renders output like <%= %> blocks do – but which also automatically HTML encodes it before doing so. This eliminates the need to explicitly HTML encode content like we did in the example above. Instead, you can just write the more concise code below to accomplish the exact same thing:

We chose the <%: %> syntax so that it would be easy to quickly replace existing instances of <%= %> code blocks. It also enables you to easily search your code-base for <%= %> elements to find and verify any cases where you are not using HTML encoding within your application to ensure that you have the correct behavior.

Avoiding Double Encoding

While HTML encoding content is often a good best practice, there are times when the content you are outputting is meant to be HTML or is already encoded – in which case you don’t want to HTML encode it again.

ASP.NET 4 introduces a new IHtmlString interface (along with a concrete implementation: HtmlString) that you can implement on types to indicate that its value is already properly encoded (or otherwise examined) for displaying as HTML, and that therefore the value should not be HTML-encoded again. The <%: %> code-nugget syntax checks for the presence of the IHtmlString interface and will not HTML encode the output of the code expression if its value implements this interface. This allows developers to avoid having to decide on a per-case basis whether to use <%= %> or <%: %> code-nuggets. Instead you can always use <%: %> code nuggets, and then have any properties or data-types that are already HTML encoded implement the IHtmlString interface.

Using ASP.NET MVC HTML Helper Methods with <%: %>

For a practical example of where this HTML encoding escape mechanism is useful, consider scenarios where you use HTML helper methods with ASP.NET MVC. These helper methods typically return HTML. For example: the Html.TextBox() helper method returns markup like <input type=”text”/>. With ASP.NET MVC 2 these helper methods now by default return HtmlString types – which indicates that the returned string content is safe for rendering and should not be encoded by <%: %> nuggets.

This allows you to use these methods within both <%= %> code nugget blocks:

As well as within <%: %> code nugget blocks:

In both cases above the HTML content returned from the helper method will be rendered to the client as HTML – and the <%: %> code nugget will avoid double-encoding it.

This enables you to default to always using <%: %> code nuggets instead of <%= %> code blocks within your applications. If you want to be really hardcore you can even create a build rule that searches your application looking for <%= %> usages and flags any cases it finds as an error to enforce that HTML encoding always takes place.

Scaffolding ASP.NET MVC 2 Views

When you use VS 2010 (or the free Visual Web Developer 2010 Express) to build ASP.NET MVC 2 applications, you’ll find that the views that are scaffolded using the “Add View” dialog now by default always use <%: %> blocks when outputting any content. For example, below I’ve scaffolded a simple “Edit” view for an Article object. Note the three usages of <%: %> code nuggets for the label, textbox, and validation message (all output with HTML helper methods):

Summary

The new <%: %> syntax provides a concise way to automatically HTML encode content and then render it as output. It allows you to make your code a little less verbose, and to easily check/verify that you are always HTML encoding content throughout your site. This can help protect your applications against cross-site script injection (XSS) and HTML injection attacks.

Hope this helps,

Scott

37 Comments

Hi Scott,
The new is cool. Thanks for sharing this.

shiju - Wednesday, April 7, 2010 7:43:11 AM

Very useful. Scott thanks!

Sergey Lutay - Wednesday, April 7, 2010 7:55:55 AM

Nice feature. Thanks

zhe - Wednesday, April 7, 2010 8:19:19 AM

Very cool feature that I always formet to use!!

ASP.NET MvcPager - Wednesday, April 7, 2010 8:28:16 AM

Great post, that will make things easier!

But what the interface really means, is that the class CAN return an HTML encoded string (via the ToHtmlString() method), rather than that it already IS encoded. It actually gives a choice between ToString() and ToHtmlString(). If this is correct, I found that part of your blog post as well as the MSDN entry a bit misleading.

Magnus Markling - Wednesday, April 7, 2010 8:30:20 AM

Looks like a good feature for MVC, in webforms is there anyway to incorporate this with the databinding syntax? It would be very useful if I could have my Eval statements automatically encoded like this.

Pete S - Wednesday, April 7, 2010 8:32:21 AM

I wonder why Scott hasn't mentioned that : is like = but seen from a side. I first read about this funny way of remembering it at Haack.

Andrei Rinea - Wednesday, April 7, 2010 8:32:43 AM

This is absolutely good news. But how about databinding syntax ?

We are planning to upgrade a couple of ASP.NET 2.0 Web Forms sites where databinding is used heavily, so it would be even better news if there were some simplifications on that area as well.

Thanx!

Helge Kalnes - Wednesday, April 7, 2010 9:13:58 AM

Will it work on ASP.NET MVC 2 with VS 2008?

shiju - Wednesday, April 7, 2010 9:34:48 AM

HtmlString type is really great!

Thanks for sharing this!

atagaew - Wednesday, April 7, 2010 9:40:24 AM

Wish there was a way to make this work with ASP.NET MVC 2 on VS 2008.

Nick Masao - Wednesday, April 7, 2010 10:11:21 AM

Excellent addition; thank you!

Scott Kuhn - Wednesday, April 7, 2010 11:52:20 AM

Holy crap! I didn't know about this at all! Sadly, I'm working with ASP.Net 4. Makes me wonder what else I'm missing and where I can find out...

Ben - Wednesday, April 7, 2010 12:29:59 PM

Hello Scott,
What if you wanted the user to post HTML but also wanted to remove any malicious scripts (Javascript)?
Thank you

DavisBains - Wednesday, April 7, 2010 12:35:00 PM

Nice work Scott!!!

Jay - Wednesday, April 7, 2010 12:44:17 PM

Is this going to work with data binding as well?.. ie ?

Nathan - Wednesday, April 7, 2010 1:02:42 PM

I like how Scott Hanselman described the new syntax during one of his Mix10 talk... "It's like the <%= synxtax, but the you are looking at the equals character from the side"

Elijah Manor - Wednesday, April 7, 2010 1:12:11 PM

I've been working w/ .NET 4 and MVC 2 for a while now and had no idea! Thanks for dropping this bit of awesome!

James Alexander - Wednesday, April 7, 2010 1:52:11 PM

Scott, I nerver found the explanation for and . I do know when to use them, but do not know the difference in syntax.

Thanks.

Fergara - Wednesday, April 7, 2010 2:22:14 PM

Great feature, but can we please bring back the light-blue block highlighting from the old ASP/Visual Interdev days? My code now looks like someone is highlighting randomly across the page, and now with this new feature its just going to get worse...

Keith - Wednesday, April 7, 2010 3:19:12 PM

Can the Microsoft Anti-XSS library be used to encode the html with this syntax?

Dave - Wednesday, April 7, 2010 4:58:46 PM

Hi Scott, what if article => article.Content contained malicious code? Wouldn't avoiding double encoding allow malicious code to get to the client? article.Content) %>

thx!

Joo Park - Wednesday, April 7, 2010 5:27:57 PM

dear Scott,
can you make more info circa asp like tags and how to use in developer scenario?

thank you

giorgio novello - Wednesday, April 7, 2010 6:50:33 PM

@Joo Park

In your example the HTML.TextBodyFor() helper Does indeed html encode
the article.Content and because the method returns the encoded result as an HtmlString, the "<:" will not encode it, avoiding double encoding.
You are confusing two separate things: The encoding done by the Html helper is separate from the encoding done by the "<:" tag. The Html helpers always encode their input, but the "<:" tag only encodes if a normal string type is passed to it and does not encode if a HtmlString type is passed to it.

Areg Sarkissian - Wednesday, April 7, 2010 7:42:51 PM

Nice, can't wait for you to add it to Classic ASP ;-)

Dave Tigweld - Wednesday, April 7, 2010 10:00:41 PM

GREEEEEEEEAAAAAAAAAAAAAAAATTTTTTTTT
Always great
thank you

Kim - Wednesday, April 7, 2010 11:37:15 PM

less typing is always cool. nice feature.

Foysal - Thursday, April 8, 2010 10:24:47 AM

Nice information, I really appreciate the way you presented.

w3cvalidation - Thursday, April 8, 2010 12:09:00 PM

Scott,

Just wanted you to know that I've been following this series since about halfway through (and have caught up on what I missed in the first half), and it's doing a terrific job of getting both myself and my coworkers excited about moving up to .NET 4. Thanks so much.

Ben

Ben - Thursday, April 8, 2010 1:24:42 PM

awesome! It should come early but still not late yet.

Bill Xie - Thursday, April 8, 2010 4:55:21 PM

As a Web Forms developer what about support for <%#? Also it would be great if the built in controls encoded things more consistently or preferably not at all. For example i've founded encoding the Text property on the ListItem class a real pain.

Lee Timmins - Friday, April 9, 2010 9:10:51 AM

Great! It's more easy than check per-case if html is already encoded :)

Thiago Couto - Saturday, April 10, 2010 2:59:47 AM

@Scott: thank you -- this new syntax would help to improve quality of ASPX code.

@jdu -- I don't agree that "don't code" is a good solution.
It's good that syntax helps to cover for developers mistakes. Such help from development environment helps to speed up development and maintain good quality of the code.

Dennis Gorelik - Tuesday, April 13, 2010 5:34:11 PM

Great! Thanks, I'm glad it's compatible with html helpers, so that developers can default to <%: for most scenarios.

DalSoft - Tuesday, April 27, 2010 11:54:18 AM

Excellent addition; thank you!

Wrongful Death Lawyer - Friday, April 30, 2010 12:51:44 PM

Scott,
will they include support for AntiXss in this release ?

Edmond Deuser - Friday, May 14, 2010 3:26:22 PM

Finally. Not encoding by default is/was a huge ASP.NET problem.

How about some other encoding types (some mentioned above),
like javascript string (in html), etc...

mark - Friday, May 28, 2010 1:19:08 PM

Comments have been disabled for this content.