Project Description
Cleans html with a whitelist for tags and attributes. Filters attributes with regex.
It has a Model Binder for MVC 3 that detects UIHint attribute on model's properties and cleans those properies' value before the action method. It is a good tool to regularize posted in html.

What this laundry cleans

If your web site has to accept html posts (probably some kind of blog, forum or it is just inplace editable by administrators) it suffers form rughly formatted, invalid and insecure html data. Html Laundry can catch and clean posted in html before it reaches your action method. You can have a complete fine grained control over what it passes through and what it filters out.

You can use Html Laundry to protect your site from ugly formatted posts and to create a protection against XSS.

How it is integrated into MVC 3

If you install it with the nuget then your project gets an AppStart_HtmlLaundry.cs in it's root. This file replaces the default model binder. The new model binder detects the UIHint("html"), DataType(DataType.Html) and AllowHtml attributes on your model's properties and as the part of the binding process cleans those properties value.
So it's automatic, you do not have to do anything. But if you like to fine tune it's working you can with whitelists and also with the WhitelistAttribute on your properties.

How the wash machine works

Html Laundry uses the strict old way. Treats the input html as a string and builds up an xml tree from that string. The building process ensures that all the html elements will have their closing tag, so at the end there will be a vaild xml or xhtml if you like. It ensures that everything that seems to be a tag certainly will be an elemet in the tree and all the content text will be just text - no tricks.
After the xml building the Html Laundry takes a whitelist (you can use different whitelists for different tasks because whitelists are customizable) and drops out every element that not exists in the whitelist. The dropped elements leave their contents behind in their parent element ensuring no data loss. After all the xml dom will be converted back to string. Now it's clean and safe html for displaying back to the user.

Warning!

Html Laundry can protect against XSS however I do not guarantee it will! It's not the main purpose. If you search for XSS protector try AntiXSS instead. Html Laundry strongly depends on the whitelist! So if you want to be sure it protects your site you have to test it in your own environment.

Last edited Sep 8, 2011 at 6:44 PM by Tocsi, version 12