I have seen many versions of these and a lot of the time people are expecting that a bad word would be written complete, I.e. BADWORD. Sometimes they overlook the fact that others get hold of this rule and simply bypass by adding symbols in between, I.e. B*A*D*W*O*R*D. Of course this would not be recognized if simply searching the string for BADWORD.
This technique I have used here relies on a base list in XML. I have created a class which is called BarWordFilter and with this I use the singleton pattern. I do this because the class has to first compile a list of Regexs from the words inside the base XML File, and as I do not want a re compilation of these at every bad word check, I have opted for the singleton pattern.
for any word which is in the list the rendered pattern will follow a set trend. So if we look again at BADWORD, the regular expression I have come with would be as follows.
What I do is I create the pattern at runtime. I look for instances of lower or upper case, and ultimately anything which, if we ignore anything which is not a character, spells our bad word.
I have create a simple test page here to have a go. Please note I have only got the real serious words in the list for the purposes of this demonstration. I have not published this list as I do not think it is necessary. I have used a simple XML structure so please feel free to copy the code here, and generate as many bad words as you like <s>.
Example Page : http://andrewrea.co.uk/badwordfilter/Default.aspx
The BadWordFilter class
The XML file which I have used is below. Dead simple, but does the job.