Matt is absolutely right, and it's a bit easier to allow acceptable, as well, in my opinion.
This is Perl that shows both ways, taken from the cert site:
#! The first function takes the negative approach.
#! Use a list of bad characters to filter the data
sub FilterNeg {
local( $fd ) = @_;
$fd =~ s/[<>\"\'\%\;)(\&+]//g;
return( $fd ) ;
}
#! The second function takes the positive approach.
#! Use a list of good characters to filter the data
sub FilterPos {
local( $fd ) = @_;
$fd =~ tr/A-Za-z0-9\ //dc;
return( $fd ) ;
}
$Data = "This is a test string<script>";
$Data = &FilterNeg( $Data );
print "$Data\n";
$Data = "This is a test string<script>";
$Data = &FilterPos( $Data );
print "$Data\n";
If you're interested in more, see the article:
"Understanding Malicious Content Mitigation for Web Developers"
at:
http://www.cert.org/tech_tips/malicious_code_mitigation.html/
--ph