Hello,

I'm creating a word filtering script and I have a set of "bad words" that I'm currently filtering. The problem I have is that some users are using instead non-ascii characters to simulate these bad words. So I'm in need of a way to filter non-ascii characters to avoid this. Maybe just allow a-z, 0-9 and characters like @, ?, #, %,!, &,~, for example.

How can I accomplish this?

Thanks!

    I'm not certain how PHP deals with non-ascii characters, but I bet you could get away with using pattern matching. If you want to remove all non-ascii characters, then you need to use [man]preg_replace[/man]. If you just want to know if a given word has any characters outside that set you specified, you should use [man]preg_match[/man].

    this might get rid of characters that aren't in your specified set

    function remove_weird_chars($str) {
      $pattern = "/[^a-zA-z0-9@?#%!&~ .]/";
      $result = preg_replace($pattern, '', $str);
      return $result;
    }
    
      Write a Reply...