Is there any way to get workwrap working correct with utf-8?

I use this function:

function utf8_wordwrap($str,$len,$what){
# usage: utf8_wordwrap("text",3,"<br>");
# by tjomi4`, thanks to SiMM.
# www.yeap.lv
$from=0;
$str_length = preg_match_all('/[\x00-\x7F\xC0-\xFD]/', $str, $var_empty);
$while_what = $str_length / $len;
while($i <= round($while_what)){
$string = preg_replace('#^(?:[\x00-\x7F]|[\xC0-\xFF][\x80-\xBF]+){0,'.$from.'}'.
                       '((?:[\x00-\x7F]|[\xC0-\xFF][\x80-\xBF]+){0,'.$len.'}).*#s',
                       '$1',$str);
$total .= $string.$what;
$from = $from+$len;
$i++;
}
return $total;
}

it works but the problem is that it counts spaces too. For example,

$text = utf8_wordwrap("this is my text",6," ");

returns
this i s my text

    This is a severely convoluted function, I've no idea how it's supposed to work.

    Probably the author didn't realise that:
    - mbstring functions exist
    - preg_replace can take the "u" modifier which specifies that you're using utf8

    I've no idea, in the latter case, what unicode characters count as "word" or "nonword" characters in a preg expression.

    Likewise, whitespace is confusing, there are quite a lot of unusual types of whitespace in unicode

    Mark

      7 days later
      Write a Reply...