Originally posted by Weedpacket
I think what is being asked for is "I have a string of words with spaces between them. What is the word that contains the 18th (or whatever) character?" The answer to that could be found with strpos() (using 18 as the offset parameter), strrpos() (using 18 as the offset parameter since PHP5) and substr().

The question I have is "what if the 18th character is a space? Or punctuation?"

This is what i meant. I want the word to return when i know the position. If the 18th char is a space or something else, it counts back until it finds a word

    I really shouldnt be writing the code for you, but I wanted to give it a try (itchy fingers), so here's my take on it:

    function parseForWord($str, $pos, $delim = ' ') {
    	$len = strlen($str);
    	//perform bounds checking
    	if ($pos >= 0 && $pos < $len) {
    		//is character at $pos is a delimiter?
    		if ($str{$pos} != $delim) {
    			//$pos marks a character within a word
    			$word_len = 1;
    			//scan left for start of word
    			for ($start = $pos - 1; $start >= 0; $start--) {
    				if ($str{$start} != $delim) {
    					$word_len++;
    				} else {
    					break;
    				}
    			}
    			//scan right to end of word
    			for ($i = $pos + 1; $i < $len; $i++) {
    				if ($str{$i} != $delim) {
    					$word_len++;
    				} else {
    					break;
    				}
    			}
    			//$start + 1 here because of $start-- or an out-of-bounds $start
    			return substr($str, $start + 1, $word_len);
    		} else {
    			//$pos marks position of a delimiter
    			$word_len = 0;
    			//scan left only
    			for ($i = $pos - 1; $i >= 0; $i--) {
    				if ($str{$i} != $delim) {
    					$word_len = 1;
    					break;
    				}
    			}
    			//$word_len = 1 > 0 if (last character of) word found
    			if ($word_len > 0) {
    				//scan left for start of word
    				for ($start = $i - 1; $start >= 0; $start--) {
    					if ($str{$start} != $delim) {
    						$word_len++;
    					} else {
    						break;
    					}
    				}
    				return substr($str, $start + 1, $word_len);
    			} else {
    				return '';
    			}
    		}
    	} else {
    		return false;
    	}
    }

    Note that the position of a character within the string here is numbered from 0.

      Whoooooaaaaa what a function.... i am really really impressed. I will try to study en learn from it ! is there really no simple solution to my request ? like a backwords strpos ?

      I did find this... but sometimes it fails. Your function also returns a string if the position is in a middle of a string !! great !

      Take a look at this one:

      print array_pop(explode(' ',substr($text,0,$positie)));

        Originally posted by starbbs
        like a backwords strpos ?

        [man]strpos[/man]
        See also: [man]strrpos[/man]

          is there really no simple solution to my request ?

          Maybe, e.g. with regex, but I'm not sure how to go about constructing one that fits the bill.
          The complexity lies in the way you define words, and deal with them when the position is actually a delimiter's position.

          like a backwords strpos ?

          I think you mean the reverse function of strpos() rather than reversed strpos().

            Well laserlight.... i am really greatfull for your help. I did not realise that this request isn´t all that simple.

            But did you take a look at my example ±
            print array_pop(explode(' ',substr($text,0,$positie)));

              print array_pop(explode(' ',substr($text,0,$positie)));

              At first glance, this might be ok, though you'll have problems with multiple delimiters.

              The code example I gave you was originally coded by defining words as alphanumeric strings, so ctype_alnum() was used to test instead of a direct comparison with the delimiter.
              This allows you to work with more than one delimiter with just a small modification of the function.

                Obviously, the real code is the content of the second loop. The loops are only there as test harnesses to make sure that the right words are returned for every possible value of $pos.

                $sentence = "Dit is een simpele text die nergens op slaat. I'd write this sentence in Dutch if I knew any Dutch."; 
                
                // Okay, let's test this.
                for($pos = 0; $pos<strlen($sentence); ++$pos)
                	echo $pos,' ',array_pop(explode(' ',substr($sentence,0,$pos))),"\n";
                
                // Not what is desired, I take on.
                
                // Regexps? Okay, let's have a crack with them.
                
                for($pos=0; $pos<strlen($sentence); ++$pos)
                {
                	// Break the sentence into to pieces at $pos. The word we want will be the 
                	// last one on the left. If $pos was inside the word at the time, a fragment
                	// of the word will end up on the right.
                	$left = substr($sentence, 0, $pos);
                	$right = substr($sentence, $pos);
                	// Isolate the last word to the left of $pos and the first word to the right of $pos.
                	// There may be some whitespace after the last word, and there may be some before the first.
                	preg_match('/(\\S*)(\\s*)$/', $left, $last_word_and_space);
                	preg_match('/^(\\s*)(\\S*)/', $right, $space_and_first_word);
                	// We don't need $junk - it's the entire string matched - word and space and all.
                	list($junk, $last_word, $space_on_left) = $last_word_and_space;
                	list($junk, $space_on_right, $first_word) = $space_and_first_word;
                	if($space_on_left=='' && $space_on_right=='')
                	{
                		// $pos lay inside a word, which got bro
                		// ken into two pieces.
                		$word = $last_word.$first_word;
                	}
                	elseif($space_on_right!='')
                	{
                		// $pos was in whitespace
                		//  at the end of a word
                		$word = $last_word;
                	}
                	elseif($space_on_left!='') // Don't really need to test this
                	{
                		// $pos was positioned 
                		// at the start of the word
                		$word = $first_word;
                	}
                	echo $pos,' ',$word,"\n";
                }
                

                (Edit: just noticed vBulletin had turned my \s's and \S's into s's and S's. That's not right.)

                  Thanks for this one... but i really have to make this last code work cause it does not returns the found word left at the position i know/want ?

                  SO what does this one ?

                    function parseForWord($str, $pos, $delim = ' ')

                    I also saw that this one also return words like:

                    name,

                    You see the , at the end ? i thoight it tests this to filter this out ?

                      what about doing a search for the previous delimiter, then using teh mid function to return the word using the two positions?

                        You see the , at the end ? i thoight it tests this to filter this out ?

                        It doesnt, because you define the delimiter as a space.

                        It does for my own version, because I defined words as alphanumeric strings, rather than as non-delimiters.

                          SO you wrote a simlira unction for your self ? cn you post this ?

                            function parseForWord($str, $pos) {
                                $len = strlen($str);
                                //perform bounds checking
                                if ($pos >= 0 && $pos < $len) {
                                    //is character at $pos is a delimiter?
                                    if (ctype_alnum($str{$pos})) {
                                        //$pos marks a character within a word
                                        $word_len = 1;
                                        //scan left for start of word
                                        for ($start = $pos - 1; $start >= 0; $start--) {
                                            if (ctype_alnum($str{$start})) {
                                                $word_len++;
                                            } else {
                                                break;
                                            }
                                        }
                                        //scan right to end of word
                                        for ($i = $pos + 1; $i < $len; $i++) {
                                            if (ctype_alnum($str{$i})) {
                                                $word_len++;
                                            } else {
                                                break;
                                            }
                                        }
                                        //$start + 1 here because of $start-- or an out-of-bounds $start
                                        return substr($str, $start + 1, $word_len);
                                    } else {
                                        //$pos marks position of a delimiter
                                        $word_len = 0;
                                        //scan left only
                                        for ($i = $pos - 1; $i >= 0; $i--) {
                                            if (ctype_alnum($str{$i})) {
                                                $word_len = 1;
                                                break;
                                            }
                                        }
                                        //$word_len = 1 > 0 if (last character of) word found
                                        if ($word_len > 0) {
                                            //scan left for start of word
                                            for ($start = $i - 1; $start >= 0; $start--) {
                                                if (ctype_alnum($str{$start})) {
                                                    $word_len++;
                                                } else {
                                                    break;
                                                }
                                            }
                                            return substr($str, $start + 1, $word_len);
                                        } else {
                                            return '';
                                        }
                                    }
                                } else {
                                    return false;
                                }
                            }

                              Ooh! Ooh! Can I play too?

                              function getWord($text, $pos) {
                                  if ($pos > strlen($text)) {
                                      return '';
                                  }
                                  if ($pos < 1)  $pos = 1;
                              
                              /*  Get a copy of the string up until $pos  */
                              $tmp = substr($text, 0, $pos);
                              /*  Match the last word of the temporary string keeping any trailing non-word characters  */
                              $tmp = preg_replace('/^.*?([\w\\'\-]+[^\w\\'\-]*)$/s', '\1', $tmp);
                              /*  Append the second part of the original string back onto our temporary string  */
                              $tmp .= substr($text, $pos);
                              /*  Match the word at the beginning of the string and return it  */
                              return preg_replace('/^([\w\\'\-]*).*$/s', '\1', $tmp);
                              }
                              

                              This one counts the first character in the string as position 1. It uses regex's "word" character to find word breaks, which should vary by locale.

                              Edited to include apostrophes and hyphens in words. Also changed PHP tags to CODE since its nearly impossible to get backslashes right in PHP blocks. grrr

                                Just my two (euro) cent:

                                function getWord2($text, $pos) {
                                  if ($pos > strlen($text)) { 
                                    return 'Tekst is veel te kort!'; 
                                  } 
                                  preg_match_all('#\b\w+#', substr($text, 0, $pos), $out);
                                  return array_pop($out[0]);
                                } 

                                Using the assertion \b (which is no character consuming) does the magic trick.

                                Edit This BBcode doesn't like the regex pattern. Hence the CODE tag !

                                  This BBcode doesn't like the regex pattern. Hence the CODE tag !

                                  Yeah, I think its a bug with the php bbcode tag.

                                  There's a bug with your solution though, in that if the position is in the middle of the word, not the whole word is returned (due to the substr())

                                    Originally posted by mtmosier
                                    This one counts the first character in the string as position 1. It uses regex's "word" character to find word breaks, which should vary by locale.

                                    It also breaks on "I'd", a word I deliberately used in my test string because of this. It's also why I didn't do anything about punctuation, since nothing was specified about them in the original problem (even though I asked🙂).

                                      It also breaks on "I'd", a word I deliberately used in my test string because of this.

                                      Quite true. Easily fixed, but then the question becomes what else is a valid part of a word? A hyphen I suppose, but there must be more. Alternatively could simply define what constitues a character on which to break.

                                      I think I need more detailed specs.

                                        There's a bug with your solution though, in that if the position is in the middle of the word, not the whole word is returned (due to the substr())

                                        True. If that's what he needs, it is easy to correct it by adding just one line:

                                        function getWord2($text, $pos) {
                                          if ($pos > strlen($text)) { 
                                            return 'Tekst is veel te kort!'; 
                                          }
                                          while ($text{$pos} != ' ') $pos++;        //  <--------   added line
                                          preg_match_all('#\b\w+#', substr($text, 0, $pos), $out);
                                          return array_pop($out[0]);
                                        } 

                                        As for the "I'd" problem, pcre syntax considers it as two words. Which is gramaticly correct I guess (forgive me if I'am wrong, english is not my mother language!).

                                        If not, change the regex pattern with #\b[\w']+#.

                                        Et voilà. Meer moet dat niet zijn!