Hi all, for fun (and to learn jquery / ajax) I'm building a word jumble game. Given 5 letters, the user would guess a combination to try to make words. for example acep possible answers would be ace, pace, ape, pea (more?) anyway, I'm using a dictionary api to check the submissions, but I don't want to even check the dictionary if they were to type ave (maybe a typo of ace).

This is what I came up with, I'm wondering if there is possibly a better way:

foreach( str_split($word) as $letter ) {
   if( in_array($letter,$letters) ) {
      $k = array_search($letter,$letters);
      unset($letters[$k]);
   } else {
      $result = array();
      $result['valid'] = FALSE;
      $result['word'] = $word;
      $result['error'] = 'Invalid Letters';
      echo json_encode($result);
      exit;
   }
}

$letters is an array with the current puzzles letter, may contain more than one of a single letter.

    <?php
    //first, you need a string with the allowed letters
    $allowed = implode('',$letters);
    
    // then you can use a regex
    if(preg_match("/[^$allowed]/",$word)){ /* disallowed letters! */ }
    else{ /* all is right with the world */ }

      traq's suggestion will meet your "uses disallowed letters" test, but I have a question: Given the letters "ACEP", would "PEACE" or "PAPA" be acceptable responses?

        @weed no they would not, that's why I didn't use a regular expression myself, I didn't know how to limit it to now more instances of a letter than is in the allowed list. IE given aceelpr replace is acceptable, peeler, while a real word, is not acceptable.

          In that case I'd still go with traq's suggestion as one step; a subsequent step would catch cases where letters are overused. That automatically makes me think of [man]count_chars[/man].

            Couldn't you skip the regexp altogether and simply check the count of each character in the 'guessed' string to make sure that none exceeded the frequency that they occurred in the 'jumbled' string (which could be 0 if they entered a a letter that didn't appear at all)?

              You certainly could skip the regexp but I suspect using it to first look for the presence of an invalid character first will be faster and look cleaner than counting how often they each occur and seeing if their totals are greater than 0 (it's only a suspicion, mind you).

                Guess I figured the overhead of regexp would be more than the overhead of count_chars(). I might be curious enough to do some benchmarks once I get home, but I definitely don't think using both regexps + count_chars would be any cleaner than simply using count_chars()... quick example:

                function check_valid_guess($jumbled, $guess)
                {
                	$jumbled_freq = count_chars($jumbled, 1);
                
                foreach(count_chars($guess, 1) as $char => $freq)
                	if(!isset($jumbled_freq[$char]) || $freq > $jumbled_freq[$char])
                		return false;
                
                return true;
                }

                or, if repeatedly calling isset() is more expensive than initializing an array of length 256:

                function check_valid_guess($jumbled, $guess)
                {
                	$jumbled_freq = count_chars($jumbled);
                
                foreach(count_chars($guess, 1) as $char => $freq)
                	if($freq > $jumbled_freq[$char])
                		return false;
                
                return true;
                }

                EDIT: Of course, the array initialization cost offset/savings could be nonexistent if it's done in both cases... guess it would depend on the C code used to implement count_chars() when the second parameter is 0 vs. 1.

                  Your second code sample is essentially what I came up with, with the regular expression test removed (and therefore necessarily retaining counts of all characters because any one of them might be looked up - what difference that makes depends on just how associative arrays' hashtables work); so if you are going to test, here's another.

                  function jumble_test($source_letters, $candidate)
                  {
                  	if(preg_match("/[^$source_letters]/", $candidate))
                  	{
                  		return false;
                  	}
                  	$source_letters = count_chars($source_letters, 1);
                  	foreach(count_chars($candidate, 1) as $char=>$count)
                  	{
                  		if($count > $source_letters[$char])
                  		{
                  			return false;
                  		}
                  	}
                  	return true;
                  }
                  

                  Note that using 1 for the second parameter when counting the characters in $candidate means the loop will have far fewer elements to iterate over.

                  bradgrafelman wrote:

                  guess it would depend on the C code used to implement count_chars() when the second parameter is 0 vs. 1.

                  	if (mymode < 3) {
                  		array_init(return_value);
                  	}
                  
                  for (inx = 0; inx < 256; inx++) {
                  	switch (mymode) {
                   		case 0:
                  			add_index_long(return_value, inx, chars[inx]);
                  			break;
                   		case 1:
                  			if (chars[inx] != 0) {
                  				add_index_long(return_value, inx, chars[inx]);
                  			}
                  			break;
                  /* ... cases 2, 3, and 4 */
                  		}
                  	}
                  
                    Weedpacket;10994104 wrote:

                    Note that using 1 for the second parameter when counting the characters in $candidate means the loop will have far fewer elements to iterate over.

                    Ah crap, forgot to update that too. Initially, I had reversed the behavior between mode 0 (the default) and mode 1 in my head; while editing my post, I forgot to switch to using mode 1 in both code samples for the 'guessed' parameter. My mistake!

                    On the same topic, however, wouldn't your code generate E_NOTICE level errors if the 'guessed' parameter contained characters that were not present in the 'jumbled' one (hence why I included isset() in the relevant coding example)?

                    At any rate, I managed (after much searching) to locate a sample framework I'd written ages ago to test a series of functions for performance across a set of parameters. Overkill? Yes. But I was bored (both then and now :p), so here is the source:

                    <?php
                    
                    /**************
                     * Test setup *
                     **************/
                    $funcs = array('check_valid_guess_rev_1', 'check_valid_guess_rev_2', 'check_valid_guess_rev_3');
                    $params = array(
                    	array('APEC', 'PEACE'),
                    	array('APEC', 'SPACES'),
                    	array('TOM MARVOLO RIDDLE', 'I AM LORD VOLDEMORT')
                    );
                    $num_times = 25000;
                    $unit_conv = array(1000, 'msec');
                    $precision = 6;
                    $sound = true;
                    
                    /*****************************************
                     * Definitions of functions to be tested *
                     *****************************************/
                    function check_valid_guess_rev_1($jumbled, $guess) 
                    { 
                        $jumbled_freq = count_chars($jumbled, 1); 
                    
                    foreach(count_chars($guess, 1) as $char => $freq) 
                        if(!isset($jumbled_freq[$char]) || $freq > $jumbled_freq[$char]) 
                            return false; 
                    
                    return true; 
                    }
                    
                    function check_valid_guess_rev_2($jumbled, $guess) 
                    { 
                        $jumbled_freq = count_chars($jumbled); 
                    
                    foreach(count_chars($guess, 1) as $char => $freq) 
                        if($freq > $jumbled_freq[$char]) 
                            return false; 
                    
                    return true; 
                    }
                    
                    /* bradgrafelman: renamed version of Weedpacket's jumble_test() */
                    function check_valid_guess_rev_3($source_letters, $candidate) 
                    { 
                        if(preg_match("/[^$source_letters]/", $candidate)) 
                        { 
                            return false; 
                        } 
                        $source_letters = count_chars($source_letters, 1); 
                        foreach(count_chars($candidate, 1) as $char=>$count) 
                        { 
                            if($count > $source_letters[$char]) 
                            { 
                                return false; 
                            } 
                        } 
                        return true; 
                    }
                    
                    /***********************************
                     * !!! END OF EDITABLE REGIONS !!! *
                     ***********************************/
                    
                    
                    
                    /*******************************
                     * Init stat-keeping variables *
                     *******************************/
                    $times = array_fill(0, count($funcs), array());
                    $medians = array_fill(0, count($params), array());
                    
                    /*************
                     * Run tests *
                     *************/
                    for($func_idx = 0; $func_idx < count($funcs); $func_idx++)
                    {
                    	$times[$func_idx] = array_fill(0, count($params), array());
                    
                    for($param_idx = 0; $param_idx < count($params); $param_idx++)
                    {
                    	for($i = 0; $i < $num_times; $i++)
                    	{
                    		$start = microtime(true);
                    		call_user_func_array($funcs[$func_idx], $params[$param_idx]);
                    		$end = microtime(true);
                    
                    		$times[$func_idx][$param_idx][] = ($end - $start) * $unit_conv[0];
                    	}
                    
                    	// Sort stored times
                    	sort($times[$func_idx][$param_idx]) or die('error sorting times');
                    
                    	// Store median value
                    	$medians[$param_idx][$func_idx] =
                    		$times[$func_idx][$param_idx][floor($num_times / 2)];
                    }
                    
                    // Announce completion of testing for function
                    if($sound)
                    	echo "\x07";
                    }
                    
                    /***********************************
                     * Generate output of test results *
                     *      (grouped by function)      *
                     ***********************************/
                    echo "Number of executions: $num_times\n\n";
                    
                    for($func_idx = 0; $func_idx < count($funcs); $func_idx++)
                    {
                    	echo "Function #$func_idx :\n";
                    
                    for($param_idx = 0; $param_idx < count($params); $param_idx++)
                    {
                    	echo "\t$funcs[$func_idx](" . implode(',', $params[$param_idx]) . ")\n";
                    
                    	// Variable rename for brevity
                    	$p = $precision;
                    
                    	printf(
                    		"\t\tMin: %.{$p}f %4\$s\n\t\tMax: %.{$p}f %4\$s\n"
                    			. "\t\tAvg: %.{$p}f %4\$s\n",
                    		$times[$func_idx][$param_idx][0],
                    		$times[$func_idx][$param_idx][$num_times - 1],
                    		(array_sum($times[$func_idx][$param_idx]) / $num_times),
                    		$unit_conv[1]
                    	);
                    }
                    }
                    
                    echo "\n--------------------------------------------------\n\n";
                    
                    /***********************************
                     * Generate output of test results *
                     *     (grouped by parameters)     *
                     *     (based on func medians)     *
                     ***********************************/
                    for($param_idx = 0; $param_idx < count($params); $param_idx++)
                    {
                    	echo "Parameter Group #$param_idx (" . implode(',', $params[$param_idx])
                    		. ") :\n";
                    	echo "\tMedians :\n";
                    
                    // Sort median values
                    asort($medians[$param_idx]) or die('error sorting medians');;
                    
                    foreach($medians[$param_idx] as $func_idx => $median)
                    {
                    	// Variable rename for brevity
                    	$p = $precision;
                    
                    	printf(
                    		"\t\t%s: %.{$p}f %s\n",
                    		$funcs[$func_idx],
                    		$median,
                    		$unit_conv[1]
                    	);
                    }
                    }

                    and the results:

                    Number of executions: 25000
                    
                    Function #0 :
                            check_valid_guess_rev_1(APEC,PEACE)
                                    Min: 0.009775 msec
                                    Max: 0.121832 msec
                                    Avg: 0.011212 msec
                            check_valid_guess_rev_1(APEC,PACES)
                                    Min: 0.010014 msec
                                    Max: 0.446081 msec
                                    Avg: 0.011878 msec
                            check_valid_guess_rev_1(TOM MARVOLO RIDDLE,I AM LORD VOLDEMORT)
                                    Min: 0.011921 msec
                                    Max: 0.262976 msec
                                    Avg: 0.013390 msec
                    Function #1 :
                            check_valid_guess_rev_2(APEC,PEACE)
                                    Min: 0.043869 msec
                                    Max: 0.850916 msec
                                    Avg: 0.046381 msec
                            check_valid_guess_rev_2(APEC,PACES)
                                    Min: 0.044823 msec
                                    Max: 0.217199 msec
                                    Avg: 0.046985 msec
                            check_valid_guess_rev_2(TOM MARVOLO RIDDLE,I AM LORD VOLDEMORT)
                                    Min: 0.044823 msec
                                    Max: 0.602007 msec
                                    Avg: 0.047439 msec
                    Function #2 :
                            check_valid_guess_rev_3(APEC,PEACE)
                                    Min: 0.013828 msec
                                    Max: 0.666142 msec
                                    Avg: 0.015597 msec
                            check_valid_guess_rev_3(APEC,PACES)
                                    Min: 0.007868 msec
                                    Max: 0.097036 msec
                                    Avg: 0.009406 msec
                            check_valid_guess_rev_3(TOM MARVOLO RIDDLE,I AM LORD VOLDEMORT)
                                    Min: 0.016928 msec
                                    Max: 0.356197 msec
                                    Avg: 0.018463 msec
                    
                    --------------------------------------------------
                    
                    Parameter Group #0 (APEC,PEACE) :
                            Medians :
                                    check_valid_guess_rev_1: 0.010967 msec
                                    check_valid_guess_rev_3: 0.015020 msec
                                    check_valid_guess_rev_2: 0.046015 msec
                    Parameter Group #1 (APEC,PACES) :
                            Medians :
                                    check_valid_guess_rev_3: 0.009060 msec
                                    check_valid_guess_rev_1: 0.011921 msec
                                    check_valid_guess_rev_2: 0.046968 msec
                    Parameter Group #2 (TOM MARVOLO RIDDLE,I AM LORD VOLDEMORT) :
                            Medians :
                                    check_valid_guess_rev_1: 0.013113 msec
                                    check_valid_guess_rev_3: 0.017881 msec
                                    check_valid_guess_rev_2: 0.046968 msec

                    @: It would appear your suspicions were correct; your function (rev_3) beat my faster function (rev_1) in group #1 where an invalid character was used.

                    EDIT: ... and I'm stupid. Forgot that the whole point of your use of the regexp was such that the count_chars() was only done for guessed strings that didn't contain any invalid letters. Oopsie.

                    Regardless, removing the isset() only affected the function's median time for param group #2 with the given precision (presumably because it was a longer string and the isset() was called more times).

                      Just for thoroughness I ran this on my PC plus the function I made. Seems to be significantly better with count_chars :

                      function validateLetters($word,$letters) {
                         $letters = str_split(strtolower($letters));
                         $word = str_split(strtolower($word));
                      
                         foreach( $word as $letter ) {
                            if( in_array($letter,$letters) ) {
                               $k = array_search($letter,$letters);
                               unset($letters[$k]);
                            } else {
                               return FALSE;
                            }
                         }
                         return TRUE;
                      }
                      Number of executions: 25000
                      
                      Function #0 :
                      	check_valid_guess_rev_1(APEC,PEACE)
                      		Min: 0.004768 msec
                      		Max: 0.622034 msec
                      		Avg: 0.005807 msec
                      	check_valid_guess_rev_1(APEC,SPACES)
                      		Min: 0.004768 msec
                      		Max: 0.081778 msec
                      		Avg: 0.005964 msec
                      	check_valid_guess_rev_1(TOM MARVOLO RIDDLE,I AM LORD VOLDEMORT)
                      		Min: 0.004768 msec
                      		Max: 0.054836 msec
                      		Avg: 0.006160 msec
                      Function #1 :
                      	check_valid_guess_rev_2(APEC,PEACE)
                      		Min: 0.015974 msec
                      		Max: 0.092983 msec
                      		Avg: 0.017594 msec
                      	check_valid_guess_rev_2(APEC,SPACES)
                      		Min: 0.016928 msec
                      		Max: 0.499010 msec
                      		Avg: 0.018405 msec
                      	check_valid_guess_rev_2(TOM MARVOLO RIDDLE,I AM LORD VOLDEMORT)
                      		Min: 0.016928 msec
                      		Max: 0.103951 msec
                      		Avg: 0.018601 msec
                      Function #2 :
                      	check_valid_guess_rev_3(APEC,PEACE)
                      		Min: 0.005960 msec
                      		Max: 0.136137 msec
                      		Avg: 0.006796 msec
                      	check_valid_guess_rev_3(APEC,SPACES)
                      		Min: 0.002861 msec
                      		Max: 0.043154 msec
                      		Avg: 0.003478 msec
                      	check_valid_guess_rev_3(TOM MARVOLO RIDDLE,I AM LORD VOLDEMORT)
                      		Min: 0.006914 msec
                      		Max: 0.065088 msec
                      		Avg: 0.007931 msec
                      Function #3 :
                      	validateLetters(APEC,PEACE)
                      		Min: 0.006914 msec
                      		Max: 0.067949 msec
                      		Avg: 0.007633 msec
                      	validateLetters(APEC,SPACES)
                      		Min: 0.003815 msec
                      		Max: 0.169039 msec
                      		Avg: 0.005040 msec
                      	validateLetters(TOM MARVOLO RIDDLE,I AM LORD VOLDEMORT)
                      		Min: 0.015974 msec
                      		Max: 0.214100 msec
                      		Avg: 0.019852 msec
                      
                      --------------------------------------------------
                      
                      Parameter Group #0 (APEC,PEACE) :
                      	Medians :
                      		check_valid_guess_rev_1: 0.005960 msec
                      		check_valid_guess_rev_3: 0.006914 msec
                      		validateLetters: 0.007153 msec
                      		check_valid_guess_rev_2: 0.017166 msec
                      Parameter Group #1 (APEC,SPACES) :
                      	Medians :
                      		check_valid_guess_rev_3: 0.003099 msec
                      		validateLetters: 0.005007 msec
                      		check_valid_guess_rev_1: 0.005960 msec
                      		check_valid_guess_rev_2: 0.018120 msec
                      Parameter Group #2 (TOM MARVOLO RIDDLE,I AM LORD VOLDEMORT) :
                      	Medians :
                      		check_valid_guess_rev_1: 0.005960 msec
                      		check_valid_guess_rev_3: 0.007868 msec
                      		validateLetters: 0.017881 msec
                      		check_valid_guess_rev_2: 0.018120 msec
                      

                        @: One thing to consider is that this:

                              if( in_array($letter,$letters) ) { 
                                 $k = array_search($letter,$letters);

                        contains a bit of redundancy; why force PHP to walk through the array with in_array() only to make it do the same thing again with array_search()? Instead, simply call array_search() and check to see if $letter was found or not.

                        Also consider that your function is calling strtolower() on both inputs, whereas neither of the other three functions touch the given inputs before performing the checks.

                        EDIT: Another consideration; it may be that for small strings (which is probably all that these functions will ever see, unless you plan on jumbling up the entirety of War and Peace) calling unset() is more expensive than simply overwriting that location in the array with the empty string.

                          Damn you brad! I was literally sitting on my bed about to lay down when my phone got the email! I had to come back and update the function per your suggestions. Unset seems to be faster than assigning the empty string:

                          function validateLetters($letters,$word) {
                             $letters = str_split($letters);
                             $word = str_split($word);
                          
                             foreach( $word as $letter ) {
                                if( ($k = array_search($letter,$letters)) === FALSE ) {
                                   return FALSE;
                                } else {
                                   $letters[$k] = ''; // Also used unset... See results
                                }
                             }
                             return TRUE;
                          }
                          Number of executions: 25000
                          
                          Function #0 :
                          	check_valid_guess_rev_1(APEC,PEACE)
                          		Min: 0.003815 msec
                          		Max: 0.034094 msec
                          		Avg: 0.005238 msec
                          	check_valid_guess_rev_1(APEC,SPACES)
                          		Min: 0.004768 msec
                          		Max: 0.036955 msec
                          		Avg: 0.005827 msec
                          	check_valid_guess_rev_1(TOM MARVOLO RIDDLE,I AM LORD VOLDEMORT)
                          		Min: 0.004768 msec
                          		Max: 0.063181 msec
                          		Avg: 0.006246 msec
                          Function #1 :
                          	check_valid_guess_rev_2(APEC,PEACE)
                          		Min: 0.015974 msec
                          		Max: 0.113010 msec
                          		Avg: 0.017712 msec
                          	check_valid_guess_rev_2(APEC,SPACES)
                          		Min: 0.015974 msec
                          		Max: 0.089884 msec
                          		Avg: 0.017999 msec
                          	check_valid_guess_rev_2(TOM MARVOLO RIDDLE,I AM LORD VOLDEMORT)
                          		Min: 0.015974 msec
                          		Max: 0.069141 msec
                          		Avg: 0.017710 msec
                          Function #2 :
                          	check_valid_guess_rev_3(APEC,PEACE)
                          		Min: 0.005960 msec
                          		Max: 0.077009 msec
                          		Avg: 0.006696 msec
                          	check_valid_guess_rev_3(APEC,SPACES)
                          		Min: 0.002861 msec
                          		Max: 0.104904 msec
                          		Avg: 0.003424 msec
                          	check_valid_guess_rev_3(TOM MARVOLO RIDDLE,I AM LORD VOLDEMORT)
                          		Min: 0.005960 msec
                          		Max: 0.054836 msec
                          		Avg: 0.007585 msec
                          Function #3 :
                          	validateLettersEmptyString(APEC,PEACE)
                          		Min: 0.005960 msec
                          		Max: 0.052929 msec
                          		Avg: 0.006797 msec
                          	validateLettersEmptyString(APEC,SPACES)
                          		Min: 0.003815 msec
                          		Max: 0.092030 msec
                          		Avg: 0.004727 msec
                          	validateLettersEmptyString(TOM MARVOLO RIDDLE,I AM LORD VOLDEMORT)
                          		Min: 0.012875 msec
                          		Max: 0.095129 msec
                          		Avg: 0.014931 msec
                          Function #4 :
                          	validateLettersUnset(APEC,PEACE)
                          		Min: 0.004768 msec
                          		Max: 0.086069 msec
                          		Avg: 0.006044 msec
                          	validateLettersUnset(APEC,SPACES)
                          		Min: 0.003815 msec
                          		Max: 0.045061 msec
                          		Avg: 0.004660 msec
                          	validateLettersUnset(TOM MARVOLO RIDDLE,I AM LORD VOLDEMORT)
                          		Min: 0.010967 msec
                          		Max: 0.093937 msec
                          		Avg: 0.012673 msec
                          
                          --------------------------------------------------
                          
                          Parameter Group #0 (APEC,PEACE) :
                          	Medians :
                          		check_valid_guess_rev_1: 0.005007 msec
                          		validateLettersUnset: 0.005960 msec
                          		check_valid_guess_rev_3: 0.006199 msec
                          		validateLettersEmptyString: 0.006914 msec
                          		check_valid_guess_rev_2: 0.017166 msec
                          Parameter Group #1 (APEC,SPACES) :
                          	Medians :
                          		check_valid_guess_rev_3: 0.003099 msec
                          		validateLettersUnset: 0.005007 msec
                          		validateLettersEmptyString: 0.005007 msec
                          		check_valid_guess_rev_1: 0.005960 msec
                          		check_valid_guess_rev_2: 0.017881 msec
                          Parameter Group #2 (TOM MARVOLO RIDDLE,I AM LORD VOLDEMORT) :
                          	Medians :
                          		check_valid_guess_rev_1: 0.005960 msec
                          		check_valid_guess_rev_3: 0.007153 msec
                          		validateLettersUnset: 0.012159 msec
                          		validateLettersEmptyString: 0.015020 msec
                          		check_valid_guess_rev_2: 0.017166 msec
                          

                            Well at least you have some empirical data in case you need to shave off as many microseconds as possible.

                            (Oh, and since you're still up, be sure to let me know where I can send the invoice of $499.98 for the license to use my testing framework two times; I also offer site licenses for $14999.98 - 5&#37; off if you mention PHPBuilder! 🙂)

                              Send the bill to my accountant. Of course keep in mind he may mention something about there not being a licence, copyright or terms of use and therefore refuse to pay you, but that's between you and him! 😛

                                Write a Reply...