Binomial choose k elements from a set of n.

function subsets($needed, $set)
{
	if($needed == 0)
	{
		yield [];
	}
	else
	{
		--$needed;
		$n = count($set);
		for($i = 0; $i < $n; ++$i)
		{
			$element = array_pop($set);
			foreach(subsets($needed, $set) as $subset)
			{
				$subset[] = $element;
				yield $subset;
			}
		}
	}
}


$k = 3;
$n = str_split('abcdefg');
foreach(subsets($k, $n) as $subset)
{
	echo join($subset) . ' '; // 7!/(3!4!) = 35 different subsets.
}
a month later
3 years later

Well, this has been quiet for a while. So I'm going to share a routine I knocked together to emulate sliced array views. The fun of array elements that contain variable references. I've pulled this stunt before....

function slice(array &$input, int $start = 0, int $end = PHP_INT_MAX, int $step = 1): array
{
	$start = max($start, 0);
	$end = min($end, count($input));
	$step = max($step, 1);

$return = [];
for($i = $start; $i < $end; $i += $step)
{
	$return[] = &$input[$i];
}
return $return;
}


// Examples

$array = range(0,20);

echo "\n\n";
echo 'Original array: ', join(' ', $array), "\n";


$pick = slice($array, 2, 12, 3);

echo 'Picking every third item starting from 2 up to 12: ', join(' ', $pick), "\n";

$pick[3] = 'foo!';

echo 'Changed one item: ', join(' ', $pick), "\n";

echo 'Original array now looks like: ', join(' ', $array), "\n";

echo "\n\n";

$again = slice($array, start: 4, step: 5); // Note: PHPv8 named arguments because they're nice.

echo 'Every fifth item starting from 4 and continuing to the end: ', join(' ', $again), "\n";

foreach($again as &$poke)
{
	$poke = 'wibble';
}
unset($poke);

echo 'Answer every question the same way: ', join(' ', $again), "\n";

echo 'Making the original array: ', join(' ', $array), "\n";

// And variable references are symmetric, so
$array[19] = 'Yeah';
echo join(' ', $again), " you can go the other way as well.\n\n";
// These views are live.


// And just a little more elaborate...

$stride = 9;
$rows = array_fill(0, $stride**2, 0);

$columns = [];
for($i = 0; $i < $stride; ++$i)
{
	$columns[] = slice($rows, $i, step: 9); // Variable references survive this sort of treatment.
}
$columns = array_merge(...$columns); // And this sort too; we're not actually touching the array elements themselves

for($i = 0; $i < $stride**2; ++$i)
{
	$columns[$i] = $i;
}

echo join(' ', $rows);

/*
The same result can be achieved with
$stride = 9;
$rows = array_merge(...array_map(null, ...array_chunk(range(0, $stride**2-1), $stride)));
but that would be missing the point.
*/
?>
    4 months later

    Okay, here's a wee thing.

    $this->source->tap() returns arrays [bool, mixed] with the first part showing whether the second part is cromulent; in this particular instance the second part is irrelevant if it is not so. I want to keep tapping the source for those second part items while that first part remains true.

    		while(([, $item] = $this->source->tap())[0]) {...}
    		

    See, the assignment expression evaluates to the whole array, with both 0 and 1 elements. The left hand side of the assignment binds $item to the value of the 1 element and ignores the 0 element. But since the assignment operation itself evaluates to the entire array, the 0 element is happily sitting there to be addressed to see if it is true or not.

    Here, have a look at the opcode instructions, (as generated by php -d opcache.opt_debug_level = 0x10000 when you have opcache enabled):

    ...
    0005 T2 = FETCH_OBJ_R THIS string("source")
    0006 INIT_METHOD_CALL 0 T2 string("tap")
    0007 V3 = DO_FCALL
    0008 V4 = FETCH_LIST_R V3 int(1)
    0009 ASSIGN CV0($item) V4
    0010 T6 = FETCH_DIM_R V3 int(0)
    0011 JMPNZ T6 0003
    

    Instructions 5-6 retrieve the source property from $this, and the tap method from that, using the register T2 to store the property along the way.
    Instruction 7 makes the call to tap and stuffs the result into register V3. That's the two-element [bool, mixed] array.
    Instruction 8 fetches element 1 from what's in register V3 (for the purposes of a list() assignment) and stuffs it into register V4.
    Instruction 9 promptly copies what's in V4 into local variable 0 (known to you and me as $item).
    Instruction 10 now fetches element 0 from the [bool, mixed] array still sitting in register V3 and stuffs it into register T6.
    Finally, instruction 11 decides whether or not to jump back to instruction 3 (not shown) depending on whether or not the content of register T6 is nonzero or not. For whatever reason, while loops are written in bytecode with their controlling tests at the end of the loop body (indicated in the bytecode and source snippet by ...).

      14 days later

      Don't know if I've ever used array_diff() before, but came up with this use case today where it seems to do the trick.

      I have a set of tests that use JSON files for the test definitions. In them I have an optional field for each test called "identities"* that can either be a simple string (which then is used to get the array of identities under that key from another JSON file), or it can be an explicit array, e.g. "identities": ["id_1", "id_5"].

      I decided I wanted a way to say, "Use this set, except for these identities which I want you to skip." I then added the ability to use this syntax:

      "identities": "foo except [\"id_1\", \"id_5\"]"
      

      To handle that in the script that interpolates and executes each test definition:

            elseif(preg_match('/^(\w+) except (\[.+\])$/', $test['identities'], $matches)) {
              $exceptions = json_decode($matches[2], true);
              $ids = array_diff($identities[$matches[1]], $exceptions);
              if(empty($ids) or !is_array($ids)) {
                die("Failed to populate 'except' identities\n");
              }
              $api->identities($ids);
            }
      

      __________
      * "identities" are a specific thing in the app -- don't worry what it is, just know that it's a string here. 😉

      PS: I may change it so that the exceptions are an array in the identities file, and then just do:

      "identities": "foo except bar"
      
        2 years later

        I wanted to generate the prime numbers in sequence. For reasons related to reindexing hashtables.

        function primeGenerator()
        {
        	yield 2;
        	yield 3;
        	$primes = [2, 3];
        	$b = 6;
        	while(true)
        	{
        		$t1 = $b - 1;
        		$t2 = $b + 1;
        		$t1s = sqrt($t1);
        		$t2s = sqrt($t2);
        		foreach($primes as $p)
        		{
        			if($p > $t1s)
        			{
        				break;
        			}
        			if($t1 % $p == 0)
        			{
        				goto t1Composite;
        			}
        		}
        		$primes[] = $t1;
        		yield $t1;
        t1Composite:
        		foreach($primes as $p)
        		{
        			if($p > $t2s)
        			{
        				break;
        			}
        			if($t2 % $p == 0)
        			{
        				goto t2Composite;
        			}
        		}
        		$primes[] = $t2;
        		yield $t2;
        t2Composite:
        		$b += 6;
        	}
        };

        Iterating that gives me the first 100,000 primes in about two seconds.

        3 months later

        Weedpacket

        Oh, hey, maybe it should do properties, too:

        
        function map_method(array $array): object
        {
        	return new class($array)
        	{
        		public function __construct(private array $array){}
        
        		public function __call(string $method, array $arguments): array
        		{
        			return array_map(fn($element) => $element->{$method}(...$arguments), $this->array);
        		}
        		public function __get(string $property): array
        		{
        			return array_map(fn($element) => $element->{$property}, $this->array);
        		}
        	};
        }
        
        $array = [ // Only because I couldn't be bothered creating some realistic objects
        	(object)['a' => 17],
        	(object)['a' => 42],
        	(object)['a' => 21],
        	(object)['a' => 19],
        	(object)['a' => 4]
        ];
        
        $as = map_method($array)->a;
        
        echo join(' ', $as);

        And, just to make things even cuter:

        $array = [
        	(object)['numerals' => 17, 'words' => 'seventeen'],
        	(object)['numerals' => 42, 'words' => 'forty-two'],
        	(object)['numerals' => 21, 'words' => 'twenty-one'],
        	(object)['numerals' => 19, 'words' => 'nineteen'],
        	(object)['numerals' => 40, 'words' => 'forty']
        ];
        
        $mapped = map_method($array);
        $numerals = $mapped->numerals;
        $words = $mapped->words;
        echo '[', join(' ', $numerals), " , ", join(' ', $words), ']';

        Mapping property accesses can be done with array_column, but array_column won't map method calls. I should probably implement __isset as well to go with __get (as you need to if you want __get to be called by array_column), but that would mean either examining the objects being mapped to see if the property is in fact set for each of them, or just punting by returning true, and letting each individual object complain if in fact the property isn't set in its particular case. Or I could just have the anonymous class's __get use array_column itself. I mean, it's there...

          4 months later

          I was getting an (iterable) stream of items in batches, processing them together, and send the result onwards. Then there were changes and my source was no longer able to guarantee that the items were nicely batched for me. Sometimes not enough, often too many.

          I had to stick a buffer in to the workflow.

          So here's what I did. Instead of my real code (which would involve a lot more context) I've written an example where I'm working on a series of strings that are all supposed to be concatenated together into one long file which is then base64-encoded (you can base64 a bytestream in chunks if each chunk is a multiple of three bytes in length). So imagine that the strings I'm getting are all sorts of different lengths and I want to smooth things out into suitably-sized chunks that I can encode on the fly.

          What I had previously:

          foreach($iterable_line_source as $line)
          {
          	echo base64_encode($line), "\r\n";
          }

          What I now had:

          foreach(chunker($iterable_line_source) as $line)
          {
          	echo base64_encode($line), "\r\n";
          }

          With all the buffer-handling inside that chunker function:

          function chunker(iterable $source, int $chunk_size = 57): Generator
          {
          	$buffer = '';
          	foreach($source as $chunk)
          	{
          		$buffer .= $chunk;
          		if(strlen($buffer) >= $chunk_size)
          		{
          			$chunks = str_split($buffer, $chunk_size);
          			$buffer = array_pop($chunks);
          			yield from $chunks;
          		}
          	}
          	yield $buffer;
          }

          (The significance of "57" is a base64 thing: 57 input bytes is enough for exactly one line of MIME-formatted output.)

            25 days later

            More dealing with iterables.

            I needed to generate the subsets of a given set of values that I got from an iterable. Problem was that I wouldn't just be given an array; new values would be arriving even while I was generating subsets of what I already had.

            I could just run the source iterable to exhaustion and put all the values into an array, then do the needful from there — I certainly needed to build such an array because there will be subsets that include both the first and last elements from the iterable, which means I'd need to have both on hand — but this would have been a bottleneck that causes further processing to stall.

            So here are a couple of functions that "generate all the subsets of a given set" on the fly: generating what subsets can be generated for each new set element as it becomes available. The first generates all subsets of the input set, the latter those subsets that are of a specific size. At the bottom is an example of them being used.

            
            function powerset(iterable $source): Generator
            {
            	// This could be a fully-fledged named function in its own right,
            	// but as long as this is the only place it's used...
            	$powersets_for = function(array $set)use(&$powersets_for): Generator
            	{
            		if(empty($set))
            		{
            			yield [];
            		}
            		else
            		{
            			$s = array_shift($set);
            			foreach($powersets_for($set) as $ss)
            			{
            				yield $ss;
            				yield [$s, ...$ss];
            			}
            		}
            	};
            
            	$collection = [];
            	yield []; // Even the empty set has one subset.
            	foreach($source as $s)
            	{
            		foreach($powersets_for($collection) as $ss)
            		{
            			yield [...$ss, $s];
            		}
            		$collection[] = $s;
            	}
            }
            
            function subsets(iterable $source, int $k): Generator
            {
            	if($k == 0)
            	{
            		yield [];
            	}
            	else
            	{
            		$subsets_for = function(array $set, int $k)use(&$subsets_for): Generator
            		{
            			$n = count($set);
            			if($k == 0)
            			{
            				yield [];
            			}
            			elseif($k == 1)
            			{
            				foreach($set as $s)
            				{
            					yield [$s];
            				}
            			}
            			elseif($k == $n)
            			{
            				yield $set;
            			}
            			elseif($k < $n)
            			{
            				--$k;
            				while(count($set) >= $k)
            				{
            					$s = array_shift($set);
            					foreach($subsets_for($set, $k) as $ss)
            					{
            						yield [$s, ...$ss];
            					}
            				}
            			}
            		};
            
            		$collection = [];
            		foreach($source as $term)
            		{
            			if(count($collection) >= $k - 1)
            			{
            				foreach($subsets_for($collection, $k - 1) as $part)
            				{
            					yield [...$part, $term];
            				}
            			}
            			$collection[] = $term;
            		}
            	}
            }
            
            
            
            function maker(int $n): Generator
            {
            	for($i = 1; $i <= $n; ++$i)
            	{
            		echo "Create $i\n";
            		yield $i;
            	}
            }
            
            foreach(subsets(maker(8), 5) as $sset)
            {
            	echo "Subset: [" . join(',', $sset) . "]\n";
            }
            
            echo "\n\n";
            
            foreach(powerset(maker(8)) as $set)
            {
            	echo "Subset: [" . join(',', $set) . "]\n";
            }
              7 months later

              While searching for something else I stumbled on this old post from 2010:
              Weedpacket
              wherein I presented the curry function

              function curry($f)
              {
                      if(!is_callable($f))
                      {
                              trigger_error(E_USER_WARNING, "curry()'s argument needs to be callable.");
                              return null;
                      }
                      return function($x)use($f)
                      {
                              // An arbitrary number of arguments could be passed...
                              return function()use($x,$f)
                              {
                                      $args = func_get_args();
                                      // ...and the first one already HAS been...
                                      array_unshift($args, $x);
                                      return call_user_func_array($f, $args);
                              };
                      };
              }
              

              It's been a long while since I've had to make use of call_user_func_array, so I wondered what that function would look like if I wrote it today. So I wrote it today:

              function curry(callable $f): callable
              {
              	return fn($x) => fn(...$args) => $f($x, ...$args);
              }
              

              Just a little snapshot of 14 years' worth of PHP language evolution.

                10 months later

                Weedpacket

                Huh. That function is doing two jobs. Should be two functions.

                function selections(int $t, int $n)
                {
                	if($n == 0) {
                		yield [];
                	} elseif($t == 0) {
                		yield array_fill(0, $n, false);
                	} elseif($t == $n) {
                		yield array_fill(0, $n, true);
                	} else {
                		foreach(selections($t - 1, $n - 1) as $s) {
                			yield [true, ...$s];
                		}
                		foreach(selections($t, $n - 1) as $s) {
                			yield [false, ...$s];
                		}
                	}
                }
                
                function subsets(int $needed, array $children)
                {
                	// Ensure the indices are normalised.
                	$children = array_values($children);
                	$select = fn($kept) => $children[$kept];
                	foreach(selections($needed, count($children)) as $selection) {
                		yield array_map($select, array_keys($selection, true));
                	}
                }
                
                
                $k = 3;
                $n = str_split('abcdefg');
                foreach(subsets($k, $n) as $subset)
                {
                	echo join($subset) . ' '; // 7!/(3!4!) = 35 different subsets.
                }

                Bonus for getting them in lexicographical order.

                Because the recursive part doesn't deal with the set elements directly, it can take some shortcuts and finish earlier.

                Note the anonymous function that has been pulled out of the array_map into $select. It's actually been pulled out of the loop to avoid instantiating a whole new Closure object on every iteration (that would be destroyed at the end of the iteration only for another new practically identical object to be created at the start of the next).

                  Write a Reply...