I've written a class method that implements AMF3 serialization of an array object. I've posted documented source code but here it is again:
public function write_array($d) {
// Arrays go onto the object_table if we haven't seen them before
// NOTE: strict equality will return true if two distinct arrays
// have identical keys and values...if we can find a way to check for
// a real reference, that would rule
// DISABLED FOR NOW because we don't want arrays to be artificially referencing each other
# $key = array_search($d, $this->object_table, TRUE);
# if ($key !== FALSE) {
if (FALSE) {
// we have already written an indentical array to the object table
// write a ref and be done with it
$ref = $key << 1; // low bit is zero to indicate ref, other bits are 29-bit variable int to specify index
$this->write_integer($ref);
return;
}
// not a ref, store reference to this object in object_table if there is room
if (count($this->object_table) < self::AMF_OBJECT_MAX_REFERENCES) {
# $this->object_table[] =& $d
$this->object_table[] = ''; // we have to store empty strings so that when
// we do add an object reference that the ref index
// will match what flash expects
}
// divide the array into two arrays, one for numeric keys, one for associative keys
$num = array();
$assoc = array();
$hi_key = 0;
$prev_num_key = NULL;
$nums_ordered_and_dense = TRUE;
foreach($d as $key => $val) {
if (is_int($key) && ($key >= 0)) {
$num[$key] = $val;
$hi_key = max($hi_key, $key);
// track whether the numeric values are both ordered and dense
// this will help with performance for write_array when we are
// serializing large dense numeric arrays
if (is_null($prev_num_key)) {
// we haven't checked anything yet...first key in dense, ordered array must be zero
if ($key !== 0) $nums_ordered_and_dense = FALSE;
} else {
// current key *must* equal previous key+1
if (($prev_num_key+1) !== $key) $nums_ordered_and_dense = FALSE;
}
$prev_num_key = $key;
} else {
$assoc[$key] = $val;
}
}
if ($nums_ordered_and_dense) {
$contig_len = count($num);
} else {
// either the array is non contiguous or out of order or both
// check to see how much of the numeric part is contiguous
$num_len = count($num);
if ($hi_key == ($num_len - 1)) {
// all numeric values are contiguous
$contig_len = $num_len;
} else {
// the array either has negative indices or has non-contiguous indices
for($i=0; $i<$hi_key; $i++) {
if (!isset($num[$i])) {
$contig_len = $i;
break;
}
}
}
}
// due to the way array lengths are encoded, we have a maximum upper length for arrays
if ($contig_len > self::AMF_ARRAY_MAX_CONTIG_LENGTH) {
throw new Exception('Array contiguous length exceeded the maximum allowed of ' . self::AMF_ARRAY_MAX_CONTIG_LENGTH);
}
// write the length of the contiguous part to the buffer
// as length packed together with the value/reference flag bit
// living in the lowest bit of a variable length integer
$len_val_ref = $contig_len << 1 | 0x01;
$this->write_integer($len_val_ref);
// output the key/value pairs
// e.g., arr['AB'] = 1
// first the key, a string:
// 00000101 - first bytes indicate key length, here it says length is 2, low bit is 'val' flag
// 01000001 - UTF code for upper case A
// 01000010 - UTF code for upper case B
// next, the value
// 00000100 - first byte is type indicator, here an integer
// 01111111 - the value of the integer, here 127
// first, the non-contiguous numeric indices
foreach($num as $key => $val) {
if (($key < 0) || ($key > $contig_len)) { // NOTE that $num[$contig_len] should never contain a value
// as it is either greater than the array length for
// contiguous arrays and for non-contiguous arrays, it
// is the index of the lowest empty slot
$this->write_string($key);
$this->serialize_r($val);
}
}
// then the associative values
foreach($assoc as $key => $val) {
$this->write_string($key);
$this->serialize_r($val);
}
// after name-value pairs are finished, output an empty string
$this->write_string('');
// now output just the values of the contiguous part of the num array (if any);
for($i=0; $i<$contig_len; $i++) {
$this->serialize_r($num[$i]);
}
}
Note that AMF3 serialization treats associative arrays quite differently than numeric ones and also that it considers numeric arrrays with missing indices to be associative arrays -- at least the part of the array after the missing index. This contributes a lot of complexity to the function because it requires checks to make sure that numeric array indices are both ordered and dense.
It seems to me it would be nice to reduce the amount of looping through the array values. The efficiency of this serialization is extremely important for a project I've been working on. I also ask because I aspire to write this code as a PECL extension (i.e., in C) and I'm dreading converting this code to C.