Weedpacket;10906864 wrote:A PHP string is already in its binary representation: one character == one byte. If you want to see the bytes in a string, array_map('ord',str_split($string)), or pack('H').
This script outputs the ASCII ordinals of the characters in a string.
$str = 'abc';
$v = array_map('ord',str_split($string));
print_r($v);
outputs this:
Array
(
[0] => 97
[1] => 98
[2] => 99
)
I understand what that function is doing. $v contains an array of integers that correspond to the ASCII ordinals of a, b, and c.
This I don't understand at all:
$str = pack('H', 'ABC');
echo 'len:' . strlen($str) . "\n";
$v = array_map('ord',str_split($str));
print_r($v);
It outputs this:
len:1
Array
(
[0] => 160
)
It I put a * after the H, then I get a string of length 2:
len:2
Array
(
[0] => 171
[1] => 192
)
I have also tried pack with C and c instead but that just returns an array with zero as its only member. Now I know that with those ordinals above, I can use [man]base_convert[/man] and get something like this:
function myfunc($s) {
$ord = ord($s);
$s2 = strval($ord);
return base_convert($s2, 10, 2);
}
$str = 'abc';
$v = array_map('myfunc',str_split($str));
print_r($v);
which outputs this:
Array
(
[0] => 1100001
[1] => 1100010
[2] => 1100011
)
I also know that what is happening here is that we are grabbing the ordinals (an integer) and converting them to base-2 integers. I hope I'm not being totally obtuse here when I point out that those binary numbers have only 7 digits and a
byte has 8 digits. I know that if PHP has a byte in memory somewhere that it has one more bit. Can I assume that the missing bit is a leading zero or is there some kind of two's compliment thing going on? Or some Endian bushwhacking?
I'm certainly feeling obtuse here. I was just hoping for a way to check exactly what bits and bytes I've managed to [man]pack[/man] up in my string to determine if it matches the bit-and-byte order descriptions described in the Adobe AMF3 Specification. Without being able to actually look at the ones and zeros, I feel like I'm working in the dark. I did manage to cook up a function to show me the bits in a number value, but it returns 00000000 whenever you feed it a string:
function my2bin($c) {
$result = '';
for($i=0; $i<8; $i++) {
$p2 = pow(2, $i);
if ($p2 & $c) {
$result = '1' . $result;
} else {
$result = '0' . $result;
}
}
return $result;
}
Also, I'm not sure how to detect the 'bit length' of a given argument so I can do doubles or floats or whatever.
Weedpacket;10906864 wrote:Because you're telling [man]pack[/man] to treat 1 as a double: the number 1.0. What you're looking at is the IEEE-754-standard binary representation of 1.0 (in sixty-four bits).
Sadly, I went looking for the IEEE-754 standard and was prompted to purchase it at the IEEE website. I was naively expecting that a big-endian 8-btye (that's 64 bits) representation of the number 1 would look like this:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001
.
I went and looked some more and there's an article on wikipedia that I'll be attempting to digest. I'm still rather choking on the AMF3 spec and Augmented Backus-Naur Form.
Weedpacket;10906864 wrote:Yes; AMFPHP has got its test back to front. AMFPHP_BIG_ENDIAN is true iff the machine is little-endian (e.g., Intel-based).
Aha! Just as I suspected. A bad constant name. Or something like that. Definitely confusing. There are few comments in the amfphp code either. Also, I don't think their AMF serialization supports args being passed by reference. I could be wrong. I have increasing confidence that my crusade to write this AMFPHP class is justified. BTW, I think it's worth noting that this constant defines as true on both an intel mac and also on a dual-core AMD machine running CentOS. So PHP on both of these machines is little-endian? Or is that a function of the OS?
Weedpacket;10906864 wrote:The IEEE-754 representation (network byte order, hence big-endian) of 1.0 would be
00111111 11110000 00000000 00000000 00000000 00000000 00000000 00000000
*------- ---===== ======== ======== ======== ======== ======== ========
* Sign bit: positive
- Exponent: 1023
= Mantissa: 1.00000000000000000000000000000000000000000000000000000 (in binary)
Value: (1-sign*2) * mantissa * 2^(exponent-1023)
I like your notation and thank you very very very much for that. I do think I'm getting somewhere here. I think I know understand that my machine, when packing doubles, will choose some machine-specific endianness for it and that is why I must detect the endianness of my machine if I am to reliably pack doubles for this protocol.