I've been using stream_socket-recvfrom to receive short plain text ascii packets from some internet enabled remote devices. So the letter I for example arrives as a 49 which is the ascii code for same & the socket puts out an I.
However when I set the remotes into AES encrypted mode they not only encrypt but they switch to sending in hex instead of ascii. So after the I has been encrypted it might arrive as a C4.
This is a real problem because the socket listener puts out strange characters (like Ä for C4 or ‰ for 89). If the characters are printable I can use an ascii to hex converter after the socket listener to get the original hex back but many ascii representations of hex values just won't convert back to the original hex.
So to successfully use mycrypt or similar to decode the encrypted packets I need a socket listener that can handle hexadecimal strings.
Seems simple enough but I'm stumped on how to do it. Any ideas from you learned practitioners of the art?

    You will probably need to provide a lot more information for this to be sorted out. For instance, you say you "set the remotes into AES encrypted mode". Encrypted communication requires that the two communication parties share some keys in order to encrypt and decrypt messages. It may be that when you turn on this encrypted mode that the remote device tries to do some handshaking before it actually sends the data. In other words, turning on encryption might enable some kind elaborate protocol dance and your socket server will need to honor that protocol to make any sense of the incoming messages.

    If that sounds confusing, let me try to simplify. I could send you an encrypted message, but if you don't know what encryption technique I've used and what key you should use to decipher it, it is useless to you. That is the point of encryption.

      thanks for the thoughts chief but we've got all that stuff with keys & IV vectors sorted, no it's purely an issue with the socket assuming the input is ascii when it's hex. If I put an ascii to hex converter downstream of the socket then mcrypt actually works about half the time (when all the hex characters in that particular string generate a printable ascii character)

        Forgive me for trying to teach you something you already know. Your question doesn't provide a lot of detail about your situation so I assumed you were more inexperienced.

        Not sure what you mean when you refer to chars as "hex". a single char can be represented either as the char itself or as a hexadecimal representation of its ascii number. For instance, The lower case letter j is 100 according to this ascii chart. the number 105 expressed in hexadecimal is 6A. One thing to keep in mind is character encoding (see [man]utf8_encode[/man]). if you utf8_encode any char with an ordinal lower than 127 in the ascii chart, it is unaffected. If you utf8_encode anything higher than 127 then you get a two-byte string back -- which means that chars like Ä (with ascii ordinal 142) becomes two bytes instead of just one.

        Another possibility is that the data returned is [man]base64_encode[/man]ed. I've seen base64 encoding used a lot because it doesn't get mangled passing through various sockets, http requests, or email line-wrapping type stuff. NOTE that when you base64_encode something, it's usually longer than the original string. You might be able to check the length of the string you are sending against the one you receive on our socket listener. If the string at the socket listener is about 33% longer, then you might have a base64 encoded string.

          I'm an old hardware man sneaky so please bear with me. When the remotes are in normal mode they only send codes up to 127 (ie the normal ascii character set) so your lower case j is actually 106 in decimal = 6A in hex = 0110 1010 in binary.

          When the remotes are running encryption they use the full range for an 8 bit byte, ie up to 1111 1111 in binary = FF in hex. So the socket listener seems to be utf8_encoding this data & returning Ä for C4 when what I need to pass to mcrypt is the C4.

          Looks like utf8_encoding is a sort of global variable we set elsewhere for the whole page with the socket listener in it. Can you suggest a coding format that will return C4 = 1010 0100 as C4 rather than Ä?

            Hmm. I'm a bit out of my depth in this arena. Sorry for the typos in the ascii codes. I had a lot going on here at that moment.

            You may or may not want to take a peek at this thread where I went over a lot of questions about sockets in PHP and utf8. Weedpacket was kind enough to spend some time teaching me about strings in php.

            [man]stream_socket_recvfrom[/man] is a php function that returns a string. It's important to understand that PHP strings are just raw bytes really and it's up to YOU to know what kind of string is in there because various string functions interpret them as character strings in different ways. It's up to you as a programmer to know if a given string value is really just ASCII text or whether it's utf8 or some other multibyte char string or something else entirely. I use socket functions in some PHP applications to transmit bitmapped images, for instance. socket_read might hand me a PHP string but I know it's a string of bytes that represent a bitmap image.

            Even if you are sure your string value is supposed to be text, it could have some kind of multibyte encoding. You'd need to know what it is so you know what functions to use or whether to utf8_decode it. There are various different string functions you might use. For example, [man]strlen[/man] assumes it's either extended ascii or iso8859 or something and determines the string's length assuming that each byte is exactly one char. [man]mb_strlen[/man], one the other hand, lets you specify an encoding. The two functions may return different values on the exact same string variable. If the data coming off your socket represents a JPEG then you might want to write it directly to a file with a .jpg extension.

            If the highest byte sent by your remotes is 0111 1111, then utf8 encoding them wouldn't change a thing. As far as I know, any char with a zero in the large bit is the same utf8 encoded as it is in plain ascii. In fact, UTF8 encoding may have nothing to do with your problem. I just don't understand how you'd ever get a 1 in your high bit if you're only using ascii chars from 0000 0000 to 0111 1111. Are you transmitting raw data or something? If not, it seems to me that the data you are reading from your socket is not sub-127 ascii but either something that is still encrypted or something that has been encoded somehow.

            I apologize if I'm just confusing things. Perhaps we can simplify. Let's say your "remote" is something hooked up via socket to your socket listener. Let's say your remote takes the word "Yes" and just sends it over the socket. To me, I would expect:

            0101 1001 - Y
            0110 0101 - e
            0111 0011 - s
            

            Byte order would be irrelevant because we're talking single bytes regardless of whether it's utf8 or ASCII.

            Now if your remote encrypted the message and then sent it across the socket, I'd have no idea what to expect on the other side. This is what I was thinking in my first post. If you could spell out a bit more about the encryption stack you have, that would be helpful.

              When the remotes are in normal mode they just send ascii so 'yes' is 79 65 73 in hex. If I WireShark (a packet sniffer) the data on the local network as it heads for the internet it looks this 79 65 73...... yes. Wireshark displays like an eprom programmer, is you've ever used one, shows hex & ascii

              When the remotes are in encrypted mode 'yes' might become C4 FE FF for instance. WireShark shows C4 FE FF..... Äþÿ but the socket delivers Äþÿ only & what I need to pass to mcrypt is C4 FE FF.

              It's really frustrating for me because I know C4 is just a binary bit stream 1100 0100 but all I can get out of this php socket are these useless ascii representations of values above 7Fh

                Write a Reply...