I\'m playing around with an application for secure email-like communication and
ID: 649755 • Letter: I
Question
I'm playing around with an application for secure email-like communication and I want to perform length hiding padding on the plaintext messages so they always have a consistent size before encrypting with AES.
I would like to do PKCS7 style padding (if possible), because it's easy to figure out how much to strip from the decrypted output, but how would you do such a padding if the amount of padding exceeds what can be described with one byte? i.e. if I have a message that is 2.000 bytes in size and I want to pad it to become 16.000 bytes in size. This requires a padding size of 14.000 bytes which is 0x36B0 in hex, how would the padding look like for such a value and how can I know that I should take the last two bytes instead of the last byte to find the padding size?
Explanation / Answer
As you note, PKCS7 padding isn't designed to do exactly what you want; it's really designed to allow you to pad up to the next multiple of the block size, that is, to the next multiple of 8 or 16. That it does rather well; however, it's not designed to do what you want with it.
I would note that for block ciphers, as long as you also include a good Message Authentication Code or some other way of ensuring integrity, that the actual padding method isn't critical to security; the AES encryption itself will ensure that there is no information leakage other than message length, and the MAC will ensure that, if the attacker tries to play games with modifying the message, well, that'll always result in a MAC failure, and so that attacker won't learn anything). This is true even if you try to design a padding scheme specifically to leak information (assuming, of course, that your padding method doesn't use the key).
Hence, the only real constraints on your padding scheme is:
Not to leak any information do to the message length (that's the one thing AES does not disguise)
Be able to remove the padding to obtain the original message without ambiguity
As long as you follow those two constraints, you're golden. One obvious way to meet both goals is to pad the message out to a fixed length; and then add a 2 (or 4) byte 'original message length' at either the beginning or the end of the message.
I'm assuming that you aren't concerned about interoperability with existing systems; if you are, then you'll need to live within whatever system they are using.
And, the above answer "the padding method is not critical" applies only to block ciphers along with some MAC); when you consider other cryptographical primitives (say, RSA), the padding method does become important in those cases.