For quite some time I\'ve been thinking about the idea to construct a hashing al
ID: 647514 • Letter: F
Question
For quite some time I've been thinking about the idea to construct a hashing algorithm that contains its own checksum value, and thereby can verify itself. With hashing algorithms like SHA1 and MD5 this seems to be difficult although not impossible as explained here. The content that is hashed could never change, but I can think of a few situations where this is absolutely desirable. For example, certificates containing their own thumbprint calculated over all fields. I have tried to design the basics myself once, but that was more ambitious than I had foreseen.
The way I see it there are two approaches:
1) Narrow down the possible hash values by analyzing the content. Then race for all possibilities to see if the content and the hash match. I have implemented this and although it did work it was everything but usable.
2) Calculate a hash and adjust the content to match the hash value. For hashing algorithms like MD5 this is near to impossible and any new algorithm would possibly impaired by this.
I'm convinced it is both possible and usable someway, therefore it surprises me how few there is to find on the subject. Are there any case studies or related algorithms to this idea?
Explanation / Answer
Maybe you can create a hash function like that, but it will have a major security weakness, because in order to achieve your goal, you need some correlated manipulations in the input and output. And that can be exploited by an attacker for cryptanalysis.
1) Assume you have a given input and know its according hash value.
2) If you flip a single bit in the input, it should flip the hash value at every position with probability 0.5, because otherwise you have a weakness for differential cryptanalysis. Formally, this can be seen as an linear correlation of some part of the algorithm.
That last part is what you actually need for your self containing hash function but at the same time that also gives the attacker an advantage for finding collisions or even preimages, calculate the internal algorithm of the hash backwards, etc.
The fact that something exists, does not mean there is a more efficient way than just testing all candidates.