Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Preface: This question was originally asked on Theoretical Computer Science, and

ID: 649058 • Letter: P

Question

Preface: This question was originally asked on Theoretical Computer Science, and the kind people there referred me to this web site. It is being repeated here in an attempt to find a satisfying answer.

Over the years, two novel encryption techniques have come to mind and been implemented as programming libraries that could be integrated into applications. However, how to analyze their security and vulnerability characteristics has never been very clear, and their usage has been limited to mainly experimental tests. Are there tools available for automated examination of such parameters one may be interested in understanding for an encryption library? Are there bodies of people who are interested in being introduced to new encryption concepts for the purpose of executing their own personal analysis on such a process? I'm not sure where to look.

The first encryption algorithm is a mono-alphabetic simple substitution cipher. It requires two keys to operate and is designed to frustrate frequency analysis. The longer of the keys forms a table by which plain-text has a normal substitution cipher applied. Each encoded byte is then split into four values of two bits each. The second, shorter key is then used to allow a random selection from four groups of sixty-four unique bytes each. Each two bit value from the encoded byte is used to select which group of sixty-four bytes to use. Encoding has two disadvantages: the output is four times larger, and repeated data encoding may allow some frequency analysis.

The second encryption algorithm is a stream cipher like the first but internally operates on blocks of data. It utilizes two keys to operate: the first is a two-dimensional array that describes how to construct a (virtual) multidimensional grid, and the second is an initialization vector for the encoding/decoding engine. It attempts to overcome frequency analysis by encoding bytes with a window of preceding bytes (initialized from the second key). A byte with its preceding window of bytes form a multidimensional index into the aforementioned grid. Unfortunately, encoding duplicate blocks of data longer than the window size starts yielding equivalent data.

Explanation / Answer

Typically, there are a variety of steps in evaluating a new algorithm. They start with the quick review

is it already known?
does it vary only in nonrelevant ways from what is known?

which is commonly enough to show vulnerabilities in many amateur attempts at encryption. The point is that there are a number of well known ways to translate strings of symbols into other strings of symbols, and many of those have been evaluated in common cryptographic studies. Good books on cryptography cover these.

Then, the more intensive analyses begin. These include:

functional inversion
analysis of the symbol statistics
differential cryptanalysis

and a variety of other mathematical techniques for extracting information of the keys from the intercept stream.

Also, if the technique involves nontraditional information stores (if it doesn't have keys, if it has strange communication channels and protocols, etc.), then you have to also analyse the various impersonation and interception scenarios on each channel.

For the description given, it would appear that the first cryptographic algorithm would fail the first quick view. Substitution ciphers are well known. It doesn't really matter how many times you compose substitution ciphers, their properties are the same. If you had 10 keys and did substitution 10 times before final send, it would be equivalent to a single key substitution cipher and have no better security. Unless there is some other tweak to the algorithm, that would quickly be dismissed as already known