I have two ciphertexts, I suppose that its RC4 with reused key. I have XORed bot
ID: 647636 • Letter: I
Question
I have two ciphertexts, I suppose that its RC4 with reused key. I have XORed both ciphertexts and obtained message containing combined cleartexts.
I suppose that the underlaying messages are written in some structured format, but unfortuanatelly I do not know anything about frequencies (I might guess that there will be eg plenty of { }, if its json).
Most sources that I have read state that the separation is simple, but do not discuss the method to separate the texts. My idea was to use backtracking algorithm and try to construct both messages, so their sum (xor) matches the known combined text.
Is this the correct solution to this issue, or is there some smarter way? I'm also sure that I am not the first programmer solving this issue, so if you can point me to some implementation or library (prefferably in Java), I will be gratefull.
Thanks.
Explanation / Answer
Maarten appears to make it look like it's an impossible (or, at least, an exceedingly difficult task) to recover the two plaintexts.
Indeed, if you literally know nothing about the plaintexts, it can be difficult. However, you typically have a reason you are interested in the messages, and hence often have a clue as to what language they might be. If the language is 'random bits', or something else which is high entropy, it can be a difficult or impossible task. However real plaintexts often don't fall into that category. You mentioned that they might be json files; lets go with that.
One common trick that is useful in many languages is crib dragging; you pick an n-gram that is common in the language (for json, the 4-gram consisting of the characters ", :, space and " is one possibility), and xor in that in various places in the xor'ed plaintexts. If one plaintext consists of that 4-gram in that spot, the result of the xor will be the contents of the other plaintext in that location. Hence, if the result of the xor is a plausible plaintext, you might have successfully decoded both messages there (and can possibly extend the decode, if the other plaintext has obvious previous/next characters).
By using several such cribs, and by expanding revealed plaintexts when possible, my experience (with ASCII-encoded English) has found it to be a doable process -- I don't have any specific experience with json, however I would expect it to be doable as well.
I don't personally know of any tooling that would help you (I built my own tooling on the fly when I did it); however I wouldn't be surprised if something exists. However, I would expect any such tooling would require some hand-holding (as people are more likely to judge correctly whether things are plausible or not).