Dear lazyweb...

nicholas on 2008-08-18T11:11:56

Dear lazyweb...

There is some code we have. It implements part of its security by calculating a digest that is not entirely unlike MD5. Well, to be exact, it's specified as taking the Digest::MD5 source code


static void
MD5Init(MD5_CTX *ctx)
{
  /* Start state */
  ctx->A = 0x67452301;
  ctx->B = 0xefcdab89;
  ctx->C = 0x98badcfe;
  ctx->D = 0x10325476;

  /* message length */
  ctx->bytes_low = ctx->bytes_high = 0;
}

and replacing those 4 values with 4 others.

Now, I don't like repeating myself (in code at least. Real life is another matter) and I was trying to find a way to avoid having a complete patched fork of the Digest::MD5 source. That's a static C function, so I can't replace it in a subclass. I've skimmed the source, and I can't see any way to directly knobble A, B, C and D. Am I right in thinking that changing the start state in this fashion before digesting a string $glurpp is exactly equivalent to computing the (true) MD5 of a string "$prefix$glurpp", where $prefix is some fixed prefix string that I don't know yet? If yes, is there any efficient way of computing that prefix, short of brute force?


Brute force, I'm afraid

mauzo on 2008-08-18T16:29:15

Since A, B, C, and D are just the four bytes of the digest, there's no way to calculate them without breaking MD5 (by brute force or otherwise).

Re:Brute force, I'm afraid

Ron Savage on 2008-08-19T01:45:45

Well, these days there's no need to use brute force. Just use Rainbow Tables (see Google for details). The short answer is that people calculate the MD5 checksum for vast numbers of strings, so you just need to look up the result to reverse the process.
And, now, let's hear it for: SHA1!!!!

Why do it that way?

ask on 2008-08-26T23:40:04

What do you gain from changing the start state over just adding some secret to the data you are md5'ing?

Oh wait - that's what you are asking about too. :-)

  - ask

Quite difficult

polettix on 2008-09-23T08:40:23

If my recalls on the subject are right, $prefix should be the inverse hash calculated on 0x67452301efcdab8998badcfe10325476 in *your* system (i.e. the system with your initial state).

Now, you have a system that has more or less the same strength of MD5 (apart of course from your initial state, which might be stronger or weaker), and you're facing the problem of inverting a hash - which makes it quite difficult for you to find $prefix. As long as you find it, you know that you have to flush your system in the toilet, more or less ;)

On the other hand, this whole prefix stuff (normally referred to as "salting") could be beneficial. I suspect that you *have* to use the system with the alternative initialisation, because of some past work done on it (possibly with other systems), so you might be stuck on it. If this is the case, I have two suggestions:

1. try to reflect (or make the other ones to reflect) on the fact that not all initialisers are good for MD5, and you might end up in having a weaker system (as an aside, remember them that there's no security through obscurity). Consider salting (with different salts, of course) as an alternative;

2. patch the MD5 module in order to be able to accept any initialisation sequence, and make the default as it is now. The module owner could be willing to accept such a patch (no change in behaviour, only added features), and you would have a hook to change those values at will without the need to fork. Sounds like a win-win situation (but I'm not the maintainer, so I can't speak for them).

Flavio.