Perl source obfuscators are stupid

brian_d_foy on 2006-09-20T04:24:06

I'm writing the chapter in Mastering Perl on cleaning up source code, so I figured I'd look at some code obfuscators. I'm sure other people will have stories to tell.

The most stupid obfuscators just get rid of whitespace. perltidy clears that right up.

The oddest one I found looked like it did a lot of stuff, but the last statement in the file was always

"eval($foo)"
. I changed the eval() to
print()
and there's the program. A slightly fancier one had several rounds of that. Still, I had the source in two minutes, and that's just doing it manually.

I'm thinking, just for the heck of it, creating some de-obfuscators just to put in the book.


de-obfuscators

glob on 2006-09-20T05:30:41

> I'm thinking, just for the heck of it, creating some de-obfuscators just to put in the book.

perl -MO=Deparse obfuscated.pl

Re:de-obfuscators

brian_d_foy on 2006-09-20T08:11:00

It's a bit more complicated than you think. Deparse can clean up simple-minded things, but the eval trick isn't something Deparse will figure out. It will still show a huge string, the operations on that huge string, and an eval().

Re:de-obfuscators

Aristotle on 2006-09-20T09:07:54

Override CORE::GLOBAL::eval to print/save the code before running it?

Re:de-obfuscators

jjore on 2006-09-20T15:26:46

The GOO;eval($code) pattern appeared to be really common in the stuff people showed me so I thought about making B::Deobfuscate optionally run the GOO and replace GOO;eval($code) with the $code.

I just didn't get around to it.

B::Deobfuscate

Adrian on 2006-09-20T16:53:09

I assume you've come across B::Deobfuscate?

It was caused by a rather entertaining thread on perlmonks a few years back :-)

Re:B::Deobfuscate

brian_d_foy on 2006-09-20T17:37:14

Yes, I ran into that module. I just can't get it to install. :(

Re:B::Deobfuscate

jjore on 2006-09-20T18:02:46

Really? Works for me (IIRC). Send me an RT ticket and I might make it work.

Re:B::Deobfuscate

brian_d_foy on 2006-09-20T18:21:35

Mine installation failed the signature test. I think I'm just going to write a fake Test::Signature to always return ok.

I could just delete the one that's already there, but something else keeps installing it.

Re:B::Deobfuscate

jjore on 2006-09-20T18:49:25

Ah. perlmonks have told me there's other problems too but no one bothered to cc me on those either.

Re:B::Deobfuscate

jjore on 2006-09-21T02:33:24

FYI, I removed all the SIGNATURE stuff and fixed a few other minor things. It's released as 0.15 now. It's still no more special than a B::Deparse with a renaming function.

Re:B::Deobfuscate

brian_d_foy on 2006-09-21T04:10:31

I like renaming everything to flowers. :)

Re:B::Deobfuscate

jjore on 2006-09-21T05:45:31

Thanks for noticing. 0.16 even lets you use the Flowers dictionary:

    -MO=Deobfuscate,-DFlowers

    B::Deobfuscate->new( -DFlowers )

If you want to obfuscate...

Alias on 2006-09-22T06:41:14

... then as far as I'm concerned Perl::Squish is a good start (because at least it removes comments/pod and compresses) but anything beyond that is dubious at best.

The whole goal is information extraction, to remove anything that humans need for maintenance that the machine isn't going to need at run-time. But there's only so much of that you can do.

I can see some PPI-based functionality coming down the line eventually to munge the names of lexical scalars, but beyond that I honestly can't think of much you can do that de-obfu can't reverse.

If you write the program properly (assuming a large program) then the best you can really do is deny the attacker the unit tests, docs, comments, layout (if you do any custom layouts for readability that Perl::Tidy can't reinstate) and some lexical name munging.

And possibly you might also be able to remove platform-compatibility and compile-time optimisation stuff, so that what gets deployed to one platform is specific to that platform and lacks anything you did for cross-platform functionality.

But there's really a limit to how far one could take it.

Of course, then there's always something like crypto you can add, but that will have it's own caveats.

Adam K

Re:If you want to obfuscate…

Aristotle on 2006-09-22T14:26:07

You can do more. Besides removing information a human needs, you can also do the following:

  • Dilute intent with redundant information

    In each scope, assign all variables from outer scopes that are used to new variables, so it becomes harder to track what is being modified where.

    If you can analyse the source code sufficiently well, you could even introduce global variables used in multiple places as the new location for values.

  • Reduce abstraction

    Inline constants, except for a few instances. Fold most constant expressions you can find. Inline short subroutines in a randomly selected set of their callers. Unroll loops you can analyse. Transform analysable simple cases of recursion into explicit iteration.

  • Increase indirection

    Randomly extract short bits of code into subroutines. Find bits of code that are similar on a tiny scale and extract those too (eg. several instances of for(LIST){...}xyzzy([LIST],sub{...}) – that sort of thing). Stick random sets of variables together into an array and refer to them by index.

All of these are modifications that no deobfuscator will be able to reverse.

Funnily enough, these are all simple refactorings – which, ironically, would be hard to implement for Perl because the language is impossible to parse, whereas it would be easy to abuse the refactoring tools in Eclipse to automatically obfuscate Java.

Re:If you want to obfuscate…

Alias on 2006-09-23T03:21:13

> Funnily enough, these are all simple refactorings – which, ironically,
> would be hard to implement for Perl because the language is impossible
> to parse, whereas it would be easy to abuse the refactoring tools in
> Eclipse to automatically obfuscate Java.

Which is kind of what I meant by that being all we can do.

It's not that it's impossible in the general case, it's just that WE (Perl) can't do them. Or at least, we can't do many.