Java Decompilers

April 2002 Note.  This article isn't anwhere near current but is still valid as general commentary on java decompilers.  I'm not planning to re-review the modern crop, but I will note new programs as I become aware of them.

and are they worth worrying about

Generally, the object of decompiler is to accept a java .class file as input, and produce a compilable source file as its result. In the chaotic world of software development there are many reasons, legitimate and otherwise, to wish for such a tool. The transparent and information-rich structure of java .class files, which makes Java's dynamic linking so much better than previously common models, also makes such tools particularly easy to build.

What tools are available?

All of these tools are pure java, so the essential distribution consists of a java class library and instructions to invoke it. They're all a littly quirky to set up and use, a characteristic shared by many standalone java applications. Once set up, they more or less "just work", producing output that is nearly ready for the compiler.

Testing method

I chose a small utility library, consisting of about 15 classes, as my standard test set. I compiled the library using JDK 1.02, with -o and without -g. I decompiled with all three decompilers, then manually edited the decompiled sources until they could be successfully recompiled. I then decompiled these three sets of "second generation" binaries, with each of the three decompilers, yielding nine sets of "third generation" sources. I then manually compared various pairs of sources, looking for inconsistancies which might indicate incorrectly decompiled code. Since this was a "only a test", I had the luxury of referring to the original sources, and the double luxury that I wrote these sources myself; two advantages that would not generally be available to anyone using a decompiler in earnest.

The test set was not specifically designed to validate or torture decompilers, and there is no way to know if the results here are representative of all classes, or if the list of problems encountered is complete. It should, however, give you some idea what to watch for.

I organized decompilation errors according to the taxonomy below, based on the general idea that easy-to-spot and easy-to-fix errors were less significant than hidden or hard to fix errors. The very worst thing a decompiler can do is produce code that passes through a compiler without complaint, but which is not functionally equivalent to the original code.

Error Taxonomy

Class 1 errors Class 2 errors Class 3 errors Class 4 errors Class 5 errors Class 6 errors
general description flagged by compiler, easily fixed flagged by compiler, not easily fixed. Ugly, Incomprehinsible, but correct code. Suble misprints. Subtly Incorrect programs Total failure. Gross errors Severly damaged semmantics No warning, and hard to identify
example boolean variable incorrectly identified as int

missing, but trivial type cast.

generating code containing goto unreconstructed flow conrol

unreconstructed use of + for string append

failing to use \ to escape characters in string constants

misprinting character constants

crash without producing output misuse or non-use of "this."

other patently incorrect code

Decompiler Errors by type

Class 1 errors Class 2 errors Class 3 errors Class 4 errors Class 5 errors Class 6 errors
Mocha
version beta 1
a few no no no yes, mocha crashes on some class files. no
WingDis
version 2.06
just one no overuse of if(x!=false) and similar construction no no misuse or non-use of of "super."
mistranslation of x=a++; to a++; x=a;
DeJaVu
version 1.0
a few no major problem with flow analysis yes no no

Conclusion, would you use these tools?

As a basis for a new software product, based on reverse engineered code? Definitely not.
As a way to deduce out what a particular method of a class library is actually doing? Definitely.
As a means to make an emergency repair in someone else's classes? Maybe, and with great reluctance.

Your mileage may vary.


Detailed Examples

Class 1 problems

Mocha and DejaVu sometimes failed to infer boolean type for integer operations, though it is interesting that they failed in different places.
    PrintStream PrintStream()
    {
        return new PrintStream(outputstream, 1);  // 1 should be true
    }

Beautiful, but it's not Java.

Mocha transformed a static initializer into an elegant, but illegal construction.
    public ConsoleWindow(String string, int i1)
    {
        dead = false;
        styles = { "Plain", "Bold", "Italic" };
        sizes = { "8", "9", "10", "12", "14", "16", "18", "24" };
        ...
Wingdis produced equally beautiful, syntactically correct, but semanticly incorrect code for this, making a class 6 error.

On the other hand, DejaVu emitted perfecty legal, but ugly, code as follows:

    public ConsoleWindow(String arg1, int arg2) {
    ...
        String[] Har1;
        Har1 = new String[3];
        Har1[0] = "Plain";
        Har1[1] = "Bold";
        Har1[2] = "Italic";
        this.styles = Har1;
        ...

Class 2 problems

Earlier versions of WingDis sometimes produced code containing GOTO statements, which were not java and nearly impossible to re-code into correct java. I'm pleased to find that this class of error seems to be extinct. The reviewed version of wingdis seems to do a flawless job of flow analysis, as does Mocha.

DeJaVu sometimes emits correct but nearly incomprehensible code, see the class 3 section


Class 3 problems

Reconstructions that are correct, but perhaps not as easy to read or understand as the original. The quality of reconstruction varies widely; from almost magically good to definitely abysmal.

All three programs were able to reconstruct simple loops quite well, but only mocha and wingdis were able to handle almost everything with equal grace. Dejavu frequently resorts to emitting correct, but nearly incomprehensible code dominated by switch statements.

Simple cases, good results

use of "A" + B to generate strings simple loop reconstruction
original
public String toString () 
{
        String myname = this.getName();
        return("#<" 
        + super.toString()
        + (myname!=null ? (" " + myname) : "" )
        + ">");
}
public static int LList_Length(LList l)
{       int len=0;
  while(l!=null) { len++; l=l.next; }
  return(len);
}
mocha
public String toString()
{
     String string;
     string = getName();
     return "#<" + super.toString() 
                + ((string != null) 
                        ? (" " + string) 
                        : "") 
                + ">";
}
public static int LList_Length(LList lList)
{
   int i = 0;
   for (; lList != null; lList = lList.next)
         i++;
   return i;
}
wingdis
public String toString() {
  String Stri1= getName(); 
  return "#<" + super.toString() 
        + ( (Stri1 == null) 
            ? "" 
            : new StringBuffer(" " 
                + Stri1).toString() ) 
                + ">"; 
    }
 public static int LList_Length(
     LList LLis0) {
     int int1;
     for (int1= 0; (LLis0!= null) ; LLis0= LLis0.next) {
         int1++; 
     }
     return int1; 
 }

Dejavu
public String toString() {
    String obj;
    StringBuffer Hobj1;
    String Hobj;
    obj = this.getName();
    Hobj1 = new StringBuffer().append("#<")
                                .append(super.toString());
    if (!(obj == null)) {
         Hobj = new StringBuffer().append(" ")
                                .append(obj).toString();
     }
     else {
         Hobj = "";
     }
     return Hobj1.append(Hobj).append(">").toString();
}
 public static int LList_Length(LList arg0) {
   int i;
   i = 0;
   while (arg0 != null) {
        i++;
        arg0 = arg0.next;
   }   /* end while loop */
   return i;
}

More troublesome cases

Complex loop reconstruction
original
public LList Sort_Short_LList (CompareFunction fn) {
           LList out_list = this;
                 LList l=this;
                 LList in_list = l.next;
                l.next = null;
                while(in_list!=null) 
                { /* scan through the in list, performing an insertion
                        sort into the out list */
                LList current_list = in_list;
                Object current_item = current_list.contents;
                LList scan_list = out_list;
                LList prev_scan_list = null;
                in_list = in_list.next;
                while(scan_list!=null 
                  && !fn.InOrder(current_item,scan_list.contents)) 
                        {
                        prev_scan_list = scan_list;
                        scan_list = scan_list.next;
                        }
                current_list.next = scan_list;
        if(prev_scan_list!=null) 
                   { prev_scan_list.next = current_list; 
                   }
                   else
                   {out_list = current_list;
                   }
            }
          return(out_list);
        }
mocha
    public LList Sort_Short_LList(CompareFunction compareFunction)
    {
        LList lList1 = this;
        LList lList2 = next;
        next = null;
        while (lList2 != null)
        {
            LList lList3 = lList2;
            Object object = lList3.contents;
            LList lList4 = lList1;
            LList lList5 = null;
            lList2 = lList2.next;
            for (; lList4 != null && !compareFunction.InOrder(object, lList4.contents); lList4 = lList4.next)
                lList5 = lList4;
            lList3.next = lList4;
            if (lList5 != null)
                lList5.next = lList3;
            else
                lList1 = lList3;
        }
        return lList1;
    }
wingdis
    public LList Sort_Short_LList(
        CompareFunction Comp1) {
        
        dlib.LList LLis2= this; 
        LList LLis3= next; 
        next= null; 
        while (LLis3!= null)  {
            LList LLis4= LLis3; 
            Object Obje5= LLis4.contents; 
            LList LLis6= LLis2; 
            LList LLis7= null; 
            for (LLis3= LLis3.next; (LLis6!= null) ; LLis6= LLis6.next) {
                if (Comp1.InOrder(Obje5, LLis6.contents)!= false)  {
                    break;
                }
                LLis7= LLis6; 
            }

            LLis4.next= LLis6; 
            if (LLis7== null)  {
                LLis2= LLis4; 
                continue;
            }
            LLis7.next= LLis4; 
        }

        return LLis2; 

    }
DejaVu
    public LList Sort_Short_LList(CompareFunction arg1) {
        LList obj5 = null;
        LList obj4 = null;
        LList obj3 = null;
        Object obj2 = null;
        LList obj1 = null;
        LList obj = null;
        int CTL_PC = 1;
        while (true) {
            switch (CTL_PC) {
                case 1: {
                    obj5 = this;
                    obj4 = this.next;
                    this.next = null;
                    CTL_PC = 2;
                    break;
                }
                case 2: {
                    if (obj4 != null) { CTL_PC = 3; break; }
                    CTL_PC = 10;
                    break;
                }
                case 10: {
                    return obj5;
                }
                case 3: {
                    obj3 = obj4;
                    obj2 = obj3.contents;
                    obj1 = obj5;
                    obj = null;
                    obj4 = obj4.next;
                    CTL_PC = 4;
                    break;
                }
                case 4: {
                    if (obj1 == null) { CTL_PC = 7; break; }
                    CTL_PC = 5;
                    break;
                }
                case 5: {
                    if (!(arg1.InOrder(obj2, obj1.contents))) { CTL_PC = 6; break; }
                    CTL_PC = 7;
                    break;
                }
                case 7: {
                    obj3.next = obj1;
                    if (obj == null) { CTL_PC = 8; break; }
                    CTL_PC = 9;
                    break;
                }
                case 9: {
                    obj.next = obj3;
                    CTL_PC = 2;
                    break;
                }
                case 8: {
                    obj5 = obj3;
                    CTL_PC = 2;
                    break;
                }
                case 6: {
                    obj = obj1;
                    obj1 = obj1.next;
                    CTL_PC = 4;
                    break;
                }
            }
        }
    }


Class 4 problems

DeJaVu made an error printing this character constant
            //the original
            if((ch == '\r') || (ch =='\n')) 
             { charisready = true; break; 
             }
            //Oops! character constants in this format are base 8, so should be '\15'
            if (c == '\13' || !(c != '\n')) {   
                  this.charisready = true;
            }

Class 5 problems

Mocha crashed (or causes Java runtime to crash), decompiling this method, and consequently no output was generated for the class that complained it. Other than the fact that "there is a problem" is obvious, this is as serios a bug as a decompiler can have.
        public void run()
                {
                int target;
                while((target = pickclass())<classlist.length)
                        {
                        try 
                                {Class.forName(classlist[target]);
                                classes_to_go--;
                                }
                        catch (ClassNotFoundException err)
                                {classes_to_go--;
                                System.out.println("Class " + classlist[target] 
                                + " not found " + err.toString());
                                }
                        changeLabel();
                        }}

Class 6 problems

The very worst thing a decompiler can do is to produce incorrect code, without any warning that there is a problem. Wingdis has a serious, systematic problem, using "super.", both adding it where it should not be, and omitting it where it must be.

For exmaple, this:

    public void Dispose() 
     { 
       super.Dispose(); 
     }
became
    public void Dispose() 
    {   Dispose();     //missing "super"
        return; 
    }

Wingdis produced this beautiful, legal java code for a static initializer,

    public ConsoleWindow(String string, int i1)
    {
        dead = false;
       String [] styles = { "Plain", "Bold", "Italic" };
       String [] sizes = { "8", "9", "10", "12", "14", "16", "18", "24" };
        ...
unfortunately, the desired effect of initializing the instance variables "styles" and "sizes" are not preserved. One would have to also add a line such as "this.styles=styles;"

Wingdis also has a problem reconstructing expressions which contained ++ and --, for example,

        a[b++]=c;
tends to become
        b++;
        a[b]=c;
which uses the wrong, already incremented, value of b.


comments/suggestions to: ddyer@real-me.net If you think your software is treated unfairly, fix your bugs or convince me they're features. If you want your software reviewed, be patient, (or send me free software :-)

My Java page

my home page