Sunday, September 6, 2009

LambdaJ: the wild man's closure


A couple weeks back I was talking about using anonymous inner classes to lend a more functional style to your Java. A common response to this was "hey champ, just use Scala" or "hey tough guy, have you tried Clojure?" Thanks. In the immortal words of the Dude, "Obviously you're not a golfer." That aside, I wouldn't have anything useful to say about closures in Java but for a posting on tss that linked to me. As luck would have it that post also had a link to something REALLY interesting: Mario Fusco's lambdaJ. What is lambdaJ? Only one of the most interesting and creative uses of Java syntax I've seen since Szczepan Faber's Mockito came out a couple years ago.

LambdaJ is a library that very artfully plasters over some of the uglier bits of Java, like the collections framework and the lack of closures. I'm not going to waste prose trying to describe it. I'll just cut to the chase with some code:


//example of lambdaJ closure
Closure println = closure(); { of(System.out).println(var(String.class)); }
println.each("one", "two", "three");

//example of lambdaJ forEach
List personInFamily = asList(new Person("Domenico"), new Person("Mario"), new Person("Irma"));
forEach(personInFamily).setLastName("Fusco");

//example of lambdaJ sort
List sortedByAgePersons = sort(persons, on(Person.class).getAge());

The Catch
If you haven't seen that before, I bet you are excited. I was too the first time I saw it and I nearly ran out and refactored all the code I was working on to use lambdaJ. Then I decided to slow down and do some benchmarks. That is where I hit a snag. It turns out that all this magic comes with a price. Most iterative operations are an order of magnitude slower than java's for loops. Here is some benchmarking code.

import static ch.lambdaj.Lambda.forEach;
import static ch.lambdaj.Lambda.forEach;
import static ch.lambdaj.Lambda.on;
import static ch.lambdaj.Lambda.sum;
import static org.junit.Assert.assertTrue;

import java.math.BigDecimal;
import java.util.ArrayList;
import java.util.List;

import org.junit.Test;

public class LambdaJBenchmark {
public class Sequence{

private List<BigDecimal> sequence;

public Sequence(int size){
sequence = new ArrayList<BigDecimal>();
for(int i = 0; i<size; i++){
sequence.add(new BigDecimal(i));
}
}

public int iterativeSum(){
BigDecimal sum = new BigDecimal(0);
for(BigDecimal value : sequence){
sum = sum.add(value);
}
return sum.intValue();
}

public void iterateEntireSequence(){
for(BigDecimal value : sequence){
value.intValue();
}
}

public int lambdaJSum(){
return sum(sequence, on(BigDecimal.class).intValue());
}

public void forEachEntireSequence(){
forEach(sequence).intValue();
}

}

@Test
public void lambdaJIsSlowerThanForForTenThousandItems() {
long startTime = System.currentTimeMillis();
new Sequence(10000).lambdaJSum();
long endTime = System.currentTimeMillis();
long lambdaJRunningTime = endTime - startTime;

startTime = System.currentTimeMillis();
new Sequence(10000).iterativeSum();
endTime = System.currentTimeMillis();
long javaRunningTime = endTime - startTime;

System.out.println("Summing 10000 numbers. LambdaJ run time: " + lambdaJRunningTime + " and Java: " + javaRunningTime);
}

@Test
public void lambdaJIsSlowerThanForForOneItems() {
long startTime = System.currentTimeMillis();
new Sequence(1).lambdaJSum();
long endTime = System.currentTimeMillis();
long lambdaJRunningTime = endTime - startTime;

startTime = System.currentTimeMillis();
new Sequence(1).iterativeSum();
endTime = System.currentTimeMillis();
long javaRunningTime = endTime - startTime;

System.out.println("Summing 1 number. LambdaJ run time: " + lambdaJRunningTime + " and Java: " + javaRunningTime);
}

@Test
public void lambdaJNestedLoopsAreMoreOrLessAsFastAsFlatLoopOfSameSize() {
List<Sequence> sequences = new ArrayList<Sequence>();
for(int i = 0; i<100; i++){
sequences.add(new Sequence(100));
}

//Seems that "warming up" lambdaJ makes a big difference.
//Without this line the second test is an order of mag. faster.
new Sequence(10).lambdaJSum();

long startTime = System.currentTimeMillis();
new Sequence(10000).lambdaJSum();
long endTime = System.currentTimeMillis();
long flatRunTime = endTime - startTime;

startTime = System.currentTimeMillis();
forEach(sequences).lambdaJSum();
endTime = System.currentTimeMillis();
long nestedRunTime = endTime - startTime;

assertTrue("Summing 100 numbers 100 times. Run time: " + nestedRunTime + " but summing 10000 numbers took: " + flatRunTime, (nestedRunTime < (flatRunTime + 15)) && (nestedRunTime > (flatRunTime - 15)));
}
}
Here is sample output:
Summing 10000 numbers.  LambdaJ run time: 259 and Java: 19


All those tests show you how much slower things are with lambdaJ. I haven't really looked into the lambdaJ code, but I imagine reflection is what is slowing this down... still, for some applications perhaps the trade-off between speed and readability is worth it. Another thing to bear in mind is that lambdaJ is a pretty new project. They might still be optimizing this stuff. Looking at the LambdaDemoTestMain class (in the lambdaJ source) it is clear that Mario and company are thinking about this performance stuff. They have a more exhaustive set of speed comparisons like the ones I created.

More Catches
Bear in mind that I'm new to lambdaJ and might be missing a few tricks here. But here are some gotchas that have tripped me up. That closure trick is really clever, but when you try to use it I think you will find the behavior of lambdaJ closures to surprising. If you try something like this:

Sequence sequenceOfOne = new Sequence(1);
Closure foo = closure(); { sequenceOfOne = new Sequence(2); }

You realize you can't really do assignment because you need to use of() to bind sequenceOfOne to the active closure. Another surprise to me was that you can't have a closure with a return value. See this example:

private Integer plusTen(int lambdaJSum) {
return lambdaJSum + 10;
}

Closure sequenceSumPlusTen = closure();{ of(this).plusTen(var(Sequence.class).lambdaJSum()); }
Integer sequenceOfOnePlusTen = (Integer)sequenceSumPlusTen.apply(new Sequence(1));


I expected sequenceOfOnePlusTen to be 11, but sequenceOfOnePlusTen gets set to null. And no, you can't just inline a return statement in there. I hope I don't seem to be picking on lambdaJ. I think it is great. I would love to be able to use the sort methods and forEach methods in my codebase. But while we are talking about gotchas, here are some more: Using forEach on a null or zero sized collection requires using a different forEach method that has a slightly different syntax. And using lambdaJ on classes that are final will cause problems because lambdaJ can't proxy those classes. I suppose Strings must play hell with lambdaJ for that reason. And trying this code out, sure enough, I get a class cast exception.


List<String> strings = new ArrayList<String>();
strings.add("foo");
forEach(strings).toUpperCase();


Other interesting behavior I noticed was that the first lambdaJ operation you do is much slower than subsequent ones. See my benchmarking tests from before. The last test case will fail horribly if you comment out the lambdaJ "warm up code". I haven't dug through the lambdaJ code enough to understand why.

Conclusion
In the end you see how lambdaJ is a fantastic project full of great ideas. It offers you a taste of how things could be different if Java implemented closures. However it is just papering over this ugliness in Java. And because of the amount of proxy, thread local and reflection magic they had to use to do it, you should be prepared for some surprises when things don't work as you would expect. Regardless, my hat is off to the lambdaJ team. This is a great project that deserves to get more attention. I hope the project matures, inspires more DSL style code and whets the Java communities appetite for real closures and a better collections framework.

2 comments:

Mario Fusco said...

Thanks a lot to give my library a try. Of course there is still room for improvements, but as you can imagine java is not a DSL friendly language.

Anyway if you have any suggestion, advice or critic please feel free to post them on the lambdaj WishList wiki page

http://code.google.com/p/lambdaj/wiki/WishList

or to email me at:

mario.fusco[at]gmail.com

Thanks again
Mario

Mario Fusco said...

Hi again,

I was rereading your trouble with lambdaj closures and I just realized you were using them in the wrong way.

Actually lambdaj closures can return values. About your example I suppose you were trying to do something like that:

private Integer plusTen(int lambdaJSum) {
return lambdaJSum + 10;
}

Closure sequenceSumPlusTen = closure();{ of(this).plusTen(var(Integer.class)); }
Integer sequenceOfOnePlusTen = (Integer)sequenceSumPlusTen.apply(1);

Am I missing something?

Bye
Mario