Functional Java

Stephen Swensen

4.11/5 (6 votes)

Nov 13, 2009

CPOL

8 min read

40764

285

Functional programming with functors and object streams in Java.

Download source - 139 KB

Introduction

The goal of the swensen.functional package is to provide Java with:

a set of generic functors,
an immutable wrapper type unifying Java arrays and Iterables featuring method chaining and lazy functional transformations, and
a set of generic tuple types to facilitate ad-hoc data views common in functional programming.

The project was conceived by the author as a means to grow comfortable with Java, and inspired by the author's experience with C# and F#. The types in swensen.functional do not seek to extend Java as a language, they work with core language features, making heavy use of generics and anonymous inner classes in particular. In this spirit, it is the author's intent to maximize the utility of this library within the day-to-day Java programming practices of developers and minimize the set of new tools they are required learn; we do not need another dynamically typed DSL or over-wrought framework. To that end, we gain a serious portion of the expressiveness and guarantees of functional programming but without necessarily its succinctness, which is not possible within the bounds of the Java language.

Generic Functors (Func0, Func1, Func2, ... ; Action0, Action1, Action2, ...)

Comparator and Runnable are well-known examples of functors in Java. They are interfaces specifying a single method, and they are often created via anonymous inner classes. With generics, we can forego ad-hoc functors and refine ourselves to two varieties: FuncX (non-void return type), and ActionX (void return type), where X is the number arguments encapsulated by our functors (arbitrarily implemented through 7 in swensen.functional). For example, let's take Func2:

package swensen.functional;

/**
 * @author Stephen Swensen
 * A generic functor for non-Void methods with two parameters.
 * @param <T1>    the type of the first parameter of call
 * @param <T2>    the type of the second parameter of call
 * @param <R>    the return type of call
 */
public interface Func2 <T1,T2,R> {
    /**
     * Invoke this functor synchronously.
     * @param t1    the first parameter
     * @param t2    the second parameter
     * @return    the return value
     */
    public R call(T1 t1, T2 t2);
}

Here we see that T1, T2, and R are type parameters allowing us to encapsulate any method having two (non-primitive) arguments T1, T2 and a (non-primitive) return type R. We might use it as follows:

Func2<String,String,Integer> func = new Func2<String,String,Integer>() {
    public Integer call(String t1, String t2) {
        t1 = (t1 == null) ? "" : t1.trim();
        t2 = (t2 == null) ? "" : t2.trim();
        return t1.length() - t2.length();
    }
};

Integer result = func.call("hello world   ", "  hello world"); //result == 0

swensen.functional also supplies a set of functors designed to facilitate compatibility between generic and legacy functors: CallableFunc, ComparatorFunc, and RunnableAction. Each of these abstract classes implement both their indicated legacy functors and their generic analogs, and may be instantiated directly by an anonymous inner class or by the static method from overloaded to accept pre-existing instances of either super type. Additional compatibility functors may be added by request. One other functor, Predicate<T>, is an abstract class implementing Func1<T,Boolean> and exposing several methods for building compound Predicates.

Sequences (Seq)

Seq is an immutable type implementing Iterable and featuring method chaining and lazy evaluation. Seqs are constructed via the static method of overloaded for Iterables and all primitive and generic vararg arrays. The wrapped data-structure is not actually copied into the resulting Seq, and it is never modified: all operations against Seqs result in new Seqs. This is achieved by using anonymously constructed Iterators to produce Seq elements on demand (a technique used throughout the Seq class).

Note that while Seq is itself immutable, it allows some vulnerability to this guarantee:

When constructing a Seq, the supplied Iterator is not copied and therefore may potentially be mutated by external reference holders, an allowance under performance considerations.
It has a public constructor for the sole purpose of allowing extension (the static of methods are the preferred methods for construction), reaching a compromise between guarantees for library consumers and flexibility for library designers.

In general, this is the design compromise we favor when developing immutable types such as Tuples further on.

Since Seqs are immutable, their underlying Iterator's remove method should always throw an UnsupportedOperationException. Therefore, we create an abstract class called ReadonlyIterator extending Iterator for this purpose:

public abstract class ReadonlyIterator<E> implements Iterator<E> {
    public void remove() {
        throw new UnsupportedOperationException();
    }
}

And we demonstrate how it is used when constructing a Seq from a generic vararg array:

/**
 * Construct a sequence from a generic array.
 * Lazy Iterable wrapper around arr, no new memory is allocated.
 * Outside structural mutation to arr compromises the immutability of this Seq.
 * Note that since we override Seq.of for each and every primitive array type, we will
 * never get surprising behavior like that seen with Arrays.asList.
 * @param <E>    element type
 * @param arr    an array, may be null
 * @return    a new Seq constructed from arr
 * (or Seq.EMPTY if arr is null or length 0)
 */
public static <E> Seq<E> of(final E...arr)
{
    if(arr == null || arr.length == 0)
        return Seq.getEmpty();

    return Seq.of(new Iterable<E>(){
        public Iterator<E> iterator() {
            return new ReadonlyIterator<E>(){
                int cur = 0;

                public boolean hasNext() {
                    return cur < arr.length;
                }

                public E next() {
                    if(cur < arr.length)
                        return arr[cur++];
                    else
                        throw new NoSuchElementException();
                }
            };
        }
    });
}

Seq makes extensive use of the functors we introduced earlier. It includes fundamental functional methods like filter, map, and foldl, along with many others useful for manipulating and generating Seq object streams. Here is a sample of the form and style of programming we can enjoy:

/* elements generated on demand, no memory allocated
   up front except for Seq object itself. */
Seq<Integer> range1 = Seq.range(1, 10);

/* range1 is neither mutated nor traversed at this time. */
Seq<Integer> range2 = range1.append(34, 36, 38, 39);

Func1<Integer,Integer> minus3 = new Func1<Integer,Integer>(){
    public Integer call(Integer t1) {
        return t1 - 3;
    }
};

/* could alternatively use Predicate<Integer> */
Func1<Integer,Boolean> isPositiveOdd = new Func1<Integer,Boolean>(){
    public Boolean call(Integer t1) {
        return (t1 % 2) == 1;
    }
};

/* note the use of method chaining
   {1,2,3,4,5,6,7,8,9,10,34,36,38,39} ->
   {-2,-1,0,1,2,3,4,5,6,7,31,33,35,36} -> {1,3,5,7,31,33,35} */
Seq<Integer> result = range2.map(minus3).filter(isPositiveOdd);

Action1<Object> println = new Action1<Object>() {
    public void call(Object obj){
        System.out.println(obj);
    }
};

/* apply the println Action to each element immediately. */
result.iter(println);

Func2<Integer,Integer,Integer> sum = new Func2<Integer,Integer,Integer>(){
    public Integer call(Integer t1,Integer t2) {
        return t1 + t2;
    }
};

/* 115 */
Integer resultSum = result.foldl(0, sum);

Tuples (Tuple1, Tuple2, Tuple3, ...)

The decision to include tuple implementations was a difficult one, and new to version 2.0, as the author did not want to introduce unnecessary elements for developers to learn. But ultimately, it became apparent that these were critical for achieving rich functional expressiveness, and the alternative required an even more burdensome collection of less generic types. Similar to FuncX and ActionX, we have a set of generic TupleX implementations (immutable, of course) where X is the number of fields represented, implemented from 1 to 7 (arbitrarily). Additionally, we have a class Tuples filled with static methods for constructing various tuples, giving us the benefit of better type inference over TupleX constructors, and a single overloaded method create for all our construction needs. Let's look at the implementation of Tuple3; notice our implementations for toString(), equals(), and hashCode():

public class Tuple3<T1,T2,T3> {
    //note that we can use public fields instead
    //of getters since they are marked final
    public final T1 t1;
    public final T2 t2;
    public final T3 t3;
    public Tuple3(T1 t1, T2 t2, T3 t3) {
        this.t1 = t1;
        this.t2 = t2;
        this.t3 = t3;
    }
    public static <T1,T2,T3> Tuple3<T1,T2,T3> create(T1 t1, T2 t2, T3 t3)
    {
        return new Tuple3<T1,T2,T3>(t1,t2,t3);
    }
    public String toString()
    {
        return String.format("(%s, %s, %s)", t1, t2, t3);
    }
    public boolean equals(Object other) {
        if (this == other)
            return true;
        if (!(other instanceof Tuple3<?,?,?>))
            return false;

        Tuple3<T1,T2,T3> rhs = (Tuple3<T1,T2,T3>) other;
        if (!(this.t1==null ? rhs.t1==null : this.t1.equals(rhs.t1)))
            return false;

        if (!(this.t2==null ? rhs.t2==null : this.t2.equals(rhs.t2)))
            return false;

        if (!(this.t3==null ? rhs.t3==null : this.t3.equals(rhs.t3)))
            return false;

        return true;
    }
    public int getHashCode() {
        int hash = 1;
        hash *= 31 + (t1 == null ? 0 : t1.hashCode());
        hash *= 31 + (t2 == null ? 0 : t2.hashCode());
        hash *= 31 + (t3 == null ? 0 : t3.hashCode());
        return hash;
    }
}

And we might use it as follows:

Tuple3<Integer, String, Seq<Double>> tuple3 = 
                Tuples.create(5, "hello", Seq.of(0.5, 1.2, 5.3));
int t1 = tuple3.t1;
String t2 = tuple3.t2;
Seq<Double> t3 = tuple3.t3;
        
//(5, hello, {0.5, 1.2, 5.3})
System.out.println(tuple3.toString());

Features at a Glance

What follows is a limited reference of some of the features found in this library:

The following demonstrate all of the ways Seqs may be constructed using the of method:

Seq<Integer> a = Seq.of(1,2,3,4); //generic vararg array
Seq<Integer> b = Seq.of(new Integer[] {1,2,3,4}); //generic array
Seq<Integer> c = Seq.of(new int[] {1,2,3,4}); //primitive array
Seq<Integer> d = Seq.of(Arrays.asList(1,2,3,4)); //any Iterable

Seq also provides a rich set of bounded and unbounded range Seq overloaded for int, long, float, double, BigInteger, and BigDecimal, including reverse ranges and skips:

Seq.range(-2, 2); // -2, -1, 0, 1, 2
Seq.range(2L, -2L); // 2L, 1L, 0L, -1L, -2L
Seq.range(BigInteger.valueOf(-4), 
          BigInteger.valueOf(4), 
          BigInteger.valueOf(2)); // -4, -2, 0, 2, 4 (all BigIntegers)
Seq.unboundedRange(2.0, .5); // 2.0, 2.5, 3.0, 3.5, 4.0, ...

We can also generate infinite sequences using Seq.generate which takes a producer functor as an argument. Here is a simple example using Seq.generate to produce an infinite sequence of random numbers:

final Random r = new Random();
Seq.generate(new Func0<Integer>() {
    public Integer call() {
        return r.nextInt();
    }
});

Since Seq implements Iterable, it may be used anywhere Iterables are used in the Java standard library or your own libraries. Additionally, Seq implements a few methods for converting Seqs to standard Java collection types:

List<Integer> list = Seq.of(1,1,2,2,3,3).toArrayList();
Integer[] array = Seq.of(1,1,2,2,3,3).toArray(Integer.class);

/* [1,2,3] */
Set<Integer> set = Seq.of(1,1,2,2,3,3).toHashSet();

/* [1L : "1", 2L : "2", 3L : "3"] */
Map<Long, String> map = Seq.of(1,1,2,2,3,3).toHashMap(
    new Func1<Integer, Long>() {
        public Long call(Integer integer) {
            return integer.longValue();
        }
    },
    new Func1<Integer, String>() {
        public String call(Integer integer) {
            return integer.toString();
        }
    }
);

Seqs may be created by concatenating other Seqs:

/* 1,2,3,5,6 */
Seq.of(1,2,3).append(4,5,6);

/* 4,5,6,1,2,3 */
Seq.of(1,2,3).prepend(4,5,6);

/* 1,2,3,4,5,6,7,8,9 */
Seq.concat(Seq.of(null, Seq.of(1,2,3), Seq.<Integer>getEmpty(), 
           Seq.of(4,5,6), null, Arrays.asList(7,8,9), null));

In contrast to index / length based methods for extracting sub-sequences typical of Lists and Arrays, functional data-structures typically use a combination of take and skip methods:

/* 0,1,2 */
Seq.of(0,1,2,3,4,5,6,7,8).take(3)

/* 3,4,5,6,7,8 */
Seq.of(0,1,2,3,4,5,6,7,8).skip(3)

/* 3,4,5 */
Seq.of(0,1,2,3,4,5,6,7,8).skip(3).take(3);

Seq provides a sensible overload for toString as well as structural, pair-wise equals:

/* "{1, 2, null, 4, 5}" */
Seq.of(1,2,null,4,5).toString();

/* true */
Seq.of(1,2,3,4).equals(Seq.of(new int[]{1,2,3,4}));

Seq provides several methods for querying itself, including aggregate functions:

Seq<Integer> s = Seq.of(4,3,2,1);
/* true */
s.skip(4).isEmpty();

/* 4 */
s.count();

/* the following three methods have overloads for Comparator, 
   Func2, and ComparatorFunc as well */    

/* 1,2,3,4 */
s.sort();

/* 4 */
s.max();

/* 1 */
s.min();

Seqs may also be manipulated as sets for convenience, although these are not implemented lazily:

Seq<Integer> s1 = Seq.of(0,0,1,1,2,2);
Seq<Integer> s2 = Seq.of(1,1,2,2,3,3);

/* 0,1,2,3 */
s1.union(s2);

/* 1,2 */
s1.intersect(s2);

/* 0,1,2 */
s1.distinct();

There are several other methods which may be performed on Seqs beyond what has been explored here, and the real power of functional programming with Seq lays in chaining these methods together in a series of light-weight steps.

Unit Testing

This project represents my first experience with unit testing. Since then I have broadened my experience with unit and integration testing through work on Groovy on Grails and Java Swing projects, with teams seriously dedicated to the practice. What I find both interesting and gratifying with the tests for swensen.functional is that no mocking frameworks or dynamic features are required to effectively test these libraries; being filled with all immutable data-structures and pure functions, there are no internal dependencies which need to be accounted for. Indeed, unit tests in functional libraries serve as mere verification of the implementation of algorithms which can be mathematically proven. Given a set of inputs, the outputs are always the same. This builds extreme confidence in refactoring and implementation tweaking, which is indeed one of the goals of unit testing in general, but in my experience is never so rock-solid in programs filled with mutation and consequently with tests relying on mocking frameworks.

Challenges

The initial design phase was quite intense. In particular, choosing the best representation of functors was an early challenge. Initially, I experimented with dynamic functors which invoked standard methods using Reflection. But the inadequacies of generics (no runtime Reflection of parametric-types), complexity (for example, resolving overloaded methods), and realization that a Reflection based approach would sour many users who treasure compile-time type checking, led me to ultimately settle on the simple, generic, strongly-typed, familiar approach using interfaces which would be instantiated as anonymous classes.

Probably the biggest, continuing challenge has been the awkwardness of the Iterator interface which is at the core of all lazily evaluated Seq methods. Specifically, hasNext(), which:

performs a potentially expensive operation,
may be called repeatedly.

Therefore, certain algorithms have to take special pains to actually retrieve and store the next value when hasNext() is called. Take Seq.filter() for example, which is conceptually straight-forward, but surprisingly difficult to implement with Java's main mechanism for doing so:

public Seq<E> filter(final Func1<? super E,Boolean> predicate)
{
    if(predicate == null)
        throw new IllegalArgumentException("predicate cannot be null");

    if(this == EMPTY)
        return Seq.getEmpty();

    final Iterable<E> that = this;
    return Seq.of(new Iterable<E>(){
        public Iterator<E> iterator() {
            return new ReadonlyIterator<E>(){
                Iterator<E> thatIterator = that.iterator();

                boolean endReached = false;
                boolean curSet = false;
                E cur = null;

                public boolean hasNext() {
                    if(endReached)
                        return false;
                    else if(curSet)
                        return true;

                    while(thatIterator.hasNext()) {
                        cur = thatIterator.next();
                        if(predicate.call(cur)) {
                            curSet = true;
                            return true;
                        }
                    }

                    return !(endReached = true);
                }

                public E next() {
                    if(hasNext()) {
                        curSet = false;
                        return cur;
                    }
                    else
                        throw new NoSuchElementException();
                }
            };
        }
    });
}

Final Remarks

The "Download source" link at the top of this article is a zip file containing the swensen.functional package source (tested against Sun's JRE version 1.5 update 22 for Win32), JavaDocs, and JUnit 4 tests. I appreciate any suggestions or bug reports.

History

Version	Date	Description
1.00	November 11, 2009	First release.
1.01	November 14, 2009	Added `null` argument checks to several methods. Changed `Seq.any` and `Seq.all` to accept `Func1` instead of `Predicate`. Overloaded `Seq.min`/`max`/`sort` for `Comparator`, `Func2`, and `ComparatorFunc`. Fixed `Seq.isEmpty()`.
1.02	March 7, 2010	Worked around bug in Mac OSX Java 6 compiler for `Seq.min`/`max`/`sort`. Fixed some JavaDoc issues (illuminated by switching from Eclipse to IntelliJ IDEA). Created `test.swensen.functional.SeqTest` JUnit 4 test class and created basic test cases for `Seq`.
1.03	May 22, 2010	Fixed bug in `Seq.foldl` not respecting pair-wise associativity.
2.0, Major Release	December 5, 2010	Renamed `Seq.eager` to `Seq.force`. Renamed `Seq.elementAt` to `Seq.ith`. Added `Seq.nth`. Replaced `Seq.subSequence` with `Seq.skip` and `Seq.take`. Added `Seq.range` overload with a step. Added `Seq.range` overloads for `Long`, `Float`, `Double`, `BigInteger`, and `BigDecimal`. Fixed potential for stack overflow bug in `Seq.filter`. Added `Seq.takeWhile` and `Seq.skipWhile`. Removed `generate` overload which took terminator predicate: use `generate().takeWhile()` instead. Added `truncate`. Added Tuple types. Added `Seq.groupBy` and supporting `Grouping` and `GroupingSeq` types (extending a `Tuple` and `Seq` types, respectively). Added the static `Seq.concat` method. Added `Seq.partition` and supporting `Partition` type (extending a Tuple). Added `Seq.iteri` and `Seq.itern`. Changed all uses of `NullPointerException` to `IllegalArgumentException`. Changed target JRE from 1.6 to 1.5 update 22. Added `Seq.shuffle`. Added `Seq.unboundedRange` overloads. Added `Seq.zip` and `Seq.zip`.