Generating binary keys for algorithmic comparators. A user may construct an algorithmic comparator by creating a ComparatorExpr object (through various static methods in this class). She could then create a KeyGenerator object and use it to create binary keys for tuple. The KeyGenerator object can be reused for different tuples that conform to the same schema. Sorting the tuples by the binary key yields the same ordering as sorting by the algorithmic comparator. Basic idea (without optimization):
TODO Remove the strong dependency with Pig by adding a DatumExtractor interface that allow applications to extract leaf datum from user objects, something like the following:
interface DatumExtractor { Object extract(Object o); }
And user may do something like this:
class MyObject { int a; String b; } ComparatorExpr expr = KeyBuilder.createLeafExpr(new DatumExtractor { Object extract(Object o) { MyObject obj = (MyObject)o; return obj.b; } }, DataType.CHARARRAY);
TODO Change BagExpr to IteratorExpr, so that it may be used in more general context (any Java collection). TODO Add an ArrayExpr (for Java []).