The optimizer that takes the user specified program plan and creates an optimized plan that contains exact descriptions about how the physical execution will take place. It first translates the user program into an internal optimizer representation and then chooses between different alternatives for shipping strategies and local strategies.
The basic principle is taken from optimizer works in systems such as Volcano/Cascades and Selinger/System-R/DB2. The optimizer walks from the sinks down, generating interesting properties, and ascends from the sources generating alternative plans, pruning against the interesting properties.
The optimizer also assigns the memory to the individual tasks. This is currently done in a very simple fashion: All sub-tasks that need memory (e.g. reduce or join) are given an equal share of memory.