Abstract:
The conditions for creating optimizing parallelizing compilers for computing systems with distributed memory are described. Target computing systems are microcircuits of the “supercomputer on a chip” type. Both optimizing program transformations specific to systems with distributed memory and those transformations that are needed both for computing systems with distributed memory and for computing systems with shared memory are presented. The issues of minimizing interprocessor transfers when parallelizing a recursive function are discussed. The main approach to creating such compilers is block-affine data placement in distributed memory with minimization of inter-processor transfers. It is shown that parallelizing compilers for computing systems with distributed memory should be created on the basis of a high-level internal representation and a high-level output language.