Towards complete automatic code generation
Abstract
A compiler consists of two phases: source program analysis and object program synthesis.
The problem of compilation has been tackled in depth, and many tools and techniques exist for building the analysis phase. Similar elegant solutions do not exist for automatic generation of back-ends. Existing studies of automatic code generation have attempted to automate particular tasks of code generation and do not offer a complete solution.
This thesis describes a new code generator generator environment.
Our environment, Poorna, takes a description of an architecture and produces a code generator. A new architectural description language (ADL), Shilpa, has been designed. The ADL analyzer outputs a set of tree patterns and architectural information. The tree patterns, together with a built-in pattern matching algorithm, form the instruction selector, and the architectural information can be directly used in optimization modules.
Shilpa is an ADL designed for code generation. The emphasis is on the description of the semantics of various operations at the instruction level. A description is in the form of attributed objects. Important objects of an architecture are instructions, runtime specifications, and instruction execution details. The semantics of an instruction is accepted in tree form. Runtime information is accepted for a register model in the form of a list of directives. To make the specification more general, context-sensitive tuning of the intermediate representation (IR) is allowed.
The environment does not attempt to automate particular optimization phases. Instead, it tries to take a complete description of instruction characteristics. Towards this end, an Extended Finite State Automaton (EFSA) is defined. In the automaton, states represent resources and edges define instruction execution characteristics.
A variant of the top-down tree pattern matching algorithm forms the instruction selector. This algorithm is designed to cover subject DAGs in terms of tree patterns. The algorithm also considers one level of term rewriting. It identifies templates of rewrites in the subject structure and considers both the template and the result of the rewrite rule for matching. This is expected to make the front-end completely independent of the back-end while ensuring optimal instruction selection.
Algorithms have been defined to extract information from an instance of the EFSA. The information is provided in the form of a set of initialized arrays which can be directly used in optimization modules. Register information can be completely derived, and register allocation algorithms are fully retargetable. The latency information is used in instruction scheduling algorithms. Algorithms are provided to derive latencies and parallel paths of execution. Incorporation of different instruction scheduling-related strategies is discussed.
Code generators for ISCO and RS6000 have been built as part of the environment by the author, and two other code generators (R3000 and SPARC) have been built as part of student projects. The performance of these is analyzed. The environment does not seem to have any inherent drawback in producing good retargetable code generators.