UPDATE [13Feb2010]:This website is severely outdated and may contain incorrect or incomplete information.
logo

Context-Free Language Translation for the DDC Assembler


Jim Gaines
Sponsor: Hill Air Force Base
Faculty Adviser: Dr. Chris Myers

Department of Electrical and Computer Engineering
University of Utah
Salt Lake City, UT 84102

Abstract

In computer science, as in linguistics, effective language translation encompasses a large part of the literature. While the linguist is frequently concerned with phonetics and universal properties, the computer scientist is often interested in optimization and platform independency. Further, to ensure persistently correct operation, there can be no ambiguity in source to machine translation. Thus, every possible line of code in DDC assembly must reduce to a sequence of recognized symbols that will, in turn, translate to a unique stream of bits. This injective mapping between bit streams and source has been achieved in the DDC assembler through the development of a context-free grammar, in the extended Backus-Noir Format (BNF), to realize the construction of a lexer and parser.

Lexical and syntactic analysis, akin to spelling and grammatical checks in a word processor, are handled by the lexer and parser, respectively. This analysis essentially checks for uniqueness and will throw an exception upon any ambiguities or, given grammatically correct source, will call methods to the backend to generate object code to later be transferred to the DDC ASIC. As manual lexer/parser design has often been considered difficult and inertial to any future changes in the language specification, it was decided in the beginning phases of the project that this task would be best left to a scanner generator; a tool that generates optimized lexers and scanners from a formal context-free grammar description of a specified language.

ANTLR, an open source, license and royalty free, predicated LL(k) parser generator has been utilized to this end. The result of a doctoral thesis, almost two decades of research and active community interaction, left ANTLR as the obvious choice. Unlike the archetypical yacc/bison and lex/flex scanner generators, ANTLR has a novel approach to language translation in that it makes little distinction between the lexer and parser and uses an extended BNF grammar definition for both. Utilization of ANTLR has allowed the current implementation of the DDC assembler to inherit the extensibility, optimization and hierarchal exception handling typically only found in more mature assemblers.

  • Presentations
    • [29 March 2007] Technical Open House (Awarded Best Presentation) (pdf)
  • Papers
    • [03 May 2007] Developer's Guide
    • [12 April 2007] DDC Assembly Language Specification and Assembler Implementation (section: "Context Free Language Translation")
    • Grammar, Reduced Backus-Noir
    • Grammar, Complete Extended Backus-Noir in ANTLR/C#
  • ANTLR Links: