CISC 471/672 Programming Assignment #2 — MeggyJava Scanner and setPixel Compiler
Introduction
You are about to embark on writing your first compiler and up your compiler wizardry level!! PA2 is the first programming assignment for CISC 471/672 where you can work in a team. Your team will extend a JLex based scanner to include all of the tokens in the MeggyJava language. Additionally you will be writing a parser that can generate AVR assembly code for the PA2 grammar, which is basically just setPixel() statements.All files you need to get started are located in the PA2 folder.
For the scanner part of the assignment, an incomplete
regression testing system is provided. This testing system iterates
through all input files
fname.in
in a TestCases directory and compares output with
corresponding fname.in.OK
files. You are provided with a
MJPA2Driver.java
program that calls the scanner repeatedly and
outputs each symbol that is read in.
Example: The file plus.in contains:
+ + + + + + + + + + +and scanning it should produce the following result (NOTE that the given driver doesn't quite do this. You have to edit it.):
symbol: #2 symbolValue: [+ at: (1,1) value: -1] symbol: #2 symbolValue: [+ at: (2,1) value: -1] symbol: #2 symbolValue: [+ at: (2,3) value: -1] symbol: #2 symbolValue: [+ at: (2,5) value: -1] symbol: #2 symbolValue: [+ at: (3,5) value: -1] symbol: #2 symbolValue: [+ at: (4,3) value: -1] symbol: #2 symbolValue: [+ at: (4,7) value: -1] symbol: #2 symbolValue: [+ at: (5,1) value: -1] symbol: #2 symbolValue: [+ at: (5,9) value: -1] symbol: #2 symbolValue: [+ at: (6,2) value: -1] symbol: #2 symbolValue: [+ at: (6,8) value: -1]The symbol number "#2" corresponds to a constant for the PLUS token (see
sym.java
generated by the JavaCup tool). A Symbol
object contains a token and a SymbolValue objects. SymbolValue objects
contain a lexeme, line, position, and integer value. The default integer
value is -1
. This default value is used to indicate that no
extra value is associated with a PLUS token, but a ColorLiteral, a
ToneLiteral, or a number will have useful associated values.
See the Meggy
Java Tokens description
in the Resources section of the website for a description of
Symbols and their associated values.
Another example: funny.in:
3abc abc3 (45,67,89ten) /* ,,, ***/ ;;;; // end
funny.in.OK:
symbol: #46 symbolValue: [3 at: (1,1) value: 3] symbol: #47 symbolValue: [abc at: (1,2) value: -1] symbol: #47 symbolValue: [abc3 at: (1,6) value: -1] symbol: #5 symbolValue: [( at: (1,11) value: -1] symbol: #46 symbolValue: [45 at: (1,12) value: 45] symbol: #21 symbolValue: [, at: (1,14) value: -1] symbol: #46 symbolValue: [67 at: (1,15) value: 67] symbol: #21 symbolValue: [, at: (1,17) value: -1] symbol: #46 symbolValue: [89 at: (1,18) value: 89] symbol: #47 symbolValue: [ten at: (1,20) value: -1] symbol: #6 symbolValue: [) at: (1,23) value: -1] symbol: #34 symbolValue: [; at: (1,37) value: -1] symbol: #34 symbolValue: [; at: (1,38) value: -1] symbol: #34 symbolValue: [; at: (1,39) value: -1] symbol: #34 symbolValue: [; at: (1,40) value: -1]
For the second part of this assignment, you will be creating the Meggy Java to AVR compiler for the PA2 subset of the MeggyJava language that enables any number of pixels to be set with Meggy.setPixel() and will include byte casts as parameters to the Meggy.setPixel() function calls. You will
- use your lexer from part one of the assignment,
- build the parser with JavaCUP,
- perform syntax directed code generation to AVR assembly code for Meggy.setPixel() statements,
The Assignment
Part I
To complete this assignment you must:- Extend mj.lex so that it scans all the tokens specified in the Meggy Java Tokens description. Do not change the order of the terminal definitions in the bogus parser, as this would change their symbol numbers.
- Run regress.sh and confirm all provided test cases pass.
- Add test cases for non provided token types (see Terminals doc)
- Add error cases for all token types (see Terminals doc)
Part II
For the second part of the assignment, you should create a jar file, MJPA2.jar, that can be executed as follows:java -jar MJ.jar InputFile.java
The input files can be any PA2 MeggyJava program. The PA2Test.java example you wrote for PA1 is a possible test case for MJ.jar. The output file named InputFile.java.s should be an AVR assembly program that using the build process could run on the MeggyJr device. Additionally, the InputFile.java.s file must be such that we can run it through the AVR simulator.
Assembly (.s) programs have a prolog and epilog. Files for these are provided in avrH.rtl.s and avrF.rtl.s.
The set of instructions you will need include those in the avrH.rtl.s and avrF.rtl.s files and the following:
# Examples of each statement type are provided. # Register numbers, constant values, and function names can change. ldi r24,lo8(1) ldi r24,73 call functionName
Execute MJSIM.jar for a list of available instructions. For more details about the instructions see the AVR Overview from Michelle Strout.
Notice that avrF.rtl.s already has an infinite loop at the end of main so that the program will always remain running on the Meggy Jr device even if there is no while loop in the MeggyJava program.
For this assignment, no error handling is necessary. In other words, you can assume the input is correct.
You will start off by downloading PA2Start.tar.gz.
Provided Test Cases for Part I
You are provided with various test cases. This set is incomplete. For each token type: (Specials, Reserved Words, Reserved Phrases, Int-Literal, Id, Comments) create files fnm.in and fnm.in.OK, thoroughly testing that type, and provide files errfnm.in and errfnm,in.OK thoroughly testing incorrect tokens of that type, e.g. an errspec.in could contain ">@$" and more.Submitting the Assignment
- Make sure you test your implementations thoroughly.
- Include all of the regression test terminal files ( .in and .in.OK) that you wrote,
- Include a README file explaining the file structure (your test files) and other comments you want your TA to know (e.g. features not implemented).
- Submit assignment by committing and pushing your files including the PA1.jar to your team's private github repository where the instructor and TA have shared access. Submit one assignment per group. Make sure to call the tar ball PA2.tar.
-
Sanity Check (procedure TA will use to grade your assignment):
# Unpack and build: > tar xf PA2.tar > cd PA2 // note the expected directory name > cat README > make # Examine tests for part I: > cd TestCases > find *.in | xargs -I % sh -c 'echo -e "----\n" % "\n-----"; cat %; \ echo -e "\n * * * \n"; cat %.OK; echo -e "\n\n\n"' # Execute tests: > ./regress.sh
# Test cases for part II:java -jar MJ.jar TestCase.java java -jar MJSIM.jar TestCase.java.s > t1 javac TestCase.java java TestCase > t2 diff t1 t2
Evaluation
This assignment is graded on a 70 point scale. Your submission will be graded with the following breakdown:- 25 points: Part I: Scanner for PA2 Grammar
- 10 points: Extension of mj.lex to correctly scan all tokens in PA2 grammar demonstrated with all test cases running correctly for PA2 tokens
- 5 points: All original test cases from regress.sh run correctly
- 5 points: Added sufficient test cases for non-provided token types
- 5 points: Added error cases for all token types
- 35 points: Part II: Syntax-directed compiler for PA2 grammar
- 5 points: parses correctly for PA2 grammar, irregardless of code generation capability
- 5 points: generates correct code for the smallest possible MeggyJava program, no statements
- 5 points: generates correct code for PA2 subset for setting pixels with at least 2 different constant integer expressions
- 5 points: generates correct code for PA2 subset for setting pixels with at least 2 different color literals
- 5 points: generates correct code for PA2 subset for setting pixels with byte casting
- 5 points: generates correct code for a sequence of (at least 10) setPixel calls
- 5 points: awesome test cases to show off what your first compiler can do!
- Note: no error handling is required. However, if you include error handling with error messages and test cases to demonstrate, you can earn up to 5 extra credit points.
- 5 points: README explaining the file structure (your test files) and other comments you want your TA to know (e.g. features not implemented)
- 5 points: Followed instructions for deliverable and submission format/files