CS 477 – Project – 4366
College of Computing and Informatics Project Deadline: Sunday 12/05/2024 @ 23:59 [Total Mark is 14] Student Details: CRN: Name: Name: Name: ID: ID: ID: Instructions: • You must submit two separate copies (one Word file and one PDF file) using the Assignment Template on Blackboard via the allocated folder. These files must not be in compressed format. • It is your responsibility to check and make sure that you have uploaded both the correct files. • Zero mark will be given if you try to bypass the SafeAssign (e.g. misspell words, remove spaces between words, hide characters, use different character sets, convert text into image or languages other than English or any kind of manipulation). • Email submission will not be accepted. • You are advised to make your work clear and well-presented. This includes filling your information on the cover page. • You must use this template, failing which will result in zero mark. • You MUST show all your work, and text must not be converted into an image, unless specified otherwise by the question. • Late submission will result in ZERO mark. • The work should be your own, copying from students or other resources will result in ZERO mark. • Use Times New Roman font for all your answers. General Instructions Pg. 01 General Instructions • • • • • • • You are allowed to work in a group of 3 students for this assessment activity. Each submitted part of this Project should contain two components: o A brief report describing the development progress o Source code of implementation The source code should be separated into modules using an understandable coding style and naming rules. The report should contain instructions about how to run your code. If any extra environment needs to be setup for your code, please give full instruction in your report. The report should list all your test cases, including the expected result for each test case. If we cannot run your code, a demo will be required and you will be notified through email. You can use any programming language for implementation in the project e.g. Java, C etc. The submission must contain a .zip file: • • of all your code a set of input files to be used for testing purpose, as well as a printout of the resulting output of the program for each input file Grading Scheme: Work Component Part 1 Part 2 Part 3 Part 4 Deliverables 1. 2. 1. 2. 1. 2. 1. 2. Documentation Implementation Documentation Implementation Documentation Implementation Documentation Implementation Marks / Comments 3.5 3.5 3.5 3.5 If the deliverables are complete and the scanner is working as expected. If the deliverables are complete and syntax analysis phase is implemented correctly If the deliverables are complete and semantic analysis phase is implemented correctly If the deliverables are complete and code generation phase is implemented correctly Part 1: Scanner Pg. 02 Part 1: Scanner Design and implement a scanner for a programming language whose lexical specifications are given below. The scanner identifies and outputs tokens (valid words and punctuation) in the source program. Its output is a token that can thereafter be used by the syntax analyzer to verify that the program is syntactically correct. When invoked, the scanner should extract the next token from the source program and should be able to output a token even if the input does not form a correct program. The syntax of the language will be specified later in Part 2. Atomic lexical elements of the language: id ::= letter alphanum* alphanum ::= letter | digit | _ integer ::= nonzero digit* | 0 float ::= integer fraction [e[+|−] integer] fraction ::= .digit* nonzero | .0 letter ::= a..z |A..Z digit ::= 0..9 nonzero ::= 1..9 Operators, punctuation and reserved words: == < > = ; , . : :: + * / = && ! || ( ) { } [ ] /* */ // if then else for class integer float read write return main Notes: • • • You are responsible for providing appropriate test cases that test for a wide variety of valid and invalid cases. It is up to you to analyze this lexical definition and figure out if there are ambiguities or other errors. Any changes to this definition, if any, must be justified and should not result in diminishing the expressive power of the language. Also, you have to design the lexical analyzer in a flexible way, so that it can easily be adapted if changes are needed during the designing of syntax analyzer. The tokens // and /* followed */ by denote comments in the source code Part 1: Scanner Pg. 03 Deliverables: Document • Lexical specifications: Identify the lexical specification you used in the design of the scanner, as well as any changes that you may have applied to original lexical specifications. • Finite state automaton: Finite state machine describing the operation of your scanner. • Error reporting and recovery: Identify all the possible lexical errors that the scanner can encounter. Identify and explain the error recovery technique that you implement. • Design: Give a brief overview of the overall structure of your solution, as well as a brief description of the role of each component of your implementation. • Implementation Tools: Identify all the tools/libraries/techniques that you have used in your analysis or implementation and justify why you have used these particular ones as opposed to others. Implementation 1. Scanner: 1.1. Develop a scanner that recognizes the above-mentioned tokens. 1.2. It should be a function that returns a data structure containing the information about the next token identified in the source program file. 1.3. The data structure should contain information such as (1) the token type (2) its value (or lexeme) when applicable and (3) its location in the source code. Fully describe this structure in your documentation. 2. Driver: 2.1. Include a driver that repeatedly calls the lexical analysis function and prints the token type of each token until the end of the source program is reached. 2.2. The scanner should optionally print the token stream to a file in the AtoCC format. 2.3. Another file (for verification purposes) should contain representative error messages each time an error is encountered in the input program. 2.4. The scanner should not stop after encountering an error. Error messages should be clear and should identify the location of the error in the source code. 3. Test cases: 3.1. Include many test cases that test a wide variety of valid and invalid cases. Selected examples: 0123 [Invalid number:0123] OR [integer:0][integer:123] 01.23 12.340 12345.6789e-123 12345 abc abc_1 _abc1 1abc [Invalid number:01.23] OR [integer:0][float:1.23] [Invalid number:12.340] OR [float:12.34][integer:0] [float:12345.6789e-123] [integer:12345] [id:abc] [id:abc_1] [Invalid identifier:_abc1] [Invalid identifier:1abc] OR [integer:1][id:abc] Part 2: Syntax Analysis Pg. 04 Part 2: Syntax Analysis This section is about the design and implement a syntax analyzer for the language specified by the grammar specified on the next page. Deliverables: Documentation 1. Transformed Grammar into LL(1): 1.1. Remove all the EBNF notations and replace them by right-recursive list-generating productions. 1.2. Analyze the syntactical definition. List in your documentation all the ambiguities and left recursions or any error you may have found in the grammar. 1.3. Modify the productions so that the left recursions and ambiguities are removed without modifying the language. 1.4. Include in your documentation the set of productions that can be parsed using the topdown predictive parsing method, i.e. an LL(1) grammar. 2. FIRST and FOLLOW sets: 2.1. Derive the FIRST and FOLLOW sets for each non-terminal in your transformed grammar. 3. Design: 3.1. Give a brief overview of the overall structure of your solution, as well as a brief description of the role of each component of your implementation. 4. Implementation Tools: 4.1. Identify all the tools/libraries/techniques that you have used in your analysis or implementation and justify why you have used these particular ones as opposed to others. Implementation • Parser: Implement a predictive parser (recursive descent or table-driven) for your modified set of grammar rules. The result of the parser should be the creation of a tree data structure representing the parse tree as identified by the parsing process. This tree will become the intermediate representation used by the two following assignments. • Derivation Output: Your parser should write to a file the derivation that proves that the source program can be derived from the starting symbol. • Error-Reporting: Parser should properly identify all errors in the input program and print a meaningful message to the user for each error encountered. The error messages should be informative on the nature of the errors, as well as the location of errors in the input file. • Error Recovery: The parser should implement an error recovery method that permits to report all errors present in the source code. • Test Cases: Write a set of source files that enable to test the parser for all syntactical structures involved in the language. Include cases testing for a variety of different errors to demonstrate the accuracy of your error reporting and recovery. • Note that the grammar analysis/transformation process can be greatly simplified by the use of tools such as the kfgEdit AtoCC tool. Part 2: Syntax Analysis Pg. 05 Grammar: The syntactical definition is using the following conventions: • Terminals (lexical elements, or tokens) are represented in single quotes ‘thisWay’. • Non-terminals are represented in italics, thisWay. • The empty phrase is represented by EPSILON. • EBNF-style repetition notation is represented using curly brackets {This way}. It represents zero or more occurrence of the sentential form enclosed in the brackets. • EBNF-style optionality notation is represented using square brackets [This way]. It represents zero or one occurrence of the sentential form enclosed in the brackets. • The non-terminal is the starting symbol of the grammar. Except from the EBNF constructs, the grammar is expressed using the syntax used by the kfgEdit AtoCC toolkit application. prog -> {classDecl} {funcDef} ‘main’ funcBody ‘;’ classDecl -> ‘class’ ‘id’ [‘:’ ‘id’ {‘,’ ‘id’}] ‘{‘ {varDecl} {funcDecl} ‘}’ ‘;’ funcDecl -> type ‘id’ ‘(‘ fParams ‘)’ ‘;’ funcHead -> type [‘id’ ‘sr’] ‘id’ ‘(‘ fParams ‘)’ funcDef -> funcHead funcBody ‘;’ funcBody -> ‘{‘ {varDecl} {statement} ‘}’ varDecl -> type ‘id’ {arraySize} ‘;’ statement -> assignStat ‘;’ | ‘if’ ‘(‘ expr ‘)’ ‘then’ statBlock ‘else’ statBlock ‘;’ | ‘for’ ‘(‘ type ‘id’ assignOp expr ‘;’ relExpr ‘;’ assignStat ‘)’ statBlock ‘;’ | ‘read’ ‘(‘ variable ‘)’ ‘;’ | ‘write’ ‘(‘ expr ‘)’ ‘;’ | ‘return’ ‘(‘ expr ‘)’ ‘;’ assignStat -> variable assignOp expr statBlock -> ‘{‘ {statement} ‘}’ | statement | EPSILON expr -> arithExpr | relExpr relExpr -> arithExpr relOp arithExpr arithExpr -> arithExpr addOp term | term sign -> ‘+’ | ‘-‘ term -> term multOp factor | factor factor -> variable | functionCall | ‘intNum’ | ‘floatNum’ | ‘(‘ arithExpr ‘)’ | ‘not’ factor | sign factor variable -> {idnest} ‘id’ {indice} functionCall -> {idnest} ‘id’ ‘(‘ aParams ‘)’ idnest -> ‘id’ {indice} ‘.’ | ‘id’ ‘(‘ aParams ‘)’ ‘.’ indice -> ‘[‘ arithExpr ‘]’ arraySize -> ‘[‘ ‘intNum’ ‘]’ type -> ‘integer’ | ‘float’ | ‘id’ fParams -> type ‘id’ {arraySize} {fParamsTail} | EPSILON aParams -> expr {aParamsTail} | EPSILON fParamsTail -> ‘,’ type ‘id’ {arraySize} aParamsTail -> ‘,’ expr assignOp -> ‘=’ relOp -> ‘eq’ | ‘neq’ | ‘lt’ | ‘gt’ | ‘leq’ | ‘geq’ addOp -> ‘+’ | ‘-‘ | ‘or’ multOp -> ‘*’ | ‘/’ | ‘and’ Part 2: Syntax Analysis Pg. 06 Tokens Keywords: Operators: main | class | if | then | else | for | read | write | return | integer | float eq (==) | neq () | lt () | leq (=) + ||* |/ not (!) | and (&&) | or (||) = sr (::) Punctuation: : | , | . | ; | [ | ] | { | } | ( | ) id: floatNum: intNum: follows specification for program identifiers found in Part 1 follows specification for float literals found in Part 1 follows specification for integer literals found in Part 1 Part 3: Semantic Analysis Pg. 07 Part 3: Semantic Analysis In this section, you have to implement a semantic analyzer for the language described in Part 2. This in turn implies implementation of two interrelated sub-phases: 1. Implementing semantic actions for the generation of symbol tables 2. Implementing semantic checking and type-checking semantic actions. Some of the characteristics of the language semantics and the tasks required for their implementation/documentation are as follows: • The symbol table helps resolving the typing and cross-referencing of the identifiers used in the program, taking into consideration the typing and scoping of the identifiers as expressed in the analyzed program. • The symbol table contains an entry for all identifiers (variables, functions, classes) defined in its own scope. There are scopes for each class definition, free or member function definition, and a global scope for the whole program. • A program includes a set of free functions including only one main function. Information about the free functions must be available in the symbol table before function calls can be semantically verified. In order to allow functions to be called before they are defined, you need to implement at least two passes: o Pass 1: Constructs the symbol tables o Pass 2: Uses the information generated in the first pass. • Classes represent the encapsulation of a user-defined data type and declarations for its member functions. The member functions are all defined before the free functions. As with free functions, two passes (as described above) have to be used to allow a class to refer to another class that is declared after it. As member function are only declared in the class declaration and defined later, two passes are necessary to link the function declaration’s symbol table entry with its corresponding local symbol table. • If a class has an inheritance list, the symbol table of its directly inherited class(es) should be linked in this class, so that inherited members are effectively considered as members of the class, even though they are part of a separate scope. Inherited members with the same name and kind (i.e. variable or function) as a class member should be shadowed by the class member. In any case, circular class dependencies must be reported a semantic error. • It is allowed to have two members with the same name in two different classes or functions, as they are not defined in the same scope. • There is no point in checking for indexes out of bounds, as indexes can be expressions, whose value cannot be determined at compile time. • Variables declared inside functions (local variables) or classes (data members) are considered local and thus can only be used in the current function or class scope. Data members can be used in all member functions of their respective class. This requires a nested symbol table structure: o The global symbol table, representing all the symbols defined in the global scope, exists until the end of the compilation process. o All local symbol tables represent sub-scopes and should be bound to their respective elements in the current symbol table. o A local symbol table is created at the beginning of the compilation of any function or class. Therefore, you have to associate with each variable and function identifier a symbol table record that contains its properties, and include it in the symbol table representing the scope in which this identifier is declared. You have to keep in mind that you might have to change the structure of these records later in the design of Part 3: Semantic Analysis Pg. 08 • code generation phase of the compiler. Make sure you can change the record structure with minimal changes to your symbol table manipulating functions. When using operators in expressions, attribute migration must be done to determine the type of sub-expressions. For simplicity of code generation later, it is suggested that it should be semantically invalid to have operands of arithmetic operators to be of different types. For the same reason, both operands of an assignment must be of the same type. Deliverables: Implementation 1. Symbol table creation phase: 1.1. Implement the data structures and functions for the symbol tables so that they can support the type checking semantic actions. 1.2. Using syntax-directed translation, implement AST traversal that activates semantic actions so that: 1.2.1.A new table is created at the beginning of the program for the global scope. 1.2.2.A new entry is created in the global table for each class declared in the program. These entries should contain links to local tables for these classes. 1.2.3.An entry in the appropriate table is created for each variable defined in the program, i.e. class data members or function’s local variables. 1.2.4.An entry in the appropriate table is created for each function definition (free functions and member functions). These entries should be links to local tables for these functions. 1.2.5.During symbol table creation, there are some semantic errors that are detected and reported, such as multiply declared identifiers in the same scope, as well warnings for shadowed inherited members. 1.2.6.All declared member functions should have a corresponding function definition. 1.2.7.The content of the symbol tables should be output into a file in order to demonstrate their correctness/completeness. 1.2.8.In functions, (including member functions in classes) variables should be defined before being used in the statements. If not, an “undeclared variable” error message is issued. 1.2.9.An identifier cannot be declared twice in the same scope. In such a case, a “multiply declared identifier” message is issued. 2. Semantic/type checking phase: 2.1. Using syntax-directed translation, implement AST tree traversal that triggers semantic actions so that the following semantic checks are done, which should result in error reporting if the check fails: 2.1.1. Type checking is applied on expressions, as well as on assignment and return statements. 2.1.2. Any id referred to is defined in the scope where it is used (failure results in the following error messages: undefined local variable, undefined free function, undefined member, undefined class) 2.1.3.Function calls are made with the right number and type of parameters. Expressions passed as parameters in a function call must be of the same type as declared in the function declaration. Part 3: Semantic Analysis Pg. 09 2.1.4. Referring to an array variable should be made using the same number of dimensions as declared in the variable declaration. Expressions used as an index must be of integer type. 2.1.5.Circular class dependencies (through data membersinheritance) are reported as semantic error. 2.1.6.The “.” operator should be used only on variables of a class type. If so, its right operand must be a member of that class. If not, a “undefined member” should be issued. 2.1.7. The type of the operand of a return statement must be the same as declared in its function’s return type, as declared in the function’s declaration. 3. Error reporting: 3.1. All semantic errors/warnings present in the entire program should be reported properly, mentioning the location in the program source file where the error was located. 3.2. Only existing errors should be reported. 3.3. Errors should be reported in synchronized order, even if different phases are implemented and errors are found in different phases. 4. Output of symbol tables data structure: After the symbol table structure has been completely created, the program should output it to a file, allowing to easily verify if the symbol table correctly corresponds to the input program. Documentation 1. Analysis: 1.1. Description of the semantic rules used in the implementation (see above). 2. Design: 2.1. Description of the purpose of each phase involved in the implementation. For each phase, mapping of semantic actions to AST nodes, along with a description of the effect/role of each semantic action. 2.2. Justification of the overall structure of the solution and the roles of the individual components used in the applied solution. 3. Implementation Tools: 3.1. Description of tools/libraries/techniques used in the analysis/implementation. 3.2. Successful use of tools/libraries/techniques used in the analysis/implementation. Example Symbol Table: Part 4: Code Generation Pg. 10 Part 4: Code Generation In this section, you have to implement a code generation phase for the language as described in sections 2 and 3. Below is a list of different specific constructs/concepts for which code generation needs to be implemented for all the aspects of the language to become executable: 1. Memory allocation: As variables are declared, memory space must be allocated that will eventually be used to store their values. Memory cells are either identified using a unique Moon assembly code label, or an offset relative to the stack frame on which the variable will reside during execution of the scope in which it is declared. The size of the allocated memory block should depend on the declared type/kind of variable (integer, float, array, object, array of objects, etc.) 1.1. Allocate memory for basic types (integer, float). 1.2. Allocate memory for arrays of basic types. 1.3. Allocate memory for objects. 1.4. Allocate memory for objects with inheritance. 1.5. Allocate memory for objects having object members. 1.6. Allocate memory for arrays of objects. 2. Functions: The code generated for functions should be associated with a Moon assembly code subroutine, i.e. a block of code that can be jumped to using a label, then when the function resolves, jump back to the instruction following the initial jump. Parameters should be passed/received during the function call. The return value should be passed to the calling function when the function resolves. In order to allow recursive function calls, a function call stack mechanism needs to be implemented, which implies that memory allocation for variables must all be done using offsets relative to the bottom of the current function’s stack frame. If a call to a member function is made, the member function should have access to the data members of the object from which it was called. 2.1. Branch to a function’s code block, execute the code, branch back to the calling function. 2.2. Branch to a function that has been branched upon. 2.3. Pass parameters as local values to the function’s code block. 2.4. Upon function resolution, pass the return value back to the calling function. 2.5. Function call stack mechanism and recursive function calls. 2.6. Call to member functions. 2.7. Call to deeply nested member function. 3. Statements: Implementation of Moon code for every kind of statement as defined in the grammar: assignment conditional statement, loop statement, input/output statement, return statement. Correct implementation the specific branching mechanisms for control flow statements. 3.1. Assignment statement: correct assignment of the resulting value of an expression to a variable, independently of what is the expression. 3.2. Conditional statement: correct implementation of branching mechanism, including for imbricated conditional statements. 3.3. Loop statement: correct implementation of branching mechanism, including for imbricated loop statements. 3.4. Input/output statement: execution of a read() statement should result in the user being prompted for a value from the keyboard by the Moon program, and assign this value to the parameter passed to the read() statement. Execution of a write() statement should print to the Moon console the result of evaluating the expression passed as a parameter to the write() statement. Part 4: Code Generation Pg. 11 3.5. Return statement: execution of the return statement should result in passing the value of the expression passed to the return statement back to the calling function. 4. Aggregate data members access: Aggregate data types such as arrays and objects contain a group of data values. Code must be generated so that contained member values in such an aggregated value can be accessed when referred to as factors in an expression, or the left-hand side of an assignment statement. This includes access to array elements and object members. 4.1. For arrays of basic types (integer and float), access to an array’s elements, single or multidimensional. 4.2. For arrays of objects, access to an array’s elements, single or multidimensional. 4.3. For objects, access to members of basic types. 4.4. For objects, access to members of array types, as well as the elements of the array. 4.5. For objects, access to members of object types, as well as the elements of this object. 4.6. For objects, access to deeply nested objects. 4.7 For objects, access to the members of a superclass. 5. Expressions: Computing of the resulting value of an entire expression, including the simple case when an expression is either a variable name of even a single literal value, up to complex expressions involving a kind of operators, array indexed with expressions, deeply nested object members, etc. This involves memory allocation for temporary results, register allocation/deallocation 5.1. Computing the value of an entire expression. 5.2. Expression is a single variable or literal value. 5.3. Expressions involving all of: arithmetic, relational and logic operators in one expression 5.4. Expression involving an array factor whose indexes are themselves expressions. 5.5. Expression involving an object factor referring to deeply nested object members. 5.6. Memory allocation for temporary results. 5.7. Register allocation/deallocation scheme Deliverables: Documentation 1. Analysis 1.1. Description of the register allocation/deallocation scheme used in the implementation. 1.2. Description of the memory usage scheme used in the implementation (tag-based, stack-based), including for temporary variables, function calls (free functions or member functions, data members, and calculation of variables’ allocated memory). 1.3. Description of the purpose of each phase involved in the implemented code generation. For each phase, mapping of semantic actions to AST nodes, along with a description of the effect/role of each semantic action. 2. Design 2.1. Description/rationale of the overall structure of the solution and the roles of the individual components used in the applied solution. 3. Implementation Tools 3.1. Description of tools/libraries/techniques used in the analysis/implementation. 3.2. Successful use of tools/libraries/techniques used in the analysis/implementation.
Collepals.com Plagiarism Free Papers
Are you looking for custom essay writing service or even dissertation writing services? Just request for our write my paper service, and we'll match you with the best essay writer in your subject! With an exceptional team of professional academic experts in a wide range of subjects, we can guarantee you an unrivaled quality of custom-written papers.
Get ZERO PLAGIARISM, HUMAN WRITTEN ESSAYS
Why Hire Collepals.com writers to do your paper?
Quality- We are experienced and have access to ample research materials.
We write plagiarism Free Content
Confidential- We never share or sell your personal information to third parties.
Support-Chat with us today! We are always waiting to answer all your questions.