JoaoESmoreira
JUC Compiler
The main goal of the project is to build a working compiler capable of performing lexical, syntactic, and semantic analysis for the JUC language. The grammar was provided in EBNF notation and required disambiguation to avoid Shift-Reduce and Reduce-Reduce conflicts, allowing smooth integration with Yacc for bottom-up parsing.
The full repository and documentation can be found here: JUC-Project Repository.
Tools and Technologies
- C Language — Core implementation language
- Lex — Lexical analysis tool
- Yacc — Syntactic analysis (parser generator)
- LLVM - High-level assembly language
- Linux/Unix Environment — Compilation and execution
Repository Structure
- src/: source code implementing Lex, Yacc, and compiler logic
- docs/: project documentation (including this report)
- Makefile: build and run automation
- README.org: project overview and build instructions
Grammar Design
The grammar was rewritten to eliminate ambiguities by defining operator precedence and associativity using directives:
%left,%right, and%nonassoc- Precedence is defined from lower to higher levels
To correctly handle optional if-then-else statements, a special NO
ELSE non-associative operator was introduced with the %prec
directive to control precedence when else is omitted.
Data Structures
Token
A token_t structure was created to store the lexical information
produced by Lex, containing:
value(token value)lineandcol(token position in the source file)
Abstract Syntax Tree (AST)
The AST represents the hierarchical structure of the source program.
Each node is represented by an ast_node_t structure containing:
id: node name (e.g., "MethodDecl", "FieldDecl")value: token value (yytextfrom lexical analysis)type: node type (set during semantic analysis)line,col: position in source codechild,brother: pointers to the first child and next sibling nodes
Symbol Table
The symbol table stores information about identifiers, parameters, and methods. It uses two main structures:
param_list: linked list for method parameterssymbol_table: stores symbols and nested scopes
Each function symbol points to another symbol table representing its local scope.
Algorithms
The algorithms used involve recursive traversal of trees and linked lists. During AST traversal:
- Global variables and functions are identified through sibling traversal
- Function bodies are analyzed by exploring children before siblings
- Information is propagated upward during recursion to annotate the AST and update the symbol table
Footer
Copyright © 2025 Joao ES Moreira
The contents of this website are licensed under the Creative Commons Attribution-NoDerivatives 4.0 International License (CC-BY-ND 4.0).
The source code of this website is licensed under the MIT license, and available in GitHub repositor. User-submitted contributions to the site are welcome, as long as the contributor agrees to license their submission with the CC-BY-ND 4.0 license.