Skip to content

davidmartinezhi/little-duck-language

Repository files navigation

Little Duck Compiler

License Python ANTLR4

Little Duck Compiler is a compiler project for the Little Duck programming language. It leverages ANTLR4 for lexical analysis and parsing, implements semantic analysis, intermediate code generation (quadruples), and includes a virtual machine to execute the compiled code.

Table of Contents

Overview

The Little Duck Compiler transforms Little Duck source code into intermediate quadruples, which are then executed by a virtual machine. This process involves several stages:

  1. Lexical Analysis: Tokenizing the input source code using ANTLR4-generated lexer.
  2. Parsing: Building a parse tree using ANTLR4-generated parser.
  3. Semantic Analysis: Ensuring semantic correctness and generating quadruples.
  4. Intermediate Code Generation: Producing a list of quadruples representing the program logic.
  5. Execution: Running the generated quadruples on a virtual machine.

Features

  • ANTLR4 Integration: Utilizes ANTLR4 for robust lexical analysis and parsing.
  • Semantic Analysis: Implements semantic checks using a semantic cube and symbol tables.
  • Quadruple Generation: Transforms high-level code into intermediate quadruples for execution.
  • Virtual Machine: Executes quadruples, managing memory and operations.
  • Comprehensive Testing: Includes a suite of pytest cases to validate quadruple generation and compiler correctness.
  • Function Support: Handles function declarations, parameter passing, and local memory management.
  • Control Structures: Supports conditionals (si, sino) and loops (mientras).

Architecture

The compiler is organized into several modules, each responsible for a specific aspect of the compilation process:

  • Lexer and Parser: Generated by ANTLR4 based on the Little Duck grammar.
  • Semantic Modules:
    • semantic_cube.py: Defines type operations and compatibility.
    • variable_table.py: Manages variable declarations and scopes.
    • function_table.py: Handles function declarations, parameters, and scopes.
    • stack.py: Implements operand and type stacks for expression evaluation.
    • quadruple.py: Manages quadruple generation and storage.
  • Custom Listener: LittleDuckCustomListener traverses the parse tree to perform semantic actions and generate quadruples.
  • Custom Error Listener: LittleDuckErrorListener handles syntax and semantic errors during parsing.
  • Virtual Machine: Executes the generated quadruples, managing memory and control flow.
  • Driver Script: Driver.py serves as the entry point for compiling and executing Little Duck programs.
  • Tests: A suite of pytest cases to verify quadruple generation and compiler functionality.

Installation

Prerequisites

  • Python 3.8+
  • ANTLR4: Ensure ANTLR4 is installed and the runtime is available in your environment.

Steps

  1. Clone the Repository

    git clone https://github.com/davidmartinezhi/little-duck-language.git
    cd little-duck-language
  2. Create a Virtual Environment

    It's recommended to use a virtual environment to manage dependencies.

    python3 -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install Dependencies

    pip install -r requirements.txt
  4. Generate ANTLR4 Artifacts

    Ensure that ANTLR4 is installed and accessible. Generate the lexer and parser.

    antlr4 -Dlanguage=Python3 little_duck.g4 -o generated

    Replace little_duck.g4 with the path to your ANTLR4 grammar file if it's located elsewhere.

Usage

Running a Little Duck Program from a Text File

To execute a Little Duck program by providing it with a .txt or .duck file, follow these steps:

  1. Prepare Your Little Duck Source File

    Create a .duck or .txt file containing your Little Duck code. For example, create a file named program.duck with the following content:

    programa test_program;
    vars {
        a, b : entero;
    }
    inicio {
        a = 10;
        b = 20;
        si (a < b) haz {
            escribe("a is less than b");
        } sino haz {
            escribe("a is not less than b");
        }
    }
    fin
    
  2. Use the Driver Script to Compile and Execute

    The Driver.py script serves as the main entry point for compiling and running Little Duck programs. It handles parsing, semantic analysis, quadruple generation, and execution via the virtual machine.

    Driver Script Overview:

    """
    This script is a driver for parsing the Little Duck language using ANTLR4.
    It takes an input file as a command-line argument, tokenizes it with a lexer,
    parses it using the Little Duck parser, and performs semantic analysis using a custom listener.
    After parsing, it retrieves the generated quadruples and virtual memory,
    and runs the virtual machine to execute the program.
    """
    
    import sys
    from antlr4 import *
    from generated.little_duckLexer import little_duckLexer  # Generated lexer for Little Duck language
    from generated.little_duckParser import little_duckParser  # Generated parser for Little Duck language
    from src.custom_listener import LittleDuckCustomListener  # Custom listener for semantic analysis
    from src.custom_error_listener import LittleDuckErrorListener  # Custom error listener
    from semantics.virtual_machine import VirtualMachine  # Virtual Machine class
    
    
    def main(argv):
        if len(argv) != 2:
            print("Usage: python3 Driver.py <input_file>")
            return
    
        # Load the input file provided as a command-line argument and create an input stream
        input_file = argv[1]
        input_stream = FileStream(input_file)
    
        # Initialize the lexer with the input stream (breaks the input into tokens)
        lexer = little_duckLexer(input_stream)
    
        # Create a stream of tokens from the lexer output
        stream = CommonTokenStream(lexer)
    
        # Initialize the parser with the token stream (uses the tokens to create a parse tree)
        parser = little_duckParser(stream)
    
        # Remove default error listeners
        parser.removeErrorListeners()
    
        # Add the custom error listener
        error_listener = LittleDuckErrorListener()
        parser.addErrorListener(error_listener)
    
        # Start parsing the input according to the grammar rule 'programa' (the entry point of the grammar)
        tree = parser.programa()
    
        # Initialize the custom listener with traversal printing disabled
        listener = LittleDuckCustomListener(print_traversal=False)
    
        # Walk the parse tree with the custom listener to perform semantic actions
        walker = ParseTreeWalker()
        walker.walk(listener, tree)
    
        # Retrieve the quadruples, virtual memory, and function table
        quadruples = listener.quadruple_manager.quadruples
        virtual_memory = listener.virtual_memory
        function_table = listener.function_table
    
        # Initialize and run the virtual machine
        vm = VirtualMachine(quadruples, virtual_memory, function_table, print_traversal=False)
        vm.run()
    
    
    if __name__ == '__main__':
        main(sys.argv)
  3. Execute the Driver Script

    Run the Driver.py script with your source file as an argument.

    python3 Driver.py program.txt

    Expected Output:

    Generated Quadruples:
    0: ('GOTO', None, None, 1)
    1: ('=', 10000, None, 1000)          # a = 10
    2: ('=', 10001, None, 1001)          # b = 20
    3: ('<', 1000, 1001, 9000)           # t0 = a < b
    4: ('GOTOF', 9000, None, 7)          # if not t0, jump to quad 7
    5: ('print_str', 12000, None, None)  # print "a is less than b"
    6: ('print_newline', None, None, None)
    7: ('GOTO', None, None, 8)           # jump to quad 8
    8: ('print_str', 12001, None, None)  # print "a is not less than b"
    9: ('print_newline', None, None, None)
    10: ('END', None, None, None)         # End of program
    
    a is less than b
    

    Explanation:

    • The compiler generates quadruples based on the source code.
    • The virtual machine executes the quadruples, resulting in the printed message "a is less than b".

Running Tests

The project includes a comprehensive suite of pytest cases to validate quadruple generation and compiler functionality.

  1. Navigate to the Tests Directory

    cd tests
  2. Run Pytest

    pytest test_quadruples_generation.py

    This will execute all the test cases and report any failures or successes.

Project Structure

little-duck-language/
│
├── generated/                      # ANTLR4-generated lexer and parser
│   ├── little_duckLexer.py
│   ├── little_duckParser.py
│   └── little_duckListener.py
│
├── semantics/                      # Semantic analysis modules
│   ├── semantic_cube.py
│   ├── variable_table.py
│   ├── function_table.py
│   ├── stack.py
│   ├── quadruple.py
│   └── virtual_machine.py          # Virtual Machine implementation
│
├── src/                            # Source code
│   ├── custom_listener.py          # Custom ANTLR4 listener for semantic actions
│   └── custom_error_listener.py    # Custom error listener
│
├── tests/                          # Pytest test cases
│   └── test_quadruples_generation.py
│
├── Driver.py                       # Driver script for compiling and executing programs
├── little_duck.g4                  # ANTLR4 grammar file
├── requirements.txt                # Python dependencies
└── README.md                       # Project documentation

Example

Simple Assignment

Little Duck Code:

programa test_simple_assignment;
vars {
    a : entero;
}
inicio {
    a = 5;
}
fin

Expected Quadruples:

0: ('GOTO', None, None, 1)
1: ('=', 10000, None, 1000)  # a = 5
2: ('END', None, None, None)

Explanation:

  1. GOTO: Jumps to the inicio block.
  2. Assignment: Assigns the constant 5 (address 10000) to variable a (address 1000).
  3. END: Marks the end of the program.

Conditional Statement

Little Duck Code:

programa test_conditional_statement;
vars {
    a, b : entero;
}
inicio {
    a = 5;
    b = 3;
    si (a > b) haz {
        escribe("a is greater");
    }
}
fin

Expected Quadruples:

0: ('GOTO', None, None, 1)
1: ('=', 10000, None, 1000)          # a = 5
2: ('=', 10001, None, 1001)          # b = 3
3: ('>', 1000, 1001, 9000)           # t0 = a > b
4: ('GOTOF', 9000, None, 7)          # if not t0, jump to quad 7 (END)
5: ('print_str', 12000, None, None)  # print "a is greater"
6: ('print_newline', None, None, None)
7: ('END', None, None, None)

Explanation:

  1. GOTO: Jumps to the inicio block.
  2. Assignments: Assigns 5 to a and 3 to b.
  3. Comparison: Checks if a > b and stores the result in temporary address 9000.
  4. GOTOF: If the condition is false, jumps to the end.
  5. Print Statement: Prints "a is greater".
  6. Newline: Adds a newline after the print.
  7. END: Marks the end of the program.

Contributing

Contributions are welcome! Please follow these steps to contribute:

  1. Fork the Repository

  2. Create a Feature Branch

    git checkout -b feature/YourFeatureName
  3. Commit Your Changes

    git commit -m "Add your feature"
  4. Push to the Branch

    git push origin feature/YourFeatureName
  5. Open a Pull Request

    Describe your changes and submit the pull request for review.

License

This project is licensed under the MIT License.

Acknowledgements

  • ANTLR4 for providing powerful tools for language recognition.
  • Pytest for facilitating robust testing.
  • Contributors and the open-source community for their invaluable support and resources.

Feel free to reach out or open issues if you encounter any problems or have suggestions for improvements!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published