Dockerfile, Leaderboard generation, Minor fixes

ashwinprasadme · ashwinprasadme · commit ed51ca2c9b51 · 2023-10-20T17:43:26.000+02:00
diff --git a/Dockerfile b/Dockerfile
@@ -0,0 +1,12 @@
+FROM python:3.10-slim
+
+WORKDIR /app
+
+COPY . /app
+
+RUN pip install --trusted-host pypi.python.org -r requirements.txt
+
+WORKDIR /app/src
+
+# Run app.py when the container launches
+ENTRYPOINT ["python", "main_runner.py"]
diff --git a/README.md b/README.md
@@ -1,83 +1,149 @@
 # TypeEvalPy: A Micro-benchmarking Framework for Python Type Inference Tools
 
 <p align="center">
+<br>
 <img src="TypeEvalPy.jpg" width="60%" align="center">
+<br>
 </p>
 
-
 ## 📌 **Key Features**:
+
 - 📜 Contains **154 code snippets** to test and benchmark.
 - 🏷 Offers **845 type annotations** across a diverse set of Python functionalities.
 - 📂 Organized into **18 distinct categories** targeting various Python features.
 - 🚢 Seamlessly manages the execution of **containerized tools**.
 - 🔄 Efficiently transforms inferred types into a **standardized format**.
-- 📊 Produces **meaningful metrics** for in-depth assessment and comparison.
+- 📊 Automatically produces **meaningful metrics** for in-depth assessment and comparison.
+
+---
+
+## 🛠️ Supported Tools
+
+| Supported :white_check_mark:                               | In-progress :wrench:                                                 | Planned :bulb:                             |
+| -------------------------------------------------------- | -------------------------------------------------------------------- | ------------------------------------------ |
+| [HeaderGen](https://github.com/ashwinprasadme/headergen) | [Intellij PSI](https://plugins.jetbrains.com/docs/intellij/psi.html) | [Llama 2](https://ai.meta.com/llama/)      |
+| [Jedi](https://github.com/davidhalter/jedi)              | [Pyre](https://github.com/facebook/pyre-check)                       | [ChatGPT](https://openai.com/blog/chatgpt) |
+| [Pyright](https://github.com/microsoft/pyright)          | [PySonar2](https://github.com/yinwang0/pysonar2)                     |
+| [HiTyper](https://github.com/JohnnyPeng18/HiTyper)       | [Pytype](https://github.com/google/pytype)                           |
+| [Scalpel](https://github.com/SMAT-Lab/Scalpel/issues)    | [TypeT5](https://github.com/utopia-group/TypeT5)                     |
+| [Type4Py](https://github.com/saltudelft/type4py)         |                                                                      |
 
 ---
 
 ## 🏆 TypeEvalPy Leaderboard
 
-Below is a comparison showcasing exact matches across different tools, coupled with `top_n` predictions for our ML-based tools.
+Below is a comparison showcasing exact matches across different tools, coupled with `top_n` predictions for ML-based tools.
 
-| 🛠️ Tool | Top-n | Function Return Type | Function Parameter Type | Local Variable Type | Total |
-|----|----|----|----|----|----|
-| **[HeaderGen](https://github.com/ashwinprasadme/headergen)** | 1 | 186 | 56 | 321 | 563 |
-| **[Jedi](https://github.com/davidhalter/jedi)** | 1 | 122 | 0 | 293 | 415 |
-| **[Pyright](https://github.com/microsoft/pyright)** | 1 | 100 | 8 | 297 | 405 |
-| **[HiTyper](https://github.com/JohnnyPeng18/HiTyper)** | 1<br>3<br>5 | 163<br>173<br>175 | 27<br>37<br>37 | 179<br>225<br>229 | 369<br>435<br>441 |
-| **[HiTyper (static)](https://github.com/JohnnyPeng18/HiTyper)** | 1 | 141 | 7 | 102 | 250 |
-| **[Scalpel](https://github.com/SMAT-Lab/Scalpel/issues)** | 1 | 155 | 32 | 6 | 193 |
-| **[Type4Py](https://github.com/saltudelft/type4py)** | 1<br>3<br>5 | 39<br>103<br>109 | 19<br>31<br>31 | 99<br>167<br>174 | 157<br>301<br>314 |
+| Rank | 🛠️ Tool | Top-n | Function Return Type | Function Parameter Type | Local Variable Type | Total |
+|----|----|----|----|----|----|----|
+| 1 | **[HeaderGen](https://github.com/ashwinprasadme/headergen)** | 1 | 186 | 56 | 322 | 564 |
+| 2 | **[Jedi](https://github.com/davidhalter/jedi)** | 1 | 122 | 0 | 293 | 415 |
+| 3 | **[Pyright](https://github.com/microsoft/pyright)** | 1 | 100 | 8 | 297 | 405 |
+| 4 | **[HiTyper](https://github.com/JohnnyPeng18/HiTyper)** | 1<br>3<br>5 | 163<br>173<br>175 | 27<br>37<br>37 | 179<br>225<br>229 | 369<br>435<br>441 |
+| 5 | **[HiTyper (static)](https://github.com/JohnnyPeng18/HiTyper)** | 1 | 141 | 7 | 102 | 250 |
+| 6 | **[Scalpel](https://github.com/SMAT-Lab/Scalpel/issues)** | 1 | 155 | 32 | 6 | 193 |
+| 7 | **[Type4Py](https://github.com/saltudelft/type4py)** | 1<br>3<br>5 | 39<br>103<br>109 | 19<br>31<br>31 | 99<br>167<br>174 | 157<br>301<br>314 |
+
+*<sub>(Auto-generated based on the the analysis run on 20-10-23 14:51)</sub>*
 
 ---
+## :whale: Running with Docker
+
+### 1️⃣ Clone the repo
+
+```bash
+git clone https://github.com/ashwinprasadme/TypeEvalPy.git
+```
+
+### 2️⃣ Build Docker image
+
+```bash
+docker build -t typeevalpy .
+```
+
+### 3️⃣ Run TypeEvalPy
 
-## 📥 Installation
+🕒 Takes about 30mins on first run to build Docker containers.
 
-1. **Install Dependencies and Set Up Virtual Environment**
+📂 Results will be generated in the `results` folder within the root directory of the repository.
+Each results folder will have a timestamp, allowing you to easily track and compare different runs.
 
-   Run the following commands to set up your virtual environment and activate the virtual environment.
+```bash
+docker run \
+      -v /var/run/docker.sock:/var/run/docker.sock \
+      -v ./results:/app/results \
+      typeevalpy
+```
 
-   ```bash
-   python3 -m venv .env
-   ```
+🔧 **Optionally**, run analysis on specific tools:
 
-   ```bash
-   source .env/bin/activate
-   ```
+```bash
+docker run \
+      -v /var/run/docker.sock:/var/run/docker.sock \
+      -v ./results:/app/results \
+      typeevalpy --runners headergen scalpel
+```
 
-   ```bash
-   pip install -r requirements.txt
-   ```
+🛠️ Available options: `headergen`, `pyright`, `scalpel`, `jedi`, `hityper`, `type4py`, `hityperdl`
 
 ---
 
-## 🚀 Usage: Running the Analysis
+<details>
+  <summary><b>Running From Source...</b></summary>
+
+   ## 1. 📥 Installation
+
+   1. **Clone the repo**
+
+      ```bash
+      git clone https://github.com/ashwinprasadme/TypeEvalPy.git
+      ```
+
+
+   2. **Install Dependencies and Set Up Virtual Environment**
+
+      Run the following commands to set up your virtual environment and activate the virtual environment.
+
+      ```bash
+      python3 -m venv .env
+      ```
+
+      ```bash
+      source .env/bin/activate
+      ```
+
+      ```bash
+      pip install -r requirements.txt
+      ```
+
+   ---
+
+   ## 2. 🚀 Usage: Running the Analysis
 
-1. **Navigate to the `src` Directory**
+   1. **Navigate to the `src` Directory**
 
-   ```bash
-   cd src
-   ```
+      ```bash
+      cd src
+      ```
 
-2. **Execute the Analyzer**
+   2. **Execute the Analyzer**
 
-   Run the following command to start the benchmarking process on all tools:
+      Run the following command to start the benchmarking process on all tools:
 
-   ```bash
-   python main_runner.py
-   ```
+      ```bash
+      python main_runner.py
+      ```
 
-   or
+      or
 
-   Run analysis on specific tools
+      Run analysis on specific tools
 
-   ```
-   python main_runner.py --runners headergen hityperdl
-   ```
+      ```
+      python main_runner.py --runners headergen scalpel
+      ```
 
-   Available options: headergen, pyright, scalpel, jedi, hityper, type4py, hityperdl
+</details>
 
-   The results will be generated in the `.results` folder within the root directory of the repository. Each results folder will have a timestamp, allowing you to easily track and compare different runs.
 
 ---
 
diff --git a/src/main_analyze_results.py b/src/main_analyze_results.py
@@ -969,27 +969,27 @@ def run_results_analyzer():
     results_dir = results_dir / "analysis_results"
     # Move logs
     # TODO: Improve generation of logs files
-    os.rename("results_analysis.log", f"{str(results_dir)}/results_analysis.log")
-    os.rename(
+    shutil.move("results_analysis.log", f"{str(results_dir)}/results_analysis.log")
+    shutil.move(
         "results_analysis_info.log", f"{str(results_dir)}/results_analysis_info.log"
     )
-    os.rename(
+    shutil.move(
         "tools_error_result_data.csv",
         f"{str(results_dir)}/tools_error_result_data.csv",
     )
-    os.rename(
+    shutil.move(
         "tools_sound_complete_data.csv",
         f"{str(results_dir)}/tools_sound_complete_data.csv",
     )
-    os.rename(
+    shutil.move(
         "tools_exact_match_data.csv",
         f"{str(results_dir)}/tools_exact_match_data.csv",
     )
-    os.rename(
+    shutil.move(
         "tools_exact_match_category_data.csv",
         f"{str(results_dir)}/tools_exact_match_category_data.csv",
     )
-    os.rename(
+    shutil.move(
         "tools_sensitivities_data.csv",
         f"{str(results_dir)}/tools_sensitivities_data.csv",
     )
@@ -998,23 +998,23 @@ def run_results_analyzer():
     os.makedirs(results_dir / "missing", exist_ok=True)
     os.makedirs(results_dir / "paper_tables", exist_ok=True)
 
-    os.rename(
+    shutil.move(
         "paper_table_1.csv",
         f"{str(results_dir)}/paper_tables/paper_table_1.csv",
     )
-    os.rename(
+    shutil.move(
         "paper_table_2.csv",
         f"{str(results_dir)}/paper_tables/paper_table_2.csv",
     )
-    os.rename(
+    shutil.move(
         "paper_table_3.csv",
         f"{str(results_dir)}/paper_tables/paper_table_3.csv",
     )
     shutil.copy(
         f"{str(results_dir)}/tools_sound_complete_data.csv",
         f"{str(results_dir)}/paper_tables/paper_table_4.csv",
     )
-    os.rename(
+    shutil.move(
         "paper_table_5.csv",
         f"{str(results_dir)}/paper_tables/paper_table_5.csv",
     )
@@ -1023,20 +1023,20 @@ def run_results_analyzer():
         f"{str(results_dir)}/paper_tables/paper_table_6.csv",
     )
     for tool in list(tools_results.keys()):
-        os.rename(
+        shutil.move(
             f"{tool}_mismatches_reasons.csv",
             f"{str(results_dir)}/mismatches/{tool}_mismatches_reasons.csv",
         )
-        os.rename(
+        shutil.move(
             f"{tool}_not_found_reasons.csv",
             f"{str(results_dir)}/missing/{tool}_not_found_reasons.csv",
         )
 
         if tool in utils.ML_TOOLS:
-            os.rename(
+            shutil.move(
                 f"top_n_table_{tool}.csv", f"{str(results_dir)}/top_n_table_{tool}.csv"
             )
-            os.rename(
+            shutil.move(
                 f"top_n_exact_match_table_{tool}.csv",
                 f"{str(results_dir)}/top_n_exact_match_table_{tool}.csv",
             )
diff --git a/src/main_runner.py b/src/main_runner.py
@@ -1,5 +1,6 @@
 import logging
 import os
+import shutil
 import tarfile
 import time
 from argparse import ArgumentParser
@@ -299,7 +300,9 @@ def get_args():
 
 def main():
     args = get_args()
-    host_results_path = f"../results/results_{datetime.now().strftime('%d-%m %H:%M')}"
+    host_results_path = (
+        f"../results/results_{datetime.now().strftime('%d-%m-%y %H:%M')}"
+    )
 
     available_runners = {
         "headergen": (
@@ -345,7 +348,7 @@ def main():
 
     run_results_analyzer()
 
-    os.rename("main_runner.log", f"{str(host_results_path)}/main_runner.log")
+    shutil.move("main_runner.log", f"{str(host_results_path)}/main_runner.log")
 
 
 if __name__ == "__main__":
diff --git a/src/result_analyzer/README_template.md b/src/result_analyzer/README_template.md
diff --git a/src/result_analyzer/analysis_leaderboard.py b/src/result_analyzer/analysis_leaderboard.py

Original file line number	Diff line number	Diff line change
`@@ -969,27 +969,27 @@ def run_results_analyzer():`
`969`	`969`	`results_dir = results_dir / "analysis_results"`
`970`	`970`	`# Move logs`
`971`	`971`	`# TODO: Improve generation of logs files`
`972`		`- os.rename("results_analysis.log", f"{str(results_dir)}/results_analysis.log")`
`973`		`- os.rename(`
	`972`	`+ shutil.move("results_analysis.log", f"{str(results_dir)}/results_analysis.log")`
	`973`	`+ shutil.move(`
`974`	`974`	`"results_analysis_info.log", f"{str(results_dir)}/results_analysis_info.log"`
`975`	`975`	`)`
`976`		`- os.rename(`
	`976`	`+ shutil.move(`
`977`	`977`	`"tools_error_result_data.csv",`
`978`	`978`	`f"{str(results_dir)}/tools_error_result_data.csv",`
`979`	`979`	`)`
`980`		`- os.rename(`
	`980`	`+ shutil.move(`
`981`	`981`	`"tools_sound_complete_data.csv",`
`982`	`982`	`f"{str(results_dir)}/tools_sound_complete_data.csv",`
`983`	`983`	`)`
`984`		`- os.rename(`
	`984`	`+ shutil.move(`
`985`	`985`	`"tools_exact_match_data.csv",`
`986`	`986`	`f"{str(results_dir)}/tools_exact_match_data.csv",`
`987`	`987`	`)`
`988`		`- os.rename(`
	`988`	`+ shutil.move(`
`989`	`989`	`"tools_exact_match_category_data.csv",`
`990`	`990`	`f"{str(results_dir)}/tools_exact_match_category_data.csv",`
`991`	`991`	`)`
`992`		`- os.rename(`
	`992`	`+ shutil.move(`
`993`	`993`	`"tools_sensitivities_data.csv",`
`994`	`994`	`f"{str(results_dir)}/tools_sensitivities_data.csv",`
`995`	`995`	`)`
`@@ -998,23 +998,23 @@ def run_results_analyzer():`
`998`	`998`	`os.makedirs(results_dir / "missing", exist_ok=True)`
`999`	`999`	`os.makedirs(results_dir / "paper_tables", exist_ok=True)`
`1000`	`1000`
`1001`		`- os.rename(`
	`1001`	`+ shutil.move(`
`1002`	`1002`	`"paper_table_1.csv",`
`1003`	`1003`	`f"{str(results_dir)}/paper_tables/paper_table_1.csv",`
`1004`	`1004`	`)`
`1005`		`- os.rename(`
	`1005`	`+ shutil.move(`
`1006`	`1006`	`"paper_table_2.csv",`
`1007`	`1007`	`f"{str(results_dir)}/paper_tables/paper_table_2.csv",`
`1008`	`1008`	`)`
`1009`		`- os.rename(`
	`1009`	`+ shutil.move(`
`1010`	`1010`	`"paper_table_3.csv",`
`1011`	`1011`	`f"{str(results_dir)}/paper_tables/paper_table_3.csv",`
`1012`	`1012`	`)`
`1013`	`1013`	`shutil.copy(`
`1014`	`1014`	`f"{str(results_dir)}/tools_sound_complete_data.csv",`
`1015`	`1015`	`f"{str(results_dir)}/paper_tables/paper_table_4.csv",`
`1016`	`1016`	`)`
`1017`		`- os.rename(`
	`1017`	`+ shutil.move(`
`1018`	`1018`	`"paper_table_5.csv",`
`1019`	`1019`	`f"{str(results_dir)}/paper_tables/paper_table_5.csv",`
`1020`	`1020`	`)`
`@@ -1023,20 +1023,20 @@ def run_results_analyzer():`
`1023`	`1023`	`f"{str(results_dir)}/paper_tables/paper_table_6.csv",`
`1024`	`1024`	`)`
`1025`	`1025`	`for tool in list(tools_results.keys()):`
`1026`		`- os.rename(`
	`1026`	`+ shutil.move(`
`1027`	`1027`	`f"{tool}_mismatches_reasons.csv",`
`1028`	`1028`	`f"{str(results_dir)}/mismatches/{tool}_mismatches_reasons.csv",`
`1029`	`1029`	`)`
`1030`		`- os.rename(`
	`1030`	`+ shutil.move(`
`1031`	`1031`	`f"{tool}_not_found_reasons.csv",`
`1032`	`1032`	`f"{str(results_dir)}/missing/{tool}_not_found_reasons.csv",`
`1033`	`1033`	`)`
`1034`	`1034`
`1035`	`1035`	`if tool in utils.ML_TOOLS:`
`1036`		`- os.rename(`
	`1036`	`+ shutil.move(`
`1037`	`1037`	`f"top_n_table_{tool}.csv", f"{str(results_dir)}/top_n_table_{tool}.csv"`
`1038`	`1038`	`)`
`1039`		`- os.rename(`
	`1039`	`+ shutil.move(`
`1040`	`1040`	`f"top_n_exact_match_table_{tool}.csv",`
`1041`	`1041`	`f"{str(results_dir)}/top_n_exact_match_table_{tool}.csv",`
`1042`	`1042`	`)`