Collect stdlib benchmarks results #7599

Akirathan · 2023-08-17T11:01:19Z

Notable changes to the `bench_download.py` script:

Added new -s|--source parameter that can be either engine or stdlib.
Data is downloaded asynchronously, which makes it blazingly fast - downloading all the benchmarks for the last 2 months without cache takes 5 seconds.
Cache is not used by default.
Issues a warning if the user wants to download benchmarks that are expired according to the GH policy.
- The default retention policy for all the artifacts is 90 days. Cannot be changed.
Use proper xml std library to parse the bench reports instead of regular expression matching.

Changes to the generated website:

Add "filter by date" functionality.
By default, displays only benchmarks for the last month on the charts.
Minor style tweaks

std-bits/benchmarks/src/bench/java/org/enso/benchmarks/libs/LibBenchRunner.java

radeusgd · 2023-08-24T09:32:30Z

Issues a warning if the user wants to download benchmarks that are expired according to the GH policy.

The default retention policy for all the artifacts is 90 days. Cannot be changed.

If we need to, it is rather straightforward to add a step to the benchmark job that not only uploads the results as a GH artifact, but also as a file to S3 - we already have/had such functionality. On S3 we can store the results indefinitely, if we want to.

tools/performance/engine-benchmarks/bench_download.py

Akirathan · 2023-08-24T22:48:13Z

Issues a warning if the user wants to download benchmarks that are expired according to the GH policy.

The default retention policy for all the artifacts is 90 days. Cannot be changed.

If we need to, it is rather straightforward to add a step to the benchmark job that not only uploads the results as a GH artifact, but also as a file to S3 - we already have/had such functionality. On S3 we can store the results indefinitely, if we want to.

Upload to S3 would be very nice, that would solve a lot of problems. Do you know how to do that? Or refer me to the point when we used to do that?

Co-authored-by: GregoryTravis <[email protected]>

radeusgd · 2023-08-28T09:46:21Z

Issues a warning if the user wants to download benchmarks that are expired according to the GH policy.

The default retention policy for all the artifacts is 90 days. Cannot be changed.

If we need to, it is rather straightforward to add a step to the benchmark job that not only uploads the results as a GH artifact, but also as a file to S3 - we already have/had such functionality. On S3 we can store the results indefinitely, if we want to.

Upload to S3 would be very nice, that would solve a lot of problems. Do you know how to do that? Or refer me to the point when we used to do that?

I've been writing this using the Github Actions a long time ago. I think we are still doing this, but I'm less familiar with the new Rust CI structure - I think

enso/build/build/src/aws.rs

Lines 88 to 123 in b8edf6a

    
           pub async fn update_manifest(repo_context: &impl IsRepo, edition_file: &Path) -> Result { 
        
               let bucket_context = BucketContext { 
        
                   client:     aws_sdk_s3::Client::new(&aws_config::load_from_env().await), 
        
                   bucket:     EDITIONS_BUCKET_NAME.to_string(), 
        
                   upload_acl: ObjectCannedAcl::PublicRead, 
        
                   key_prefix: Some(repo_context.name().to_string()), 
        
               }; 
        
               let new_edition_name = Edition( 
        
                   edition_file 
        
                       .file_stem() 
        
                       .context("Edition file path is missing filename stem!")? 
        
                       .as_str() 
        
                       .to_string(), 
        
               ); 
        
               ide_ci::fs::expect_file(edition_file)?; 
        
               let manifest = bucket_context.get_yaml::<Manifest>(MANIFEST_FILENAME).await?; 
        
               debug!("Got manifest index from S3: {:#?}", manifest); 
        
               let (new_manifest, nightlies_to_remove) = 
        
                   manifest.with_new_nightly(new_edition_name, NIGHTLY_EDITIONS_LIMIT); 
        
               for nightly_to_remove in nightlies_to_remove { 
        
                   debug!("Should remove {}", nightly_to_remove); 
        
               } 
        
               let new_edition_filename = 
        
                   edition_file.file_name().context("Edition file path is missing filename!")?; 
        
               bucket_context 
        
                   .put(new_edition_filename.as_str(), ByteStream::from_path(&edition_file).await?) 
        
                   .await?; 
        
               bucket_context.put_yaml("manifest.yaml", &new_manifest).await?; 
        
               Ok(()) 
        
           }

may be a starting point but you may need to ask @mwu-tow for more details.

Akirathan · 2023-08-28T09:54:55Z

Issues a warning if the user wants to download benchmarks that are expired according to the GH policy.

The default retention policy for all the artifacts is 90 days. Cannot be changed.

If we need to, it is rather straightforward to add a step to the benchmark job that not only uploads the results as a GH artifact, but also as a file to S3 - we already have/had such functionality. On S3 we can store the results indefinitely, if we want to.

Upload to S3 would be very nice, that would solve a lot of problems. Do you know how to do that? Or refer me to the point when we used to do that?

I've been writing this using the Github Actions a long time ago. I think we are still doing this, but I'm less familiar with the new Rust CI structure - I think

enso/build/build/src/aws.rs

Lines 88 to 123 in b8edf6a

pub async fn update_manifest(repo_context: &impl IsRepo, edition_file: &Path) -> Result {

let bucket_context = BucketContext {

client: aws_sdk_s3::Client::new(&aws_config::load_from_env().await),

bucket: EDITIONS_BUCKET_NAME.to_string(),

upload_acl: ObjectCannedAcl::PublicRead,

key_prefix: Some(repo_context.name().to_string()),

};

let new_edition_name = Edition(

edition_file

.file_stem()

.context("Edition file path is missing filename stem!")?

.as_str()

.to_string(),

);

ide_ci::fs::expect_file(edition_file)?;

let manifest = bucket_context.get_yaml::<Manifest>(MANIFEST_FILENAME).await?;

debug!("Got manifest index from S3: {:#?}", manifest);

let (new_manifest, nightlies_to_remove) =

manifest.with_new_nightly(new_edition_name, NIGHTLY_EDITIONS_LIMIT);

for nightly_to_remove in nightlies_to_remove {

debug!("Should remove {}", nightly_to_remove);

}

let new_edition_filename =

edition_file.file_name().context("Edition file path is missing filename!")?;

bucket_context

.put(new_edition_filename.as_str(), ByteStream::from_path(&edition_file).await?)

.await?;

bucket_context.put_yaml("manifest.yaml", &new_manifest).await?;

Ok(())

}

may be a starting point but you may need to ask @mwu-tow for more details.

Thanks for the comments. I have created #7669 to track the results uploading. That would make the result processing much simpler.

JaroslavTulach · 2023-08-31T05:04:44Z

When I run this script on my Ubuntu 22.04 I get:

$ python --version
Python 3.10.12
$ tools/performance/engine-benchmarks$ ./bench_download.py -s stdlib -b develop
Traceback (most recent call last):
  File "tools/performance/engine-benchmarks/./bench_download.py", line 908, in <module>
    asyncio.run(main())
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "tools/performance/engine-benchmarks/./bench_download.py", line 814, in main
    await get_bench_runs(since, until, branch, workflow_id)
  File "tools/performance/engine-benchmarks/./bench_download.py", line 429, in get_bench_runs
    async with asyncio.TaskGroup() as task_group:
AttributeError: module 'asyncio' has no attribute 'TaskGroup'

I assume this is caused by the attempt to download the results in parallel that is using "too advanced" features of Python for my Unbuntu 22.04. I'd prefer slower download, but greater portability.

JaroslavTulach · 2023-08-31T05:34:04Z

Following patch makes the script work on my Python 3.10.2:

tools/performance/engine-benchmarks$ git diff
diff --git tools/performance/engine-benchmarks/bench_download.py tools/performance/engine-benchmarks/bench_download.py
index 55aa752df8..8ec3fb4d82 100755
--- tools/performance/engine-benchmarks/bench_download.py
+++ tools/performance/engine-benchmarks/bench_download.py
@@ -45,7 +45,6 @@ Dependencies for the script:
         - Used as a template engine for the HTML.
 """
 
-import asyncio
 import json
 import logging
 import logging.config
@@ -81,7 +80,7 @@ except ModuleNotFoundError as err:
 DATE_FORMAT = "%Y-%m-%d"
 ENGINE_BENCH_WORKFLOW_ID = 29450898
 """
-Workflow ID of engine benchmarks, got via `gh api 
+Workflow ID of engine benchmarks, got via `gh api
 '/repos/enso-org/enso/actions/workflows'`.
 The name of the workflow is 'Benchmark Engine'
 """
@@ -92,7 +91,7 @@ since 2023-08-22.
 """
 STDLIBS_BENCH_WORKFLOW_ID = 66661001
 """
-Workflod ID of stdlibs benchmarks, got via `gh api 
+Workflod ID of stdlibs benchmarks, got via `gh api
 '/repos/enso-org/enso/actions/workflows'`.
 The name is 'Benchmark Standard Libraries'
 """
@@ -297,7 +296,7 @@ def _read_json(json_file: str) -> Dict[Any, Any]:
         return json.load(f)
 
 
-async def _invoke_gh_api(endpoint: str,
+def _invoke_gh_api(endpoint: str,
                    query_params: Dict[str, str] = {},
                    result_as_text: bool = True) -> Union[Dict[str, Any], bytes]:
     query_str_list = [key + "=" + value for key, value in query_params.items()]
@@ -307,21 +306,18 @@ async def _invoke_gh_api(endpoint: str,
         "api",
         f"/repos/enso-org/enso{endpoint}" + ("" if len(query_str) == 0 else "?" + query_str)
     ]
-    logging.info(f"Starting subprocess `{' '.join(cmd)}`")
-    proc = await asyncio.create_subprocess_exec("gh", *cmd[1:],
-                                                stdout=subprocess.PIPE,
-                                                stderr=subprocess.PIPE)
-    out, err = await proc.communicate()
-    logging.info(f"Finished subprocess `{' '.join(cmd)}`")
-    if proc.returncode != 0:
-        print("Command `" + " ".join(cmd) + "` FAILED with errcode " + str(
-            proc.returncode))
-        print(err.decode())
-        exit(proc.returncode)
-    if result_as_text:
-        return json.loads(out.decode())
-    else:
-        return out
+    logging.info(f"Running subprocess `{' '.join(cmd)}`")
+    try:
+        ret = subprocess.run(cmd, check=True, text=result_as_text, capture_output=True)
+        if result_as_text:
+            return json.loads(ret.stdout)
+        else:
+            return ret.stdout
+    except subprocess.CalledProcessError as err:
+        print("Command `" + " ".join(cmd) + "` FAILED with errcode " + str(err.returncode))
+        print(err.stdout)
+        print(err.stderr)
+        exit(err.returncode)
 
 
 class Cache:
@@ -393,7 +389,7 @@ class FakeCache:
         return 0
 
 
-async def get_bench_runs(since: datetime, until: datetime, branch: str, workflow_id: int) -> List[JobRun]:
+def get_bench_runs(since: datetime, until: datetime, branch: str, workflow_id: int) -> List[JobRun]:
     """
     Fetches the list of all the job runs from the GH API for the specified `branch`.
     """
@@ -407,16 +403,16 @@ async def get_bench_runs(since: datetime, until: datetime, branch: str, workflow
         # Start with 1, just to determine the total count
         "per_page": "1"
     }
-    res = await _invoke_gh_api(f"/actions/workflows/{workflow_id}/runs", query_fields)
+    res = _invoke_gh_api(f"/actions/workflows/{workflow_id}/runs", query_fields)
     total_count = int(res["total_count"])
     per_page = 3
     logging.debug(f"Total count of all runs: {total_count} for workflow ID "
                   f"{workflow_id}. Will process {per_page} runs per page")
 
-    async def get_and_parse_run(page: int, parsed_bench_runs) -> None:
+    def get_and_parse_run(page: int, parsed_bench_runs) -> None:
         _query_fields = query_fields.copy()
         _query_fields["page"] = str(page)
-        res = await _invoke_gh_api(f"/actions/workflows/{workflow_id}/runs", _query_fields)
+        res = _invoke_gh_api(f"/actions/workflows/{workflow_id}/runs", _query_fields)
         bench_runs_json = res["workflow_runs"]
         _parsed_bench_runs = [_parse_bench_run_from_json(bench_run_json)
                               for bench_run_json in bench_runs_json]
@@ -426,15 +422,13 @@ async def get_bench_runs(since: datetime, until: datetime, branch: str, workflow
     query_fields["per_page"] = str(per_page)
     num_queries = math.ceil(total_count / per_page)
     parsed_bench_runs = []
-    async with asyncio.TaskGroup() as task_group:
-        # Page is indexed from 1
-        for page in range(1, num_queries + 1):
-            task_group.create_task(get_and_parse_run(page, parsed_bench_runs))
+    for page in range(1, num_queries + 1):
+        get_and_parse_run(page, parsed_bench_runs)
 
     return parsed_bench_runs
 
 
-async def get_bench_report(bench_run: JobRun, cache: Cache, temp_dir: str) -> Optional[JobReport]:
+def get_bench_report(bench_run: JobRun, cache: Cache, temp_dir: str) -> Optional[JobReport]:
     """
     Extracts some data from the given bench_run, which was fetched via the GH API,
     optionally getting it from the cache.
@@ -451,7 +445,7 @@ async def get_bench_report(bench_run: JobRun, cache: Cache, temp_dir: str) -> Op
     # There might be multiple artifacts in the artifact list for a benchmark run
     # We are looking for the one named 'Runtime Benchmark Report', which will
     # be downloaded as a ZIP file.
-    obj: Dict[str, Any] = await _invoke_gh_api(f"/actions/runs/{bench_run.id}/artifacts")
+    obj: Dict[str, Any] = _invoke_gh_api(f"/actions/runs/{bench_run.id}/artifacts")
     artifacts = obj["artifacts"]
     assert len(artifacts) == 1, "There should be exactly one artifact for a benchmark run"
     bench_report_artifact = artifacts[0]
@@ -466,7 +460,7 @@ async def get_bench_report(bench_run: JobRun, cache: Cache, temp_dir: str) -> Op
         return None
 
     # Get contents of the ZIP artifact file
-    artifact_ret = await _invoke_gh_api(f"/actions/artifacts/{artifact_id}/zip", result_as_text=False)
+    artifact_ret = _invoke_gh_api(f"/actions/artifacts/{artifact_id}/zip", result_as_text=False)
     zip_file_name = os.path.join(temp_dir, artifact_id + ".zip")
     logging.debug(f"Writing artifact ZIP content into {zip_file_name}")
     with open(zip_file_name, "wb") as zip_file:
@@ -683,7 +677,7 @@ def ensure_gh_installed() -> None:
         exit(1)
 
 
-async def main():
+if __name__ == '__main__':
     default_since: datetime = (datetime.now() - timedelta(days=14))
     default_until: datetime = datetime.now()
     default_cache_dir = path.expanduser("~/.cache/enso_bench_download")
@@ -808,27 +802,20 @@ async def main():
     """ Set of all gathered benchmark labels from all the job reports """
     job_reports_per_branch: Dict[str, List[JobReport]] = {}
     for branch in branches:
-        bench_runs: List[JobRun] = []
         for workflow_id in bench_source.workflow_ids():
-            bench_runs.extend(
-                await get_bench_runs(since, until, branch, workflow_id)
-            )
-        if len(bench_runs) == 0:
-            print(
-                f"No successful benchmarks found within period since {since}"
-                f" until {until} for branch {branch}")
-            exit(1)
+            bench_runs = get_bench_runs(since, until, branch, workflow_id)
+            if len(bench_runs) == 0:
+                print(
+                    f"No successful benchmarks found within period since {since}"
+                    f" until {until} for branch {branch}")
+                exit(1)
 
         job_reports: List[JobReport] = []
 
-        async def _process_report(_bench_run):
-            _job_report = await get_bench_report(_bench_run, cache, temp_dir)
-            if _job_report:
-                job_reports.append(_job_report)
-
-        async with asyncio.TaskGroup() as task_group:
-            for bench_run in bench_runs:
-                task_group.create_task(_process_report(bench_run))
+        for bench_run in bench_runs:
+            job_report = get_bench_report(bench_run, cache, temp_dir)
+            if job_report:
+                job_reports.append(job_report)
 
         logging.debug(f"Got {len(job_reports)} job reports for branch {branch}")
         if len(job_reports) == 0:
@@ -839,13 +826,15 @@ async def main():
 
         logging.debug("Sorting job_reports by commit date")
 
-        def _get_timestamp(job_report: JobReport) -> datetime:
+
+        def get_timestamp(job_report: JobReport) -> datetime:
             return datetime.strptime(
                 job_report.bench_run.head_commit.timestamp,
                 GH_DATE_FORMAT
             )
 
-        job_reports.sort(key=lambda report: _get_timestamp(report))
+
+        job_reports.sort(key=lambda report: get_timestamp(report))
 
         if create_csv:
             write_bench_reports_to_csv(job_reports, csv_output)
@@ -902,7 +891,3 @@ async def main():
     index_html_abs_path = path.abspath(site_path)
     print(f"The generated HTML is in {index_html_abs_path}")
     print(f"Open file://{index_html_abs_path} in the browser")
-
-
-if __name__ == "__main__":
-    asyncio.run(main())

JaroslavTulach

Please make the script portable, so it continues to run on Ubuntu 22.04 and/or GraalPy 17.

TaskGroup requires Python 3.10+

Akirathan · 2023-09-04T15:10:20Z

@JaroslavTulach Fixed with 8484e6e - asyncio.TaskGroup requires Python 3.10+. I replaced it with an explicit list of tasks and tested the script on python:3.7.16 Docker image, it should work on your side.

I don't want to give up on asynchronous fetching, since I have changed the behavior of --use-cache to be false by default on purpose. Furthermore, I experienced some cache inconsistency bugs, and it was difficult to catch them, and with asynchronous fetching, you can generate a website with results from the last 2 months under 20 seconds.

This reverts commit 340e9bc.

Co-authored-by: Radosław Waśko <[email protected]>

[TMP] LibBenchRunner generates just a single JMH benchmark

340e9bc

Akirathan self-assigned this Aug 17, 2023

Akirathan commented Aug 17, 2023

View reviewed changes

std-bits/benchmarks/src/bench/java/org/enso/benchmarks/libs/LibBenchRunner.java Outdated Show resolved Hide resolved

Akirathan added 8 commits August 17, 2023 18:08

[WIP] Style improvements to bench-downloader

dded777

Add since date filter functionality

7339149

Clear selection when apply filter clicked

8cf22f7

Add deps check to bench_download.py

0a509a3

bench report parsing uses XML ElementTree library

62a4ed4

Fix run without cache

fa5f5fc

Make the bench downloader asynchronous

bfc4f17

Issue warnings if since parameter is older than 90 days

68b748b

Akirathan added the CI: No changelog needed Do not require a changelog entry for this PR. label Aug 23, 2023

Akirathan marked this pull request as ready for review August 23, 2023 21:40

Akirathan requested review from jdunkerley, radeusgd, GregoryTravis, 4e6 and JaroslavTulach as code owners August 23, 2023 21:40

Akirathan added 12 commits August 23, 2023 23:49

Style section with filters

f0c3872

Remove compare runs

53b7b00

Add benchmark --source parameter

79467e2

bench reports XML files can have more than one score

9719a14

HTML title corresponds to the source

f64e968

Add generated_site dir to .gitignore

559467b

Update docs

914bf30

--source param is required

4498c0b

Reorder parameters - nicer help message

4ceace3

GH dependency is checked after printing help

ae4c431

Generated site path corresponds to the bench source

fa01155

fmt

dfbe929

JaroslavTulach approved these changes Aug 24, 2023

View reviewed changes

radeusgd reviewed Aug 24, 2023

View reviewed changes

tools/performance/engine-benchmarks/bench_download.py Outdated Show resolved Hide resolved

radeusgd approved these changes Aug 24, 2023

View reviewed changes

4e6 approved these changes Aug 24, 2023

View reviewed changes

GregoryTravis approved these changes Aug 24, 2023

View reviewed changes

tools/performance/engine-benchmarks/bench_download.py Outdated Show resolved Hide resolved

Fix typo

c8d26ff

Co-authored-by: GregoryTravis <[email protected]>

Akirathan added the CI: Ready to merge This PR is eligible for automatic merge label Aug 28, 2023

Akirathan mentioned this pull request Aug 28, 2023

Investigate how to upload bench results to AWS #7669

Closed

JaroslavTulach self-requested a review August 31, 2023 05:34

JaroslavTulach requested changes Aug 31, 2023

View reviewed changes

JaroslavTulach mentioned this pull request Aug 31, 2023

Standardization of Vector.sort array access #7697

Merged

2 tasks

Replace asyncio.TaskGroup with task list

8484e6e

TaskGroup requires Python 3.10+

Akirathan requested a review from JaroslavTulach September 4, 2023 16:51

Akirathan and others added 2 commits September 5, 2023 09:22

Revert "[TMP] LibBenchRunner generates just a single JMH benchmark"

a84ab49

This reverts commit 340e9bc.

Fix typo

f081b8d

Co-authored-by: Radosław Waśko <[email protected]>

JaroslavTulach approved these changes Sep 6, 2023

View reviewed changes

mergify bot merged commit 50a9b39 into develop Sep 6, 2023

mergify bot deleted the wip/akirathan/7598-Ensure-that-stdlib-benchmark-reports-are-collected branch September 6, 2023 14:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collect stdlib benchmarks results #7599

Collect stdlib benchmarks results #7599

Akirathan commented Aug 17, 2023 •

edited

Loading

radeusgd commented Aug 24, 2023

Akirathan commented Aug 24, 2023

radeusgd commented Aug 28, 2023

Akirathan commented Aug 28, 2023

JaroslavTulach commented Aug 31, 2023

JaroslavTulach commented Aug 31, 2023

JaroslavTulach left a comment

Akirathan commented Sep 4, 2023

Collect stdlib benchmarks results #7599

Collect stdlib benchmarks results #7599

Conversation

Akirathan commented Aug 17, 2023 • edited Loading

Notable changes to the bench_download.py script:

Changes to the generated website:

radeusgd commented Aug 24, 2023

Akirathan commented Aug 24, 2023

radeusgd commented Aug 28, 2023

Akirathan commented Aug 28, 2023

JaroslavTulach commented Aug 31, 2023

JaroslavTulach commented Aug 31, 2023

JaroslavTulach left a comment

Choose a reason for hiding this comment

Akirathan commented Sep 4, 2023

Akirathan commented Aug 17, 2023 •

edited

Loading

Notable changes to the `bench_download.py` script: