Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect stdlib benchmarks results #7599

Conversation

Akirathan
Copy link
Member

@Akirathan Akirathan commented Aug 17, 2023

Closes #7598

Notable changes to the bench_download.py script:

  • Added new -s|--source parameter that can be either engine or stdlib.
  • Data is downloaded asynchronously, which makes it blazingly fast - downloading all the benchmarks for the last 2 months without cache takes 5 seconds.
  • Cache is not used by default.
  • Issues a warning if the user wants to download benchmarks that are expired according to the GH policy.
    • The default retention policy for all the artifacts is 90 days. Cannot be changed.
  • Use proper xml std library to parse the bench reports instead of regular expression matching.

Changes to the generated website:

  • Add "filter by date" functionality.
  • By default, displays only benchmarks for the last month on the charts.
  • Minor style tweaks

@Akirathan Akirathan self-assigned this Aug 17, 2023
@Akirathan Akirathan added the CI: No changelog needed Do not require a changelog entry for this PR. label Aug 23, 2023
@Akirathan Akirathan marked this pull request as ready for review August 23, 2023 21:40
@radeusgd
Copy link
Member

Issues a warning if the user wants to download benchmarks that are expired according to the GH policy.

  • The default retention policy for all the artifacts is 90 days. Cannot be changed.

If we need to, it is rather straightforward to add a step to the benchmark job that not only uploads the results as a GH artifact, but also as a file to S3 - we already have/had such functionality. On S3 we can store the results indefinitely, if we want to.

@Akirathan
Copy link
Member Author

Issues a warning if the user wants to download benchmarks that are expired according to the GH policy.

  • The default retention policy for all the artifacts is 90 days. Cannot be changed.

If we need to, it is rather straightforward to add a step to the benchmark job that not only uploads the results as a GH artifact, but also as a file to S3 - we already have/had such functionality. On S3 we can store the results indefinitely, if we want to.

Upload to S3 would be very nice, that would solve a lot of problems. Do you know how to do that? Or refer me to the point when we used to do that?

Co-authored-by: GregoryTravis <[email protected]>
@Akirathan Akirathan added the CI: Ready to merge This PR is eligible for automatic merge label Aug 28, 2023
@radeusgd
Copy link
Member

Issues a warning if the user wants to download benchmarks that are expired according to the GH policy.

  • The default retention policy for all the artifacts is 90 days. Cannot be changed.

If we need to, it is rather straightforward to add a step to the benchmark job that not only uploads the results as a GH artifact, but also as a file to S3 - we already have/had such functionality. On S3 we can store the results indefinitely, if we want to.

Upload to S3 would be very nice, that would solve a lot of problems. Do you know how to do that? Or refer me to the point when we used to do that?

I've been writing this using the Github Actions a long time ago. I think we are still doing this, but I'm less familiar with the new Rust CI structure - I think

pub async fn update_manifest(repo_context: &impl IsRepo, edition_file: &Path) -> Result {
let bucket_context = BucketContext {
client: aws_sdk_s3::Client::new(&aws_config::load_from_env().await),
bucket: EDITIONS_BUCKET_NAME.to_string(),
upload_acl: ObjectCannedAcl::PublicRead,
key_prefix: Some(repo_context.name().to_string()),
};
let new_edition_name = Edition(
edition_file
.file_stem()
.context("Edition file path is missing filename stem!")?
.as_str()
.to_string(),
);
ide_ci::fs::expect_file(edition_file)?;
let manifest = bucket_context.get_yaml::<Manifest>(MANIFEST_FILENAME).await?;
debug!("Got manifest index from S3: {:#?}", manifest);
let (new_manifest, nightlies_to_remove) =
manifest.with_new_nightly(new_edition_name, NIGHTLY_EDITIONS_LIMIT);
for nightly_to_remove in nightlies_to_remove {
debug!("Should remove {}", nightly_to_remove);
}
let new_edition_filename =
edition_file.file_name().context("Edition file path is missing filename!")?;
bucket_context
.put(new_edition_filename.as_str(), ByteStream::from_path(&edition_file).await?)
.await?;
bucket_context.put_yaml("manifest.yaml", &new_manifest).await?;
Ok(())
}
may be a starting point but you may need to ask @mwu-tow for more details.

@Akirathan
Copy link
Member Author

Issues a warning if the user wants to download benchmarks that are expired according to the GH policy.

  • The default retention policy for all the artifacts is 90 days. Cannot be changed.

If we need to, it is rather straightforward to add a step to the benchmark job that not only uploads the results as a GH artifact, but also as a file to S3 - we already have/had such functionality. On S3 we can store the results indefinitely, if we want to.

Upload to S3 would be very nice, that would solve a lot of problems. Do you know how to do that? Or refer me to the point when we used to do that?

I've been writing this using the Github Actions a long time ago. I think we are still doing this, but I'm less familiar with the new Rust CI structure - I think

pub async fn update_manifest(repo_context: &impl IsRepo, edition_file: &Path) -> Result {
let bucket_context = BucketContext {
client: aws_sdk_s3::Client::new(&aws_config::load_from_env().await),
bucket: EDITIONS_BUCKET_NAME.to_string(),
upload_acl: ObjectCannedAcl::PublicRead,
key_prefix: Some(repo_context.name().to_string()),
};
let new_edition_name = Edition(
edition_file
.file_stem()
.context("Edition file path is missing filename stem!")?
.as_str()
.to_string(),
);
ide_ci::fs::expect_file(edition_file)?;
let manifest = bucket_context.get_yaml::<Manifest>(MANIFEST_FILENAME).await?;
debug!("Got manifest index from S3: {:#?}", manifest);
let (new_manifest, nightlies_to_remove) =
manifest.with_new_nightly(new_edition_name, NIGHTLY_EDITIONS_LIMIT);
for nightly_to_remove in nightlies_to_remove {
debug!("Should remove {}", nightly_to_remove);
}
let new_edition_filename =
edition_file.file_name().context("Edition file path is missing filename!")?;
bucket_context
.put(new_edition_filename.as_str(), ByteStream::from_path(&edition_file).await?)
.await?;
bucket_context.put_yaml("manifest.yaml", &new_manifest).await?;
Ok(())
}

may be a starting point but you may need to ask @mwu-tow for more details.

Thanks for the comments. I have created #7669 to track the results uploading. That would make the result processing much simpler.

@JaroslavTulach
Copy link
Member

When I run this script on my Ubuntu 22.04 I get:

$ python --version
Python 3.10.12
$ tools/performance/engine-benchmarks$ ./bench_download.py -s stdlib -b develop
Traceback (most recent call last):
  File "tools/performance/engine-benchmarks/./bench_download.py", line 908, in <module>
    asyncio.run(main())
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "tools/performance/engine-benchmarks/./bench_download.py", line 814, in main
    await get_bench_runs(since, until, branch, workflow_id)
  File "tools/performance/engine-benchmarks/./bench_download.py", line 429, in get_bench_runs
    async with asyncio.TaskGroup() as task_group:
AttributeError: module 'asyncio' has no attribute 'TaskGroup'

I assume this is caused by the attempt to download the results in parallel that is using "too advanced" features of Python for my Unbuntu 22.04. I'd prefer slower download, but greater portability.

@JaroslavTulach
Copy link
Member

Following patch makes the script work on my Python 3.10.2:

tools/performance/engine-benchmarks$ git diff
diff --git tools/performance/engine-benchmarks/bench_download.py tools/performance/engine-benchmarks/bench_download.py
index 55aa752df8..8ec3fb4d82 100755
--- tools/performance/engine-benchmarks/bench_download.py
+++ tools/performance/engine-benchmarks/bench_download.py
@@ -45,7 +45,6 @@ Dependencies for the script:
         - Used as a template engine for the HTML.
 """
 
-import asyncio
 import json
 import logging
 import logging.config
@@ -81,7 +80,7 @@ except ModuleNotFoundError as err:
 DATE_FORMAT = "%Y-%m-%d"
 ENGINE_BENCH_WORKFLOW_ID = 29450898
 """
-Workflow ID of engine benchmarks, got via `gh api 
+Workflow ID of engine benchmarks, got via `gh api
 '/repos/enso-org/enso/actions/workflows'`.
 The name of the workflow is 'Benchmark Engine'
 """
@@ -92,7 +91,7 @@ since 2023-08-22.
 """
 STDLIBS_BENCH_WORKFLOW_ID = 66661001
 """
-Workflod ID of stdlibs benchmarks, got via `gh api 
+Workflod ID of stdlibs benchmarks, got via `gh api
 '/repos/enso-org/enso/actions/workflows'`.
 The name is 'Benchmark Standard Libraries'
 """
@@ -297,7 +296,7 @@ def _read_json(json_file: str) -> Dict[Any, Any]:
         return json.load(f)
 
 
-async def _invoke_gh_api(endpoint: str,
+def _invoke_gh_api(endpoint: str,
                    query_params: Dict[str, str] = {},
                    result_as_text: bool = True) -> Union[Dict[str, Any], bytes]:
     query_str_list = [key + "=" + value for key, value in query_params.items()]
@@ -307,21 +306,18 @@ async def _invoke_gh_api(endpoint: str,
         "api",
         f"/repos/enso-org/enso{endpoint}" + ("" if len(query_str) == 0 else "?" + query_str)
     ]
-    logging.info(f"Starting subprocess `{' '.join(cmd)}`")
-    proc = await asyncio.create_subprocess_exec("gh", *cmd[1:],
-                                                stdout=subprocess.PIPE,
-                                                stderr=subprocess.PIPE)
-    out, err = await proc.communicate()
-    logging.info(f"Finished subprocess `{' '.join(cmd)}`")
-    if proc.returncode != 0:
-        print("Command `" + " ".join(cmd) + "` FAILED with errcode " + str(
-            proc.returncode))
-        print(err.decode())
-        exit(proc.returncode)
-    if result_as_text:
-        return json.loads(out.decode())
-    else:
-        return out
+    logging.info(f"Running subprocess `{' '.join(cmd)}`")
+    try:
+        ret = subprocess.run(cmd, check=True, text=result_as_text, capture_output=True)
+        if result_as_text:
+            return json.loads(ret.stdout)
+        else:
+            return ret.stdout
+    except subprocess.CalledProcessError as err:
+        print("Command `" + " ".join(cmd) + "` FAILED with errcode " + str(err.returncode))
+        print(err.stdout)
+        print(err.stderr)
+        exit(err.returncode)
 
 
 class Cache:
@@ -393,7 +389,7 @@ class FakeCache:
         return 0
 
 
-async def get_bench_runs(since: datetime, until: datetime, branch: str, workflow_id: int) -> List[JobRun]:
+def get_bench_runs(since: datetime, until: datetime, branch: str, workflow_id: int) -> List[JobRun]:
     """
     Fetches the list of all the job runs from the GH API for the specified `branch`.
     """
@@ -407,16 +403,16 @@ async def get_bench_runs(since: datetime, until: datetime, branch: str, workflow
         # Start with 1, just to determine the total count
         "per_page": "1"
     }
-    res = await _invoke_gh_api(f"/actions/workflows/{workflow_id}/runs", query_fields)
+    res = _invoke_gh_api(f"/actions/workflows/{workflow_id}/runs", query_fields)
     total_count = int(res["total_count"])
     per_page = 3
     logging.debug(f"Total count of all runs: {total_count} for workflow ID "
                   f"{workflow_id}. Will process {per_page} runs per page")
 
-    async def get_and_parse_run(page: int, parsed_bench_runs) -> None:
+    def get_and_parse_run(page: int, parsed_bench_runs) -> None:
         _query_fields = query_fields.copy()
         _query_fields["page"] = str(page)
-        res = await _invoke_gh_api(f"/actions/workflows/{workflow_id}/runs", _query_fields)
+        res = _invoke_gh_api(f"/actions/workflows/{workflow_id}/runs", _query_fields)
         bench_runs_json = res["workflow_runs"]
         _parsed_bench_runs = [_parse_bench_run_from_json(bench_run_json)
                               for bench_run_json in bench_runs_json]
@@ -426,15 +422,13 @@ async def get_bench_runs(since: datetime, until: datetime, branch: str, workflow
     query_fields["per_page"] = str(per_page)
     num_queries = math.ceil(total_count / per_page)
     parsed_bench_runs = []
-    async with asyncio.TaskGroup() as task_group:
-        # Page is indexed from 1
-        for page in range(1, num_queries + 1):
-            task_group.create_task(get_and_parse_run(page, parsed_bench_runs))
+    for page in range(1, num_queries + 1):
+        get_and_parse_run(page, parsed_bench_runs)
 
     return parsed_bench_runs
 
 
-async def get_bench_report(bench_run: JobRun, cache: Cache, temp_dir: str) -> Optional[JobReport]:
+def get_bench_report(bench_run: JobRun, cache: Cache, temp_dir: str) -> Optional[JobReport]:
     """
     Extracts some data from the given bench_run, which was fetched via the GH API,
     optionally getting it from the cache.
@@ -451,7 +445,7 @@ async def get_bench_report(bench_run: JobRun, cache: Cache, temp_dir: str) -> Op
     # There might be multiple artifacts in the artifact list for a benchmark run
     # We are looking for the one named 'Runtime Benchmark Report', which will
     # be downloaded as a ZIP file.
-    obj: Dict[str, Any] = await _invoke_gh_api(f"/actions/runs/{bench_run.id}/artifacts")
+    obj: Dict[str, Any] = _invoke_gh_api(f"/actions/runs/{bench_run.id}/artifacts")
     artifacts = obj["artifacts"]
     assert len(artifacts) == 1, "There should be exactly one artifact for a benchmark run"
     bench_report_artifact = artifacts[0]
@@ -466,7 +460,7 @@ async def get_bench_report(bench_run: JobRun, cache: Cache, temp_dir: str) -> Op
         return None
 
     # Get contents of the ZIP artifact file
-    artifact_ret = await _invoke_gh_api(f"/actions/artifacts/{artifact_id}/zip", result_as_text=False)
+    artifact_ret = _invoke_gh_api(f"/actions/artifacts/{artifact_id}/zip", result_as_text=False)
     zip_file_name = os.path.join(temp_dir, artifact_id + ".zip")
     logging.debug(f"Writing artifact ZIP content into {zip_file_name}")
     with open(zip_file_name, "wb") as zip_file:
@@ -683,7 +677,7 @@ def ensure_gh_installed() -> None:
         exit(1)
 
 
-async def main():
+if __name__ == '__main__':
     default_since: datetime = (datetime.now() - timedelta(days=14))
     default_until: datetime = datetime.now()
     default_cache_dir = path.expanduser("~/.cache/enso_bench_download")
@@ -808,27 +802,20 @@ async def main():
     """ Set of all gathered benchmark labels from all the job reports """
     job_reports_per_branch: Dict[str, List[JobReport]] = {}
     for branch in branches:
-        bench_runs: List[JobRun] = []
         for workflow_id in bench_source.workflow_ids():
-            bench_runs.extend(
-                await get_bench_runs(since, until, branch, workflow_id)
-            )
-        if len(bench_runs) == 0:
-            print(
-                f"No successful benchmarks found within period since {since}"
-                f" until {until} for branch {branch}")
-            exit(1)
+            bench_runs = get_bench_runs(since, until, branch, workflow_id)
+            if len(bench_runs) == 0:
+                print(
+                    f"No successful benchmarks found within period since {since}"
+                    f" until {until} for branch {branch}")
+                exit(1)
 
         job_reports: List[JobReport] = []
 
-        async def _process_report(_bench_run):
-            _job_report = await get_bench_report(_bench_run, cache, temp_dir)
-            if _job_report:
-                job_reports.append(_job_report)
-
-        async with asyncio.TaskGroup() as task_group:
-            for bench_run in bench_runs:
-                task_group.create_task(_process_report(bench_run))
+        for bench_run in bench_runs:
+            job_report = get_bench_report(bench_run, cache, temp_dir)
+            if job_report:
+                job_reports.append(job_report)
 
         logging.debug(f"Got {len(job_reports)} job reports for branch {branch}")
         if len(job_reports) == 0:
@@ -839,13 +826,15 @@ async def main():
 
         logging.debug("Sorting job_reports by commit date")
 
-        def _get_timestamp(job_report: JobReport) -> datetime:
+
+        def get_timestamp(job_report: JobReport) -> datetime:
             return datetime.strptime(
                 job_report.bench_run.head_commit.timestamp,
                 GH_DATE_FORMAT
             )
 
-        job_reports.sort(key=lambda report: _get_timestamp(report))
+
+        job_reports.sort(key=lambda report: get_timestamp(report))
 
         if create_csv:
             write_bench_reports_to_csv(job_reports, csv_output)
@@ -902,7 +891,3 @@ async def main():
     index_html_abs_path = path.abspath(site_path)
     print(f"The generated HTML is in {index_html_abs_path}")
     print(f"Open file://{index_html_abs_path} in the browser")
-
-
-if __name__ == "__main__":
-    asyncio.run(main())

@JaroslavTulach JaroslavTulach self-requested a review August 31, 2023 05:34
Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make the script portable, so it continues to run on Ubuntu 22.04 and/or GraalPy 17.

TaskGroup requires Python 3.10+
@Akirathan
Copy link
Member Author

@JaroslavTulach Fixed with 8484e6e - asyncio.TaskGroup requires Python 3.10+. I replaced it with an explicit list of tasks and tested the script on python:3.7.16 Docker image, it should work on your side.

I don't want to give up on asynchronous fetching, since I have changed the behavior of --use-cache to be false by default on purpose. Furthermore, I experienced some cache inconsistency bugs, and it was difficult to catch them, and with asynchronous fetching, you can generate a website with results from the last 2 months under 20 seconds.

@mergify mergify bot merged commit 50a9b39 into develop Sep 6, 2023
@mergify mergify bot deleted the wip/akirathan/7598-Ensure-that-stdlib-benchmark-reports-are-collected branch September 6, 2023 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: No changelog needed Do not require a changelog entry for this PR. CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ensure that stdlib benchmark reports are collected
5 participants