Skip to content

Commit b4fa4fd

Browse files
authored
Generate Kani Metrics (#235)
Add a new "Kani Metrics" workflow (that runs every week) that calls `./scripts/run-kani --run metrics`, then creates a pull request to this repository with the computed metrics. See [here](https://github.com/carolynzech/verify-rust-std/pull/38) for an example of what the pull request will look like. Callouts: - This is a separate workflow from the Kani workflow because it is a cronjob instead of running on every pull request. I thought the plots would be too noisy if we ran on every PR. - See the "Notes" section of `kani_std_analysis.py` for other callouts. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses.
1 parent 2b2baa8 commit b4fa4fd

File tree

6 files changed

+384
-4
lines changed

6 files changed

+384
-4
lines changed

.github/workflows/kani-metrics.yml

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
name: Kani Metrics Update
2+
3+
on:
4+
schedule:
5+
- cron: '0 0 * * 0' # Run at 00:00 UTC every Sunday
6+
workflow_dispatch:
7+
8+
defaults:
9+
run:
10+
shell: bash
11+
12+
jobs:
13+
update-kani-metrics:
14+
runs-on: ubuntu-latest
15+
16+
steps:
17+
- name: Checkout Repository
18+
uses: actions/checkout@v4
19+
with:
20+
submodules: true
21+
22+
# The Kani metrics collection uses a Python script (kani_std_analysis.py), so make sure Python is installed
23+
- name: Set up Python
24+
uses: actions/setup-python@v4
25+
with:
26+
python-version: '3.x'
27+
28+
- name: Compute Kani Metrics
29+
run: ./scripts/run-kani.sh --run metrics --path ${{github.workspace}}
30+
31+
- name: Create Pull Request
32+
uses: peter-evans/create-pull-request@v7
33+
with:
34+
commit-message: Update Kani metrics
35+
title: 'Update Kani Metrics'
36+
body: |
37+
This is an automated PR to update Kani metrics.
38+
39+
The metrics have been updated by running `./scripts/run-kani.sh --run metrics`.
40+
branch: update-kani-metrics
41+
delete-branch: true
42+
base: main

.gitignore

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,6 @@ package-lock.json
4646
## Kani
4747
*.out
4848

49-
5049
# Added by cargo
5150
#
5251
# already existing elements were commented out
Lines changed: 307 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,307 @@
1+
#!/usr/bin/env python3
2+
3+
import argparse
4+
from collections import defaultdict
5+
import csv
6+
import json
7+
import sys
8+
from datetime import datetime
9+
import matplotlib.pyplot as plt
10+
import matplotlib.dates as mdates
11+
from matplotlib.ticker import FixedLocator
12+
13+
# Output metrics about Kani's application to the standard library by:
14+
# 1. Postprocessing results from running `kani list`, which collects data about Kani harnesses and contracts.
15+
# 2. Postprocessing results from std-analysis.sh, which outputs metrics about the standard library unrelated to Kani (e.g., a list of the functions in each crate).
16+
# 3. Comparing the output of steps #1 and #2 to compare the Kani-verified portion of the standard library to the overall library,
17+
# e.g., what fraction of unsafe functions have Kani contracts.
18+
19+
# Notes:
20+
# - This script assumes that std-analysis.sh and `kani list` have already run, since we expect to invoke this script from `run-kani.sh`.
21+
# - The results are architecture-dependent: the standard library has functions that are only present for certain architectures,
22+
# and https://github.com/model-checking/verify-rust-std/pull/122 has Kani harnesses that only run on x86-64.
23+
# - The total functions under contract are not necessarily equal to the sum of unsafe and safe functions under contract.
24+
# We've added a few functions (three, as of writing) with contracts to this fork, e.g. ffi::c_str::is_null_terminated.
25+
# Since std-analysis.sh runs on the standard library given a toolchain (not this fork), it doesn't know about this function and therefore can't classify it as safe or unsafe.
26+
# But `kani list` runs on this fork, so it can still see it and add it to the total functions under contract.
27+
# - See #TODOs for known limitations.
28+
29+
# Process the results from Kani's std-analysis.sh script for each crate.
30+
# TODO For now, we just handle "core", but we should process all crates in the library.
31+
class GenericSTDMetrics():
32+
def __init__(self, results_dir):
33+
self.results_directory = results_dir
34+
self.unsafe_fns_count = 0
35+
self.safe_abstractions_count = 0
36+
self.safe_fns_count = 0
37+
self.unsafe_fns = []
38+
self.safe_abstractions = []
39+
self.safe_fns = []
40+
41+
self.read_std_analysis()
42+
43+
# Read {crate}_overall_counts.csv
44+
# and return the number of unsafe functions and safe abstractions
45+
def read_overall_counts(self):
46+
file_path = f"{self.results_directory}/core_scan_overall.csv"
47+
with open(file_path, 'r') as f:
48+
csv_reader = csv.reader(f, delimiter=';')
49+
counts = {row[0]: int(row[1]) for row in csv_reader if len(row) >= 2}
50+
self.unsafe_fns_count = counts.get('unsafe_fns', 0)
51+
self.safe_abstractions_count = counts.get('safe_abstractions', 0)
52+
self.safe_fns_count = counts.get('safe_fns', 0)
53+
54+
# Read {crate}_scan_functions.csv
55+
# and return an array of the unsafe functions and the safe abstractions
56+
def read_scan_functions(self):
57+
expected_header_start = "name;is_unsafe;has_unsafe_ops"
58+
file_path = f"{self.results_directory}/core_scan_functions.csv"
59+
60+
with open(file_path, 'r') as f:
61+
csv_reader = csv.reader(f, delimiter=';', quotechar='"')
62+
63+
# The row parsing logic below assumes the column structure in expected_header_start,
64+
# so assert that is how the header begins before continuing
65+
header = next(csv_reader)
66+
header_str = ';'.join(header[:3])
67+
if not header_str.startswith(expected_header_start):
68+
print(f"Error: Unexpected CSV header in {file_path}")
69+
print(f"Expected header to start with: {expected_header_start}")
70+
print(f"Actual header: {header_str}")
71+
sys.exit(1)
72+
73+
for row in csv_reader:
74+
if len(row) >= 3:
75+
name, is_unsafe, has_unsafe_ops = row[0], row[1], row[2]
76+
# An unsafe function is a function for which is_unsafe=true
77+
if is_unsafe.strip() == "true":
78+
self.unsafe_fns.append(name)
79+
else:
80+
assert is_unsafe.strip() == "false" # sanity check against malformed data
81+
self.safe_fns.append(name)
82+
# A safe abstraction is a safe function with unsafe ops
83+
if has_unsafe_ops.strip() == "true":
84+
self.safe_abstractions.append(name)
85+
86+
def read_std_analysis(self):
87+
self.read_overall_counts()
88+
self.read_scan_functions()
89+
90+
# Sanity checks
91+
if len(self.unsafe_fns) != self.unsafe_fns_count:
92+
print(f"Number of unsafe functions does not match core_scan_functions.csv")
93+
print(f"UNSAFE_FNS_COUNT: {self.unsafe_fns_count}")
94+
print(f"UNSAFE_FNS length: {len(self.unsafe_fns)}")
95+
sys.exit(1)
96+
97+
if len(self.safe_abstractions) != self.safe_abstractions_count:
98+
print(f"Number of safe abstractions does not match core_scan_functions.csv")
99+
print(f"SAFE_ABSTRACTIONS_COUNT: {self.safe_abstractions_count}")
100+
print(f"SAFE_ABSTRACTIONS length: {len(self.safe_abstractions)}")
101+
sys.exit(1)
102+
103+
if len(self.safe_fns) != self.safe_fns_count:
104+
print(f"Number of safe functions does not match core_scan_functions.csv")
105+
print(f"SAFE_FNS_COUNT: {self.safe_fns_count}")
106+
print(f"SAFE_FNS length: {len(self.safe_fns)}")
107+
sys.exit(1)
108+
109+
# Process the results of running `kani list` against the standard library,
110+
# i.e., the Kani STD metrics for a single date (whichever day this script is running).
111+
class KaniListSTDMetrics():
112+
def __init__(self, kani_list_filepath):
113+
self.total_fns_under_contract = 0
114+
# List of (fn, has_harnesses) tuples, where fn is a function under contract and has_harnesses=true iff the contract has at least one harness
115+
self.fns_under_contract = []
116+
self.expected_kani_list_version = "0.1"
117+
118+
self.read_kani_list_data(kani_list_filepath)
119+
120+
def read_kani_list_data(self, kani_list_filepath):
121+
try:
122+
with open(kani_list_filepath, 'r') as f:
123+
kani_list_data = json.load(f)
124+
except FileNotFoundError:
125+
print(f"Kani list JSON file not found.")
126+
sys.exit(1)
127+
128+
if kani_list_data.get("file-version") != self.expected_kani_list_version:
129+
print(f"Kani list JSON file version does not match expected version {self.expected_kani_list_version}")
130+
sys.exit(1)
131+
132+
for contract in kani_list_data.get('contracts', []):
133+
func_under_contract = contract.get('function', '')
134+
has_harnesses = len(contract.get('harnesses', [])) > 0
135+
self.fns_under_contract.append((func_under_contract, has_harnesses))
136+
137+
self.total_fns_under_contract = kani_list_data.get('totals', {}).get('functions-under-contract', 0)
138+
139+
# Generate metrics about Kani's application to the standard library over time
140+
# by reading past metrics from metrics_file, then computing today's metrics.
141+
class KaniSTDMetricsOverTime():
142+
def __init__(self, metrics_file):
143+
self.dates = []
144+
self.unsafe_metrics = ['total_unsafe_fns', 'unsafe_fns_under_contract', 'verified_unsafe_fns_under_contract']
145+
self.safe_abstr_metrics = ['total_safe_abstractions', 'safe_abstractions_under_contract', 'verified_safe_abstractions_under_contract']
146+
self.safe_metrics = ['total_safe_fns', 'safe_fns_under_contract', 'verified_safe_fns_under_contract']
147+
# The keys in these dictionaries are unsafe_metrics, safe_abstr_metrics, and safe_metrics, respectively; see update_plot_metrics()
148+
self.unsafe_plot_data = defaultdict(list)
149+
self.safe_abstr_plot_data = defaultdict(list)
150+
self.safe_plot_data = defaultdict(list)
151+
152+
self.date = datetime.today().date()
153+
self.metrics_file = metrics_file
154+
155+
self.read_historical_data()
156+
157+
# Read historical data from self.metrics_file and initialize the date range.
158+
def read_historical_data(self):
159+
try:
160+
with open(self.metrics_file, 'r') as f:
161+
all_data = json.load(f)["results"]
162+
self.update_plot_metrics(all_data)
163+
except FileNotFoundError:
164+
all_data = {}
165+
166+
self.dates = [datetime.strptime(data["date"], '%Y-%m-%d').date() for data in all_data]
167+
self.dates.append(self.date)
168+
169+
print(f"Dates are {self.dates}\n")
170+
171+
# TODO For now, we don't plot how many of the contracts have been verified, since the line overlaps with how many are under contract.
172+
# If we want to plot this data, we should probably generate separate plots.
173+
# Update the plot data with the provided data.
174+
def update_plot_metrics(self, all_data):
175+
for data in all_data:
176+
for metric in self.unsafe_metrics:
177+
if not metric.startswith("verified"):
178+
self.unsafe_plot_data[metric].append(data[metric])
179+
180+
for metric in self.safe_abstr_metrics:
181+
if not metric.startswith("verified"):
182+
self.safe_abstr_plot_data[metric].append(data[metric])
183+
184+
for metric in self.safe_metrics:
185+
if not metric.startswith("verified"):
186+
self.safe_plot_data[metric].append(data[metric])
187+
188+
# Read output from kani list and std-analysis.sh, then compare their outputs to compute Kani-specific metrics
189+
# and write the results to {self.metrics_file}
190+
def compute_metrics(self, kani_list_filepath, analysis_results_dir):
191+
# Process the `kani list` and `std-analysis.sh` data
192+
kani_data = KaniListSTDMetrics(kani_list_filepath)
193+
generic_metrics = GenericSTDMetrics(analysis_results_dir)
194+
195+
print("Comparing kani-list output to std-analysis.sh output and computing metrics...")
196+
197+
(unsafe_fns_under_contract, verified_unsafe_fns_under_contract) = (0, 0)
198+
(safe_abstractions_under_contract, verified_safe_abstractions_under_contract) = (0, 0)
199+
(safe_fns_under_contract, verified_safe_fns_under_contract) = (0, 0)
200+
201+
for (func_under_contract, has_harnesses) in kani_data.fns_under_contract:
202+
if func_under_contract in generic_metrics.unsafe_fns:
203+
unsafe_fns_under_contract += 1
204+
if has_harnesses:
205+
verified_unsafe_fns_under_contract += 1
206+
if func_under_contract in generic_metrics.safe_abstractions:
207+
safe_abstractions_under_contract += 1
208+
if has_harnesses:
209+
verified_safe_abstractions_under_contract += 1
210+
if func_under_contract in generic_metrics.safe_fns:
211+
safe_fns_under_contract += 1
212+
if has_harnesses:
213+
verified_safe_fns_under_contract += 1
214+
215+
# Keep the keys here in sync with unsafe_metrics, safe_metrics, and safe_abstr_metrics
216+
data = {
217+
"date": self.date,
218+
"total_unsafe_fns": generic_metrics.unsafe_fns_count,
219+
"total_safe_abstractions": generic_metrics.safe_abstractions_count,
220+
"total_safe_fns": generic_metrics.safe_fns_count,
221+
"unsafe_fns_under_contract": unsafe_fns_under_contract,
222+
"verified_unsafe_fns_under_contract": verified_unsafe_fns_under_contract,
223+
"safe_abstractions_under_contract": safe_abstractions_under_contract,
224+
"verified_safe_abstractions_under_contract": verified_safe_abstractions_under_contract,
225+
"safe_fns_under_contract": safe_fns_under_contract,
226+
"verified_safe_fns_under_contract": verified_safe_fns_under_contract,
227+
"total_functions_under_contract": kani_data.total_fns_under_contract,
228+
}
229+
230+
self.update_plot_metrics([data])
231+
232+
# Update self.metrics_file so that these results are included in our historical data for next time
233+
with open(self.metrics_file, 'r') as f:
234+
content = json.load(f)
235+
content["results"].append(data)
236+
237+
with open(self.metrics_file, 'w') as f:
238+
json.dump(content, f, indent=2, default=str)
239+
240+
print(f"Wrote metrics data for date {self.date} to {self.metrics_file}")
241+
242+
# Make a single plot with specified data, title, and filename
243+
def plot_single(self, data, title, outfile):
244+
plt.figure(figsize=(14, 8))
245+
246+
colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#946F7bd', '#8c564b', '#e377c2', '#7f7f7f', '#bcbd22', '#17becf']
247+
248+
for i, (metric, values) in enumerate(data.items()):
249+
color = colors[i % len(colors)]
250+
plt.plot(self.dates, values, 'o-', color=color, label=metric, markersize=6)
251+
252+
# Mark each point on the line with the y value
253+
for x, y in zip(self.dates, values):
254+
plt.annotate(str(y),
255+
(mdates.date2num(x), y),
256+
xytext=(0, 5),
257+
textcoords='offset points',
258+
ha='center',
259+
va='bottom',
260+
color=color,
261+
fontweight='bold')
262+
263+
plt.title(title)
264+
plt.xlabel("Date")
265+
plt.ylabel("Count")
266+
267+
# Set x-axis to only show ticks for the dates we have
268+
plt.gca().xaxis.set_major_locator(FixedLocator(mdates.date2num(self.dates)))
269+
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
270+
271+
plt.gcf().autofmt_xdate()
272+
plt.tight_layout()
273+
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
274+
275+
plt.savefig(outfile, bbox_inches='tight', dpi=300)
276+
plt.close()
277+
278+
print(f"PNG graph generated: {outfile}")
279+
280+
def plot(self):
281+
self.plot_single(self.unsafe_plot_data, title="Contracts on Unsafe Functions in core", outfile="core_unsafe_metrics.png")
282+
self.plot_single(self.safe_abstr_plot_data, title="Contracts on Safe Abstractions in core", outfile="core_safe_abstractions_metrics.png")
283+
self.plot_single(self.safe_plot_data, title="Contracts on Safe Functions in core", outfile="core_safe_metrics.png")
284+
285+
def main():
286+
parser = argparse.ArgumentParser(description="Generate metrics about Kani's application to the standard library.")
287+
parser.add_argument('--metrics-file',
288+
type=str,
289+
default="metrics-data.json",
290+
help="Path to the JSON file containing metrics data (default: metrics-data.json)")
291+
parser.add_argument('--kani-list-file',
292+
type=str,
293+
default="kani-list.json",
294+
help="Path to the JSON file containing the Kani list data (default: kani-list.json)")
295+
# The default is /tmp/std_lib_analysis/results because, as of writing, that's always where std-analysis.sh outputs its results.
296+
parser.add_argument('--analysis-results-dir',
297+
type=str,
298+
default="/tmp/std_lib_analysis/results",
299+
help="Path to the folder file containing the std-analysis.sh results (default: /tmp/std_lib_analysis/results)")
300+
args = parser.parse_args()
301+
302+
metrics = KaniSTDMetricsOverTime(args.metrics_file)
303+
metrics.compute_metrics(args.kani_list_file, args.analysis_results_dir)
304+
metrics.plot()
305+
306+
if __name__ == "__main__":
307+
main()
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
{
2+
"results": [
3+
{
4+
"date": "2025-01-01",
5+
"total_unsafe_fns": 6987,
6+
"total_safe_abstractions": 1704,
7+
"total_safe_fns": 14666,
8+
"unsafe_fns_under_contract": 144,
9+
"verified_unsafe_fns_under_contract": 132,
10+
"safe_abstractions_under_contract": 41,
11+
"verified_safe_abstractions_under_contract": 41,
12+
"safe_fns_under_contract": 77,
13+
"verified_safe_fns_under_contract": 77,
14+
"total_functions_under_contract": 224
15+
}
16+
]
17+
}
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
matplotlib

0 commit comments

Comments
 (0)