Skip to content

Commit 10b4dc3

Browse files
author
V
committed
Made some comments and readme fixes.
1 parent 868ec50 commit 10b4dc3

File tree

2 files changed

+29
-7
lines changed

2 files changed

+29
-7
lines changed

jobs/gtfs-schedule-validator/Dockerfile

+1-1
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ COPY ./gtfs-validator-4.1.0-cli.jar ${V4_1_VALIDATOR_JAR}
2828
ENV V4_2_VALIDATOR_JAR=/gtfs-validator-4.2.0-cli.jar
2929
COPY ./gtfs-validator-4.2.0-cli.jar ${V4_2_VALIDATOR_JAR}
3030

31-
# v5.0.0 from https://github.com/MobilityData/gtfs-validator/releases/download/v5.0.0/gtfs-validator-5.0.0-cli.jar
31+
# v5 from https://github.com/MobilityData/gtfs-validator/releases/download/v5.0.0/gtfs-validator-5.0.0-cli.jar
3232
ENV V5_VALIDATOR_JAR=/gtfs-validator-5.0.0-cli.jar
3333
COPY ./gtfs-validator-5.0.0-cli.jar ${V5_VALIDATOR_JAR}
3434

jobs/gtfs-schedule-validator/README.md

+28-6
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,31 @@ since data creation.
2727
If you run into trouble when adding the new validator jar, it's because the default set for check-added-large-files in our pre-commit config which is a relatively low 500Kb. It's more meant as an alarm for local development than as an enforcement mechanism.
2828
You can make one commit that adds the jar and temporarily adds a higher file size threshold to the pre-commit config [like this one](https://github.com/cal-itp/data-infra/pull/2893/commits/7d40c81f2f5a2622123d4ac5dbbb064eb35565c6) and then a second commit that removes the threshold modification [like this one](https://github.com/cal-itp/data-infra/pull/2893/commits/1ec4e4a1f30ac95b9c0edffcf1f2b12e53e40733). That'll get the file through.
2929

30-
Remember you need to rebuild and push the latest docker file to dhcr before changes will be reflected in airflow runs.
31-
32-
You will need to parse the rules.json from the mobility validator. [Here is a gist to help](https://gist.github.com/vevetron/7d4bbebd2f1d524728d5349293906e3a).
33-
34-
Here is a command to test
35-
docker-compose run airflow tasks test unzip_and_validate_gtfs_schedule_hourly validate_gtfs_schedule 2024-03-22T18:00:00
30+
Remember you need to rebuild and push the latest docker file to `dhcr.io` before changes will be reflected in airflow runs.
31+
32+
You will need to parse the `rules.json` from the mobility validator. Here is a code example for the upgrade to v5:
33+
```
34+
# https://github.com/MobilityData/gtfs-validator/releases/tag/v5.0.0
35+
import json
36+
import pandas as pd
37+
38+
# Replace with your JSON data
39+
with open('rules.json') as f:
40+
data = json.load(f)
41+
result = []
42+
for key in data.keys():
43+
# print(key)
44+
result.append({
45+
'code': data[key]['code'],
46+
'human_readable_description': data[key]['shortSummary'],
47+
'version': 'v5.0.0',
48+
'severity': data[key]['severityLevel']
49+
})
50+
# Create CSV
51+
df = pd.DataFrame(result)
52+
df.to_csv('gtfs_schedule_validator_rule_details_v5_0_0.csv', index=False)
53+
```
54+
55+
Here is a command to test once you have appropriate gtfs zip files in the test bucket:
56+
57+
`docker-compose run airflow tasks test unzip_and_validate_gtfs_schedule_hourly validate_gtfs_schedule YYYY-MM-DDTHH:MM:SS`

0 commit comments

Comments
 (0)