-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run NEON sample end to end in Dev #355
Comments
@mbthornton-lbl this failed, there is no output directory. cromwell run directory is 41ec9955-f8b4-4676-86c3-f2714a67b016 |
Root cause is mismatching input key names between the wdl and the workflows.yaml |
@vlilanl @mbthornton-lbl please check on this, there are no records {'was_informed_by': 'nmdc:omprc-11-qww9t685'} in workflow_execution_set on mongo dev. |
Run on Dev: test 1 (from before the Jan NERSC maintenance): nmdc:omprc-11-hcjd7y28 Scheduler Log:
Watcher Log:
Job Failed:
|
test 2: nmdc:omprc-11-rejajm76 Scheduler Log:
Watcher Log:
|
Test 2 Also Failed:
Cromwell exec path: |
@mbthornton-lbl What does the cromwell API report about 7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d? On the file system the exit code for the last task is 0 so I would expect the status of this job to be successful in our logging.
|
@aclum {"status":"Failed","id":"7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d"} Here is the response from https://nmdc-cromwell.freeddns.org:8443/api/workflows/v1/7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d/logs
|
@aclum Here is the submission details now being logged:
|
The tasks have a zero exit code but the final workflow level checks reveal that there is an issue with the wdl in the finish_rqc step, the links to the data don't resolve. "submission": "2025-02-10T18:21:55.908Z", (nersc-python) nmdcda@perlmutter:login32:/pscratch/sd/n/nmdcda/cromwell-executions/nmdc_rqcfilter/7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d/call-finish_rqc/execution> ls -ltr |
Action here is to update the cromwell config, restart cromwell, and submit an new test dataset (see query in the description of this ticket to get the next test id) |
Cromwell / Condor services restarted - submitted next NEON sample:
|
Watcher Log:
|
Update scheduler logging and re-started on
|
Watcher Finds and Submits the 4 scheduled jobs:
|
there is no cromwell job folder so this is likely a mismatch between the wdl and inputs json made by workflow automation. @mbthornton-lbl please take over #298 and sort out passing the boolean flag to the inputs.json so it matches the wdl. |
if you get a printout of the inputs.json you can use womtool or jaws validate to check that an inputs json file is valid against a given wdl |
Start with the following Data Gen ID:
test 1 (from before the Jan NERSC maintenance): nmdc:omprc-11-hcjd7y28
test 2: nmdc:omprc-11-rejajm76
test 3: nmdc:omprc-11-qww9t685
If more test IDs are needed used the next record from this aggregation query:
The text was updated successfully, but these errors were encountered: