Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run NEON sample end to end in Dev #355

Open
mbthornton-lbl opened this issue Jan 21, 2025 · 17 comments · Fixed by #382
Open

Run NEON sample end to end in Dev #355

mbthornton-lbl opened this issue Jan 21, 2025 · 17 comments · Fixed by #382
Assignees

Comments

@mbthornton-lbl
Copy link
Contributor

mbthornton-lbl commented Jan 21, 2025

Start with the following Data Gen ID:
test 1 (from before the Jan NERSC maintenance): nmdc:omprc-11-hcjd7y28
test 2: nmdc:omprc-11-rejajm76
test 3: nmdc:omprc-11-qww9t685

If more test IDs are needed used the next record from this aggregation query:

db.getCollection('data_generation_set').aggregate(
  [
    {
      $match: {
        associated_studies:
          'nmdc:sty-11-34xj1150',
        analyte_category: 'metagenome'
      }
    },
    {
      $lookup: {
        from: 'workflow_execution_set',
        localField: 'id',
        foreignField: 'was_informed_by',
        as: 'workflow_execution_set'
      }
    },
    {
      $lookup: {
        from: 'data_object_set',
        localField: 'has_output',
        foreignField: 'id',
        as: 'do_output'
      }
    },
    {
      $match: {
        'do_output.in_manifest': {
          $exists: false
        }
      }
    },
    {
      $match: {
        workflow_execution_set: { $size: 0 }
      }
    }
  ],
  { maxTimeMS: 60000, allowDiskUse: true }
);
@aclum
Copy link
Contributor

aclum commented Jan 22, 2025

@mbthornton-lbl this failed, there is no output directory. cromwell run directory is 41ec9955-f8b4-4676-86c3-f2714a67b016

@aclum
Copy link
Contributor

aclum commented Jan 22, 2025

Root cause is mismatching input key names between the wdl and the workflows.yaml

@aclum
Copy link
Contributor

aclum commented Feb 3, 2025

@vlilanl @mbthornton-lbl please check on this, there are no records {'was_informed_by': 'nmdc:omprc-11-qww9t685'} in workflow_execution_set on mongo dev.

@mbthornton-lbl
Copy link
Contributor Author

mbthornton-lbl commented Feb 7, 2025

Run on Dev: test 1 (from before the Jan NERSC maintenance): nmdc:omprc-11-hcjd7y28

Scheduler Log:

2025-02-07 17:55:50,751 INFO: Initializing Scheduler
2025-02-07 17:55:57,373 INFO: Found 1 new jobs for nmdc:omprc-11-hcjd7y28
2025-02-07 17:56:01,145 INFO: JOB RECORD: nmdc:cab6b37c-e57c-11ef-8a5b-8aabdb8b4ad6

Watcher Log:

2025-02-07 09:49:13,349 INFO: Entering polling loop
2025-02-07 09:56:16,260 INFO: Found 1 unclaimed jobs.
2025-02-07 09:56:16,260 INFO: Claiming job nmdc:cab6b37c-e57c-11ef-8a5b-8aabdb8b4ad6
2025-02-07 09:56:19,895 INFO: Prepare and cache new job: nmdc:sys010cwj211
2025-02-07 09:56:21,439 INFO: Submitted job 64aefe25-1b8a-4c77-8d13-3fa37839bad4
2025-02-07 09:56:21,439 INFO: Job 64aefe25-1b8a-4c77-8d13-3fa37839bad4 submitted

Job Failed:

2025-02-08 00:30:52,394 INFO: Found 1 failed jobs.
2025-02-08 00:30:52,506 INFO: Processing failed job: nmdc:sys010cwj211, nmdc:wfrqc-12-kss1s587.1
2025-02-08 00:30:52,506 WARNING: Job nmdc:sys010cwj211 failed 2 times. Retrying.
2025-02-08 00:30:53,883 INFO: Submitted job dd776a12-1817-49ad-b85e-bc375841377c
2025-02-08 00:30:53,883 INFO: Job dd776a12-1817-49ad-b85e-bc375841377c submitted
2025-02-08 00:31:54,173 INFO: Found 1 failed jobs.
2025-02-08 00:31:54,173 INFO: Processing failed job: nmdc:sys010cwj211, nmdc:wfrqc-12-kss1s587.1
2025-02-08 00:31:54,173 ERROR: Job nmdc:sys010cwj211 failed 2 times. Skipping.

@mbthornton-lbl
Copy link
Contributor Author

test 2: nmdc:omprc-11-rejajm76

Scheduler Log:

2025-02-10 18:20:42,304 INFO: Initializing Scheduler
2025-02-10 18:20:48,594 INFO: Found 1 new jobs for nmdc:omprc-11-rejajm76
2025-02-10 18:20:51,066 INFO: JOB RECORD: nmdc:c203e860-e7db-11ef-aa40-8aabdb8b4ad6

Watcher Log:

2025-02-10 10:21:50,834 INFO: Claiming job nmdc:c203e860-e7db-11ef-aa40-8aabdb8b4ad6
2025-02-10 10:21:54,400 INFO: Prepare and cache new job: nmdc:sys04tw0gn52
2025-02-10 10:21:55,929 INFO: Submitted job 7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d
2025-02-10 10:21:55,929 INFO: Job 7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d submitted

@mbthornton-lbl
Copy link
Contributor Author

Test 2 Also Failed:

2025-02-10 10:21:50,833 INFO: Found 1 unclaimed jobs.
2025-02-10 10:21:50,834 INFO: Claiming job nmdc:c203e860-e7db-11ef-aa40-8aabdb8b4ad6
2025-02-10 10:21:54,400 INFO: Prepare and cache new job: nmdc:sys04tw0gn52
2025-02-10 10:21:55,929 INFO: Submitted job 7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d
2025-02-10 10:21:55,929 INFO: Job 7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d submitted
2025-02-10 12:40:01,977 INFO: Found 1 failed jobs.
2025-02-10 12:40:01,977 INFO: Processing failed job: nmdc:sys04tw0gn52, nmdc:wfrqc-12-ktp25807.1
2025-02-10 12:40:01,978 WARNING: Job nmdc:sys04tw0gn52 failed 2 times. Retrying.
2025-02-10 12:40:03,431 INFO: Submitted job c4609891-709c-4d2e-a9ea-cff0ab435fa4
2025-02-10 12:40:03,431 INFO: Job c4609891-709c-4d2e-a9ea-cff0ab435fa4 submitted
2025-02-10 12:42:04,064 INFO: Found 1 failed jobs.
2025-02-10 12:42:04,064 INFO: Processing failed job: nmdc:sys04tw0gn52, nmdc:wfrqc-12-ktp25807.1
2025-02-10 12:42:04,064 ERROR: Job nmdc:sys04tw0gn52 failed 2 times. Skipping.

Cromwell exec path:
/pscratch/sd/n/nmdcda/cromwell-executions/nmdc_rqcfilter/7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d

@aclum
Copy link
Contributor

aclum commented Feb 12, 2025

@mbthornton-lbl What does the cromwell API report about 7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d? On the file system the exit code for the last task is 0 so I would expect the status of this job to be successful in our logging.

/pscratch/sd/n/nmdcda/cromwell-executions/nmdc_rqcfilter/7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d/call-finish_rqc/execution> cat rc
0

@mbthornton-lbl
Copy link
Contributor Author

mbthornton-lbl commented Feb 12, 2025

@aclum {"status":"Failed","id":"7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d"}

Here is the response from https://nmdc-cromwell.freeddns.org:8443/api/workflows/v1/7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d/logs

{
  "calls": {
    "nmdc_rqcfilter.stage": [
      {
        "stderr": "/pscratch/sd/n/nmdcda/cromwell-executions/nmdc_rqcfilter/7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d/call-stage/execution/stderr",
        "stdout": "/pscratch/sd/n/nmdcda/cromwell-executions/nmdc_rqcfilter/7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d/call-stage/execution/stdout",
        "attempt": 1,
        "shardIndex": -1
      }
    ],
    "nmdc_rqcfilter.qc": [
      {
        "stderr": "/pscratch/sd/n/nmdcda/cromwell-executions/nmdc_rqcfilter/7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d/call-qc/execution/stderr",
        "stdout": "/pscratch/sd/n/nmdcda/cromwell-executions/nmdc_rqcfilter/7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d/call-qc/execution/stdout",
        "attempt": 1,
        "shardIndex": -1
      }
    ],
    "nmdc_rqcfilter.make_info_file": [
      {
        "stderr": "/pscratch/sd/n/nmdcda/cromwell-executions/nmdc_rqcfilter/7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d/call-make_info_file/execution/stderr",
        "stdout": "/pscratch/sd/n/nmdcda/cromwell-executions/nmdc_rqcfilter/7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d/call-make_info_file/execution/stdout",
        "attempt": 1,
        "shardIndex": -1
      }
    ],
    "nmdc_rqcfilter.finish_rqc": [
      {
        "stderr": "/pscratch/sd/n/nmdcda/cromwell-executions/nmdc_rqcfilter/7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d/call-finish_rqc/execution/stderr",
        "stdout": "/pscratch/sd/n/nmdcda/cromwell-executions/nmdc_rqcfilter/7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d/call-finish_rqc/execution/stdout",
        "attempt": 1,
        "shardIndex": -1
      }
    ]
  },
  "id": "7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d"
}

@mbthornton-lbl
Copy link
Contributor Author

@aclum Here is the submission details now being logged:

2025-02-12 12:31:35,443 INFO: Entering polling loop
2025-02-12 12:33:37,055 INFO: Found 1 unclaimed jobs.
2025-02-12 12:33:37,055 INFO: Claiming job nmdc:86338970-e980-11ef-ba19-9215da79eba9
2025-02-12 12:33:40,620 INFO: Prepare and cache new job: nmdc:sys03rfws445
2025-02-12 12:33:41,824 INFO: WDL file: /tmp/tmpqb7f1wgi.wdl
2025-02-12 12:33:41,824 INFO: Bundle file: /tmp/tmptefbsh1o.zip
2025-02-12 12:33:41,824 INFO: Workflow inputs:
2025-02-12 12:33:41,824 INFO: {
  "nmdc_rqcfilter.proj": "nmdc:wfrqc-12-7fh2nz04.1",
  "nmdc_rqcfilter.input_fastq1": "https://storage.neonscience.org/neon-microbial-raw-seq-files/2023/BMI_HCNKKBGX5_mms_R1/BMI_HCNKKBGX5_Plate8WellE10_R1.fastq.gz",
  "nmdc_rqcfilter.input_fastq2": "https://storage.neonscience.org/neon-microbial-raw-seq-files/2023/BMI_HCNKKBGX5_mms_R2/BMI_HCNKKBGX5_Plate8WellE10_R2.fastq.gz"
}
2025-02-12 12:33:41,824 INFO: Workflow labels:
2025-02-12 12:33:41,824 INFO: {
  "release": "v1.0.14",
  "wdl": "interleave_rqcfilter.wdl",
  "git_repo": "https://github.com/microbiomedata/ReadsQC",
  "submitter": "nmdcda",
  "pipeline_version": "v1.0.14",
  "pipeline": "interleave_rqcfilter.wdl",
  "activity_id": "nmdc:wfrqc-12-7fh2nz04.1",
  "opid": "nmdc:sys03rfws445"
}
2025-02-12 12:33:41,852 INFO: Submitted job e9507189-05ff-4caf-9559-26eb7ece5ebd
2025-02-12 12:33:41,852 INFO: Metadata:
2025-02-12 12:33:41,852 INFO: {
  "id": "e9507189-05ff-4caf-9559-26eb7ece5ebd",
  "status": "Submitted"
}

@mbthornton-lbl mbthornton-lbl linked a pull request Feb 12, 2025 that will close this issue
@github-project-automation github-project-automation bot moved this from In Progress to Done in 2025 - Sprint 56 - Feb 10-21,2025 Feb 13, 2025
@mbthornton-lbl mbthornton-lbl moved this from Done to In Progress in 2025 - Sprint 56 - Feb 10-21,2025 Feb 13, 2025
@aclum aclum reopened this Feb 13, 2025
@aclum
Copy link
Contributor

aclum commented Feb 13, 2025

The tasks have a zero exit code but the final workflow level checks reveal that there is an issue with the wdl in the finish_rqc step, the links to the data don't resolve.
relevant parts of the response from https://nmdc-cromwell.freeddns.org:8443/api/workflows/v1/7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d/metadata

"submission": "2025-02-10T18:21:55.908Z",
"status": "Failed",
"failures": [
{
"causedBy": [
{
"message": "Could not process output, file not found: /pscratch/sd/n/nmdcda/cromwell-executions/nmdc_rqcfilter/7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d/call-finish_rqc/execution/nmdc_wfrqc-12-ktp25807.1_filtered.fastq.gz",
"causedBy": []
}
],
"message": "Workflow failed"
}
],

(nersc-python) nmdcda@perlmutter:login32:/pscratch/sd/n/nmdcda/cromwell-executions/nmdc_rqcfilter/7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d/call-finish_rqc/execution> ls -ltr
total 52
-rw-r--r-- 1 nmdcda nmdcda 816 Feb 10 12:39 submitFile
-rw-r--r-- 1 nmdcda nmdcda 57 Feb 10 12:39 stdout.submit
-rw-r--r-- 1 nmdcda nmdcda 0 Feb 10 12:39 stderr.submit
-rw-r--r-- 1 nmdcda nmdcda 2230 Feb 10 12:39 script.submit
-rwxr-xr-x 1 nmdcda nmdcda 2963 Feb 10 12:39 script
-rwxr-xr-x 1 nmdcda nmdcda 1814 Feb 10 12:39 dockerScript
-rw-r--r-- 1 nmdcda nmdcda 0 Feb 10 12:39 stdout
-rw-r--r-- 1 nmdcda nmdcda 0 Feb 10 12:39 stderr
-rw-r--r-- 1 nmdcda nmdcda 140 Feb 10 12:39 stats.json
-rw-r--r-- 1 nmdcda nmdcda 2 Feb 10 12:39 rc
-rw-r--r-- 1 nmdcda nmdcda 140 Feb 10 12:39 nmdc_wfrqc-12-ktp25807.1_qa_stats.json
lrwxrwxrwx 1 nmdcda nmdcda 123 Feb 10 12:39 nmdc_wfrqc-12-ktp25807.1_filterStats.txt -> /cromwell-executions/nmdc_rqcfilter/7b84f6e2-7a5f-4741-8cdd-aca4a7117c3d/call-finish_rqc/inputs/-1838980290/filterStats.txt

@aclum aclum moved this from In Progress to Blocked in 2025 - Sprint 56 - Feb 10-21,2025 Feb 13, 2025
@aclum
Copy link
Contributor

aclum commented Feb 14, 2025

Action here is to update the cromwell config, restart cromwell, and submit an new test dataset (see query in the description of this ticket to get the next test id)

@aclum aclum moved this from Blocked to In Progress in 2025 - Sprint 56 - Feb 10-21,2025 Feb 14, 2025
@mbthornton-lbl
Copy link
Contributor Author

Cromwell / Condor services restarted - submitted next NEON sample:

2025-02-14 23:31:51,247 INFO: Allowing: nmdc:omprc-11-0g272j90
2025-02-14 23:31:51,247 INFO: Starting Scheduler
2025-02-14 23:31:57,757 INFO: Found 1 workflow process nodes
2025-02-14 23:31:57,771 INFO: Creating a job Reads QC Interleave:v1.0.14 for nmdc:omprc-11-0g272j90
2025-02-14 23:31:57,771 INFO: Found 1 new jobs for nmdc:omprc-11-0g272j90
2025-02-14 23:31:57,771 INFO: new job: informed_by: nmdc:omprc-11-0g272j90 trigger: nmdc:omprc-11-0g272j90 wf: Reads QC Interleave ver: v1.0.14
2025-02-14 23:31:59,892 INFO: JOB RECORD: nmdc:e3279fc0-eb2b-11ef-93b4-4afe4a5e2512

@mbthornton-lbl
Copy link
Contributor Author

Watcher Log:

2025-02-14 15:25:35,300 INFO: Status 'Failed': 10 job(s)
2025-02-14 15:25:35,300 INFO: Status 'Unknown': 2 job(s)
2025-02-14 15:25:35,300 INFO: Status 'Running': 1 job(s)
2025-02-14 15:25:35,300 INFO: Adding 13 new jobs from state file.
2025-02-14 15:25:36,688 INFO: Entering polling loop
2025-02-14 15:32:39,416 INFO: Job from State: nmdc:omprc-11-0g272j90 / nmdc:wfrqc-12-7btzkt48.1, Last Status: Unknown /nmdc:sys0f583f738 / nmdc:e3279fc0-eb2b-11ef-93b4-4afe4a5e2512
2025-02-14 15:32:39,416 INFO: Status 'Unknown': 1 job(s)
2025-02-14 15:32:39,416 INFO: Adding 1 new jobs from state file.
2025-02-14 15:36:41,736 INFO: Found 2 unclaimed jobs.
2025-02-14 15:36:41,736 INFO: Claiming job nmdc:899553c0-eb2c-11ef-883a-4afe4a5e2512
2025-02-14 15:36:44,677 INFO: Prepare and cache new job: nmdc:sys0a9hy1517
2025-02-14 15:36:45,841 INFO: WDL file: /tmp/tmpn58fb40g.wdl
2025-02-14 15:36:45,841 INFO: Bundle file: /tmp/tmpal8kekff.zip
2025-02-14 15:36:45,841 INFO: Workflow inputs:
2025-02-14 15:36:45,841 INFO: {
  "nmdc_rqcfilter.proj": "nmdc:wfrqc-12-3f745454.1",
  "nmdc_rqcfilter.input_fastq1": "https://storage.neonscience.org/neon-microbial-raw-seq-files/2021/BMI_HVNV5BGXJ_mms_R1/BMI_HVNV5BGXJ_20S_08_0734_R1.fastq.gz",
  "nmdc_rqcfilter.input_fastq2": "https://storage.neonscience.org/neon-microbial-raw-seq-files/2021/BMI_HVNV5BGXJ_mms_R2/BMI_HVNV5BGXJ_20S_08_0734_R2.fastq.gz"
}
2025-02-14 15:36:45,841 INFO: Workflow labels:
2025-02-14 15:36:45,841 INFO: {
  "release": "v1.0.14",
  "wdl": "interleave_rqcfilter.wdl",
  "git_repo": "https://github.com/microbiomedata/ReadsQC",
  "submitter": "nmdcda",
  "pipeline_version": "v1.0.14",
  "pipeline": "interleave_rqcfilter.wdl",
  "activity_id": "nmdc:wfrqc-12-3f745454.1",
  "opid": "nmdc:sys0a9hy1517"
}
2025-02-14 15:36:45,880 INFO: Submitted job d75255fc-eacd-4cd2-a410-232ea65c1c20
2025-02-14 15:36:45,880 INFO: Metadata:
2025-02-14 15:36:45,880 INFO: {
  "id": "d75255fc-eacd-4cd2-a410-232ea65c1c20",
  "status": "Submitted"
}
2025-02-14 15:36:45,880 INFO: Job d75255fc-eacd-4cd2-a410-232ea65c1c20 submitted
2025-02-14 15:36:45,880 INFO: Claiming job nmdc:8a6cb932-eb2c-11ef-883a-4afe4a5e2512
2025-02-14 15:36:48,799 INFO: Prepare and cache new job: nmdc:sys0vmpept25
2025-02-14 15:36:49,798 INFO: WDL file: /tmp/tmprtbkj9ps.wdl
2025-02-14 15:36:49,798 INFO: Bundle file: /tmp/tmp8fmw6wb6.zip
2025-02-14 15:36:49,798 INFO: Workflow inputs:
2025-02-14 15:36:49,798 INFO: {
  "nmdc_rqcfilter.proj": "nmdc:wfrqc-12-0dysns96.1",
  "nmdc_rqcfilter.input_fastq1": "https://storage.neonscience.org/neon-microbial-raw-seq-files/2022/HHM55BGXM_R1/BMI_21S_23_2083_mms_HHM55BGXM_R1.fastq.gz",
  "nmdc_rqcfilter.input_fastq2": "https://storage.neonscience.org/neon-microbial-raw-seq-files/2022/HHM55BGXM_R2/BMI_21S_23_2083_mms_HHM55BGXM_R2.fastq.gz"
}
2025-02-14 15:36:49,798 INFO: Workflow labels:
2025-02-14 15:36:49,798 INFO: {
  "release": "v1.0.14",
  "wdl": "interleave_rqcfilter.wdl",
  "git_repo": "https://github.com/microbiomedata/ReadsQC",
  "submitter": "nmdcda",
  "pipeline_version": "v1.0.14",
  "pipeline": "interleave_rqcfilter.wdl",
  "activity_id": "nmdc:wfrqc-12-0dysns96.1",
  "opid": "nmdc:sys0vmpept25"
}
2025-02-14 15:36:49,833 INFO: Submitted job 72f2e0e8-e6e8-4b5a-8258-a4c5a2914578
2025-02-14 15:36:49,834 INFO: Metadata:
2025-02-14 15:36:49,834 INFO: {
  "id": "72f2e0e8-e6e8-4b5a-8258-a4c5a2914578",
  "status": "Submitted"
}
2025-02-14 15:36:49,834 INFO: Job 72f2e0e8-e6e8-4b5a-8258-a4c5a2914578 submitted

@mbthornton-lbl
Copy link
Contributor Author

Update scheduler logging and re-started on nmdc-dev:

2025-02-18 18:55:46,138 INFO: Initializing Scheduler
2025-02-18 18:55:46,159 INFO: Reading Allowlist
2025-02-18 18:55:46,160 INFO: Read 3 items
2025-02-18 18:55:46,160 INFO: Allowing: nmdc:omprc-11-0g272j90
2025-02-18 18:55:46,160 INFO: Allowing: nmdc:omprc-11-57xww552
2025-02-18 18:55:46,160 INFO: Allowing: nmdc:omprc-11-769ab655
2025-02-18 18:55:46,160 INFO: Starting Scheduler
2025-02-18 18:55:52,615 INFO: Found workflow process node nmdc:omprc-11-0g272j90
2025-02-18 18:55:52,615 INFO: Found workflow process node nmdc:wfrqc-12-7btzkt48.1
2025-02-18 18:55:52,615 INFO: Found workflow process node nmdc:omprc-11-769ab655
2025-02-18 18:55:52,615 INFO: Found workflow process node nmdc:omprc-11-57xww552
2025-02-18 18:55:52,615 INFO: Found workflow process node nmdc:wfrqc-12-3f745454.1
2025-02-18 18:55:52,629 INFO: Skipping existing job fornmdc:omprc-11-0g272j90 Reads QC Interleave:v1.0.14
2025-02-18 18:55:52,640 INFO: Creating a job Readbased Analysis:v1.0.9 for nmdc:wfrqc-12-7btzkt48.1
2025-02-18 18:55:52,651 INFO: Creating a job Metagenome Assembly:v1.0.7 for nmdc:wfrqc-12-7btzkt48.1
2025-02-18 18:55:52,651 INFO: Found 2 new jobs for nmdc:wfrqc-12-7btzkt48.1
2025-02-18 18:55:52,651 INFO: new job: informed_by: nmdc:omprc-11-0g272j90 trigger: nmdc:wfrqc-12-7btzkt48.1 wf: Readbased Analysis ver: v1.0.9
2025-02-18 18:55:57,448 INFO: JOB RECORD: nmdc:fcd24ad2-ee29-11ef-89dd-1680ce44ff41
2025-02-18 18:55:57,587 INFO: new job: informed_by: nmdc:omprc-11-0g272j90 trigger: nmdc:wfrqc-12-7btzkt48.1 wf: Metagenome Assembly ver: v1.0.7
2025-02-18 18:55:59,536 INFO: JOB RECORD: nmdc:fe10da9e-ee29-11ef-89dd-1680ce44ff41
2025-02-18 18:55:59,540 INFO: Skipping existing job fornmdc:omprc-11-769ab655 Reads QC Interleave:v1.0.14
2025-02-18 18:55:59,540 INFO: Skipping existing job fornmdc:omprc-11-57xww552 Reads QC Interleave:v1.0.14
2025-02-18 18:55:59,540 INFO: Creating a job Readbased Analysis:v1.0.9 for nmdc:wfrqc-12-3f745454.1
2025-02-18 18:55:59,541 INFO: Creating a job Metagenome Assembly:v1.0.7 for nmdc:wfrqc-12-3f745454.1
2025-02-18 18:55:59,541 INFO: Found 2 new jobs for nmdc:wfrqc-12-3f745454.1
2025-02-18 18:55:59,541 INFO: new job: informed_by: nmdc:omprc-11-57xww552 trigger: nmdc:wfrqc-12-3f745454.1 wf: Readbased Analysis ver: v1.0.9
2025-02-18 18:56:01,918 INFO: JOB RECORD: nmdc:ff7c610a-ee29-11ef-89dd-1680ce44ff41
2025-02-18 18:56:01,922 INFO: new job: informed_by: nmdc:omprc-11-57xww552 trigger: nmdc:wfrqc-12-3f745454.1 wf: Metagenome Assembly ver: v1.0.7
2025-02-18 18:56:03,501 INFO: JOB RECORD: nmdc:006dd81e-ee2a-11ef-89dd-1680ce44ff41
2025-02-18 18:57:06,243 INFO: Skipping existing job fornmdc:wfrqc-12-7btzkt48.1 Readbased Analysis:v1.0.9
2025-02-18 18:57:06,254 INFO: Skipping existing job fornmdc:wfrqc-12-7btzkt48.1 Metagenome Assembly:v1.0.7
2025-02-18 18:57:06,254 INFO: Skipping existing job fornmdc:wfrqc-12-3f745454.1 Readbased Analysis:v1.0.9
2025-02-18 18:57:06,254 INFO: Skipping existing job fornmdc:wfrqc-12-3f745454.1 Metagenome Assembly:v1.0.7

@mbthornton-lbl
Copy link
Contributor Author

Watcher Finds and Submits the 4 scheduled jobs:

2025-02-18 10:56:24,519 INFO: Found 4 unclaimed jobs.
2025-02-18 10:56:24,521 INFO: Claiming job nmdc:fcd24ad2-ee29-11ef-89dd-1680ce44ff41
2025-02-18 10:56:27,684 INFO: Prepare and cache new job: nmdc:sys0p5m0ch17
2025-02-18 10:56:29,081 INFO: WDL file: /tmp/tmpthqbhx6i.wdl
2025-02-18 10:56:29,081 INFO: Bundle file: /tmp/tmpqyouhe2g.zip
2025-02-18 10:56:29,081 INFO: Workflow inputs:
2025-02-18 10:56:29,081 INFO: {
  "ReadbasedAnalysis.input_file": "https://data.microbiomedata.org/data/nmdc:omprc-11-0g272j90/nmdc:wfrqc-12-7btzkt48.1/nmdc_wfrqc-12-7btzkt48.1_filtered.fastq.gz",
  "ReadbasedAnalysis.proj": "nmdc:wfrbt-12-yw16js60.1"
}
2025-02-18 10:56:29,081 INFO: Workflow labels:
2025-02-18 10:56:29,081 INFO: {
  "release": "v1.0.9",
  "wdl": "ReadbasedAnalysis.wdl",
  "git_repo": "https://github.com/microbiomedata/ReadbasedAnalysis",
  "submitter": "nmdcda",
  "pipeline_version": "v1.0.9",
  "pipeline": "ReadbasedAnalysis.wdl",
  "activity_id": "nmdc:wfrbt-12-yw16js60.1",
  "opid": "nmdc:sys0p5m0ch17"
}
2025-02-18 10:56:29,131 INFO: Submitted job d5386a59-9890-495d-80a7-d06548ec86a3
2025-02-18 10:56:29,131 INFO: Metadata:
2025-02-18 10:56:29,131 INFO: {
  "id": "d5386a59-9890-495d-80a7-d06548ec86a3",
  "status": "Submitted"
}
2025-02-18 10:56:29,131 INFO: Job d5386a59-9890-495d-80a7-d06548ec86a3 submitted
2025-02-18 10:56:29,132 INFO: Claiming job nmdc:fe10da9e-ee29-11ef-89dd-1680ce44ff41
2025-02-18 10:56:31,952 INFO: Prepare and cache new job: nmdc:sys061xjm629
2025-02-18 10:56:33,311 INFO: WDL file: /tmp/tmpwi4odjv9.wdl
2025-02-18 10:56:33,311 INFO: Bundle file: /tmp/tmpw7tucsep.zip
2025-02-18 10:56:33,311 INFO: Workflow inputs:
2025-02-18 10:56:33,311 INFO: {
  "jgi_metaASM.input_file": "https://data.microbiomedata.org/data/nmdc:omprc-11-0g272j90/nmdc:wfrqc-12-7btzkt48.1/nmdc_wfrqc-12-7btzkt48.1_filtered.fastq.gz",
  "jgi_metaASM.rename_contig_prefix": "nmdc:wfmgas-12-1py2fw62.1",
  "jgi_metaASM.proj": "nmdc:wfmgas-12-1py2fw62.1"
}
2025-02-18 10:56:33,311 INFO: Workflow labels:
2025-02-18 10:56:33,311 INFO: {
  "release": "v1.0.7",
  "wdl": "jgi_assembly.wdl",
  "git_repo": "https://github.com/microbiomedata/metaAssembly",
  "submitter": "nmdcda",
  "pipeline_version": "v1.0.7",
  "pipeline": "jgi_assembly.wdl",
  "activity_id": "nmdc:wfmgas-12-1py2fw62.1",
  "opid": "nmdc:sys061xjm629"
}
2025-02-18 10:56:33,360 INFO: Submitted job 68ae6ada-f346-4210-a750-8638cd37a4bf
2025-02-18 10:56:33,360 INFO: Metadata:
2025-02-18 10:56:33,360 INFO: {
  "id": "68ae6ada-f346-4210-a750-8638cd37a4bf",
  "status": "Submitted"
}
2025-02-18 10:56:33,360 INFO: Job 68ae6ada-f346-4210-a750-8638cd37a4bf submitted
2025-02-18 10:56:33,360 INFO: Claiming job nmdc:ff7c610a-ee29-11ef-89dd-1680ce44ff41
2025-02-18 10:56:36,276 INFO: Prepare and cache new job: nmdc:sys0zhmyar16
2025-02-18 10:56:37,421 INFO: WDL file: /tmp/tmpa9kfpxcz.wdl
2025-02-18 10:56:37,421 INFO: Bundle file: /tmp/tmp4c_yp1cc.zip
2025-02-18 10:56:37,421 INFO: Workflow inputs:
2025-02-18 10:56:37,421 INFO: {
  "ReadbasedAnalysis.input_file": "https://data.microbiomedata.org/data/nmdc:omprc-11-57xww552/nmdc:wfrqc-12-3f745454.1/nmdc_wfrqc-12-3f745454.1_filtered.fastq.gz",
  "ReadbasedAnalysis.proj": "nmdc:wfrbt-12-78gmh062.1"
}
2025-02-18 10:56:37,421 INFO: Workflow labels:
2025-02-18 10:56:37,421 INFO: {
  "release": "v1.0.9",
  "wdl": "ReadbasedAnalysis.wdl",
  "git_repo": "https://github.com/microbiomedata/ReadbasedAnalysis",
  "submitter": "nmdcda",
  "pipeline_version": "v1.0.9",
  "pipeline": "ReadbasedAnalysis.wdl",
  "activity_id": "nmdc:wfrbt-12-78gmh062.1",
  "opid": "nmdc:sys0zhmyar16"
}
2025-02-18 10:56:37,456 INFO: Submitted job 78684acd-49d3-4b72-8df4-c9857671aeda
2025-02-18 10:56:37,456 INFO: Metadata:
2025-02-18 10:56:37,456 INFO: {
  "id": "78684acd-49d3-4b72-8df4-c9857671aeda",
  "status": "Submitted"
}
2025-02-18 10:56:37,456 INFO: Job 78684acd-49d3-4b72-8df4-c9857671aeda submitted
2025-02-18 10:56:37,456 INFO: Claiming job nmdc:006dd81e-ee2a-11ef-89dd-1680ce44ff41
2025-02-18 10:56:40,309 INFO: Prepare and cache new job: nmdc:sys0xkv1bn10
2025-02-18 10:56:41,315 INFO: WDL file: /tmp/tmpwb_x7x3d.wdl
2025-02-18 10:56:41,315 INFO: Bundle file: /tmp/tmpg2_wgcwv.zip
2025-02-18 10:56:41,315 INFO: Workflow inputs:
2025-02-18 10:56:41,315 INFO: {
  "jgi_metaASM.input_file": "https://data.microbiomedata.org/data/nmdc:omprc-11-57xww552/nmdc:wfrqc-12-3f745454.1/nmdc_wfrqc-12-3f745454.1_filtered.fastq.gz",
  "jgi_metaASM.rename_contig_prefix": "nmdc:wfmgas-12-j7g7f116.1",
  "jgi_metaASM.proj": "nmdc:wfmgas-12-j7g7f116.1"
}
2025-02-18 10:56:41,315 INFO: Workflow labels:
2025-02-18 10:56:41,315 INFO: {
  "release": "v1.0.7",
  "wdl": "jgi_assembly.wdl",
  "git_repo": "https://github.com/microbiomedata/metaAssembly",
  "submitter": "nmdcda",
  "pipeline_version": "v1.0.7",
  "pipeline": "jgi_assembly.wdl",
  "activity_id": "nmdc:wfmgas-12-j7g7f116.1",
  "opid": "nmdc:sys0xkv1bn10"
}
2025-02-18 10:56:41,344 INFO: Submitted job 1e09fc07-89c2-4d04-b425-7f6197c5965c
2025-02-18 10:56:41,344 INFO: Metadata:
2025-02-18 10:56:41,344 INFO: {
  "id": "1e09fc07-89c2-4d04-b425-7f6197c5965c",
  "status": "Submitted"
}
2025-02-18 10:56:41,344 INFO: Job 1e09fc07-89c2-4d04-b425-7f6197c5965c submitted
2025-02-18 10:57:42,193 INFO: Found 2 failed jobs.
2025-02-18 10:57:42,193 INFO: Processing failed job: nmdc:sys061xjm629, nmdc:wfmgas-12-1py2fw62.1
2025-02-18 10:57:42,195 WARNING: Job nmdc:sys061xjm629 failed 2 times. Retrying.
2025-02-18 10:57:43,191 INFO: WDL file: /tmp/tmp0oq0bu1c.wdl
2025-02-18 10:57:43,191 INFO: Bundle file: /tmp/tmpyz3nzqum.zip
2025-02-18 10:57:43,191 INFO: Workflow inputs:
2025-02-18 10:57:43,191 INFO: {
  "jgi_metaASM.input_file": "https://data.microbiomedata.org/data/nmdc:omprc-11-0g272j90/nmdc:wfrqc-12-7btzkt48.1/nmdc_wfrqc-12-7btzkt48.1_filtered.fastq.gz",
  "jgi_metaASM.rename_contig_prefix": "nmdc:wfmgas-12-1py2fw62.1",
  "jgi_metaASM.proj": "nmdc:wfmgas-12-1py2fw62.1"
}
2025-02-18 10:57:43,191 INFO: Workflow labels:
2025-02-18 10:57:43,191 INFO: {
  "release": "v1.0.7",
  "wdl": "jgi_assembly.wdl",
  "git_repo": "https://github.com/microbiomedata/metaAssembly",
  "submitter": "nmdcda",
  "pipeline_version": "v1.0.7",
  "pipeline": "jgi_assembly.wdl",
  "activity_id": "nmdc:wfmgas-12-1py2fw62.1",
  "opid": "nmdc:sys061xjm629"
}
2025-02-18 10:57:43,223 INFO: Submitted job 1ef281cb-d386-4ebd-a414-0e48d9dfbe3e
2025-02-18 10:57:43,223 INFO: Metadata:
2025-02-18 10:57:43,223 INFO: {
  "id": "1ef281cb-d386-4ebd-a414-0e48d9dfbe3e",
  "status": "Submitted"
}
2025-02-18 10:57:43,223 INFO: Job 1ef281cb-d386-4ebd-a414-0e48d9dfbe3e submitted
2025-02-18 10:57:43,223 INFO: Processing failed job: nmdc:sys0xkv1bn10, nmdc:wfmgas-12-j7g7f116.1
2025-02-18 10:57:43,226 WARNING: Job nmdc:sys0xkv1bn10 failed 2 times. Retrying.
2025-02-18 10:57:44,244 INFO: WDL file: /tmp/tmpzfno3vbd.wdl
2025-02-18 10:57:44,244 INFO: Bundle file: /tmp/tmpr2cyujp4.zip
2025-02-18 10:57:44,244 INFO: Workflow inputs:
2025-02-18 10:57:44,244 INFO: {
  "jgi_metaASM.input_file": "https://data.microbiomedata.org/data/nmdc:omprc-11-57xww552/nmdc:wfrqc-12-3f745454.1/nmdc_wfrqc-12-3f745454.1_filtered.fastq.gz",
  "jgi_metaASM.rename_contig_prefix": "nmdc:wfmgas-12-j7g7f116.1",
  "jgi_metaASM.proj": "nmdc:wfmgas-12-j7g7f116.1"
}
2025-02-18 10:57:44,244 INFO: Workflow labels:
2025-02-18 10:57:44,244 INFO: {
  "release": "v1.0.7",
  "wdl": "jgi_assembly.wdl",
  "git_repo": "https://github.com/microbiomedata/metaAssembly",
  "submitter": "nmdcda",
  "pipeline_version": "v1.0.7",
  "pipeline": "jgi_assembly.wdl",
  "activity_id": "nmdc:wfmgas-12-j7g7f116.1",
  "opid": "nmdc:sys0xkv1bn10"
}
2025-02-18 10:57:44,273 INFO: Submitted job c520c9e5-3e40-4f26-8eb3-0e2c324565a0
2025-02-18 10:57:44,273 INFO: Metadata:
2025-02-18 10:57:44,273 INFO: {
  "id": "c520c9e5-3e40-4f26-8eb3-0e2c324565a0",
  "status": "Submitted"
}
2025-02-18 10:57:44,273 INFO: Job c520c9e5-3e40-4f26-8eb3-0e2c324565a0 submitted
2025-02-18 10:58:44,606 INFO: Found 2 failed jobs.
2025-02-18 10:58:44,606 INFO: Processing failed job: nmdc:sys061xjm629, nmdc:wfmgas-12-1py2fw62.1
2025-02-18 10:58:44,606 ERROR: Job nmdc:sys061xjm629 failed 2 times. Skipping.
2025-02-18 10:58:44,609 INFO: Processing failed job: nmdc:sys0xkv1bn10, nmdc:wfmgas-12-j7g7f116.1
2025-02-18 10:58:44,609 ERROR: Job nmdc:sys0xkv1bn10 failed 2 times. Skipping.

@aclum
Copy link
Contributor

aclum commented Feb 18, 2025

there is no cromwell job folder so this is likely a mismatch between the wdl and inputs json made by workflow automation. @mbthornton-lbl please take over #298 and sort out passing the boolean flag to the inputs.json so it matches the wdl.

@aclum
Copy link
Contributor

aclum commented Feb 18, 2025

if you get a printout of the inputs.json you can use womtool or jaws validate to check that an inputs json file is valid against a given wdl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging a pull request may close this issue.

3 participants