Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Template/update #133

Open
wants to merge 37 commits into
base: dev
Choose a base branch
from
Open

Template/update #133

wants to merge 37 commits into from

Conversation

anoronh4
Copy link
Collaborator

@anoronh4 anoronh4 commented Jan 8, 2025

Pipeline template update

Some well overdue pipeline template updates. Because there hasn't been a sync in so long i had to follow the sync retrospectively guide, but now that we are in sync we should in the future be able to follow the manual sync guide, provided we don't wait too long between updates.

Some highlights in the update:

  1. pipeline initialization has been structured differently as a subworkflow rather than groovy library methods. prints version info, summarizes and dumps pipeline parameters, validates parameters, and now reads in and validates the input samplesheet.
  2. The fastq samplesheet is now validated using nf-schema and the definition file assets/schema_input.json. The maf input file is also now validated using nf-schema using a separate definition file assets/schema_maf_input.json.
  3. software dump prints to a different file in the same location and pipeline template now has different logic for generating it.
  4. pipeline completion workflow added, which sends emails upon completion or failure.

I am also switching the main development branch to dev so as to more closely reflect the pipeline and avoid discordance with future template syncs. All github actions triggering with develop should now trigger with dev, and the base branch of PRs should be dev, unless the PR for dev to be merged into main

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the mskcc/forte branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@anoronh4 anoronh4 marked this pull request as draft January 8, 2025 04:34
@anoronh4 anoronh4 requested a review from gongyixiao January 10, 2025 22:59
@anoronh4
Copy link
Collaborator Author

it might make sense to create a new branch off of develop called dev and use that instead. the template wants dev and our code keeps getting incongruous because i for some reason chose develop.

@anoronh4 anoronh4 changed the base branch from develop to dev January 27, 2025 19:22
@anoronh4 anoronh4 requested a review from johnoooh February 14, 2025 19:09
@johnoooh johnoooh marked this pull request as ready for review February 24, 2025 15:43
Copy link

@johnoooh johnoooh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Template updates are hard because there are so many little changes spread across so many files but ultimately they shouldnt change too much. I left a few comments, but they're not urgent. Overall everything looks good!

CHANGELOG.md Outdated
@@ -26,6 +27,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

- [#132](https://github.com/mskcc/forte/pull/132) - fix generate cff split/span logic for fusioncatcher and arriba

- [#133](https://github.com/mskcc/forte/pull/133) - Template update for nf-core/tools v3.1.1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only change in this PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i elaborated a bit more. since the PR is linked someone can always follow the link to get more information.

nextflow.config Outdated
// TODO nf-core: Update the field with the details of the contributors to your pipeline. New with Nextflow version 24.10.0
[
name: 'Anne Marie Noronha',
affiliation: '',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add your info?

@@ -11,7 +11,7 @@ workflow GROUP_READS {
def fastq_pair_id = meta.fastq_pair_id
def meta_clone = meta.clone().findAll { !["read_group","fastq_pair_id"].contains(it.key) }
meta_clone.id = meta.sample
[groupKey(meta_clone,meta.fq_num), reads, read_group, fastq_pair_id]
[meta_clone, reads, read_group, fastq_pair_id]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need fq_num anymore? is it in the meta_clone item now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think i was struggling to find a place for fq_num because the input sheet workflow changed so much. either way I don't think it's necessary because without groupKey it should still group the correct number of fastqs anyways? It would make sense to add back if we were using something like watchPath, i believe.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this deleted?

@pintoa1-mskcc
Copy link
Collaborator

pintoa1-mskcc commented Mar 3, 2025

after I updated nextflow to work with this branch, something is wrong with the baits file now?

error I'm getting

  • --baits ({"idt_v2":{"baits":"/juno/work/ccs/cmopipeline/forte/GRCh38_probes/xgen-exome-hyb-panel-v2-probes-hg38.bed","targets":"/juno/work/ccs/cmopipeline/forte/GRCh38_probes/xgen-exome-hyb-panel-v2-targets-hg38.bed"}}): Value is [object] but should be [string]

Curious if its something to do with the change ifrom having params in getGenomeAttribute to removing it in this:

forte/main.nf

Line 22 in db0c18f

params.baits = getGenomeAttribute('baits')

Copy link
Collaborator

@pintoa1-mskcc pintoa1-mskcc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the new template doesnt work with our assets? Idk something wrong and I cannot run the pipeline on terra

@@ -208,6 +204,8 @@ If `-profile` is not specified, the pipeline will run locally and expect all sof
- A generic configuration profile to be used with [Charliecloud](https://hpc.github.io/charliecloud/)
- `apptainer`
- A generic configuration profile to be used with [Apptainer](https://apptainer.org/)
- `wave`
- A generic configuration profile to enable [Wave](https://seqera.io/wave/) containers. Use together with one of the above (requires Nextflow ` 24.03.0-edge` or later).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in the readme we should explicitly state that the version of nextflow must be 24.03.0 or later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants