Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unused variables and refactor GENE_LIST #437

Merged
merged 1 commit into from
Feb 22, 2024

Conversation

joverlee521
Copy link
Contributor

Noted in previous PRs that that the GENES and GENES_SPACE_DELIMITED variables are not needed¹ or used in the workflow,² so refactor the GENE_LIST to be a hardcoded list of genes.

If we want to ensure that we do not miss any genes from the Nextclade dataset, we could parse out the gene names from the dataset's genome_annotation.gff file. However, I think that will over-complicate the Snakemake workflow so I'm leaving the hardcoded list.

¹ #372 (comment)
² #435 (comment)

Checklist

  • Checks pass

Noted in previous PRs that that the `GENES` and `GENES_SPACE_DELIMITED`
variables are not needed¹ or used in the workflow,² so refactor the
`GENE_LIST` to be a hardcoded list of genes.

If we want to ensure that we do not miss any genes from the Nextclade
dataset, we could parse out the gene names from the dataset's
genome_annotation.gff file. However, I think that will over-complicate
the Snakemake workflow so I'm leaving the hardcoded list.

¹ #372 (comment)
² #435 (comment)
@joverlee521 joverlee521 requested a review from a team February 21, 2024 18:28
@joverlee521
Copy link
Contributor Author

Tested locally by running the debug config

nextstrain build \
    --envdir ~/Repos/env.d/aws/ \
    --image nextstrain/ncov-ingest \
    . \ 
        --configfile config/debug_sample_genbank.yaml \
        --config s3_dst=s3://nextstrain-data/files/ncov/open/branch/update-gene-list

All translation_*.fasta.zst files have been uploaded to s3://nextstrain-data/files/ncov/open/branch/update-gene-list

Copy link
Member

@corneliusroemer corneliusroemer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@corneliusroemer corneliusroemer merged commit 35a5bea into master Feb 22, 2024
2 checks passed
@corneliusroemer corneliusroemer deleted the update-gene-list branch February 22, 2024 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants