Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingest: Derive URL column during ingest #80

Merged
merged 1 commit into from
Dec 16, 2024

Conversation

j23414
Copy link
Contributor

@j23414 j23414 commented Dec 11, 2024

Description of proposed changes

Add URL column during ingest so the node call out works automatically. Attempting the jq method

Related issue(s)

Checklist

  • Checks pass

@@ -107,6 +107,8 @@ rule curate:
--abbr-authors-field {params.abbr_authors_field} \
| augur curate apply-geolocation-rules \
--geolocation-rules {input.all_geolocation_rules} \
| jq -c --arg GENBANK "{params.id_field}" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a workflow used strain as the id_field, this would create invalid URLs...Hmm, might be best to add a new config param (e.g. genbank_accession_field) that is explicitly for this command.

Copy link
Contributor Author

@j23414 j23414 Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point, especially if we use id_field=strain for segmented viruses (e.g. flu or do you have an example? I assume in that case strain name is in the header line of sequences.fasta instead of genbank) to match isolates across tangle trees.

I can add an explicit genbank_accession_field here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added an explicit config['curate']['genbank_accession'] field during b2936f8

@j23414 j23414 force-pushed the 76-ingest-add-rule-to-create-url-column-for-accessions branch from 58a9a99 to edeee12 Compare December 16, 2024 19:47
@j23414 j23414 force-pushed the 76-ingest-add-rule-to-create-url-column-for-accessions branch from edeee12 to b2936f8 Compare December 16, 2024 19:52
@j23414
Copy link
Contributor Author

j23414 commented Dec 16, 2024

After discussion: #76 (comment)

Moved away from the jq implementation back to the csvtk implementation. Thanks all, and this is ready for review

@j23414 j23414 merged commit b400173 into main Dec 16, 2024
1 check passed
@j23414 j23414 deleted the 76-ingest-add-rule-to-create-url-column-for-accessions branch December 16, 2024 23:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ingest: add rule to create url column for accessions
3 participants