Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable prohibiting landing zone uploads of specific file types #2064

Open
mmilek opened this issue Feb 7, 2025 · 3 comments
Open

Enable prohibiting landing zone uploads of specific file types #2064

mmilek opened this issue Feb 7, 2025 · 3 comments
Assignees
Labels
app: landingzones Issue in the landingzones app app: taskflowbackend Issue in the taskflowbackend app feature Requested feature or enhancement
Milestone

Comments

@mmilek
Copy link

mmilek commented Feb 7, 2025

Problem

for some projects managed by the external sodar instance, the sharing of primary human sequence data (ie identifying data) is not allowed. We would need to prevent the upload of certain file types such as fastq to sodar

Solution

perhaps by file suffix or by content validation?

@mmilek mmilek added the feature Requested feature or enhancement label Feb 7, 2025
@mikkonie mikkonie changed the title Forbid the upload of identifying data files, such as fastq with primary human seq info Enable prohibiting landing zone uploads of specific file types Feb 7, 2025
@mikkonie mikkonie added tbd Comments wanted, spec/schedule/prioritization to be decided, etc. app: landingzones Issue in the landingzones app app: taskflowbackend Issue in the taskflowbackend app labels Feb 7, 2025
@mikkonie
Copy link
Contributor

mikkonie commented Feb 7, 2025

This seems to be a reasonable feature to add. There are some assumptions and questions that come to mind:

  • This should obviously be at least project specfic
    • Is more granularity needed? E.g. assay specific?
      • If so, the controls for this would be ISA-Tab / assay plugin specific
      • Best choice would be to configure this with SODAR specific ISA-Tab comments
  • Are file suffix checks sufficient?
    • If not, what else would be needed?
  • How and where to communicate these limitations to the user?
    • In the end, this will depend on how and where exactly we choose to configure this

See also #2065.

@mmilek
Copy link
Author

mmilek commented Feb 7, 2025

i think project specific is enough, assay specific maybe not needed

file suffix checks could be the initial solution but i think content validation is the way to go. for fastq the format is well-defined.

the data steward is responsible for creating a project with the necessary limitations and communicating with the user eg via project description.

@mikkonie
Copy link
Contributor

mikkonie commented Feb 7, 2025

Thanks for the feedback. In that case, I'd add a simple project setting similar to BAM/VCF file omission and check against file suffixes in landing zone validation.

We can of course make this fancier later on, but with this initial implementation this should be trivial to get in v1.1. Tagging the issue in the milestone.

@mikkonie mikkonie self-assigned this Feb 7, 2025
@mikkonie mikkonie removed the tbd Comments wanted, spec/schedule/prioritization to be decided, etc. label Feb 7, 2025
@mikkonie mikkonie added this to the v1.1.0 milestone Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
app: landingzones Issue in the landingzones app app: taskflowbackend Issue in the taskflowbackend app feature Requested feature or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants