-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inheritance principle clarification for exact matches #1583
Comments
ha. apparently bids-matlab would get confused by this. |
Copying @Remi-Gau's reply from my original comment:
|
This seems like a sane enough check for both humans and machines, but, yeah, I can see where this leads with inheritance in derivatives. I look forward to seeing dozens of similar, but not quite, json's littered all over the top of derivative datasets. |
I am glad you are volunteering to update the regex that will be inevitably become necessary to parse all this. 😘 |
I think it would need further clarification to proposed formulation: The original E.g. currently So, reformulation might need to explicitly separate cases for different levels -- old version for levels higher than the corresponding "data file" level, and then relaxation (multiple allowed, but only 1 exactly matching is taken...) at the data file level. But that would complicate it even more :-/ But also it requires analysis (I have only vague memory) of the cases we might have already where some entities are on purpose omitted (e.g. |
for the latter concern, looking at Example 4 at https://bids-specification.readthedocs.io/en/stable/02-common-principles.html#the-inheritance-principle
which would still be ok under proposed formulation since that "special casing" of 4. would not kick in, but if we extend by hypothetical addition of a file with extra entity (
|
So I was going here for a minimal relaxation, but after discussing with @ericearl, @Remi-Gau and @tsalo, we ended up at something dangerously close to @Lestropie's "even more relaxed" approach (Lestropie#5). This is my understanding of the rules, for the sake of making sure we're agreed (I'm generally happy to accept Rob's wording):
This has the following advantages:
There is a cliff where tools that do not respect this algorithm may not be able to handle datasets constructed after its establishment. My hope is that this will primarily affect derivatives, as both driving use cases come up in derivatives, where the tooling is in greater flux. |
Note that any relaxation may be disallowed before BIDS 2.0 (#1339). In any case, I think the relaxation I suggested in my last comment is off the table unless the SG reconsiders. The solutions I see are:
|
I think that we can certainly reconsider #1339. I am wondering whether this also suggests a reprioritization of 2.0, given this additional pressure? I also wonder whether the direction pursued (and abandoned) in #1280 can help us circumvent some of these issues, without change to the inheritance principle? I still think that it has a naturalness to it that is appealing, possibly preferable to these relatively opaque (if simple!) rules. |
Muahahaha. Welcome all back to my nightmare... 😈
Personally I would strongly discourage this. Firstly, it's an unnecessary and unintuitive layer of complexity on a rule that is already highly criticised for those very attributes. Secondly, it would directly preclude the very situation for which I require a more advanced inheritance principle for BEP016.
Agreed. Unfortunately it may be silent mishandling. My thoughts here have generally arrived at:
Can you link to any relevant discussion? I might have ideas given how long I've spent trying to get around this problem. |
@Remi-Gau was saying the same thing.
I am extremely confident that the algorithm described above can be done in pybids and the validator. I'm sure in bids-matlab as well. While I'm okay with pybids being a little experimental, the validator is very nearly identical to the spec for most people, so we'd want to co-release. If a working branch is needed to demonstrate the feasibility, I'm for it. @ericearl is also interested in writing a set of tools that implement the inheritance principle that can be called/imported in various languages.
Very difficult to enforce, but Eric's plan of a reference implementation would help. If it's fast and easy to use from multiple languages, even better.
If you just mean implementation, we can do it. If you mean something like "different behavior based on the dataset's claimed
Possibly. The steering group gets the final word, obviously.
I just meant this (from first post):
I could change it to:
Practically, this probably means always adding a That would be dissatisfying to me, but if that's what's needed, we can do it. |
Side comment: I think that there is an empty niche in our tooling eco system for a package to generate dummy datasets that match certain "requirements" so they can be used for testing. There is f(ake)mriprep but not very configurable and very narrow in usecase Maybe this could be the occasion to have a first draft for such tool. |
I am sure that bids-matlab could actually be improved from this. |
Although I see the rationale, I worry about this one would making it too restrictive, as indeed cases like
or alike leading easily to conflicts. edit: I do not worry about tools (as you have mentioned - they can disambiguate) but rather about humans . But may be my concern is ungrounded -- worth looking through openneuro datasets on where we would find such conflicts ATM. |
When it comes to the inclusion of entities, I have always conceptualised it as being on an as-needed-to-disambiguate basis. That's the reason why we don't put I don't like the idea of having the difference between two data files being the presence vs. absence of an entity, as opposed to different values for an entity. Given enough thought I wonder if I could make the case that the specification should be forbidding it outright (for data files; for metadata that's just inheritance, and should IMO be used wherever possible).
Being more explicit about your use case here, since it's important for the argument: You want it to be possible for a tool to update an existing derivatives dataset with content that was not requested when that dataset was first created. So that's introducing a degree of mutability to a derivatives dataset over and above regenerating absent / withheld data. In that scenario, I would extend my entity philosophy above and say that you need to include whatever entities are necessary to provide disambiguation for all possible outputs from that tool from all possible re-runs.
Thanks for accidentally providing support for my banning of data files with a strict entity subset of another :-P
Regarding this one, I would actually suggest just taking something like "advanced example 1" currently proposed for addition in #1003, and producing a filesystem realisation of it along with the expected associations & orders for cross-checking. |
From previous discussions, there does not seem to be clear consensus on either relaxing or tightening up the inheritance principle. I can report, however, a practical difficulty with the current rule and derivatives as written. Example:
According to the current rule, both JSON files apply to the MNI-space T1w.nii.gz, which violates
One way of relaxing (or clarifying) this would be:
This is what is implemented in https://github.com/bids-standard/bids-validator/pull/1773, which is needed to handle fMRIPrep derivatives right now. The alternative is to clarify derivatives to figure out what fMRIPrep should be doing differently. We have been trying to avoid inheritance principle ambiguities, and yet we ran into this.
For what it's worth, the proposal above comports with PyBIDS' current behavior, which was intended to support the current reading of inheritance, and has not changed for several years.
Related issues:
The text was updated successfully, but these errors were encountered: