-
-
Notifications
You must be signed in to change notification settings - Fork 308
Add discussion of runtime errors and refusal-to-process vs validation success/failure #1316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -608,6 +608,38 @@ | |
</t> | ||
</section> | ||
|
||
<section title="Evaluation Results"> | ||
<t> | ||
Evaluating an instance against a schema MUST produce one of four general outcomes: | ||
|
||
<list> | ||
<t> | ||
success: the evaluation completed without failing any validation assertions | ||
</t> | ||
<t> | ||
failure: the evaluation completed with assertion failures | ||
</t> | ||
<t> | ||
runtime error: the schema was understood but the evaluation could not | ||
complete | ||
</t> | ||
<t> | ||
refusal to process: the schema was not understood and evaluation never began | ||
</t> | ||
</list> | ||
</t> | ||
<t> | ||
Runtime errors include but are not limited to: failure of reference resolution, | ||
detection of an infinite reference loop, or excessive consumption of memory. | ||
</t> | ||
<t> | ||
Unlike runtime errors, refusal-to-process is the expected outcome in certain cases. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I stil think this line is a bit fuzzy. Consider parsing a number from a string. Failing to parse
In both cases, the outcome is failure external to the normal processing of a function. Similarly, refusal to process and runtime failures both have an outcome of a failure that is external to the normal processing of a schema. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gregsdennis what do you suggest? There is clearly a difference between "I asked for a feature to be supported and it's not so I'm going to do the sensible thing and not use this schema" and "oops something that was expected to work didn't work." What way of describing these things would you find appropriate? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's my point: the difference is semantic. It's the difference between two errors, and I don't think that the specification needs to define that difference. The specification just needs to say, "validation cannot proceed in this case," and let the implementation decide what that means for its language. A feature not being supported is not expected. (In fact, .Net has There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are two phases here: "Can I use this?" and "I'm using this." The way to react to failures in each is different. What is the advantage in preventing these from being distinct? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The other part of this is that the spec intentionally requires a "refuse to process" behavior. What's up for debate is not whether that is a behavior, as that was part of a past debate and was written into the spec. What is up for debate now is how to explain "refuse to process" compared to other conditions. PRs are not the place to re-litigate past decisions. This PR just needs to clarify what's there. Changing what's there requires a new issue. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. They are different outcomes. One is directed at the person who tried to use the schema and tells them to go load a plugin or something. It is specifically called out in the spec as a distinct outcome. The other types of errors tell the person who wrote the schema that something is wrong with the schema.
These are different things. In every dimension I can think of. BTW, I do not care what form the errors take. You can just throw different exceptions, that's fine with me- nothing in this PR has anything to do with how they are reported. But everything about these categories of outcomes is different. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
When evaluating a schema, the author of the schema is (theoretically) uninvolved and likely will not receive notification that something went wrong. The implementation's only recourse is to halt and report to the user why it halted. Maybe it halts differently, I guess, but it still halts. Whether the implementation chooses to not process a schema or it can't process a schema is irrelevant. The outcome is that the schema isn't processed and must report some sort of failure. The messaging in the failure tells the user
It's just like a JSON parser that can handle comments but isn't configured to do so. When it encounters comments, it stops and reports what's wrong. My point is that processing halts. I don't think that the specification needs to say anything more than that. Are you saying that the specification should prescribe the (minimum) content of a failure (if not the mechanism, e.g. an exception)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Or.... are you saying that you want to distinguish between errors that must be caught at evaluation-time and errors which could be caught in some sort of static analysis? Would that require implementations to support static analysis (mine doesn't)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's how I view it -- there are two main kinds of exceptions:
It might also be worthwhile to distinguish between transient and permanent errors. In my implementation, I try to only error at "compile time" for things that are not correctable (e.g. syntax errors), and anything that could be corrected between schema-loading time and data-instance-evaluation time is deferred until runtime (loading optional modules for format handling, checking if the URI in a $ref actually resolves to a known schema location, etc).
When you load a schema into your implementation, do you generate an error if the schema is malformed (e.g. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, but they would be considered a deserialization (JSON text -> model) error. I don't see that as static analysis. I think for this discussion we can assume that we have a syntactically valid schema. Evaluation against a meta-schema does (optionally) happen, but a lot of that is baked into the deserialization because the models are strongly typed and literally can't accept invalid values. |
||
Currently, the only situations where refusal-to-process is either required or allowed | ||
are when a vocabulary cannot be supported, or a dialect cannot be determined. These | ||
scenarios are noted in the appropriate sections of the specification. | ||
</t> | ||
</section> | ||
|
||
<section title="Extending JSON Schema" anchor="extending"> | ||
<t> | ||
Additional schema keywords and schema vocabularies MAY be defined | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that these need to be distinct. I think the outcome of a "refusal to process" is still a runtime error.
In .Net, a "runtime error" means that an exception would be thrown. This is the outcome in both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're distinct because the expected responses are distinct (and this might actually mean I have these definitions a bit wrong in the current text).
A "refusal to process" is a correct outcome. There is nothing you can or should do to produce a different outcome. If you do something like remove an unsupported vocabulary that was preventing evaluation, you're really just trying something else entirely. It might be represented by a validation status of
null
orNone
or whatever, instead oftrue
orfalse
. It could be an exception, but it's not really something that is handled.A "runtime error" is an incorrect outcome. It indicates either an error in the schema (referencing a non-existent internal location, which can be determined immediately) or an error in configuration (failing to supply the necessary external schema for resolving references). These are correctable errors- exceptions you can potentially handle.
For example, if you get an exception that says that a reference to
https://example.com/external-schema#/$defs/whatever
cannot be resolved, you might be able to download that resource, load it, and re-try the original evaluation.Does this correct-but-no vs correctable-error distinction help? I am definitely open to re-framing this, "runtime error" and "refusal to process" were just what came to mind, and the definitions were off the top of my head.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... That's subtle, and I don't think I'd go through the effort of making the distinction except maybe throwing different exception types. But I'd still throw an exception for both cases.
If the spec is going to identify "refusal to process" as a non-erroring result, then accommodation needs to be made for it in the output. Currently there is no such accommodation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also identify the specific scenarios where "refusal to process" is the desired result (as we have with
$vocabulary
)."Runtime error" should be the catch-all result for anything not specified as one of the other result states.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gregsdennis
I like this idea. I think I got too caught up in when things can happen (schema load or evaluation) which produced a weird ambiguity/overlap in these categories.
An unresolvable reference should always be a "runtime error" even if it is detected at schema load time.
I don't think implementation details such as what is or is not an exception (some languages don't even have exceptions, some implementations don't use them) are relevant, though. I wouldn't put refusal-to-process in the output format, but I'm OK with deferring that decision until later (I wouldn't fight it tooth and nail either- it's a weak preference).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with that. Sounds like the paragraph at line 642 needs to be updated, then, as it attempts to make the load/evaluation distinction.
Honestly, I'm not sure we need it at all unless you want to CREF it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gregsdennis I removed it and rewrote the previous paragraph- please see if this seems clear and reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the distinction between "runtime error" and "refuse to process" is hazy, especially since I think that encountering an unknown $vocabulary fits into both categories. You say that "refuse to process" means that nothing you can do would provide a better outcome, whereas an unresolvable $ref is fixable (e.g. by loading the missing resource). But isn't that the same as a missing $vocabulary? I could resolve that error by informing the evaluator about that vocabulary (that is, perform some extra bit of code to load that vocabulary logic into the evaluator).
Perhaps you're seeing the distinction as between "we can detect this condition by statically analyzing the schema", or "we don't know if the evaluation can be run to completion until we try to evaluate it against an instance"? We can certainly detect a missing $schema or $vocabulary in static analysis; whether or not you choose to detect a missing $ref at runtime is more tricky (certainly if you allow for loading resources from disk/network at runtime, you could delay this determination; also, even if all schemas need to be preloaded, circular references mean that one cannot immediately error on loading a schema if it contains an unknown $ref, as that resource may be added sometime in the future but still in time for a successful evaluation).