Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guidelines should clarify how fragmentary Schematron should be expanded #2444

Closed
martindholmes opened this issue Jun 28, 2023 · 7 comments
Closed

Comments

@martindholmes
Copy link
Contributor

martindholmes commented Jun 28, 2023

When a TEI <constraint> element contains three <sch:assert> elements (for example) with no other wrapping Schematron, there are multiple possible ways to interpret what is intended. The output Schematron could consist of

  1. a single <sch:pattern> element with a single <sch:rule> element containing all three <sch:assert>s, or
  2. a single <sch:pattern> containing three <sch:rule>s each containing an <sch:assert>; or
  3. three <sch:pattern> elements each containing a single <sch:rule> with a single <sch:assert>.

Each of these output options has consequences for how the Schematron will work.

@sydb has always assumed that the fact that all these <sch:assert>s are grouped in a single <constraint> element here implies that the intention is (1); in other words, that the TEI <constraint> element should give rise to a <sch:pattern> and a <sch:rule> wrapping its contents. I don't believe that's necessarily so, not least because a complete <sch:pattern> could be supplied in the same context; at the very least, the situation is ambiguous and the rules should be clarified. One approach to this would be to find out what the current Stylesheets do and codify that behaviour in the Guidelines. Alternatively, Council may decide that a different set of processing rules makes more sense, and we could then back-port that the the existing Stylesheets as well as implementing it in ATOP.

@sydb
Copy link
Member

sydb commented Jun 28, 2023

I completely agree with @martindholmes that the Guidelines should be explicit about what an ODD processor will do in this case. (Probably in section 22.5.2.) But I am quite comfortable keeping the current behavior. I will go into more detail below, but concisely, of the three possibilities, (1) is the current behavior, (2) is a terrible idea because it means only the 1st <sch:assert> will be in a rule that gets fired, and (3) is essentially the opposite of (1) — it is equally as expressive — but since users have already gotten used to (1) I see no reason to change.

(1) = current behavior
If the input is

  <elementSpec module="core" ident="p" mode="change">
    <constraintSpec mode="add" ident="unwrapped" scheme="schematron">
      <constraint>
        <sch:report test="true()">Yes. It’s true. This man …</sch:report>
        <sch:assert test="false()">We live in a world where unfortunately the distinction between true
          and false appears to become increasingly blurred by manipulation of facts, by exploitation
          of uncritical minds, and by the pollution of the language.</sch:assert>
        <sch:report test="1 + 1 + 1 = 3">Got to be good looking ’cause he’s so hard to see</sch:report>
      </constraint>
    </constraintSpec>
  </elementSpec>

then our two current processors (the Stylesheets and the standalone extract-isosch.xsl) produce the same Schematron except different generated values for @id:

  <sch:pattern id="demo_2444-p-unwrapped-constraint-report-6">
    <sch:rule context="tei:p">
      <sch:report test="true()">Yes. It’s true. This man …</sch:report>
        <sch:assert test="false()">We live in a world where unfortunately the distinction between true
          and false appears to become increasingly blurred by manipulation of facts, by exploitation
          of uncritical minds, and by the pollution of the language.</sch:assert>
      <sch:report test="1 + 1 + 1 = 3">Got to be good looking ’cause he’s so hard to see</sch:report>
    </sch:rule>
  </sch:pattern>

and

  <sch:pattern id="schematron-constraint-demo_2444-p-unwrapped-11">
    <sch:rule context="tei:p">
      <sch:report test="true()">Yes. It’s true. This man …</sch:report>
      <sch:assert test="false()">We live in a world where unfortunately the distinction between true
        and false appears to become increasingly blurred by manipulation of facts, by exploitation
        of uncritical minds, and by the pollution of the language.</sch:assert>
      <sch:report test="1 + 1 + 1 = 3">Got to be good looking ’cause he’s so hard to see</sch:report>
    </sch:rule>
  </sch:pattern>

(They also had different ideas of which namespaces should be expressed with @xmlns and which should be expressed with a prefix, but I have elided those differences here.)

(2) = 1 pattern, 3 rules
This method of processing would produce

  <sch:pattern id="schematron-constraint-demo_2444-p-unwrapped-whatever">
    <sch:rule context="tei:p">
      <sch:report test="true()">Yes. It’s true. This man …</sch:report>
    </sch:rule>
    <sch:rule context="tei:p">      
      <sch:assert test="false()">We live in a world where unfortunately the distinction between true
        and false appears to become increasingly blurred by manipulation of facts, by exploitation
        of uncritical minds, and by the pollution of the language.</sch:assert>
    </sch:rule>
    <sch:rule context="tei:p">      
      <sch:report test="1 + 1 + 1 = 3">Got to be good looking ’cause he’s so hard to see</sch:report>
    </sch:rule>
  </sch:pattern>

This is untenable, because most users would not realize that the 2nd and 3rd rules will never fire. (In Schematron, only the 1st rule (in document order) whose context is matched is fired.)

(3) = 3 patterns, 3 (separate) rules
This method of processing would produce

  <sch:pattern id="schematron-constraint-demo_2444-p-unwrapped-whatever-1">
    <sch:rule context="tei:p">
      <sch:report test="true()">Yes. It’s true. This man …</sch:report>
    </sch:rule>
  </sch:pattern>
  <sch:pattern id="schematron-constraint-demo_2444-p-unwrapped-whatever-2">
    <sch:rule context="tei:p">      
      <sch:assert test="false()">We live in a world where unfortunately the distinction between true
        and false appears to become increasingly blurred by manipulation of facts, by exploitation
        of uncritical minds, and by the pollution of the language.</sch:assert>
    </sch:rule>
  </sch:pattern>
  <sch:pattern id="schematron-constraint-demo_2444-p-unwrapped-whatever-3">
    <sch:rule context="tei:p">      
      <sch:report test="1 + 1 + 1 = 3">Got to be good looking ’cause he’s so hard to see</sch:report>
    </sch:rule>
  </sch:pattern>

This, I think, would have been a perfectly reasonable thing to do back in the day. But it is not what the Stylesheets have done for over a decade, and I do not see any reason, let alone a compelling reason, to change that behavior.

It is worth re-stating that what is being discussed here is what happens when a user does not explicitly specify how patterns, rules, assertions, and reports should be grouped. Users are always allowed to express this explicitly, so we are only talking about what happens when users want to be concise.

Further note that with (1) a user can express either the “three assertions in one pattern” or “three patterns with one assertion each” outcome — by using one <constraintSpec> for the former, or three separate <constraintSpec>s for the latter — all without explicity using <sch:pattern> or <sch:rule>.

@joeytakeda
Copy link
Contributor

joeytakeda commented Jun 28, 2023

Agreed with @sydb — option 1 (just wrap them all in a sch:pattern/sch:rule) makes good sense to me (I believe the same goes for if you have a <constraint> with 1+ sch:rules: all of the sch:rules would be grouped into a single sch:pattern, right?)

@sydb
Copy link
Member

sydb commented Jun 28, 2023

Yes.

@ebeshero
Copy link
Member

ebeshero commented Sep 4, 2023

Council F2F Paderborn: Discussing this, we are fine with the way this is currently processed, and we recommend documenting what the default processing does somewhere, perhaps as a <remark> on the spec for either <constraint>or <constraintSpec>.

@martindholmes
Copy link
Contributor Author

We believe now that this may be moot because the decision to require @context means that the user/ODD-writer will have to supply <sch:rule>, and therefore will be responsible for the grouping.

@sydb
Copy link
Member

sydb commented Mar 13, 2024

This issue is now conceptually tied to #2510, and should be closed “won’t fix” when that one is implemented.

@joeytakeda
Copy link
Contributor

PR #2513 (responding to #2510) now adds a warning that sch:rule[not(@context)] will be deprecated 2025-03-15 — so closing this per @sydb's comment above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants