Skip to content
This repository has been archived by the owner on Nov 11, 2023. It is now read-only.

Roma sanity checker fails on previously found correct schemas #13

Open
GideonK opened this issue Apr 8, 2016 · 17 comments
Open

Roma sanity checker fails on previously found correct schemas #13

GideonK opened this issue Apr 8, 2016 · 17 comments

Comments

@GideonK
Copy link

GideonK commented Apr 8, 2016

(I am new to both Github and the TEI so please excuse any faux pas from my side.)

Version

The Roma web site reports being at version 4.18.

It uses P5 3.0.0 to generate the schema.
shouldwork.zip

Steps to reproduce the issue

  1. Download and unzip the attachment. It includes an anonymized customisation file of a schema that used to work with the sanity checker under version 4.10.
  2. Go to the Roma website.
  3. Click on "New".
  4. Click on "Choose file" next to "Upload a customization", and select the file "shouldwork.xml" that was unzipped, and upload it.
  5. Click on "Start".
  6. Click on "Sanity checker".

Actual results

The page shows that the schema is broken.

Expected results

The schema generated from the customisation file should pass the sanity checker's tests.

Notes and observations

  • The attached customization file is the first of a series of around 9 or 10 that have all passed the sanity checker under version 4.10. Each new file is a superset of the previous, with some added elements. The original file is based on tei_bare.xml.
  • A slight adaptation was made to the original file in order to make TEI the root element instead of teiCorpus, since I am working with single texts with the idea of combining them later under a teiCorpus root.
  • I am using Jing to validate my mark-up against the produced schemas (in RNG/RNC). There are currently no problems.
  • The already existing template for TEI corpora, available on the Roma website, also does not pass the sanity checker. I have found a newer version [here[(http://www.tei-c.org/Guidelines/Customization/#community), but the customization file is missing (as far as I know one cannot upload ODDs and RNGs etc. to Roma or convert them - please correct me if I'm wrong, as I'm new to this). In any case, the changes between the two versions are few.
  • Even the TEI and teiCorpus root elements cannot be used, for all customization files that I have so far uploaded. It seems that not a single element can reach the root.
  • My guess would then be that there was a change from 4.10 to 4.18 that is invalidating the current customization. Perhaps my change of the root is to blame somehow? However this does not explain the sanity checker failing on the template file.
  • Whenever I would like to update an existing schema that used to work, it is now impossible to run the sanity checker to ensure a well-designed schema. Your help would be greatly appreciated.

Kind regards,

Gideon

@lb42
Copy link
Member

lb42 commented Apr 8, 2016

Thanks for reporting this problem. The Roma at the TEI website was broken earlier this week because of a problem in the way the new stylesheets are being invoked, but (we think) that problem was fixed. The fact that the sanity checker has clearly gone insane may or may not be related; however, the sanity checker was an add on to Roma which is not currently being maintained, so if the solution is not obvious we are unlikely to be able to fix it in the short term. You can of course always check that your schema is valid using oXygen. There was no ODD file attached to your ticket so I cannot check it.

@jamescummings
Copy link
Member

@lb42 If you click on the link 'shouldwork.zip' in the body of the issue on the github website, you will get a copy of the ODD in a zip file. However, we don't need to do that to see that this is the case.

If you go to Roma, start up with any schema (i.e. 'Build Up' or 'Reduce' or a template that is there) and then go to the sanity checker, it will have a problem.

My first suspicion is that this is related to the change to Pure ODD content models. If the sanity checker is working by tracking through routes to ensure that every element you have is available to use, then I suspect it may have the assumption of RelaxNG content models hard coded in whatever bit does this. I'll try to have a look to see if I can at least find the bit where it goes wrong later today.

@lb42
Copy link
Member

lb42 commented Apr 8, 2016

@jamescummings. Yes, the shouldwork.zip contains just a copy of the teibare odd, provided I assume in order to show that the problem affects any ODD (which it does, and probably for the reason you suggest) ; I was curious to see what @GideonK 's own ODD looked like. If the cause is as you surmise, we should probably switch off the sanity checker.

@jamescummings
Copy link
Member

@lb42 Ah. I assumed he was just providing that to demonstrate that it was a general problem. But yes, seeing his ODD would be good as well. But given that it is affecting Roma with any ODD I think that can wait. ;-) (i.e. maybe his ODD has no problem whatsoever.)

@GideonK
Copy link
Author

GideonK commented Apr 8, 2016

Thank you for your replies. The customization file that I provided is tei_bare.xml plus some elements added and adapted to use TEI as root (see OP). The associated ODD is attached. (I have noticed that PDF output is not supported anymore.)

shouldwork_doc.zip

Is there an offline version of the sanity checker that I could use in the meantime? I am not sure that I would currently/for now be able to use oXygen in my work environment.

@jamescummings
Copy link
Member

Hi @GideonK,

Yes, PDF output of documentation seems to have been removed at some point. (well specifically as part of this commit 9ca90e2). I believe the idea was that since there were so many problems with the PDF output that instead one should provide the raw Latex (so people can go and convert that to PDF if they want) but also to replace it with Word output, since most people can generate PDF from word (e.g. Export as PDF in LibreOffice).

@lb42
Copy link
Member

lb42 commented Apr 8, 2016

The sanity checker in the older version of Roma at http://tei.oucs.ox.ac.uk/Roma is still functioning as before (i.e. using an older release of the Guidelines) It will however be updated at some time in the near future, or so I assume.

It's written in PHP so if that's in your skill set, feel free to adapt it!
source is at https://github.com/TEIC/Roma/blob/master/roma/sanitychecker.php

@GideonK
Copy link
Author

GideonK commented Apr 8, 2016

Thanks for the explanation @jamescummings. @lb42 Great, good to know about that one. Sanity prevails. :) (my latest schema is correct).

One last question: Could you perhaps point me to a changelog of the changes between 4.10 and 4.18?

@lb42
Copy link
Member

lb42 commented Apr 8, 2016

Not sure: would https://github.com/TEIC/Roma/commits/master/roma not meet the bill?

@GideonK
Copy link
Author

GideonK commented Apr 8, 2016

@lb42 OK thanks that might be useful.

@jamescummings
Copy link
Member

I'm intentionally holding off updating that version of Roma (which I would have pointed to as http://tei.oucs.ox.ac.uk/Roma btw, samething I know but one day the .oucs virtual host might vanish), until we're sure all the problems have been updated and fixed as a result of 3.0.0. I'm tempted to keep two of them running. ;-)

@martindholmes
Copy link
Contributor

The sanity checker looks quite nicely written and well-documented. But Syd, Julia and I have been talking about an XSLT-based sanity checker that could do a great deal more. We should probably go in that direction if we can. I fear it would have to be integrated into OxGarage for Roma to use it, though.

@jamescummings
Copy link
Member

@martindholmes sure, but if we were doing that, then wouldn't we want to be spending that energy rewriting Roma as an ODD editor in any case?

@martindholmes
Copy link
Contributor

Writing an XSLT sanity checker and integrating it into OxGarage seems to me to be a much smaller task than rewriting Roma. It would also help us get some insight into OxGarage, which is arguably a more important thing for us to maintain than Roma (by definition, since Roma depends on it).

@jamescummings
Copy link
Member

@martindholmes: Oh true. Wasn't suggesting that it wasn't or that it wouldn't be useful in itself. Just expressing my preference. ;-)

@jamescummings
Copy link
Member

Looking at sanitychecker.php I think it builds a list of all the RNG <define> elements to work through. These now don't exist in a pureODD context. I think it is the two functions here https://github.com/TEIC/Roma/blob/master/roma/sanitychecker.php#L239 that may need to be changed. (but I've only given it a quick glance).

@jamescummings
Copy link
Member

For now I've updated the live code and redone so in the github repository commenting out the sanity-checker tab.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants