Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Documentation of Non-Trivial Operations #403

Open
baierd opened this issue Oct 11, 2024 · 3 comments
Open

Improve Documentation of Non-Trivial Operations #403

baierd opened this issue Oct 11, 2024 · 3 comments
Assignees
Milestone

Comments

@baierd
Copy link
Collaborator

baierd commented Oct 11, 2024

Some operations in SMTLib2, for example Integer modulo() and BV smodulo() (signed modulo), behave differently. Our documentation does not reflect this properly or fails to explain the differences in a well-understandable way. Additionally, we could update the documentation of some operations that are not well understandable. We should take a closer look at our public API documentation and update it accordingly.

Examples:

  • Integer modulo
  • BV smodulo
  • BV remainder
  • BV division
  • Integer modularCongruence
  • Numeral division
@baierd baierd self-assigned this Oct 11, 2024
@baierd baierd added this to the 5.1 milestone Oct 11, 2024
@PhilippWendler
Copy link
Member

PhilippWendler commented Oct 11, 2024

As one example of what I see that could be improved, I looked at the current JavaDoc of Integer modulo:

* Create a formula representing the modulo of two operands according to Boute's Euclidean
* definition. The quotient (div numerator denominator) of the internal modulo calculation is
* floored for positive denominators and rounded up for negative denominators.
*
* <p>If the denominator evaluates to zero (modulo-by-zero), either directly as value or
* indirectly via an additional constraint, then the solver is allowed to choose an arbitrary
* value for the result of the modulo operation (cf. SMTLIB standard for the division operator in
* Ints or Reals theory).
*
* <p>Examples:
*
* <ul>
* <li>10 % 5 == 0
* <li>10 % 3 == 1
* <li>10 % (-3) == 1
* <li>-10 % 5 == 0
* <li>-10 % 3 == 2
* <li>-10 % (-3) == 2

My thoughts:

  • Is there a link for this "Boute's Euclidean" definition? It does not ring a bell for me and I would have to google it.
  • What is "the internal modulo calculation" that is mentioned? Is this relevant for users? This part might be better rewritten.
  • There are these formulas like n = k*d + r that specify exactly which condition the result fulfills. Can't you write something like that here, i.e., an actual formal definition? Something like "modulo(n, d) == r holds iff ...".
  • I would generally avoid % in the examples. This may still be ok here, but why suddenly use an operator symbol in the examples that does not appear anywhere else in the text and is therefore not defined at all? And with smodulo/remainder it really causes confusion. The examples should simply use the function name, i.e., modulo(10, 5) == 0 etc.
  • And is it true for integer modulo that it is the same or similar or explicitly different to the % operator in common programming languages? Then it would be helpful for users to discuss this. Something like "Note that this function implements the same semantics as the % operator in programming languages like C or Java (with the obvious exception of applying it to unbounded integers instead of bitvectors)."
  • "then the solver is allowed to choose an arbitrary value for the result of the modulo operation": What does that mean exactly? According to the SMTLib standard, is the result nondeterministic, i.e., all values are possible? Or is the result always a specific value, but the value differs depending on the solver? Or is both permitted by the standard? These are all relevant different cases and it should be clarified which of them applies.
  • It would be best to link to the (relevant section of) the SMTLib standard.
  • For those theories were both remainder and modulo are defined, an explicit cross-reference and discussion of the differences would be good.

@baierd
Copy link
Collaborator Author

baierd commented Feb 27, 2025

Thank you for your input!

I've extended the discussed methods JavaDoc with many of your thoughts.

Some are not really applicable however:

  • "And is it true for integer modulo that it is the same or similar or explicitly different to the % operator in common programming languages?" What is a common programming language? Also, i would argue that the user has to check whatever is applicable here anyways.
  • "It would be best to link to the (relevant section of) the SMTLib standard." I 100% agree. But thats not possible, as the standard is... lets say a little scattered. Most information can be found by simply going to the standard and checking the page of the appropriate theory. I don't see a benefit in linking that.
  • "For those theories were both remainder and modulo are defined, an explicit cross-reference and discussion of the differences would be good." While i agree in general, we explicitly state the definitions and the differences are a result of those. A user has to understand what the function is doing anyways. Also, we might actually make a mistake in describing the differences.

Note: we currently have not only the most explanation for those functions out of any solver or library, but exceed the standard itself!

@PhilippWendler
Copy link
Member

I've extended the discussed methods JavaDoc with many of your thoughts.

Do you mean b1075c7 or 3882ab0 or something else? Referencing the issue number in the commit message would help to see the commits here and be able to look at the changes.

In the latter commit I see that (dividend = quotient * divisor + remainder) is part of the documentation of both smodulo and remainder. But the two methods differ, so the equation can only hold for one of them, right? Which one is it, and what is the equation that defines the result of the other method?

Also, I note that the result of the smodulo operation is called "remainder", isn't that confusing?

  • "And is it true for integer modulo that it is the same or similar or explicitly different to the % operator in common programming languages?" What is a common programming language? Also, i would argue that the user has to check whatever is applicable here anyways.

Why would you argue that this is the case? If JavaSMT tells me "this operation is the same as the modulo operator in C and Java", then as a verifier developer I would not have to check anything on my own, I would simply call the respective method and be done with it. This would be a significant simplification of my work in this situation.

And the question "What is a common programming language?" is just an irrelevant distraction from my suggestion to add such a statement. Even if you only know about one programming language and add this information to the JavaSMT documentation, this would already help. I am not even asking for research of this topic, but I know that you know the answer to this at least for C, and I am saying that it would be better if you would add this knowledge to the JavaSMT documentation. And if you also know it for Java and maybe something else, even better. I was just saying "common programming languages" because I would not make it a goal to list every programming language here, just some of the more relevant ones.

  • "It would be best to link to the (relevant section of) the SMTLib standard." I 100% agree. But thats not possible, as the standard is... lets say a little scattered. Most information can be found by simply going to the standard and checking the page of the appropriate theory. I don't see a benefit in linking that.

I do not understand why scattered information should make it impossible to link it? I would say that scattered information would even make it more important to link it, because users are less likely to be able to find the relevant information themselves!

You say "can be found by simply going to the standard and checking the page of the appropriate theory". But how do I "simply go to the standard"? Users who are not as familiar with SMT-LIB as the JavaSMT developers (like myself, for example) will not immediately know where it is. If I search for it on the internet I find https://smt-lib.org/language.shtml, but it has three different versions of the standard, even if I exclude the "previous versions". Which one should I use? Some JavaSMT Javadocs explicitly reference SMT-LIB version 2.6, does this mean that I should always use version 2.6., or does this mean that I should use version 2.6 where mentioned and the the latest version 2.7 for all other methods? I checked the JavaSMT readme, but it also does not tell me which SMT-LIB version is relevant. And then, if I want to read version 2.6, there are 4 different versions of SMT-LIB version 2.6 linked, all with the same version number. Which would should I then use?

And how I "check the page of the appropriate theory"? If I look at the table of contents of SMT-LIB 2.6 or SMT-LIB 2.7, there is no list of chapters for theories. In fact, terms like "float" and "bitvector" do not appear at all in the table of contents, and so I am completely lost and do not know where to search further.

So, please, add proper and unique references to the documentation, for example something like "SMT-LIB 2.6 Release 2024-09-20 Section x.y.z". And wherever possible make that reference a link to the appropriate place or as close enough to the target as possible (if no deep links to sections are possible, please at least add a link to the whole standard).

As an example, consider all the countless discussions in SV-Benchmarks, CPAchecker, probably similar tools, etc. about the intricacies of the C standard. There we all the time use something like "C11 §6.9.a.b (x)" instead of "cf. C11 for the pointer-dereference operator". References like this are what I as a user for JavaSMT need to be able to follow to the source of the definition.

  • "For those theories were both remainder and modulo are defined, an explicit cross-reference and discussion of the differences would be good." While i agree in general, we explicitly state the definitions and the differences are a result of those. A user has to understand what the function is doing anyways. Also, we might actually make a mistake in describing the differences.

But what is the actual downside of adding such short cross references to these particular cases where there are countless discussions and misunderstandings? The fact that each operation should be defined itself is a necessary condition for good documentation, but not necessarily sufficient. After all, you also have examples in the documentation because you find them valuable, but the example results are also just a result of the definitions of the operations.

The argument that you might make a mistake also only indicates how important it is here to have as much and as clear documentation as possible. If you, who has spent months deeply involved in this topic, are not confident that you get it right, then probably every single JavaSMT user (including myself for sure!) is even more likely to get it wrong on their own. And of course, there are several people who could review what is written in the documentation.

Note: we currently have not only the most explanation for those functions out of any solver or library, but exceed the standard itself!

That doesn't really help those people who still do not find the information they need in the JavaSMT documentation, it makes it just more important for JavaSMT to provide lots of good documentation (and it provides the potential for JavaSMT to become even more useful, well-respected, and famous in the SMT community).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants