-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Escaping of non SMT-LIB compliant names? #26
Comments
The same problem appears for "true" and "false", if they are created as integer variables. SMTInterpol cannot create variables with such names. It might also happen for other reserved tokens. |
SMTlib allows to escape names with |
Let's slow down a bit here, why do we have a variable with such a name in a first place? |
If we do that, the formulas in SMTInterpol will look different to formulas in any other solver, which would make transferring formulas between the solvers rather unpleasant. Of course, we can still escape names across all solvers, for both variables and UFs. |
What's the current state for this? I think the most important point is that declaring a variable like |
According to SMT-LIB documentation each variable called "x" can be referred We should test if that's indeed the case for all solvers and if that's the On Apr 4, 2016 12:49, "Philipp Wendler" [email protected] wrote:
|
Actually that turned out to be wrong: all solvers treat "x" and "|x|" to be different.
But solvers already have these utilities! And they use them! Just not in this case. I'm tempted to say that's just a bug in SMTInterpol, as all other solvers seem to be able to handle it. @PhilippWendler
Hence I suggest a following course of actions: I'll file a ticket with SMTInterpol developers, and for select/store/etc I would suggest escaping on CPAchecker level. |
Of course we can also have a workaround which would work now: apply escaping only for SMTInterpol, and only when declaring the function fails (catch exception and retry) |
I am not exactly sure what your proposal is, the two parts of the proposal seem to contradict each other. Anyway, I can only refer to my comment above (from April 4th), this summarizes my opinion. The only thing I would add is that if some user-visible escaping is applied anywhere, this would also need to be documented and utilities for unescaping should be provided. |
@PhilippWendler As to documenting, I'm not sure where it should go, any suggestions? |
If one solver needs escaping and this is not handled by JavaSMT, then JavaSMT's API does need escaping. One should not code against JavaSMT's API with only a specific solver in mind. If using JavaSMT's API needs escaping, then of course escaping utilities should be added to JavaSMT. It does not make sense at all to let every user of the API re-implement escaping. However, I still do not understand why you propose escaping in CPAchecker when at the same time you say solvers should do the escaping. And yes, all known problems with solvers should be documented if we cannot add workarounds. Documenting them once is much less effort than letting all users of JavaSMT rediscover them and spend effort debugging them. I suggest a section in the documentation. |
@PhilippWendler If there was a solver that does not accept variable names starting with a letter "a", does it mean we should have wrapping just for this solver? How far should catering for particular needs go? If we add an To me we have multiple evil options, and crashing on certain variable names is currently the least evil solution. |
Exactly.
But you say we do have this problem already because other solvers do escaping internally. So why is this a problem for SMTInterpol specifically? Why can't we add transparent unescaping for all solvers to solve the problem? You claim a performance problem, but the same performance problem would occur if users of JavaSMT apply the (un)escaping themselves. The most important criterion for the JavaSMT API in this case is IMHO that it should behave similarly for all solvers. Second criterion is to make it as easy to use as possible for users. Doing nothing satisfies neither of them. |
Except they don't have to, if they use a different solver, or don't create variables called
You seriously suggest to add all these problems in order to have a workaround for a problem in SMTInterpol? |
I still do not understand the performance argument. You just suggested to the SMTInterpol developers to add escaping, but you don't want to add the very same feature to JavaSMT because of performance? Why should this be slower in JavaSMT than in SMTInterpol? Why would users that escape themselves be affected if we escape Why should a workaround for SMTInterpol that makes it behave like the other solvers affect users of these other solvers? I would not expect any form of unescaping for SMT-LIB serialization, because escaping is simply required for SMT-LIB serialization (otherwise it would be invalid). I am pretty sure that the serialization output of all other solvers also contains escaping for |
Performance of what specifically though? I was saying that adding "|" on every single visitor traversal is very costly, what does this have to do with adding pipes in SMTInterpol when creating the symbol? Our problem (visitor provides a pre-computed String) does not apply to them.
I don't understand what this question is referring to.
I don't understand that question either, sorry.
So our output with your suggestion will be escaped twice, once by us, and once by the solver, giving a different string than without escaping. This can be bad. |
Performance of whatever you were claiming is a performance problem.
I cannot see any difference in code for the implementation of the escaping feature between doing it in JavaSMT and SMTInterpol itself. If you call
You were claiming that we doing escaping is a performance problem for users that do escaping themselves. I ask why this should be the case.
You were claiming that we doing escaping is a performance problem for users that use other solvers than SMTInterpol. I ask why this should be the case.
Why should there be dual escaping? I do not understand this at all. Our problem is that SMTInterpol does not do escaping, so if we give it an escaped variable I certainly do not expect SMTInterpol to escape it again when writing SMT-LIB output. So no dual escaping. |
If we only escape for SMTInterpol, how should serialization to SMT-LIB work? Should we undo our escaping first?
IIRC, SMTInterpol does do escaping for SMT-LIB serialization. If we escape as well, everything will be escaped twice. |
But why is this transparent unescaping more expensive if done by JavaSMT than if done by SMTInterpol? Why is the transparent unescaping a problem here if it is not a performance problem for all other solvers that already do it? If you really expect the unescaping for I simply propose to add the necessary adjustments to make SMTInterpol behave just like the other solvers. I do not see why we cannot do what all the other solvers successfully do.
Why should we do escaping for other solvers which manage to do it correctly themselves?
No, why?
If we give |
OK maybe I have misunderstood initially. |
@PhilippWendler The situation with pipe symbols in SMTInterpol is even more funny: it happily accepts them, treat the identifiers as different, and then crashes on printing models or context with an assertion error. |
I would simply file these crashes as bugs in SMTInterpol. |
@PhilippWendler It's not that simple, it's done on purpose. SMTInterpol decides to not accept such variables altogether, but does so in an inconsistent way, where the check is done only when performing the serialization, to SMT-LIB or to the model. I have filed this inconsistency as a bug, but the assertion would remain in any case. So what's your proposal now? We can still escape select/store/etc variables for SMTInterpol with some other symbol, e.g. with triangular brackets. That adds an extra layer of abstraction, and does not achieve a goal of being able to consistently create same variables across all solvers, as pipe symbols would still behave in an inconsistent way. Overall, trying to enforce consistency between different solvers gets us into a can of worms, and I'm not sure that this should be a goal of JavaSMT. |
I would go with escaping with triangular brackets or If pipes are supported inconsistently and we do not want to or cannot fix this, then maybe we should forbid them completely? The definition of JavaSMT is to provide a common API for a range of solvers, and an API does not only specify method signatures but also how the methods behave. So of course enforcing consistency is for me one of the main goals of JavaSMT. Users of JavaSMT should not have to know and follow a lot of solver-specific rules when using JavaSMT. |
@PhilippWendler Well OK great, how do we distinguish between our |
If we forbid variable names matching All this complexity of variable-name handling won't go away if JavaSMT refuses to handle it. The burden will be just be on each individual user of JavaSMT, and thus be much greater in the end. Everything that JavaSMT can do to improve consistency between solvers will have a very large benefit because otherwise the problem needs to be solved again and again by users of JavaSMT. |
On Fri, May 20, 2016 at 5:54 PM, Philipp Wendler [email protected]
|
But we have that problem already now! It is not an argument against doing our own escaping. If you want to fix this problem, then you need to add our own escaping that is used for all solvers.
Why should this be the case? It is not trivial to fix this problem in CPAchecker. In fact, the easiest place to fix this is in JavaSMT, because in JavaSMT would be the smallest amount of code that needs to be touched. The code that uses JavaSMT is much larger in size, and thus it is more difficult to handle it there and not forget a case. |
Should our API allow to create non-SMT-LIB compliant function names? Current status: Proposal 1: Proposal 2: I am in favour of the second proposal. |
For other solvers everything seems to work fine even if the formula is dumped. They do some escaping themselves. I wrote an email to Philipp before, maybe I should duplicate some of it here.
|
2016-05-27 11:18 GMT+02:00 George Karpenkov [email protected]:
Proposal 1: Then we have to check/assume that every function name given to
Proposal 1 will disallow "|" in all cases. However when parsing a formula,
I think, it is acceptable to have a difference between the dumped version Do you prefer the first proposal? |
Maybe I should re-formulate my intents (for proposal 2):
In all cases, the user has the possibility to insert invalid names into the solver, either via the SMT-LIB-code (Proposal 1 escapes only names given in the API, and some solvers support more parsing than others) or via the API (Proposal 2 allows the usage of arbitrary function names, in contrast to a SMT-LIB-compliant dump). There will always be a difference between the SMT-LIB-interface and the API-interface. And in my opinion, the API-interface should allow more possibilities to create function names. The only problem remaining is to guarantee identical behaviour for all solvers, and currently only SMTInterpol has problems with this. Another solution would be to ignore this topic and fully move the responsibility for creating and providing valid names towards the user. This might be the easiest solution, as it requires no code change for JavaSMT and is (at least partly) obvious to the user. But I do not prefer this option. |
Right. Is the problem specific to Mathsat? Admittedly, I haven't tried much, but for the cases I have tried Z3 managed to escape things with pipe symbols.
OK I see. In practice, things are usually escaped using pipe symbols (as it's a no-op under SMT-LIB semantics), yet SMTInterpol crashes on input pipes.
If I have a difficult formula I have constructed using JavaSMT, I should be able to dump it to e.g. report it to solver developers, or to store it. Execution on the dumped formula should be exactly the same as the execution on the formula as constructed inside JavaSMT.
I prefer not doing anything. Crashing on weird formula names, albeit unpleasant, is very obvious and very easy to fix for a client. Too-smart-for-it's-own-good-behavior which tries to do some transparent escaping and results in weird edge cases (e.g. serialized formula is not the same as what user is expecting) is IMO much worse. |
@kfriedberger In practical terms, do you have a solution which:
|
Sorry for bringing this up again, but this issue is closed and I cannot figure out, whether it will be fixed at some time or is considered as "won't fix". I ask because I have the problem again with a file that has a method called I see that a lot of things have to be considered with this, hence I'm fine (at least for this case) if it will never be fixed, just wanted to ask out of curiosity. |
@stephanlukasczyk I think we have failed to reach a productive conclusion.
I still feel that a much simpler way to solve this is in the |
Is there a chance this issue will result in backwards-incompatible changes? Then it should be done before 2.0. |
@PhilippWendler will it satisfy everyone if we simply crash on all non-SMT-LIB compliant names in |
A consistent handling across all solvers would at least be something, yes. A problem I see with rejecting names is that the list of names to be rejected depends on the SMT theories we support. Whenever we add a new theory, there will be new operator names that need to be blacklisted, meaning that adding new theories can break existing users' code. So at least JavaSMT needs to offer utilities for checking whether a name is forbidden and for escaping and unescaping, such that not every user needs to reimplement this themselves (and worse, keep a duplicate of the blacklist that JavaSMT uses). And if we provide such utilities, what was the reason again for not just applying them directly in JavaSMT? |
Would they? From my understanding, if the solver rejects a name, it always rejects it, JavaSMT support or not. It also seems a mostly theoretical concern, given that from what I remember we only have issues with
Yes.
Extra complexity and a non-intuitive behavior (the actual formula is not what the user expects). |
Also, what exception should we throw on bad var names? |
Can we add |
Note that Z3 4.7.1 behaves differently for variable names that match |
Currently all solvers and our API seem to be valid in the following points:
In my opinion, we should not escape user-given Strings as part of JavaSMT, because we can handle them directly without further problems. |
With commit 0dc6a4b we do no longer allow symbol names that are not SMT-LIB2-compliant. |
Now we forbid problematic variable names. Will we add an (optional) utility for escaping and unescaping variable names, such that not all users need to implement this themselves, which is complex and error-prone (and problematic because such an escaping would need to be adjusted every time JavaSMT changes its rules for variable names)? I suggested this already last year. |
This could be done as another (optional additional) software layer around JavaSMT.
There might be more problems than solutions through this approach. |
A first step would be to at least provide For me it was never a goal to add this to the dumping or parsing of SMTLib formulas, we can use the escaped names there. |
With SMTInterpol, the following code fails:
Exception:
The reason is that
select
is already an existing interpreted function in SMTInterpol.I think that JavaSMT should transparently escape such reserved names for variables and functions by quoting them. Probably
select
is not the only case where this needs to be done.The text was updated successfully, but these errors were encountered: