Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Lenses #441

Open
wants to merge 21 commits into
base: main
Choose a base branch
from
Open

Add Lenses #441

wants to merge 21 commits into from

Conversation

mrjazzybread
Copy link
Contributor

Hello

This PR is the first (I guess second if you count the updated standard
library) batch of changes I am proposing to the Gospel type checker.

Lenses

One of the major changes we made to the type checker and the surface
language is lenses, lightweight notation to describe ownership of
values and their connection to their logical counterparts.

As an example, let us see how we might specify a queue using lenses:

type 'a queue
(*@ model view : 'a sequence *)

val push : 'a queue -> 'a -> unit
(*@ push q x
    consumes q @ 'a queue
    produces q @ 'a queue
    ensures q.view = cons x (old q.view) *)

The consumes clause of this specification states that we receive the
argument q with the lens 'a queue, which means we receive q with full
ownership. The produces clause states that we return the q with the
same lens and therefore the same ownership conditions. Since we have
full ownership of q, we can access its model in the specification.

Since receiving and returning the same ownership is a very common use
case, one can use the modifies which is desugared into a produces and
consumes.

val push : 'a queue -> 'a -> unit
(*@ push q x
    modifies q @ 'a queue
    ensures q.view = cons x (old q.view) *)

There is one more clause that we added to the language: preserves,
which is desugared into a produces, consumes, as well as marking the
argument as read only. For example, the following specification:

val length : 'a queue -> int
(*@ l = length q
    preserves q @ 'a queue
    ensures q.view = Sequence.length (q.view) *)

is desugared into

val length : 'a queue -> int
(*@ l = length q
    consumes q @ 'a queue
    produces q @ 'a queue
    ensures q.view = old q.view
    ensures q.view = Sequence.length (q.view) *)

Other Lenses

The only other lens currently available is the loc lens, which
describes having no ownership over OCaml values. For example, if we
have a type t and we want to write a function for physical equality:

type t
(*@ model v : integer *)

val ph_eq : t -> t -> bool
(*@ b = ph_eq x y
    preserves x @ loc
    preserves y @ loc
    ensures x = y *)

In the postcondition, x and y are of type t loc, meaning that their
model v is not accessible. In this case, the term x = y is comparing
the physical locations of the two values, not their logical values.

Defaults

When the lens is omitted in a specification, we assume full
ownership. This means we can rewrite the specification for push as:

val push : 'a queue -> 'a -> unit
(*@ push q x 
    modifies q
    ensures q = cons x (old q)*)

and get the same specification as before. This means that the changes
proposed to the type checker are backwards compatible, although the
representation of values within the AST has changed.

Additionally, if no ownership clause is given for a variable, then we
assume a preserves clause with full ownership. This means that the
length example could be rewritten as:

val length : 'a queue -> int
(*@ l = length q
    ensures l = Sequence.length *)

Logical quantifiers

Sometimes, we might wish to state a universal property about some
OCaml type which requires us to use a lens to make . For example, if
we wish to state that all queues are non-empty, we could write.

(*@ axiom non_empty : 
      forall q : 'a queue. q.view <> empty *)

Which is desugared into

(*@ axiom non_empty :
      forall q @ 'a queue : 'a queue. q.view <> empty *)

Future Work

Lenses are still quite limited, they only allow us to either claim
full or no ownership. In the future, we might want to add something in
between where we state that we own a data structure but not its
contents. For instance, we could rewrite the length example as:

val length : 'a queue -> int
(*@ l = length q
    preserves q @ loc queue
    ensures l = Sequence.length q *)

Since we use the loc queue lens instead of 'a queue, length only
requires partial ownership of the data structure. This entails that
the type of q in the ensures clause is 'a loc sequence. Although this
is not very difficult to add to the type checker, we have not worked
out what the semantics of this would be.

Unnamed model

When giving a model to an OCaml type we can now choose to have it be
unnamed, for example, the previous queue example can be rewritten as:

type 'a queue
(*@ mutable model : 'a sequence *)

val push : 'a queue -> 'a -> unit
(*@ push q x
    modifies q @ 'a sequence
    ensures q = Sequence.cons q x *)

The only details worth noting is that if a type has an unnamed model,
it cannot have any named model (rejected by the parser). Also, the
previous syntax for models remains the same, meaning if a user has a
single model, they can still , meaning this is also
backwards compatible

This also means that coercions can finally be removed and replaced
with these unnamed models.

Suite of examples

I also added a suite of examples for Gospel specifications, which
serve as an integrity check for lenses. Additionally, examples of
Gospel specifications are just a good idea to have in the repository
in general.

Default Specifications

The Gospel type checker inject a default type specification into types
and value declaration when one is absent. In the case of functions,
all their arguments are preserved. In the case of type specifications,
its model is assumed to be a reflection of its OCaml type and the type
is marked as not ephemeral

Removal of the mutable keyword

When defining model fields, it is no longer possible to mark one model
field as mutable and another as not mutable. This is because the
Separation Logic of such a scheme becomes very difficult to define in
the case of having nested mutable data structures.

In the case of a data structure where one of its model fields does not
change (e.g. an array with a constant capacity), we now write:

type 'a array
(*@ ephemeral
    model elts : 'a sequence
    model capacity : integer *)

And in each function we state that only capacity is modified

val set : 'a array -> int -> 'a -> unit
(*@ set arr i x
    modifies arr.elts
    ensures ...*)

It's a bit more painful to write, but it is much easier to define the
semantics of this specification.

Next PR

I still have another PR lined up, which will be for the translation of
Gospel into Separation Logic. I am splitting them in two for the sake
of keeping these PRs focused and manageable.

@n-osborne n-osborne self-requested a review January 16, 2025 14:14
Copy link
Contributor

@n-osborne n-osborne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR!

I still have another PR lined up, which will be for the translation of Gospel into Separation Logic. I am splitting them in two for the sake of keeping these PRs focused and manageable.

Thanks, this is already a big PR to review as it contains:

  1. introduction of the lenses (in function specifications)
  2. introduction of unnamed models (in type specifications)
  3. add default specifications
  4. removal mutable keyword (in type specifications)

I'll start the review with some design discussion of those different points, then move to a per commit code review.

About lenses

One of the major changes we made to the type checker and the surface language is lenses, lightweight notation to describe ownership of values and their connection to their logical counterparts.

I do like the lenses name, I believe it conveys more information than the previous spatial type.

There are some occurrences of spatial in the code, I'm not sure I completely understand the connection. Are there leftovers from previous version (in that case, they should be modified accordingly), or is there a meaningful connection (in that case, it should be made clear somewhere).

There is one more clause that we added to the language: preserves , which is desugared into a produces, consumes, as well as marking the argument as read only.

I appreciate the symmetry and aesthetic appeal to have preserves as dual of modifies (I really do :-)). But I'm wondering whether we need all the combinatoric and whether the name could then be misleading.

  1. modifies x @ loc: doesn't make any sense and is forbidden.
  2. modifies x @ type_name: we can only modify a value if we have full ownership anyway, the lens is then redundant information.
  3. preserves x @ loc: this is the only thing we can do with a loc anyway.
  4. preserves x @ type_name: this clause doesn't bring any information as read-only and full ownership are the default.

Based on that, I would expect modifies clauses to stay the same (no need to allow adding a lens as it can only be the one expressing full ownership). It seems that the one information we want to be able to express (in addition to what we can already express in the current Gospel) is that we restrict the lens to loc, expressing that we don't have any ownership on the value. I believe preserves is a bad keyword to carry this piece of information as it puts the focus on the read-only aspect.

About unnamed models

This fixes #392.

I agree with the design. If you feel like it, you can extract these modification in a new PR for faster merging.

This also means that coercions can finally be removed and replaced with these unnamed models.

🎉 Once this is included in the main version and if we make the decision to go this way, I'm happy to update #433

About removal of mutable keyword

I believe this is related to the previous point. Again, it can be extracted in another PR (the same as the previous point or another one) for faster merging.

type 'a array
(*@ ephemeral
    model elts : 'a sequence
    model capacity : integer *)

I really like it! Especially how the information about mutability and model are clearly separated.

I have to confess, I've always found the ephemeral keyword a bit strange. If we can find a better one it can be the opportunity to change it.

About default specifications

I'm not sure I understand the reason why? Note that I don't have any reason why not neither.

Per commit review

844cf18 Introduce unnamed model field

You introduce the Tast.model type. What is the meaning of the Self constructor? Maybe this would need some documentation?

Also, Fields that replaces the current encoding, revert the order between mutable and symbol information. This generates some subsequent modifications, otherwise unnecessary. (Comments has not been updated accordingly neither).

In Tmodule.add_sig_contents you define and use an auxiliary function get_field: model -> lsymbol list.
I believe it would be better defined next to the model type.

3772b27 Add lenses to AST

In Typing.process_val_spec, if you flip the arguments in the first definition of spatial_term, you avoid the definition of the second one and you can write

let wr = List.map (spatial_terms Modifies) vs.sp_writes
and cs = List.map (spatial_terms Consumes) vs.sp_consumes in
...

Although, those additions are removed in the 12th commit, so maybe they don't need to be here in the first place?

f6389b0 Type symbols store the logical model of the type

Tast.model and Ttypes.model seems more different than necessary
Why Default becomes Model?

I believe we should keep the corresponding constructors in the same order (when
there are no semantic reason to do something else, lexicographic order is nice)

I guess in ts_model : (bool * model) the boolean correspond to the mutability
information. This would be nice (for our future selves) to document it, either
informally with a comments, or even with an intermediate record type (also, we
do have a mutable_flag in Tast).

Some leftovers from rebasing in test/locations/

11298ba Disallow old in specification for Gospel functions

Makes sense but seems a bit orthogonal to the PR.

cf659f7 Change type checking to use lenses

I'm not sure I understand what is full in Dterm.full_env. Maybe it could be a
good idea to document what means this type and the reason why you introduce it.

Also, as the Dterm.full_env type comes with some utility function, placing it
in its sub-module could make some sense and ease finding names.

There are some left over, as the commit add union_env and env_union.

Refactoring of lb_arg: why?
This refactoring is increasing the distance between Gospel ast and OCaml one.
I believe we should try to maximize the similarities, so that familiarity is transferable.

This makes you introduce a symbol for unit, which is OK. But I'm not sure
none_id and none_vsymbol are good names for it.

As a rule of thumb, it makes more sense and it is easier to review if
refactoring and adding a feature are put in different commits.

In Typing, replacing:

let args_to_env l = List.fold_left add_arg Mstr.empty l
let largs_to_env l = args_to_env (List.map snd l)

by:

let largs_to_env l = List.fold_left (fun (_, x) -> add arg x) Mstr.empty l

will be more efficient (no allocation of an intermediate list).

Thanks for the documentation of Typing.process_args :-)
Note that traditional OCaml documentation uses [function_name arg1 arg2...]
to name and talk about the parameters rather than the @param tag.
The former is also the documentation style adopted in this project.

let prod_env = Mstr.union (fun _ -> assert false) arg_env ret_env in

It is not easy to follow what you are adding to Typing.process_val_spec.
Arguably, the function is already in bad shape, containing a lot of
intermediate functions.

Your modifications increase its size from 183 loc to 217 (in the lase commit).
If you can find a way to extract at least some of the intermediate functions
and some of the logic, it would be very nice.

88c90d7 Change logical quantifiers to use lenses

type dbinder: please document what the two dtys correspond to.

I suggest rewriting:

 | Tquant (_, vl, t) ->
     Svs.diff (t_free_vars t) (Svs.of_list (List.map (fun x -> x.bind_vs) vl))

as:

 | Tquant (_, vl, t) ->
     let aux x = Svs.add x.bind_vs in
     let bounded = List.fold_right aux vl Svs.empty in
     Svs.diff (t_free_vars t) bounded

for efficiency and readability reasons.

0b35ab8 Stack example rewrite with lenses

I believe it is "Stack example rewritten with lenses"
And a better commit message would be: "Rewrite Stack example using lenses"
The commit could be pushed a bit later in the PR, keeping the examples update after all the code modifications.

f27597e Remove mutable keyword

The commit mixes documentation with implementation changes.
Note that updating the documentation is most welcome! Thank you!
There is some mix between spaces and tabulation that pollutes the display of the diff.

@mrjazzybread
Copy link
Contributor Author

Hello

Thank you very much for your review, I will go over your points in
more detail later (I have a lot on my plate right now). The rebase I
just pushed is nothing special, just renaming all instances of
"spatial" with "lens". I will go over some of the simpler points you
raised.

Default specifications

When we define the Separation Logic semantics of a Gospel annotated
OCaml program, we have to decide how we want to translate an empty
specification. Instead of having the Separation Logic semantics module
decide the meaning of an empty specification, this is done during type
checking. I think this is better since, when translating Gospel to
some verification framework, one might either use the Separation Logic
semantics (e.g. My CFML prototype) or the Gospel typed AST
(e.g. Cameleer). Given this, both of these representations should
agree on what the default specification is when one is omitted. I
could have put this in the following PR, but I wanted that one to just
be the Separation Logic semantics and this PR have all (or at least
most of) the changes to the type checker.

Ephemeral keyword

Regarding the ephemeral keyword, maybe we could replace it with
mutable since it is no longer used.

Remove mutable keyword f27597e

Regarding this commit, do you want me to split the documentation into
another commit? Keep in mind that the code within the docs has to be
changed regardless since the documentation must type check for the
project to compile. If we split the documentation into another commit,
we will have a commit where the docs do not reflect the example
specification (which I suppose is fine).

Documentation

I was pretty lazy about documenting my changes to the Gospel type
checker and I do apologize for that 😐. As soon as I can,I will add
some comments guiding what is happening in val_process_spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants