Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor models to satisfy desired invariants #49

Merged
merged 4 commits into from
Sep 5, 2019

Conversation

rohanpm
Copy link
Member

@rohanpm rohanpm commented Sep 4, 2019

Our model classes had various bugs and weaknesses. This pull request
fixes them and adds tests to ensure that the same mistakes can't happen
on new classes.

The original motivation here was to fix up inconsistent usage of tuples vs lists
for immutability, but testing found several additional bugs to be fixed, most
notably that the recently added maintenance classes had fields using an
entirely different type than what was declared in the model.

The changes made here include:

  • added tests which verify that all public attrs-using classes meet certain invariants,
    such as "actual types match declared types" and "all fields are immutable"
  • added a FrozenList immutable list type and used that for all model fields which were
    formerly using tuples or plain lists
  • fix incorrect implementations of Maintenance classes which used wrong types
  • make Page immutable too (seems like an oversight that it wasn't already)
  • bump version to 2.0.0 because these changes are not strictly backwards-compatible
    (although I doubt that any current users are affected).

More information on each of these changes can be found in the commit messages.

It turns out there are a few mistakes easy to make when defining
new model classes (and we've made them on almost all classes
defined so far). These mistakes can break expected usage of hash/==
and a few other things.

Before fixing this, let's add some tests to cover the problems.
These tests will verify that model objects satisfy basic invariants
such as "actual types match declared types" and hash/==
functionality.

As can be seen in the test, every model class right now
is unable to pass the tests and so are marked as XFAIL.
The intent is for all public model classes to be immutable.
However, a problem arises when a model is expected to have a list
field, as lists are mutable and not hashable.

Previously, the attempted fix for this was to use tuples for those
fields instead.  This was done since a tuple is the closest thing
to an immutable list built-in to Python.

Unfortunately it turns out there are some undesirable properties
from using tuples.  The biggest issue is that most APIs are defined
in terms of lists and developers are usually picking lists as the
default choice for an iterable.

This means that, even if we define e.g. signing_keys to be a tuple,
a dev is likely to write:

  Repository(signing_keys=['foo', 'bar'], ...)

And we can make that "work" by adding a conversion to tuple, but
unfortunately a tuple is not equal to a list of same length and
contents, which results in unexpected behavior:

  # assertion fails!
  assert Repository(signing_keys=['key']).signing_keys == ['key']

It seems that if we continue using tuples, users of these classes
would have to be constantly vigilant to remember whether a value
is to be passed and compared as a tuple or list, which is quite
user-unfriendly.

So, to keep the ease-of-use of lists while avoiding the downsides
of tuples, let's define our own immutable list type, FrozenList.
This type is just like a list except that no mutation is allowed,
and instances can be hashed (like a tuple).

This class should be used on attributes like:

- use default=Factory(FrozenList) if you want to default to an empty
  list

- use type=list, and not FrozenList, because `type` acts as the
  documented type and we are not committing to using this specific type

- use converter=FrozenList to convert from input iterables into a
  FrozenList

The resulting behavior is as follows:

    # can create with a plain list
    repo = Repository(signing_keys=['key'])

    # assertion passes
    assert repo.signing_keys == ['key']

    # attempting to modify crashes
    repo.signing_keys.append('other-key')
Most model classes had some bugs causing them to fail invariant
checks. Fix them all. The most common issue was missing validation
against type.

Two larger changes include:

- MaintenanceReport: some fields were declared and documented
  as type=datetime, but all of the code was implemented using
  strings (see related test update). This has been fixed to use
  datetime throughout.

- Page previously was not frozen at all. Now it is, like the other
  model classes.
Recent model changes have adjusted the return type of some methods
and made certain behavior more strict. Though there probably aren't
affected clients right now, this is technically
backwards-incompatible, so the major version should be bumped.
default=attr.Factory(tuple),
type=tuple,
default=attr.Factory(FrozenList),
type=list,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not "FrozenList"?
In docs there's:

Please note that attrs doesn’t do anything with this metadata by itself. You can use it as part of your own code or for static type checking.

having list here makes it confusing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attrs docs aren't quite accurate, or at least, aren't telling the whole story. These types do actually make it into the API as they are added as Python3 type annotations, where tools like pylint, mypy and sphinx look at them. I expect them to end up in our docs eventually, though it's not quite working right now (#24).

For example, before this change the declared type of 'distributors' argument shows up in API as:

>>> inspect.signature(Repository.__init__).parameters['distributors'].annotation
<class 'tuple'>

After this change, it will be 'list'. I do believe list is more appropriate than FrozenList here because:

  • setting type=FrozenList is exposing that type as part of our public API, but I don't want to commit to returning instances of exactly that class
  • given that it's used as type annotation on the constructor, the appropriate usage seems more like 'declare this as the type you accept' and not 'declare this as the type you return/store'. So, type=list is fine as the constructor does indeed accept any type of list and doesn't have to be passed exactly a FrozenList.

@@ -0,0 +1,35 @@
class FrozenList(list):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What the benefit of having custom class instead of tuple?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote explanation for that in 75790f1 commit message. In summary: the previous solution was to use tuple but it has usability issues because developers choose lists as the default container and tuples aren't as compatible with them as it may seem at first glance. For instance (1, 2, 3) == [1, 2, 3] isn't true.

That means, if it continued using tuples, developers have to get in the habit of remembering where they need to write a tuple and where they need to write a list, or writing tuples everywhere (which is a tough ask), or running into weird bugs.

@rohanpm rohanpm merged commit 9e38dc8 into release-engineering:master Sep 5, 2019
@rohanpm rohanpm deleted the model-invariants branch September 5, 2019 01:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants