feat: multi-plugins with extra schemas #231

henryiii · 2025-02-26T17:57:48Z

This closes #144 and closes #134, using the "That is a good approach!" approach. :)

This adds a new entry point, validate_pyproject.multi_schema, that returns a dictionary with tools as well as extra schemas. We could later add more optional keys to this dict, so it's flexible, and the tools table allows fragments to be specified. This would allow validate-pyproject-schema-store to return all of it's known schemas in one entry point, rather than having to add an entry point for every table (and would allow the extra schemas to finally be present).

A few points for review:

Check to make sure names are reasonable, I didn't think too long and hard about names.
Should tools be a required key (currently not)? (Someone could always add an empty dict if they specified the rest with classic entry points)
Should one of the two take priority over the other? For back-compat, validate-pyproject-schema-store will likely provide both for a while. Currently this is just de-duped based on either the tool name, if present, or the schema ID otherwise, and multi plugins are loaded afterwards, so I think they take priority, but I could switch it (and test it).
Should filtering be any different?
I'm pulling help text from the description

Signed-off-by: Henry Schreiner <[email protected]>

abravalheri

Thank you very much for working on this @henryiii, sorry for the delay in reviewing.

Check to make sure names are reasonable, I didn't think too long and hard about names.

Names are hard 😅. Seems reasonable. What is the rationale behind StoredPlugin? (I suppose the vision is that they come from SchemaStore).

Should tools be a required key (currently not)? (Someone could always add an empty dict if they specified the rest with classic entry points)

I am OK with that.

Should one of the two take priority over the other? For back-compat, validate-pyproject-schema-store will likely provide both for a while. Currently this is just de-duped based on either the tool name, if present, or the schema ID otherwise, and multi plugins are loaded afterwards, so I think they take priority, but I could switch it (and test it).

I don't have a strong opinion about this. I think the important part is to ensure consistency and replicability, which sorting is already doing.

Should filtering be any different?

Possibly also apply the filtering to the "multi" entry-points for the sake of symmetry? I don't foresee a lot of usage of this feature out of the context of tests (do you?), so it should not matter too much.

I'm pulling help text from the description

The help text is just a hint to show in the CLI (enable/disable bits). I suppose it is intended to be a "oneliner" about the plugin. Getting the description sounds reasonable. Maybe we want to truncate on the first line.
(I need to review that for setuptools and distutils plugins in a separate PR because now they are not displaying anything useful).

Very minor brainstorm, just to share (but I don't think it makes sense to spend much energy on it):

It seems that we are walking towards having a "generic" type of schema, and then a specialisation for schemas that also represent "tools".
Do you think we should work on the hierarchy of the plugin classes? Does it make sense to have PluginWrapper supporting the "generic" plugin that is loaded by "multi" and then a subclass just for the "tools"? Basically:

class PluginWrapper:
    # ...the implementation proposed for StoredPlugin in this PR

class ToolWrapper(PluginWrapper):
    # the methods that are now in PluginWrapper right now

(Possibly when I first named PluginWrapper I should have used a better name like SchemaWrapper, but it is not a concern for now :P)

abravalheri · 2025-03-10T18:07:36Z

src/validate_pyproject/plugins/__init__.py

@@ -105,23 +133,42 @@ def load_from_entry_point(entry_point: EntryPoint) -> PluginWrapper:
        raise ErrorLoadingPlugin(entry_point=entry_point) from ex


+def load_from_multi_entry_point(
+    entry_point: EntryPoint,
+) -> Generator[StoredPlugin, None, None]:


I find it easier to read Iterator[T] instead of Generator[T, None, None] and understand what the function does (we don't have to remember which argument order for the return, yield and send).

Since it does not affect too much the typechecking, I tend to stick with the simpler approach. Is there a benefit in using Generator?

This is a habit I have due to a problem in typing: things like contextlib.contextmanager takes an Iterator because people usually type these as iterators. This is incorrect, as contextmanager calls .close, which is part of the Generator protocol, not the Iterator protocol. So mypy can't find a mistake like passing an Iterator instead of a Generator to things like contextmanager. The reason they gave for not fixing it was that everyone types these as Iterator. So I never type it as Iterator. :) Python 3.13 also added single argument Generator[T], largely to help with this I believe.

src/validate_pyproject/plugins/__init__.py

Co-authored-by: Anderson Bravalheri <[email protected]> Signed-off-by: Henry Schreiner <[email protected]>

Signed-off-by: Henry Schreiner <[email protected]>

henryiii · 2025-03-11T15:22:53Z

It seems that we are walking towards having a "generic" type of schema, and then a specialization for schemas that also represent "tools".

That's not what these two things represent. One is a plugin that is already loaded, and the other is one that is lazily loaded. Either one can have an empty tool name, which makes it a generic schema and not a tool-related one, but the key difference is lazy vs. not lazy. If you are fine with renaming, I think LazySchema and StoredSchema, for example, might be better.

Signed-off-by: Henry Schreiner <[email protected]>

henryiii · 2025-03-12T17:04:50Z

Added a test for the ordering. Currently multi plugins override matching classic plugins. Since they can provide helper schemas too, I think that's a fine ordering. Also I'm sorting the multi entry points by the entry point name, so which overrides if multiple plugins provide matching tools is predictable.

abravalheri

Thank you Henry.

That's not what these two things represent. One is a plugin that is already loaded, and the other is one that is lazily loaded. Either one can have an empty tool name, which makes it a generic schema and not a tool-related one, but the key difference is lazy vs. not lazy. If you are fine with renaming, I think LazySchema and StoredSchema, for example, might be better.

We could change the name to LazySchema, that would be OK for me, but I don't think we need to tackle this now...

abravalheri · 2025-03-12T17:29:55Z

src/validate_pyproject/plugins/__init__.py

+        for e in sorted(
+            iterate_entry_points("validate_pyproject.multi_schema"),
+            key=lambda e: e.name,
+            reverse=True,
+        )


Also I'm sorting the multi entry points by the entry point name, so which overrides if multiple plugins provide matching tools is predictable.

Thank you very much @henryiii.

I am trying to understand the need of sorting twice to ensure replicability.
Ideally I would like to avoid sorting multiple times.

I wonder the following:

If we produce the iterators as:

tool_eps = ( (0, e.name, load_from_entry_point(e)) ... ) multi_eps = ( ((1, e.name, p) for p in load_from_multi_entry_point(e)) ... )

Could we remove the sorted in multi_eps, and then change the sorting key for something like:

def _predictable_sorting(element): priority, name, plugin = element return (priority, name, plugin.tool or plugin.id)

And then

sorting = sorted(eps, key=_predictable_sorting) dedup = {(e.tool or e.id): e for _, _, e in sorting}

(All code is untested)

Observation 1: Not sure we need priority... It may be the case we can simply get rid of it and obtain good enough results.
Observation 2: That eps: Iterable[Union[StoredPlugin, PluginWrapper]] explicit annotation will become a nuisance... I wish the typechecker could automatically infer it.

This is not sorting¹ twice, and it is not related to which type of entry point takes priority; to switch that, we just chain in the other order. This sort is sorting the multi entry points by name; this happens before they are called and it produces the Schemas. If there are two multi entry points producing 5 tools, this sort is over two items, the later sort is over the five tools. It adds stability if two plugins both define a multi entry point that produces the same name. So, for example, let's say a user has two plugins:

validate-pyproject-schema-store defines a multi entry point with the name schema-store that produces a bunch of tools, including "cibuildwheel"

A hypothetical my-schemas package defines a multi entry point with the name my-schemas that also includes "cibuildwheel"

By sorting the multi entry points before rendering them, that ensures that my-schemas's "cibuildwheel" is always the one chosen, making it repeatable.

Also, if a user has cibuildwheel itself installed, which has a single entry point that produces "cibuildwheel", that will not take priority over the multi entry points. If it provided a multi entry point, then the name of the multi entry point would control the override order. Maybe we want single entry points to take priority, but since they are less powerful, I left it this way, but could switch if that makes more sense.

Footnotes

The same items, that is. ↩

Ah, I see you are adding the name information. I can try that.

Okay, pushed something.

Signed-off-by: Henry Schreiner <[email protected]>

henryiii added 3 commits February 26, 2025 10:57

refactor: entry-point collection changes

f4b610c

Signed-off-by: Henry Schreiner <[email protected]>

feat: multi-plugins

50efc03

Signed-off-by: Henry Schreiner <[email protected]>

feat: support fragments

ff44b37

Signed-off-by: Henry Schreiner <[email protected]>

henryiii force-pushed the henryiii/feat/multi2 branch from 3790521 to ff44b37 Compare February 26, 2025 20:12

henryiii added 3 commits February 26, 2025 15:51

tests: add some tests for more coverage

ec8c6b9

Signed-off-by: Henry Schreiner <[email protected]>

refactor: use PluginProtocol instead of Union when approriate

407e378

Signed-off-by: Henry Schreiner <[email protected]>

docs: drop an extra blank line

a19460e

Signed-off-by: Henry Schreiner <[email protected]>

abravalheri approved these changes Mar 10, 2025

View reviewed changes

Update src/validate_pyproject/plugins/__init__.py

f5f2329

Co-authored-by: Anderson Bravalheri <[email protected]> Signed-off-by: Henry Schreiner <[email protected]>

henryiii force-pushed the henryiii/feat/multi2 branch from 077b1d0 to f5f2329 Compare March 11, 2025 15:12

fix: make tools required

3f3caf1

Signed-off-by: Henry Schreiner <[email protected]>

henryiii added 2 commits March 12, 2025 12:20

tests: add test for ordering

0d6dd3e

Signed-off-by: Henry Schreiner <[email protected]>

refactor: sort multiplugins, repeatable interaction

ccd5559

Signed-off-by: Henry Schreiner <[email protected]>

abravalheri reviewed Mar 12, 2025

View reviewed changes

refactor: sort in one step

fe098a9

Signed-off-by: Henry Schreiner <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: multi-plugins with extra schemas #231

feat: multi-plugins with extra schemas #231

henryiii commented Feb 26, 2025 •

edited

Loading

abravalheri left a comment

abravalheri Mar 10, 2025

henryiii Mar 11, 2025

henryiii commented Mar 11, 2025

henryiii commented Mar 12, 2025 •

edited

Loading

abravalheri left a comment

abravalheri Mar 12, 2025 •

edited

Loading

henryiii Mar 12, 2025 •

edited

Loading

henryiii Mar 12, 2025

henryiii Mar 12, 2025

feat: multi-plugins with extra schemas #231

Are you sure you want to change the base?

feat: multi-plugins with extra schemas #231

Conversation

henryiii commented Feb 26, 2025 • edited Loading

abravalheri left a comment

Choose a reason for hiding this comment

abravalheri Mar 10, 2025

Choose a reason for hiding this comment

henryiii Mar 11, 2025

Choose a reason for hiding this comment

henryiii commented Mar 11, 2025

henryiii commented Mar 12, 2025 • edited Loading

abravalheri left a comment

Choose a reason for hiding this comment

abravalheri Mar 12, 2025 • edited Loading

Choose a reason for hiding this comment

henryiii Mar 12, 2025 • edited Loading

Choose a reason for hiding this comment

Footnotes

henryiii Mar 12, 2025

Choose a reason for hiding this comment

henryiii Mar 12, 2025

Choose a reason for hiding this comment

henryiii commented Feb 26, 2025 •

edited

Loading

henryiii commented Mar 12, 2025 •

edited

Loading

abravalheri Mar 12, 2025 •

edited

Loading

henryiii Mar 12, 2025 •

edited

Loading