Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openapi manager should support multiple documents for versioned APIs #7564

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

davepacheco
Copy link
Collaborator

@davepacheco davepacheco commented Feb 19, 2025

Goal: update the openapi-manager (cargo xtask openapi) to support multiple versions of OpenAPI documents. This is aimed at supporting client-side and server-side-only versioning as described in RFD 432 ("Versioning for internal HTTP APIs"). Critically, we want this workflow to be as simple as possible and as close as possible to the existing one for non-versioned APIs. And it should be really hard to accidentally mess things up (e.g., break compatibility with an older version that's supposed to be supported).

Status: I think this is basically ready for review. There's still some work left (see below) but I don't expect it to change the shape of this all that much and it'd be nice to parallelize here.

This change includes:

  • A rewrite of the OpenAPI manager. The user interface is basically unchanged. But the guts required a rewrite because the previous one was oriented around a simpler model of one OpenAPI document per API, plus maybe one or more extra validation files. The new one is aware of versions and many different types of problems (e.g., extra files, some types of files shouldn't be changeable, etc.) and various kinds of fixes for each one.
  • The dns-server API has been converted from lockstep to versioned. Well, kind of. We still don't have progenitor clients should specify API version progenitor#564, so I'm not able to update the DNS server implementation to support multiple versions. But in terms of the OpenAPI documents, this API is now versioned.

I realize this is a big change. (At least it's mostly new code?) But I didn't see a way to break it up well. If there's anything I can do to help review (e.g., video chat walkthrough), let me know. I'd suggest looking at stuff in this order:

  • dev-tools/openapi-manager/README.adoc: particularly "More about versioned APIs". This gives a brief overview of the goals and how the tool works now.
  • Some changes to the CLI-related parts:
    • The CLI-related files (src/{dispatch,check,generate,list}.rs) have been moved into src/cmd. Some have also been rewritten but they're logically equivalent to what they did before.
    • I'd suggest looking at dispatch.rs, then the rest of the PR, then coming back to the rest of this directory.
    • There's also a new "debug" command that prints out precisely what the tool found from all the sources it was given and how it resolved everything. Normally, the "check" and "generate" commands take care of interpreting all this and printing user-facing messages. The "debug" command is really only for understanding what happened when the tool did something unexpected.
    • "check", "generate", and "debug" all accept the same set of arguments that let you override how the tool finds blessed files, local files, and generated files. This is mainly intended for debugging and testing, though it might be useful for users to override the parent branch for blessed files when using stacked PRs or the like.
  • the guts of "collecting information from all the sources":
    • environment.rs: shows how the tool collects information from its various sources
    • spec_files_{blessed,generated,local}.rs: these are relatively small modules that use a common builder to build up a model of OpenAPI documents found
    • spec_files_generic.rs: has the common stuff used by the specific three sources
  • resolved.rs: the guts of the logic of this tool: given what we found, figure out what to do
  • go back to the CLI parts skipped: "list", "check", and "generate" commands
  • dns-server-api / clients/dns-service-client`: shows what the changes to a versioned API look like
  • anything else

If you want to see this in action: I demo'd this at last Friday's demo day. It's somewhat anti-climactic because the workflow is almost the same for versioned APIs as it was for APIs before (those are now called lockstep APIs). If you want to see exactly what's changed, check out the dns-server-api / clients/dns-service-client changes. This is also described in the README.

Work I plan to do before landing this (but don't want to block review on):

  • Add a few small unit tests.
  • You'll note a bunch of XXX-dap-last-step: these are all basically the same thing. To reduce the risk of mismerge with other changes to src/spec.rs upstream, I have kept the API definitions in src/spec.rs in roughly the same format as they were before. But before I land this, I intend to replace all those with literal definitions of ManagedApiConfig instead and rip out src/spec.rs altogether.
  • Re-run my manual test plan (I've done all this before, but I will do it again after any changes from code review):
    • Basic lockstep workflow (file missing, file stale, file up-to-date)
    • Basic versioned workflow (add version, make change; file missing, file stale, file up-to-date)
    • Retiring an old supported version
    • Error cases:
      • Cannot change a blessed version
      • Removed document for a blessed version
      • Wrong document for a blessed version
      • Warning (not error) generated for unexpected files
      • generated for stuff we can't parse (e.g., bad checksum, bad filename)
      • validation error: bad JSON file
      • validation error: not valid OpenAPI
      • validation error: Oxide-specific lint
      • validation error: stale or missing "extra" file (nexus_tags.txt)
      • bogus symlink: missing, stale
    • Test merge with a change that adds a version (older, the same as, newer than) a locally-added one.

Work to do after landing this:

  • Extract openapi-manager into a separate crate so that Propolis, Crucible, etc. can use it. I think this will be easier after this PR than it would have been before.
  • Write a small integration test suite. I think this will be easier after it's extracted because we can parametrize it by 1-2 APIs for testing only. We can use the flags that allow loading files from disk and exercise basically every possible problem/fix case. Although it's common for dev tools to have minimal tests (and there were no tests here before), I think it'd be worth it (and not too much work) to have some here.
  • @ahl is working on a tool for better determining compatibility between two OpenAPI specs. Right now, this tool considers any change to the OpenAPI spec as breaking. That includes even things like doc changes. Once that tool is available, I'll update this to use it. (But I think we'll still want to treat almost everything as breaking.)

Copy link
Contributor

@sunshowers sunshowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have to run, but have gotten through to spec_files_generated.rs. More tomorrow.

When adding an API, you will need to decide how the API is versioned when it comes to system upgrade. You have two choices:

* **Lockstep**: the client and server are always deployed as a unit (e.g., Wicket and Wicketd). It's impossible for versions to mismatch at runtime. This is ideal when possible, but rare.
* **Versioned**: the client and server may mismatch at runtime during an upgrade. See https://rfd.shared.oxide.computer/rfd/0532[RFD 532 ("Versioning for internal HTTP APIs")] for a more complete discussion of how this can work. For our purposes, the upshot is that this repo stores not just the current OpenAPI document (generated from the API trait), but also historical versions that must still be supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make RFD 532 public?

])
----
+
Among other things, the `api_versions!` call defines constants like `VERSION_MY_CHANGE` that you'll use in the next step.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit: I'd be explicit here and mention that MY_CHANGE becomes VERSION_MY_CHANGE, INITIAL becomes VERSION_INITIAL and so on.

Comment on lines +199 to +201
* If there is a blessed file for that version, then the version is blessed. The generated file must exactly match the blessed one. If not, the tool cannot fix this. You have to undo whatever changes you made that affected the blessed version. (See above on how to make changes to the API trait without affecting older versions.)
* If there is no blessed file for that version, then the version is locally-added. There should be exactly one local file for it and it should exactly match the generated file. The tool can fix any problems here by removing all local files and generating a new one based on the generated one.
* The tool also ensures that a "latest" symlink exists and points to the highest-numbered OpenAPI document.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on putting a Mermaid flowchart here?

* **local** documents: these are the OpenAPI documents in `openapi` in your working tree. These include both blessed versions and locally-added versions.
* **generated** documents: these are the OpenAPI documents generated by Dropshot from your API traits.

Putting all this together, the tool is pretty straightforward. For each supported version of a versioned API:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on putting a mermaid diagram here?


Of course, we don't need or want to support each version of an API forever. RFD 532 proposes supporting the one shipped in the last release, plus all the intermediate ones shipped in the current release. The specific policy doesn't really matter here.

To retire an old version, simply remove it from `api_versions!` and run `cargo xtask openapi generate`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, is this enough? Wouldn't you also need to go and change all the APIs that refer to the old version in their #[endpoint] annotations?

Do we want to store a list of retired versions somewhere? I'm specifically thinking of a situation where a version identifier is recycled -- is that desirable?


for api in apis.iter_apis() {
if api.is_lockstep() {
for version in api.iter_versions_semver() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's just one version here, right?

Comment on lines +56 to +59
BadVersionedFileName::UnexpectedName {
ident: ident.clone(),
source: anyhow!("unexpected prefix"),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These errors could carry the basename (unless those are already being stored somewhere else, in which case ignore this)

Comment on lines +87 to +89
// Dropshot does not support pre-release strings and we don't either.
// This could probably be made to work, but it's easier to constrain
// things for now and relax it later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed!

pub struct ApiSpecFilesBuilder<'a, T> {
apis: &'a ManagedApis,
spec_files: BTreeMap<ApiIdent, ApiFiles<T>>,
error_accumulator: &'a mut ErrorAccumulator,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a huge deal, but I personally tend to prefer passing in the accumulator. Storing a &mut makes things like cloning hard (which we may want to do in tests)

#[derive(Debug)]
pub struct ApiFiles<T> {
spec_files: BTreeMap<semver::Version, T>,
latest_link: Option<ApiSpecFileName>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remind me when latest_link is None? Does that happen only for lockstep APIs? If so would it make sense to only define latest_link for versioned APIs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants