-
Notifications
You must be signed in to change notification settings - Fork 285
[Cosmos] Refactor query responses to their own Pager and Page types #2393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Cosmos] Refactor query responses to their own Pager and Page types #2393
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks nice and clean.
@analogrelay Are you saying that today this client methods make HTTP requests but that in the future, they might not? Instead, I'd keep these methods always making HTTP requests and then introduce NEW methods (perhaps on a new/different client) that had different (transport) behavior. |
Correct. In fact it often does not use HTTP in several other languages. That's why several Cosmos clients use their own "Response" type (.NET , Java, JavaScript). This change is essentially aligning the Rust SDK with those SDKs.
Customers should not have this expectation; it will cause unexpected surprises for them as we bring the more feature-rich transports1 online, and I'd rather make it very clear to customers that queries are NOT guaranteed to move through the HTTP pipeline. When we make HTTP requests, we do use that pipeline (though perhaps, to reinforce that point, we should not use the core pipeline and should instead use a separate pipeline for queries that can work for all transports; I'm not sure on that yet). Again, it's good to point out here that several of our other clients work this way. HTTP requests are made through the "standard" HTTP pipeline and can be observed in policies, but activity on the non-HTTP transport does not.
This would be much more surprising to users, particularly given the fact that the user is usually not directly aware that their request is spanning multiple HTTP responses OR using a different transport (of course we don't hide it, but the point is they would certainly be surprised to have to make this decision themselves). I think it's also worth noting a key distinction here with Cosmos. Cosmos data plane clients should not really be considered standard wrapper SDKs that call REST APIs. They are more akin to the client driver for a database, like PostgreSQL or SQL Server. Part of the transport surface area uses, or at least supports, REST, but a significant portion of that transport surface is non-HTTP (more like EventHubs AMQP support, for example). I think it's reasonable for us to reuse the Core HTTP primitives for APIs that are entirely REST-based (essentially the "ControlPlane-over-DataPlane" APIs like CRUD for Databases/Containers). But this is the first step in moving the Data Plane operations to be independent of HTTP primitives (I expect to move point read/write operations like Item CRUD to this model as well). Footnotes
|
@JeffreyRichter it's also worth pointing out that @analogrelay's changes to |
First, just because other language SDKs do something does not mean it is the right or best thing to do.. My main problem is that potential breaking changes occur if today these methods make HTTP requests using client options like retry policy, logging, api-version, customer policies, and so on, and then, in the future when the protocol changes, the same client method ignores all the options or does something very different. I am also concerned that a customer expects our Clients to make HTTP requests (as we never attempted to hide this or abstract it away) but that these methods will not use HTTP. However, this is much less of a concern to me than the breaking change concern as we do have clients today with methods that are not 1:1 with an HTTP request (like parallel blob upload/download). But, in these scenarios we must make it very clear to customers that these methods are using some rich algorithm on top of HTTP or not using HTTP at all. In other words, I can live with this decision but it's hard for me to live with a very-likely breaking change. |
In other SDKs where we support other transports, we have a configuration option for which transport to use (for example: https://learn.microsoft.com/azure/cosmos-db/nosql/sdk-connection-modes), so the behavior change you are concerned about only happens in response to a customer code change. What @analogrelay is proposing is to enable changing that connectivity mode without also having to change all the query and point operation calls to something different. |
Ok, that does make me feel better. So, my past experience scares me with this proposal but it is your SDK and if you're ok with maintaining this SDK and supporting your customers, go ahead. |
I should clarify, I was certainly never intending to imply that just because other SDKs do something it's right, but they do provide context for the decision in that we can look to the reasons why those SDKs made that decision (which is what I was referencing).
This is precisely why I want to make this transition now, well before our initial GA. We have other SDKs that have not considered this in advance (the Go SDK in particular) and we are expecting some fairly disruptive changes in order to support this (perhaps moving, as you suggest above, to brand new APIs that do property abstract the two transports). We have a fair amount of experience with abstracting these two transports in .NET and Java, and I plan to tap into that as we continue to move forward. |
I See. I misunderstood then. I thought the plan was to GA with an HTTP implementation and then change the protocol in a subsequent release. Doing this before GA also makes me feel much better. |
Co-authored-by: Heath Stewart <[email protected]>
b760316
to
b49fce2
Compare
I've rebased this and revived it after my week off last week. Looking for sign-off from Cosmos folks (@Pilchie @kirankumarkolli) |
API change check APIView has identified API level changes in this PR and created following API reviews. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No new commits.
Cosmos DB queries for Items, Containers, Databases, or Offers, aren't guaranteed to be satisfied through HTTP requests. For example, Direct mode clients, such as the .NET SDK, use a custom TCP protocol to communicate directly with the replicas. In addition, cross-partition requests return "synthetic" pages of data that merge results from several requests (to each partition). In order to support this in the future, I'm changing the return type of the query related APIs from
azure_core::Pager<T>
to a newazure_data_cosmos::FeedPager<T>
type. Instead of yieldingazure_core::Response
values, it yieldsazure_data_cosmos::FeedPage<T>
values. AFeedPage
is a lot like aResponse
, except for a few things:It does retain the
azure_core::http::Headers
collection, because it's a collection (and thus could just be empty when we don't have any) and because all the Cosmos transports do support some form of headers that we may want to expose to the user.@heaths The easiest way for me to make this split was to refactor
azure_core::Pager<T>
a bit so that I could use it. I'm not wedded to that design if you have concern. I wanted to be able to use the existing work done onPager<T>
but did not want it to yieldazure_core::Response<T>
values. An alternative I considered and am willing to fall back to is to just copy thePager<T>
code, rather than using the type itself. I refactored it by creating two separate types:PageStream<T>
which contains all the paging logic, but yields bareT
values.Pager<T>
which is an alias forPageStream<Response<T>>
, which allows existing usages to work as they did before. In most cases, aPager<T>
should yield aResponse<T>
, so I didn't want to force every other paginated service API to spell outPager<Response<T>>
in its return type.