-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] [Resource Access Control] Finalizing the code design #5062
Comments
@DarshitChanpura I think we should expand on each option in a decision doc with more detail and capture some of what's been discussed in person on Github. As I see, we have discussed 2 main approaches, but there is now a 3rd one coming into view that takes into account the https://github.com/opensearch-project/opensearch-remote-metadata-sdk/ where resource metadata could be stored outside of OpenSearch. In the first 2 designs, we were operating under the assumption that resource metadata is stored in a system index. In this design, from a plugin developer's point of view they only need to tell the security plugin what index the sharable resources are stored in and security would handle everything else. Plugins would be unaware of whether the cluster was running with security or not as the same code would be written for both cases. Those 2 options are:
Definitions:
With
Each Conditions for sharingWith resource sharing, there are 2 conditions in which a resource is visible to the authenticated user.
1. Store resource owner and sharing info w/ the resource metadataIn this approach, there must be a way for security to write the
2. Centrally storing sharing information in an index for all resource types (preferred)Similar to 1, but centrally stored in a single index for all resource sharing info across the cluster. In this model, it can be assumed that this information is safe from being overridden because security owns the index, but it does introduce complications when plugins perform Search Requests and Get Requests on their resource indices.
Considerations for resource metadata stored outside of OpenSearchWith https://github.com/opensearch-project/opensearch-remote-metadata-sdk/, there is an effort to abstract away metadata storage for plugins and allow metadata to be stored outside of OpenSearch. With that in mind, I think the design for resource sharing should account for this to regardless of whether 1) resource metadata is stored in OpenSearch or 2) resource metadata is stored outside of OpenSearch. I like the idea of extrapolating the concept of DLS to remote stores, but I'm not sure how best to design that. One thing I was thinking about was whether to give plugin developers a mechanism for obtaining the IDs of resources visible to the authenticated user and leaving it up to the plugin developer to use that appropriated. i.e. From a plugin, they can make a call similar to:
If the plugin uses a remote store for resource metadata then they can figure out how to use the resource ids appropriately. The security plugin will also needs hooks onto when sharable resources are created/deleted. |
A couple of additions: Avoiding "human error"The issue lists as limitation of the last option "3. Plugins Make API Calls":
Even though the DLS approaches reduce this risk of security issue by wrongly using the provided concepts, they do not fix these completely. The DLS approaches always assume that a resource corresponds to exactly one document in an index. There are many thinkable cases where this is not the case, for example the alerting plugin has a concept called "alert comments" where the plugin implementation needs to do more checks to ensure authorized access: https://github.com/opensearch-project/alerting/blob/main/alerting/src/main/kotlin/org/opensearch/alerting/transport/TransportIndexAlertingCommentAction.kt ... thus, this risk to a certain extend applies to all approaches. DLSIt is already mentioned in the issue, but I'd just like to put emphasis on this: DLS has quite a few limitations which make DLS based approaches challenging and expensive to achieve.
|
Given the scalability issues with TLQ and synchronization issues with in-memory map when it comes to large clusters, API calls seems to be the "proper" solutions. With its only downside being that plugin devs must remember to call these APIs, this approach gives us the flexibility to handle internal/external data-store as well overcome DLS short-comings. I've already introduced appropriate APIs in #5016. If the implementations looks okay, we can finalize the approach. |
@DarshitChanpura @nibix @cwperks no intent to hold you folks off, but I believe instead of inventing yet another model, refining / improving / extending the existing security ones (DLS, etc) would not only benefit the existing applications but significantly reduce the maintenance costs of supporting both. |
To be honest, it seems a bit like this conversation is kind of stuck. There are good arguments for each side, but it seems to be unclear how to come to a synthesis. In order to move this forward a bit, I will try to create a bit more transparent (and possibly objective?) comparison of the possible approaches below. I will certainly get things wrong, so please correct me! :-) ComparisonApproachesThe discussed approaches. Each approach corresponds to one column in the matrix below.
MatrixRows describe whether an approach has a certain property or not. See the section below for longer explanation.
DetailsCovers complicated resource modelsCan an approach cover scenarios which have resource models where there is not a 1:1 relationship between a resource and a document. Examples: Derived resources, runtime resources, hierarchical models (folder with items, each a resource). No explicit checks in plugin necessaryDoes the approach provide a solution that a plugin can very easily use or is greater integration of the plugin code necessary? Suitable for big data resourcesIs the approach suitable for millions of resources? That means, can OpenSearch big data solutions be used to manage the big data. The DLS ext approach does require keeping the resource sharing information in the heap, thus it is not suitable for big data scenarios. Side-effect freeDoes the approach provide side channels which might expose information which is not really necessary? The DLS int approach requires updating of the resource documents whenever sharing config is changed. That means that the document version and seqno properties will be updated. This has effects on concurrent operations which rely on this information. This also exposes to each user who has access to a document the fact that resource sharing configuration was modified. Straight-forward to implementIs it already quite clear how this can be implemented? Or, is experimentation necessary to achieve the implementation? Provides benefits beyond the featureImplementation might also be used to provide enhancements in existing features, e.g. DLS. Can be implemented within 4 weeksDisclaimer: Obviously, this is a very subjective assessment. |
Just my opinion (in no way authoritative): I do believe that DLS could use some fundamental improvements. Especially the limitation that it only works for read operations, but does not prevent deleting or overwriting unauthorized documents is kind of odd and surprising. However, a property of good project management is to keep goals at a manageable size. A goal should be a single goal and not actually a combination of several goals. Thus, one could start with the goal of improving DLS. When this is achieved, one could revisit the resource sharing approach (which will have limitations as explained above). If, however, the goal to achieve a resource sharing framework cannot be delayed, it's clear that DLS should not be utilized for that. |
Another note: The argument that the approach of requiring the plugin to call a dedicated API was so far quite abstract. This makes it difficult for me to judge how big the issue actually is. To judge this better, I personally would think that a prototypical adaption of a real-world plugin (for example alerting) would be very helpful. |
@nibix Thank you for the detailed inputs. I already have anomaly-detection plugin in progress to be tested with this new change: opensearch-project/anomaly-detection#1400. This is still in draft state, as the core feature is stalled. Now that we have consensus, I can make progress on the feature PR and subsequently the AD plugin PR. Improving DLS can be done as a separate goal. I will open another issue for that improvement. For the resource sharing feature, API based model can be introduced. Once DLS enhancement is in place in future, we can provide plugin developers with an option to rely on DLS or continue with explicit calls depending on their use-case. |
Let's be specific when we bring up what this means. When I read API, my mind jumps to REST API and I don't think that's what @DarshitChanpura had in mind. Essentially, plugin devs already add a clause of their own on a SearchRequest (Example in AD) when searching through a system index containing resource metadata. I think @DarshitChanpura has been thinking of "replicating" this logic in a sense where plugin devs need to make a call ( Keep in mind that resource metadata could be stored outside of OpenSearch so I'm wondering if this could be truly agnostic with a simple assumption that any sharable resource has a unique identifier. |
I'm thinking about REST APIs. |
Background
W.r.t Resource Access Control, Doc-Level Security (DLS) approach has been updated in PR #5016. Recently, the plan shifted from implementing abstract APIs in OpenSearch core to modifying the Security plugin so it can automatically invoke resource-access-control for relevant indices. However, this method also has certain drawbacks around thread exhaustion. To guide the final decision, three primary approaches have been considered:
Below is an updated version of the approaches, each with Advantages and Limitations sections.
1. Terms Lookup Query
Description
Leverage TLQ to dynamically fetch resource-sharing information from a separate index, then match requested resource IDs against those entries.
Advantages
Limitations
2. In-Memory Map
Description
Load resource-sharing configuration into an in-memory map—similar to how the Security plugin loads its main security configuration. This map would be updated in near real-time whenever resource-sharing information changes (e.g., new resources or updated permissions).
Advantages
Limitations
3. Plugins Make API Calls
Description
Expose new APIs that other plugins can call whenever they need to check if a user has access to a given resource. These APIs handle the logic for determining resource access, potentially using the DLS approach behind the scenes or another method.
Advantages
Limitations
Conclusion
Feedback from plugin developers and end users will help guide the final choice. Each approach has trade-offs in terms of performance, maintainability, and extensibility.
The text was updated successfully, but these errors were encountered: