Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[POC] [Security Manager Replacement] Native Java Agent (dynamic code rewriting, must be low overhead) #16731

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

reta
Copy link
Collaborator

@reta reta commented Nov 27, 2024

Description

Explore the the native Java Agent (dynamic code rewriting, must be low overhead).

How does it work:

  • the application (OpenSearch) and agent use common module bootstrap
  • the application (OpenSearch) is run with the agent
  • the application (OpenSearch) uses bootstrap module apply security policies

Example:

The sample security.policy (stays the same as before):

grant codeBase "${codebase.opensearch-core}" {
   permission  java.net.SocketPermission "localhost", "connect";
};

The application (OpenSearch) is run with the agent:

-javaagent:agent-3.0.0-SNAPSHOT.jar

The application (OpenSearch) is applies security policy to the agent:

final Policy policy =  new PolicyFile("/security.policy");
AgentPolicy.setPolicy(policy);

Running with 24-ea+31-3600:

[2025-01-22T11:58:11,913][INFO ][o.o.n.Node               ] [host] version[3.0.0-SNAPSHOT], pid[101497], build[tar/7cf6a66e74d8352cf42d60c50b97e46a2aa8866c/2025-01-21T18:11:00.731851515Z], OS[Linux/6.11.0-13-generic/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/24-ea/24-ea+31-3600]                             
[2025-01-22T11:58:11,916][INFO ][o.o.n.Node               ] [host] JVM home [/home/user/jdk-24], using bundled JDK/JRE [false]
[2025-01-22T11:58:11,916][INFO ][o.o.n.Node               ] [host] JVM arguments [-Xshare:auto, -Dopensearch.networkaddress.cache.ttl=60, -Dopensearch.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -XX:
+ShowCodeDetailsInExceptionMessages, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dio.netty.allocator.numDirectArenas=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.locale.providers=SPI,CLDR, -Xms1g, -Xmx1g, -XX:+UseG1GC, -XX:G1ReservePercent=25, -XX:In
itiatingHeapOccupancyPercent=30, -Djava.io.tmpdir=/tmp/opensearch-12632241661790883371, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m, --add-modules=jdk.incubator.vector, -Djava.util.concurrent.ForkJoin
Pool.common.threadFactory=org.opensearch.secure_sm.SecuredForkJoinWorkerThreadFactory, -javaagent:agent/opensearch-agent-3.0.0-SNAPSHOT.jar, -XX:MaxDirectMemorySize=536870912, -Dopensearch.path.home=/home/user/opensearch-3.0.0-jdk24, -Dopensearch.path.conf=/home/user/opensearch-3.0.0-jdk24/config, -Do
pensearch.distribution.type=tar, -Dopensearch.bundled_jdk=true]                                                                                                                                                                                                                                                                                       
[2025-01-22T11:58:11,916][WARN ][o.o.n.Node               ] [host] version [3.0.0-SNAPSHOT] is a pre-release version of OpenSearch and is not suitable for production                              
[2025-01-22T11:58:11,967][WARN ][o.a.l.i.v.VectorizationProvider] [host] You are running with Java 23 or later. To make full use of the Vector API, please update Apache Lucene.                                                                                                                                                 
[2025-01-22T11:58:12,347][INFO ][o.o.i.r.ReindexModulePlugin] [host] ReindexPlugin reloadSPI called                                                                                                                                                                                                                              
[2025-01-22T11:58:12,348][INFO ][o.o.i.r.ReindexModulePlugin] [host] Unable to find any implementation for RemoteReindexExtension


Related Issues

Closes #16633

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added the enhancement Enhancement or improvement to existing feature or request label Nov 27, 2024
Copy link
Contributor

❌ Gradle check result for 6b73ddf: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@kumargu
Copy link
Contributor

kumargu commented Nov 27, 2024

thanks @reta this is really interesting and such a quick progress.

On a side note, it would be useful to add a small intro snippet how the agent would work overall.

@reta
Copy link
Collaborator Author

reta commented Nov 27, 2024

thanks @reta this is really interesting and such a quick progress.

Thanks @kumargu

On a side note, it would be useful to add a small intro snippet how the agent would work overall.

Absolutely, I have updated the description (but will push it a bit once we get JDK-21 baseline with #16366, it would simplify a lot the APIs usage)

@reta reta force-pushed the issue-16633 branch 2 times, most recently from 9858717 to ea045b0 Compare December 16, 2024 18:58
"Can-Retransform-Classes": "true",
"Agent-Class": "org.opensearch.javaagent.Agent",
"Premain-Class": "org.opensearch.javaagent.Agent",
"Boot-Class-Path": 'byte-buddy-1.15.10.jar opensearch-agent-bootstrap-3.0.0-SNAPSHOT.jar'
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opensearch-agent-bootstrap is shared between the OpenSearch service and the agent (so the Policy instance could be propagated)

Copy link
Contributor

❌ Gradle check result for ea045b0: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@opensearch-trigger-bot
Copy link
Contributor

This PR is stalled because it has been open for 30 days with no activity.

@opensearch-trigger-bot opensearch-trigger-bot bot added the stalled Issues that have stalled label Jan 16, 2025
Copy link
Contributor

❌ Gradle check result for 58a227c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@opensearch-trigger-bot opensearch-trigger-bot bot removed the stalled Issues that have stalled label Jan 17, 2025
Copy link
Contributor

❌ Gradle check result for 5e20fde: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 4688fd1: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 930e6ef: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@kumargu
Copy link
Contributor

kumargu commented Jan 28, 2025

@reta is it feasible for the agent to coexist with SM enabled in 3.0, meaning both SM and Agent will enforce socket restrictions?

@reta
Copy link
Collaborator Author

reta commented Jan 28, 2025

@reta is it feasible for the agent to coexist with SM enabled in 3.0, meaning both SM and Agent will enforce socket restrictions?

@kumargu I think it is feasible in theory but should not be necessary in practice, could you share your thoughts why we may need that?

@kumargu
Copy link
Contributor

kumargu commented Jan 28, 2025

@reta is it feasible for the agent to coexist with SM enabled in 3.0, meaning both SM and Agent will enforce socket restrictions?

@kumargu I think it is feasible in theory but should not be necessary in practice, could you share your thoughts why we may need that?

I was thinking we could bring in replacements of JSM in 3.0 while JSM remains enabled in 3.0 (because we'd be still on JDK-21 in 3.0). Having the alternatives coexist for sometime will give us confidence and enough community feedback before we decide to remove it in some 3.x or 4.0.

(note JDK-24 LTS will be available in Sep 2025)

@reta
Copy link
Collaborator Author

reta commented Jan 28, 2025

Having the alternatives coexist for sometime will give us confidence and enough community feedback before we decide to remove it in some 3.x or 4.0.

I think we would only target a most critical APIs by Java Agent (we just cannot much it to SM), however we should be able to run Java Agent on JDK-21 at least.

@kumargu
Copy link
Contributor

kumargu commented Jan 28, 2025

Having the alternatives coexist for sometime will give us confidence and enough community feedback before we decide to remove it in some 3.x or 4.0.

I think we would only target a most critical APIs by Java Agent (we just cannot much it to SM), however we should be able to run Java Agent on JDK-21 at least.

100% agree. Maybe just the Socket interceptor for now since we see the problems with defining the port ranges in the PR #17107

@kumargu
Copy link
Contributor

kumargu commented Feb 19, 2025

@reta, if you get a chance, an example of this working with the simplest Shiro Plugin would be very useful :)

I am willing to get this in 3.0 even in absence of File interceptor.

@reta
Copy link
Collaborator Author

reta commented Feb 19, 2025

@reta, if you get a chance, an example of this working with the simplest Shiro Plugin would be very useful :)

Sure @kumargu , I will try to wrap the POC up this week so we could make go / no-go decision (w/r Java Agent), thanks!

Copy link
Contributor

❌ Gradle check result for 0059645: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 1a72cde: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for ec22ea6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for ec22ea6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@kumargu
Copy link
Contributor

kumargu commented Feb 23, 2025

(probably not the right thread to ask this question, pardon me).

Given, you have already done the heavy-lifting here; do you think the FileInterceptor would look something as below?

A FileInterceptor Impl

import net.bytebuddy.asm.Advice;
import java.io.File;
import java.lang.reflect.Method;
import java.security.Permission;
import java.util.List;
import java.security.ProtectionDomain;
import java.io.FilePermission;


public class FileInterceptor {

    public static final String FILE_DELETE_ACTION = "delete";
    public static final String FILE_EXECUTE_ACTION = "execute";
    public static final String FILE_READ_ACTION = "read";
    public static final String FILE_WRITE_ACTION = "write";
    public static final String FILE_READLINK_ACTION = "readlink";

    @Advice.OnMethodEnter
    public static void intercept(@Advice.This File file, @Advice.Origin Method method) throws SecurityException {
        String action = getActionFromMethod(method.getName());
        
        if (action != null) {
            checkFilePermission(file, action);
        }
    }

    private static String getActionFromMethod(String methodName) {
        switch (methodName) {
            case "delete":
            case "deleteOnExit":
                return FILE_DELETE_ACTION;
            case "canExecute":
            case "setExecutable":
                return FILE_EXECUTE_ACTION;
            case "canRead":
            case "setReadable":
            case "list":
            case "listFiles":
                return FILE_READ_ACTION;
            case "canWrite":
            case "setWritable":
            case "createNewFile":
            case "mkdir":
            case "mkdirs":
                return FILE_WRITE_ACTION;
            case "readSymbolicLink":
                return FILE_READLINK_ACTION;
         // maybe more cases?
            default:
                return null;
        }
    }

    private static void checkFilePermission(File file, String action) throws SecurityException {
        Policy policy = AgentPolicy.getPolicy();
        if (policy == null) {
            return; // No policy set, allow all actions
        }

        String path = file.getAbsolutePath();
        FilePermission permission = new FilePermission(path, action);

        StackWalker walker = StackWalker.getInstance(StackWalker.Option.RETAIN_CLASS_REFERENCE);
        List<ProtectionDomain> callers = walker.walk(new StackCallerChainExtractor());

        for (ProtectionDomain domain : callers) {
            if (!policy.implies(domain, permission)) {
                throw new SecurityException("Access denied: " + action + " on " + path);
            }
        }
    }
}

High level changes in Agent

       AgentBuilder.Transformer fileOperationsTransformer = (b, typeDescription, classLoader, module, pd) -> b.visit(
            Advice.to(FileInterceptor.class)
                .on(ElementMatchers.named("delete")
                        .or(ElementMatchers.named("canExecute"))
                        .or(ElementMatchers.named("canRead"))
                        .or(ElementMatchers.named("canWrite"))
                        .or(ElementMatchers.named("readAllBytes"))
                        .or(ElementMatchers.named("newBufferedReader"))
                        .or(ElementMatchers.named("write"))
                        .or(ElementMatchers.named("isSymbolicLink"))
                        ...
                        ...
                )
        );


        final ByteBuddy byteBuddy = new ByteBuddy().with(Implementation.Context.Disabled.Factory.INSTANCE);
        return new AgentBuilder.Default(byteBuddy)
            .with(AgentBuilder.InitializationStrategy.NoOp.INSTANCE)
            .with(AgentBuilder.RedefinitionStrategy.REDEFINITION)
            .with(AgentBuilder.RedefinitionStrategy.Listener.StreamWriting.toSystemError())
            .with(AgentBuilder.TypeStrategy.Default.REDEFINE)
            .type(ElementMatchers.isSubTypeOf(SocketChannel.class))
            .transform(socketTransformer)
            .type(ElementMatchers.isSubTypeOf(File.class).or(ElementMatchers.isSubTypeOf(Files.class)))
            .transform(fileOperationsTransformer);
    }


@reta
Copy link
Collaborator Author

reta commented Feb 23, 2025

Given, you have already done the heavy-lifting here; do you think the FileInterceptor would look something as below?

@kumargu I haven't finished the heavy lifting part yet (working on changes in distribution etc), I would like to finish with these ones first before adding any more functionality into the agent, hope it makes sense. Thanks!

Copy link
Contributor

❌ Gradle check result for 1881ad9: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for bf36ceb: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for bea80ca: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@kumargu
Copy link
Contributor

kumargu commented Feb 24, 2025

Given, you have already done the heavy-lifting here; do you think the FileInterceptor would look something as below?

@kumargu I haven't finished the heavy lifting part yet (working on changes in distribution etc), I would like to finish with these ones first before adding any more functionality into the agent, hope it makes sense. Thanks!

Sound good. I was not aware that major changes in distribution is needed. Thanks for the info.

Copy link
Contributor

❌ Gradle check result for ff05507: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for b36a5ab: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for d4f505e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for c26b3d4: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

…ing, must be low overhead)

Signed-off-by: Andriy Redko <[email protected]>
Signed-off-by: Andriy Redko <[email protected]>
Copy link
Contributor

❌ Gradle check result for 45bb092: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[POC] [Security Manager Replacement] Native Java Agent (dynamic code rewriting, must be low overhead)
4 participants