Add cats-effect instrumentation #13576

iRevive · 2025-03-23T19:23:14Z

Hey folks.

Cats Effect is a high-performance, asynchronous, composable framework for building real-world applications in a purely functional style within the Typelevel ecosystem.

How the instrumentation works

Cats Effect has its context propagation mechanism known as IOLocal. 3.6.0 release provides a way to represent IOLocal as a ThreadLocal, which creates an opportunity to manipulate the context from the outside.

Agent instruments the constructor of IORuntime and stores a ThreadLocal representation of the IOLocal[Context] in the bootstrap classloader, so the agent and application both access the same instance
Instrumentation installs a custom ContextStorage wrapper (for the agent context storage). This wrapper uses FiberLocalContextHelper to retrieve the fiber's current context (if available)
Agent instruments IOFiber's constructor and starts the fiber with the currently available context

iRevive · 2025-03-23T19:26:55Z

.../opentelemetry/javaagent/instrumentation/catseffect/common/v3_6/IoLocalContextSingleton.java

+import cats.effect.IOLocal;
+import io.opentelemetry.javaagent.instrumentation.opentelemetryapi.context.AgentContextStorage;
+
+public class IoLocalContextSingleton {


It must be defined in a common package so we can lately reuse it to instrument otel4s.

Here is a prototype: iRevive@b2f6501

iRevive · 2025-03-23T20:24:54Z

Some tests have failed with the following error:

java.lang.IllegalStateException: Cannot write to this reference for cats.effect.IO arg0 in read-only context

I assume some VMs aren't happy with the body modification of IO?

laurit · 2025-03-25T06:41:03Z

...ava/io/opentelemetry/javaagent/instrumentation/catseffect/v3_6/IoRuntimeInstrumentation.java

+
+    @Advice.OnMethodEnter(suppress = Throwable.class)
+    public static void onEnter() {
+      FiberLocalContextHelper.initialize(


What if you have deployed 2 wars on tomcat that use this library, won't this break? Messing with the context storage is unusual, my hunch is that this is not a good idea. Typically such instrumentations restore the otel context when fiber starts running on a thread and save the context when it stops using the thread.

That's a valid concern indeed.

deployed 2 wars on tomcat that use this library

If I understand correctly, each deployment (app) will have its own classloader, but the bootstrap will still be shared.
If that's the case, my implementation won't work, I'm afraid.

Suppose I don't find a proper way to make the instrumentation work. Can I distribute the current implementation as a third-party extension? Can the extension have access to the bootstrap loader?

Can the extension have access to the bootstrap loader?

Not directly, but you could try using byte-buddy to define the class you need in boot loader or you could experiment with Instrumentation.appendToBootstrapClassLoaderSearch.

iRevive · 2025-04-15T18:24:26Z

Hey @laurit, I've tried a few different less-invasive approaches. Unfortunately they don't work.

The Fiber's context (IOLocal) is slightly more complex than ThreadLocal because it is pinned to a fiber rather than a thread. The fiber can switch threads, be suspended/resumed, and more.

I've tried attaching a current context to a fiber via the VirtualField, but that means I must reimplement IOLocal propagation logic on the agent level. For example, when the fiber switches a thread (e.g., to execute a blocking task), I must activate the attached context on a new thread. This approach also won't work with otel4s (at least without drastic changes to the otel4s propagation model).

I also tried installing a custom ContextWrapper (a variation of FiberContextBridge) for the application's context.

It works to some degree. However, the agent's tracer is unaware of the wrapper:

val span = tracer.spanBuilder("my-span").start()
val scope = span.makeCurrent()

IO {
  val current = Span.current() // returns the 'my-span', because the context wrapper is respected, all good
  val span = tracer.spanBuilder("span").start() // creates a brand new span, because it calls agent's Context.current(), which is a ThreadLocal<Context> in the agent scope
}.unsafeRunSync()

scope.close()
span.end()

Unfortunately, I lack knowledge of agent instrumentation, so there may be other approaches I am unaware of.
Could you suggest some alternatives?

Currently, I have only a few ideas:

Would it be possible to keep the current instrumentation but disable it by default? Users must enable the instrumentation manually, so we should prevent some non-trivial cases. However, I understand that that's a subpar and dangerous implementation, and I'm fine with the no.
From what I see, I can create a customized distribution of the OTel agent, something similar to
https://github.com/elastic/elastic-otel-java. We can test it for a few iterations, and if it works fine, we can upstream it to the OTel agent (point 1, basically).

laurit · 2025-05-06T12:44:00Z

Unfortunately, I lack knowledge of agent instrumentation, so there may be other approaches I am unaware of.
Could you suggest some alternatives?

Actually I think the main question is whether you need this instrumentation at all. context.makeCurrent() sets the thread local context to provided context and returns a Scope that can be closed to restore the previous context. Essentially this allows accessing the current context with Context.current() without needing to pass the context around. Usage of thread local doesn't play nice when code can be relocated to a different thread. Instead of using the makeCurrent it might make more sense to consider alternatives what the library provides. For example when using kotlin coroutines you'd use withContext(context1.with(animalKey, "dog").asContextElement()) {...} to update the context and coroutineContext.getOpenTelemetryContext() to access the current context. The code for this is in https://github.com/open-telemetry/opentelemetry-java/tree/main/extensions/kotlin Now to interact with libraries that use Context.current() you could still use makeCurrent() before calling the library code. The important bit is that execution thread should not change between opening and closing the scope. For zio instrumentation we didn't set this restriction but in retrospect we probably should have. Allowing execution to transition while there are open scopes just creates problems. We can't reliably close the scope when execution is suspended and have to use Context.root().makeCurrent() to reset the thread context.
What your instrumentation might wish to so is propagate context from the parent to newly launched fibers. We do this in the agent part of the kotlin coroutines instrumentation. Idk whether this would be easier or how helpful it would be.
If you look at the kotlin coroutine instrumentation https://github.com/open-telemetry/opentelemetry-java/blob/5bda810da87731e113ecab85287d327ec88f9969/extensions/kotlin/src/main/java/io/opentelemetry/extension/kotlin/KotlinContextElement.java#L42 then you'll see that it provides callbacks on when the routine is resumed and suspended so we can activate the thread local context. Zio provides similar callbacks https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/zio/zio-2.0/javaagent/src/main/java/io/opentelemetry/javaagent/instrumentation/zio/v2_0/TracingSupervisor.java It could help if cats-effects provided something similar, but it might not even be that useful if you replace makeCurrent() with something that is more cats-effects friendly. If cats-effects does not provide something similar to withContext you could build a library that provides utilities and documentation for the cats-effects users that steer them away from using makeCurrent() to alternatives better suited for cats-effects.
@iRevive does this make sense?

Add cats-effect instrumentation

65e7833

iRevive requested a review from a team as a code owner March 23, 2025 19:23

iRevive mentioned this pull request Mar 23, 2025

Add otel4s instrumentation #13549

Closed

iRevive commented Mar 23, 2025

View reviewed changes

iRevive added 6 commits March 24, 2025 11:54

Fix muzzle rules

6c238a3

Instrument IOFiber constructor instead of IO#unsafeRunFiber

9208afd

Address feedback

b45ecc5

Disable thread propagation debugger in tests

be975e0

Remove redundant JVM flags

6816d6a

Load instrumentation in the otel-api-bridge group

8251c17

laurit reviewed Mar 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add cats-effect instrumentation #13576

Add cats-effect instrumentation #13576

iRevive commented Mar 23, 2025 •

edited

Loading

Uh oh!

iRevive Mar 23, 2025 •

edited

Loading

Uh oh!

iRevive Mar 24, 2025

Uh oh!

iRevive commented Mar 23, 2025

Uh oh!

laurit Mar 25, 2025

Uh oh!

iRevive Mar 25, 2025

Uh oh!

iRevive Mar 25, 2025

Uh oh!

laurit Mar 25, 2025

Uh oh!

iRevive commented Apr 15, 2025 •

edited

Loading

Uh oh!

laurit commented May 6, 2025

Uh oh!

Uh oh!

Add cats-effect instrumentation #13576

Are you sure you want to change the base?

Add cats-effect instrumentation #13576

Conversation

iRevive commented Mar 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How the instrumentation works

Uh oh!

iRevive Mar 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iRevive Mar 24, 2025

Choose a reason for hiding this comment

Uh oh!

iRevive commented Mar 23, 2025

Uh oh!

laurit Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

iRevive Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

iRevive Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

laurit Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

iRevive commented Apr 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

laurit commented May 6, 2025

Uh oh!

Uh oh!

iRevive commented Mar 23, 2025 •

edited

Loading

iRevive Mar 23, 2025 •

edited

Loading

iRevive commented Apr 15, 2025 •

edited

Loading