Skip to content

Add Dynamic Tag Support in Log4j2Metrics#7389

Open
Harsh3305 wants to merge 5 commits into
micrometer-metrics:mainfrom
Harsh3305:main
Open

Add Dynamic Tag Support in Log4j2Metrics#7389
Harsh3305 wants to merge 5 commits into
micrometer-metrics:mainfrom
Harsh3305:main

Conversation

@Harsh3305
Copy link
Copy Markdown

Adding a support of adding tag on runtime using log events in Log4j2Metrics. This will allow us to send the tags like exception name, Kafka partition, ... as a tag.

@Harsh3305 Harsh3305 marked this pull request as ready for review April 8, 2026 17:28
@jonatan-ivanov
Copy link
Copy Markdown
Member

Thank you for the PR!
Could you please explain what is your exact use case? Why do you need this?
If you want to count exceptions from a component, it might be a better idea to instrument that component. Or in case of Kafka, instrument Kafka itself.

@jonatan-ivanov jonatan-ivanov added the waiting for feedback We need additional information before we can continue label Apr 8, 2026
@Harsh3305
Copy link
Copy Markdown
Author

Thank you for the PR! Could you please explain what is your exact use case? Why do you need this? If you want to count exceptions from a component, it might be a better idea to instrument that component. Or in case of Kafka, instrument Kafka itself.

Hi @jonatan-ivanov
I was working on a use case where we want to monitor and set SLOs on number of error logs across multiple spring boot applications. We down the line uses micrometer core to push log4j2.events with level as a metric(by enabling log4j2.events in config file(application.yml)) and those metrics then get used in data dog to set monitors, alerting and SLOs.
While working on that use case, we realise that the number of error logs are not sufficient enough to identify the problems in services, therefore, I went ahead and also start showing exception name of the dashboard and set monitors accordingly. Since in micrometer core, we can only set up the tags at the bean creating time, therefore, I was no longer able to utilise the micrometer code. Since in actuator, we have ConditionalOnMissingBean condition on the Log4j2Metrics bean(reference), therefore, I went ahead in my project and create another class which extends Log4j2Metrics in which we can have a support of adding the tags on the run time on the basis of log event. This solves the problem for me but then I thought that this kind of use case might be there for other developers and thought to try to extend the current implementation of Micro meter core to have a support of adding the tags on the basis of log event.

If you want to count exceptions from a component, it might be a better idea to instrument that component. Or in case of Kafka, instrument Kafka itself.

Since our services uses multiple dependencies, and those dependencies can also throw or error logs with throwable object, we cannot rely on the custom metric and need to utilise the micro meter since most of the dependencies utilise log4j to log the exception, therefore, I thought to utilise micrometer core to publish the tags in log4j2.events since multiple applications are already using log4j2.events and set monitors on top of it.

@jonatan-ivanov
Copy link
Copy Markdown
Member

jonatan-ivanov commented Apr 9, 2026

So if I understand correctly you are trying to parse out the necessary information from the log message (and logger name, context map ,log level, etc). This sounds like very brittle to me but I might have an alternative solution to this where you don't need to parse strings. (I'm also afraid that your solution might have a performance impact that not every user wants to pay for if they don't attach tags dynamically.)

I would recommend taking a look at the Observation API, the whole Spring portfolio is instrumented with it as well as a lot of other frameworks/libraries.

The idea behind it is that when an instrumentation signals an error (calls observation.error(...)), registered handlers will be notified. Spring Boot registers handlers for distributed tracing and metrics (that's how you get metrics for Spring) but you can register your own custom handlers where you can count how many errors happened, for example:

class ErrorCountingObservationHandler implements ObservationHandler<Observation.Context> {
    private final Meter.MeterProvider<Counter> counterProvider;

    ErrorCountingObservationHandler(MeterRegistry registry) {
        this.counterProvider = Counter.builder("errors")
            // here you can add all the static tags
            .tag("application", "appA")
            .tag("org", "eCommerce")
            .withRegistry(registry);
    }

    @Override
    public void onStop(Observation.Context context) {
        Throwable error = context.getError();
        if (error != null) {
            counterProvider
                // here you can add all the dynamic tags from the context
                .withTag("error", error.getClass().getName())
                .increment();
        }
    }

    @Override
    public boolean supportsContext(Observation.Context context) {
        // here you can select which Observations (contexts) should this handler listen
        // e.g.: return context instanceof ServerRequestObservationContext;
        return true; // means all
    }
}

This is a very basic and simple example and there are some details that might no be apparent:

  • Observation.Context is just a generic class but instrumentations have their own context classes, e.g.: Spring MVC has ServerRequestObservationContext, Spring RestClient has ClientRequestObservationContext, JDBC: DataSourceBaseContext, Spring Kafka: KafkaRecordSenderContext/KafkaRecordReceiverContext. These context objects don't just contain the exception object but important context-related information, e.g.: MVC and RestClient contexts can give you the uri/status code/http method/etc, the JDBC context can give you the db name/host/port/etc, while the Kafka context can give you the remote name/topic/partition/etc.
  • You can have one handler for all or you can write separate handlers for each thing you want to collect errors from (see the generic parameter of ObservationHandler and supportsContext)
  • For components that are not instrumented by the Observation API can be easily instrumented (@Observed is the simplest/easiest)
  • I also used a MeterProvider to create counters dynamically
  • If you are curious all the events that are happening in your app around observations, you can register an ObservationTextPublisher which will be chatty about it (only use it locally, apps in production will not be happy)

Here's a conference talk if you want to learn more: https://www.youtube.com/watch?v=Qyku6cR6ADY

Please let me know if this helps or the Observation API will not work in your use-case.

@Harsh3305
Copy link
Copy Markdown
Author

Thanks @jonatan-ivanov for sharing me this alternative approach. I'll look into this and get back to you in case I have any doubts.

@github-actions
Copy link
Copy Markdown

If you would like us to look at this PR, please provide the requested information. If the information is not provided within the next 7 days this PR will be closed.

@Harsh3305
Copy link
Copy Markdown
Author

@jonatan-ivanov I was looking into the Observation API to understand if I can utilise this to solve my use case and got some doubts related to it.
Let me give you a little more context of the problem statement. We have multiple spring boot applications(some of them are using spring boot 2 and rest are using spring boot 3). These services are majorly classified in 2 categories. Offline Services(event processing applications consuming kafka topics or queues like SQS or batch processing application which runs on cron basis which do some minor data analysis) and Online Services(applications which allow distribute data via REST or gRPC end-points by read data from DBs or other upstreams services). In case of online services, we also have a concept of handling the exception gracefully. What I mean by that, let's say, a call is made to the service which is required to return the list of results. Now one of the result is faulty(might be possible that a field which is required as per the API contract is not available in the db). So we throw an exception lets say java.time.format.DateTimeParseException but we later in the flow catch this exception and log it without throwing it again. The reason of this is we do not want to fail the whole call rather, we want to send the metric of those kind of exception so that later we can do postmortem and analyse what want wrong. I know this is not the ideal scenario but our service SLO is such a way that we are not strictly required to "Complete" and "Accurate" data but our services should have low latency and should support high requests per seconds(which is I think a use case for multiple applications across organisations. For ex, a product image is a use case where if one image url is not valid or null, then also we still show product images just that we skip the corrupted one).
As per my understanding(@jonatan-ivanov please correct me in case these assumptions are not valid), Observability API works best for below scenario:

  1. Application should be using Spring Boot 3.x.x. As prev mentioned
  2. Metrics reflect the overall status of the request. For ex, if the request status is 5.x.x then the metric will reflect the error(or trace) because of which we got this 5.x.x. But in our case, we sometimes treat the exception gracefully which might not get captured by Observation API out of the box(below I will talk about the changes which can enable this feature but requires a lot of manual efforts).

For First point, sooner or the applications will get migrated to spring boot 3 but my major concern was more related to second point.
Due to handling the exception gracefully, we will be required to perform below code change in multiple repositories and in all their dependencies (which might not be a possible solution) to add below change.

for (Data data: dbResults) {
            try {
                // parsing the data which we receive from db
                publisher.submit(parsedData);
            } catch (Exception e) {
                log.error("Error while parsing the data", e);
                observationRegistry.getCurrentObservation().error(e);// this change will be required to perform in multiple applications and their dependencies 
            }
        }

But incase we add Dynamic Tag Support in Log4j2Metrics, we are eliminating this change all together by just doing a version bump of a dependency.

I'm also afraid that your solution might have a performance impact that not every user wants to pay for if they don't attach tags dynamically.)

I was thinking on a similar line and thought to have a mechanism on deciding whether to opt for dynamic tag or not. If application didn't opt for dynamic tag then then will continue using those static counter itself and their performance will not get impacted. Here are the proposed changes. 07af8ca (build is failing, I will check it and update it.)

@jonatan-ivanov Let me know in case my understanding in not correct.
Thanks

Signed-off-by: Harsh Verma <harshverma3305@gmail.com>

Signed-off-by: Harsh3305 <harshverma3305@gmail.com>
Signed-off-by: Harsh3305 <harshverma3305@gmail.com>

Signed-off-by: Harsh3305 <harshverma3305@gmail.com>
Signed-off-by: Harsh3305 <harshverma3305@gmail.com>
Signed-off-by: Harsh3305 <harshverma3305@gmail.com>
Signed-off-by: Harsh3305 <harshverma3305@gmail.com>
@Harsh3305
Copy link
Copy Markdown
Author

@jonatan-ivanov In aa44178 commit, ci/circleci: build-jdk17 was failing rest all other were working fine. I push an empty commit to re trigger the workflow. On the empty commit, ci/circleci: build-jdk17 is working fine but now ci/circleci: docker-tests is failing(which was working fine in prev commit). Since I do not have access to re run the workflow, can you trigger the workflow on my behalf.

@jonatan-ivanov jonatan-ivanov added waiting for team An issue we need members of the team to review and removed waiting for feedback We need additional information before we can continue labels Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

waiting for team An issue we need members of the team to review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants