feat(ai): First implementation of TritonServer metrics feature #5687

pierantoniomerlino · 2025-01-31T17:22:39Z

This PR adds a new API for managing status and performance metrics of an Inference Engine. The implementation for the Triton Server is added as well.

Related Issue: This PR fixes/closes N/A

Description of the solution adopted: The new InferenceEngineMetricsService interface provides APIs to get metrics from an inference engine in the form of a Map whose keys are the metric names.

The implementation on the TritonServer engine, retrieves the GPU metrics from the metrics ports via http and the model statistics from the inference port via grpc. Both metrics are parsed and re-arranged in a json format. The keys for the GPU metrics are:

gpu_metrics.<gpu_uuid>

where the gpu_uuid is an identifier for the GPU device. For the model statistics, instead, the keys are:

model_metrics.<model.name>.<version>

Moreover, the old TritonServer implementation has been deleted, since it has been deprecated in previous releases.

pierantoniomerlino · 2025-02-07T14:00:42Z

@mattdibi could you please take a look at this PR, even if it is still in draft? I've to test some corner cases and check some code choices, but the main stuff is here.

....server/OSGI-INF/metatype/org.eclipse.kura.ai.triton.server.TritonServerContainerService.xml

...pse.kura.ai.triton.server/src/main/java/org/eclipse/kura/ai/triton/server/MetricsParser.java

...ton.server/src/main/java/org/eclipse/kura/ai/triton/server/TritonServerContainerManager.java

pierantoniomerlino · 2025-02-10T15:10:02Z

Tested on a Nvidia Jetson Orin Nano with the following scenarios:

containerized TritonServer on the Orin Nano
remote TritonServer
old implementation (remote only)

The native and containerized versions expose the metrics configuration, while the remote one hasn't any property. However, they still implement the InferenceEngineMetricService since the metrics are likely enabled in the Triton Server default configuration.

MMaiero · 2025-02-11T15:50:46Z

....server/OSGI-INF/metatype/org.eclipse.kura.ai.triton.server.TritonServerContainerService.xml

+            cardinality="0"
+            required="true"
+            default="true"
+            description="Enable the Triton Server Metrics feature. This property enables the default, CPU and GPU metrics, if available.">


Suggested change

description="Enable the Triton Server Metrics feature. This property enables the default, CPU and GPU metrics, if available.">

description="Enable the Triton Server Metrics feature. This property enables the default CPU and GPU metrics, if available.">

@MMaiero The property enables the metrics feature that contains CPU, GPU and standard statistics. I'll rephrase in this way:

Enable the Triton Server Metrics feature. This property enables the standard statistics and CPU/GPU metrics, if available.

MMaiero · 2025-02-11T15:52:26Z

....server/OSGI-INF/metatype/org.eclipse.kura.ai.triton.server.TritonServerContainerService.xml

+            cardinality="0"
+            required="false"
+            default=""
+            description="A semi-colon separated list of metrics-specific configuration settings for the Triton Server Metrics. (i.e. counter_latencies=true)">


Suggested change

description="A semi-colon separated list of metrics-specific configuration settings for the Triton Server Metrics. (i.e. counter_latencies=true)">

description="A semi-colon separated list of Triton Server Metrics. (i.e. counter_latencies=true)">

@MMaiero This property allows to configure a metric, enabling or disabling the statistics that are emitted by the service. For example, when the metrics are enabled, a user can disable the counter_latencies setting --metrics-config counter_latencies=false. So, they aren't new metrics, but configurations for metrics that are already there.

....server/OSGI-INF/metatype/org.eclipse.kura.ai.triton.server.TritonServerContainerService.xml

...ton.server/src/main/java/org/eclipse/kura/ai/triton/server/TritonServerContainerManager.java

...er/src/main/java/org/eclipse/kura/ai/triton/server/metrics/parser/ModelStatisticsParser.java

Signed-off-by: pierantoniomerlino <[email protected]>

…clipse.kura.ai.triton.server.TritonServerContainerService.xml Co-authored-by: Matteo Maiero <[email protected]>

Signed-off-by: pierantoniomerlino <[email protected]>

pierantoniomerlino · 2025-02-20T08:26:45Z

The parser for the model statistics has been updated using the JsonFormat class provided by com.google.protobuf:protobuf-jav-util. The library is embedded in the org.eclipse.kura.ai.triton.server bundle.

The library has been successfully checked using the Eclipse Dash License Tool:

echo "com.google.protobuf:protobuf-java-util:3.25.3" | java -jar org.eclipse.dash.licenses-1.1.1-20250220.065102-441.jar -
[main] INFO Querying Eclipse Foundation for license data for 1 items.
[main] INFO Found 0 items.
[main] INFO Querying ClearlyDefined for license data for 1 items.
[main] INFO Found 1 items.
[main] INFO Vetted license information was found for all content. No further investigation is required.

Signed-off-by: pierantoniomerlino <[email protected]>

pierantoniomerlino requested a review from mattdibi February 7, 2025 13:59

mattdibi reviewed Feb 10, 2025

View reviewed changes

pierantoniomerlino marked this pull request as ready for review February 10, 2025 15:06

MMaiero reviewed Feb 11, 2025

View reviewed changes

pierantoniomerlino force-pushed the tritonserver_metrics branch from 8cfde6b to a0f1401 Compare February 14, 2025 15:56

pierantoniomerlino marked this pull request as draft February 17, 2025 16:24

pierantoniomerlino force-pushed the tritonserver_metrics branch from 79e91b2 to 3fe9942 Compare February 19, 2025 09:24

pierantoniomerlino marked this pull request as ready for review February 19, 2025 09:54

mattdibi requested changes Feb 19, 2025

View reviewed changes

pierantoniomerlino and others added 19 commits February 20, 2025 09:22

First implementation of TritonServer metrics feature

1d99dec

Signed-off-by: pierantoniomerlino <[email protected]>

Reverted compatibility version to 1.8

ee17f41

Signed-off-by: pierantoniomerlino <[email protected]>

Reverted compatibility version to 1.8

b81013c

Signed-off-by: pierantoniomerlino <[email protected]>

Added interval property; several improvements

1d65448

Signed-off-by: pierantoniomerlino <[email protected]>

Added metrics support for native and remote server versions

cb8c9b9

Signed-off-by: pierantoniomerlino <[email protected]>

Removed optional configs used for debug

f2b44d1

Signed-off-by: pierantoniomerlino <[email protected]>

Added server address to metrics connection

1bd00cb

Signed-off-by: pierantoniomerlino <[email protected]>

Fixed tests

2285dd0

Signed-off-by: pierantoniomerlino <[email protected]>

Added tests

b44f936

Signed-off-by: pierantoniomerlino <[email protected]>

Added more tests

c252310

Signed-off-by: pierantoniomerlino <[email protected]>

Updated copyright and added comments

03fb2da

Signed-off-by: pierantoniomerlino <[email protected]>

Fixed stream usage

f203375

Signed-off-by: pierantoniomerlino <[email protected]>

Syntactic sugar for make sonar happy

1bee265

Signed-off-by: pierantoniomerlino <[email protected]>

Simplified triton server properties

21c8bb1

Signed-off-by: pierantoniomerlino <[email protected]>

Updated tests

d4ce81c

Signed-off-by: pierantoniomerlino <[email protected]>

Increased read and connection timeout

6bafc0b

Signed-off-by: pierantoniomerlino <[email protected]>

Update kura/org.eclipse.kura.ai.triton.server/OSGI-INF/metatype/org.e…

5a3d50e

…clipse.kura.ai.triton.server.TritonServerContainerService.xml Co-authored-by: Matteo Maiero <[email protected]>

Let's do some TDD...

e263afd

Signed-off-by: pierantoniomerlino <[email protected]>

Removed raw metrics api; fixed copyright

cd91f8c

Signed-off-by: pierantoniomerlino <[email protected]>

pierantoniomerlino added 14 commits February 20, 2025 09:22

Updated metatypes; removed getRawMetrics implementation

3538f20

Signed-off-by: pierantoniomerlino <[email protected]>

Deleted old implementation of TritonServer

1343ae2

Signed-off-by: pierantoniomerlino <[email protected]>

Added GPU metrics retrival

5e3afcd

Signed-off-by: pierantoniomerlino <[email protected]>

Added statistics retrival

c3f6f94

Signed-off-by: pierantoniomerlino <[email protected]>

Added missing copyright

c01dc6c

Signed-off-by: pierantoniomerlino <[email protected]>

Updated metrics names

56a6ab5

Signed-off-by: pierantoniomerlino <[email protected]>

Updated metric names

44142b3

Signed-off-by: pierantoniomerlino <[email protected]>

Fixed tests

df64974

Signed-off-by: pierantoniomerlino <[email protected]>

Fixed old tests

7a11c39

Signed-off-by: pierantoniomerlino <[email protected]>

Replaced statistics parsing method and updated tests

a22622d

Signed-off-by: pierantoniomerlino <[email protected]>

Fixed copyrights

218aa13

Signed-off-by: pierantoniomerlino <[email protected]>

Fixed copyrights again

47d95c7

Signed-off-by: pierantoniomerlino <[email protected]>

Refactored ModelStatistis parser

76817c6

Signed-off-by: pierantoniomerlino <[email protected]>

Updated statistics parser using protobuf JsonFormat

c1be95a

Signed-off-by: pierantoniomerlino <[email protected]>

pierantoniomerlino force-pushed the tritonserver_metrics branch from 442cea2 to c1be95a Compare February 20, 2025 08:23

Removed spaces in copyrights header

d83a245

Signed-off-by: pierantoniomerlino <[email protected]>

mattdibi previously approved these changes Feb 20, 2025

View reviewed changes

Added missing dependency on test

e247e30

Signed-off-by: pierantoniomerlino <[email protected]>

pierantoniomerlino dismissed mattdibi’s stale review via e247e30 February 20, 2025 10:53

Fixed copyright in test pom

8749089

Signed-off-by: pierantoniomerlino <[email protected]>

mattdibi approved these changes Feb 20, 2025

View reviewed changes

pierantoniomerlino merged commit ddef8d6 into develop Feb 20, 2025
5 checks passed

pierantoniomerlino deleted the tritonserver_metrics branch February 20, 2025 13:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ai): First implementation of TritonServer metrics feature #5687

feat(ai): First implementation of TritonServer metrics feature #5687

pierantoniomerlino commented Jan 31, 2025 •

edited

Loading

pierantoniomerlino commented Feb 7, 2025

pierantoniomerlino commented Feb 10, 2025

MMaiero Feb 11, 2025

pierantoniomerlino Feb 12, 2025

MMaiero Feb 11, 2025

pierantoniomerlino Feb 12, 2025

pierantoniomerlino commented Feb 20, 2025

	description="Enable the Triton Server Metrics feature. This property enables the default, CPU and GPU metrics, if available.">
	description="Enable the Triton Server Metrics feature. This property enables the default CPU and GPU metrics, if available.">

	description="A semi-colon separated list of metrics-specific configuration settings for the Triton Server Metrics. (i.e. counter_latencies=true)">
	description="A semi-colon separated list of Triton Server Metrics. (i.e. counter_latencies=true)">

feat(ai): First implementation of TritonServer metrics feature #5687

feat(ai): First implementation of TritonServer metrics feature #5687

Conversation

pierantoniomerlino commented Jan 31, 2025 • edited Loading

pierantoniomerlino commented Feb 7, 2025

pierantoniomerlino commented Feb 10, 2025

MMaiero Feb 11, 2025

Choose a reason for hiding this comment

pierantoniomerlino Feb 12, 2025

Choose a reason for hiding this comment

MMaiero Feb 11, 2025

Choose a reason for hiding this comment

pierantoniomerlino Feb 12, 2025

Choose a reason for hiding this comment

pierantoniomerlino commented Feb 20, 2025

pierantoniomerlino commented Jan 31, 2025 •

edited

Loading