-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ai): First implementation of TritonServer metrics feature #5687
Conversation
@mattdibi could you please take a look at this PR, even if it is still in draft? I've to test some corner cases and check some code choices, but the main stuff is here. |
....server/OSGI-INF/metatype/org.eclipse.kura.ai.triton.server.TritonServerContainerService.xml
Outdated
Show resolved
Hide resolved
...pse.kura.ai.triton.server/src/main/java/org/eclipse/kura/ai/triton/server/MetricsParser.java
Outdated
Show resolved
Hide resolved
...ton.server/src/main/java/org/eclipse/kura/ai/triton/server/TritonServerContainerManager.java
Outdated
Show resolved
Hide resolved
Tested on a Nvidia Jetson Orin Nano with the following scenarios:
The native and containerized versions expose the metrics configuration, while the remote one hasn't any property. However, they still implement the |
cardinality="0" | ||
required="true" | ||
default="true" | ||
description="Enable the Triton Server Metrics feature. This property enables the default, CPU and GPU metrics, if available."> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description="Enable the Triton Server Metrics feature. This property enables the default, CPU and GPU metrics, if available."> | |
description="Enable the Triton Server Metrics feature. This property enables the default CPU and GPU metrics, if available."> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MMaiero The property enables the metrics feature that contains CPU, GPU and standard statistics. I'll rephrase in this way:
Enable the Triton Server Metrics feature. This property enables the standard statistics and CPU/GPU metrics, if available.
cardinality="0" | ||
required="false" | ||
default="" | ||
description="A semi-colon separated list of metrics-specific configuration settings for the Triton Server Metrics. (i.e. counter_latencies=true)"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description="A semi-colon separated list of metrics-specific configuration settings for the Triton Server Metrics. (i.e. counter_latencies=true)"> | |
description="A semi-colon separated list of Triton Server Metrics. (i.e. counter_latencies=true)"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MMaiero This property allows to configure a metric, enabling or disabling the statistics that are emitted by the service. For example, when the metrics are enabled, a user can disable the counter_latencies
setting --metrics-config counter_latencies=false
. So, they aren't new metrics, but configurations for metrics that are already there.
....server/OSGI-INF/metatype/org.eclipse.kura.ai.triton.server.TritonServerContainerService.xml
Outdated
Show resolved
Hide resolved
8cfde6b
to
a0f1401
Compare
79e91b2
to
3fe9942
Compare
...ton.server/src/main/java/org/eclipse/kura/ai/triton/server/TritonServerContainerManager.java
Show resolved
Hide resolved
...er/src/main/java/org/eclipse/kura/ai/triton/server/metrics/parser/ModelStatisticsParser.java
Outdated
Show resolved
Hide resolved
...er/src/main/java/org/eclipse/kura/ai/triton/server/metrics/parser/ModelStatisticsParser.java
Outdated
Show resolved
Hide resolved
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
…clipse.kura.ai.triton.server.TritonServerContainerService.xml Co-authored-by: Matteo Maiero <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
442cea2
to
c1be95a
Compare
The parser for the model statistics has been updated using the JsonFormat class provided by The library has been successfully checked using the Eclipse Dash License Tool:
|
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
Signed-off-by: pierantoniomerlino <[email protected]>
This PR adds a new API for managing status and performance metrics of an Inference Engine. The implementation for the Triton Server is added as well.
Related Issue: This PR fixes/closes N/A
Description of the solution adopted: The new
InferenceEngineMetricsService
interface provides APIs to get metrics from an inference engine in the form of a Map whose keys are the metric names.The implementation on the TritonServer engine, retrieves the GPU metrics from the metrics ports via http and the model statistics from the inference port via grpc. Both metrics are parsed and re-arranged in a json format. The keys for the GPU metrics are:
where the
gpu_uuid
is an identifier for the GPU device. For the model statistics, instead, the keys are:Moreover, the old TritonServer implementation has been deleted, since it has been deprecated in previous releases.