Support nvidia-smi v13 schema, plus a few other issues. #18176

robcowart · 2025-12-31T18:29:29Z

Summary

Checklist

No AI generated code was used in this PR
AI generated code used in this PR follows the InfluxData Policy on AI-Generated Code Contributions

I am not spending any time reading some policy. GitHub Co-Pilot was certainly auto-completing a few lines here and there. If that is OK, accept the PR. If not ¯\_(ツ)_/¯

Related issues

resolves:

nvidia_smi,arch=Ampere,host=ava,index=0,name=NVIDIA\ RTX\ A5000,serial=1234567890123,uuid=GPU-12345678-aaaa-bbbb-cccc-0123456789ab ecc_errors_aggregate_dram_correctable=0i,ecc_errors_aggregate_dram_uncorrectable=0i,ecc_errors_aggregate_sram_uncorrectable_pcie=0i,remapped_rows_uncorrectable=0i,remapped_rows_pending="No",utilization_decoder=0i,pcie_link_gen_current=1i,fbc_stats_session_count=0i,driver_version="590.44.01",cuda_version="13.1",vbios_version="94.02.6D.00.0D",ecc_errors_volatile_dram_uncorrectable=0i,ecc_errors_aggregate_sram_uncorrectable_other=0i,ecc_errors_tpc_repair_pending="No",remapped_rows_failure="No",fbc_stats_average_fps=0i,ecc_errors_aggregate_sram_uncorrectable_secded=0i,ecc_errors_aggregate_sram_uncorrectable_microcontroller=0i,temperature_gpu=11i,fbc_stats_average_latency=0i,clocks_current_video=555i,compute_mode="Default",memory_reserved=441i,ecc_errors_aggregate_sram_uncorrectable_l2=0i,clocks_current_graphics=210i,clocks_current_sm=210i,power_draw=6.18,fan_speed=30i,memory_total=23028i,ecc_errors_volatile_sram_uncorrectable_parity=0i,ecc_errors_aggregate_sram_uncorrectable_parity=0i,ecc_errors_aggregate_sram_threshold_exceeded="No",remapped_rows_correctable=0i,utilization_encoder=0i,utilization_ofa=0i,pstate="P8",memory_free=555i,ecc_errors_volatile_sram_uncorrectable_secded=0i,utilization_memory=0i,utilization_jpeg=0i,pcie_link_width_current=16i,encoder_stats_session_count=0i,encoder_stats_average_fps=0i,display_active="Disabled",current_ecc="Enabled",memory_used=22033i,ecc_errors_volatile_dram_correctable=0i,ecc_errors_aggregate_sram_uncorrectable_sm=0i,ecc_errors_channel_repair_pending="No",utilization_gpu=0i,encoder_stats_average_latency=0i,ecc_errors_volatile_sram_correctable=0i,ecc_errors_aggregate_sram_correctable=0i,clocks_current_memory=405i,power_limit=230 1767202760000000000
nvidia_smi_process,host=ava,name=/root/src/llama.cpp/build/bin/llama-server,type=C pid=2623i,used_memory=22024i 1767202760000000000

or a bit easier to read...

{
  "fields": {
    "clocks_current_graphics": 210,
    "clocks_current_memory": 405,
    "clocks_current_sm": 210,
    "clocks_current_video": 555,
    "compute_mode": "Default",
    "cuda_version": "13.1",
    "current_ecc": "Enabled",
    "display_active": "Disabled",
    "driver_version": "590.44.01",
    "ecc_errors_aggregate_dram_correctable": 0,
    "ecc_errors_aggregate_dram_uncorrectable": 0,
    "ecc_errors_aggregate_sram_correctable": 0,
    "ecc_errors_aggregate_sram_threshold_exceeded": "No",
    "ecc_errors_aggregate_sram_uncorrectable_l2": 0,
    "ecc_errors_aggregate_sram_uncorrectable_microcontroller": 0,
    "ecc_errors_aggregate_sram_uncorrectable_other": 0,
    "ecc_errors_aggregate_sram_uncorrectable_parity": 0,
    "ecc_errors_aggregate_sram_uncorrectable_pcie": 0,
    "ecc_errors_aggregate_sram_uncorrectable_secded": 0,
    "ecc_errors_aggregate_sram_uncorrectable_sm": 0,
    "ecc_errors_channel_repair_pending": "No",
    "ecc_errors_tpc_repair_pending": "No",
    "ecc_errors_volatile_dram_correctable": 0,
    "ecc_errors_volatile_dram_uncorrectable": 0,
    "ecc_errors_volatile_sram_correctable": 0,
    "ecc_errors_volatile_sram_uncorrectable_parity": 0,
    "ecc_errors_volatile_sram_uncorrectable_secded": 0,
    "encoder_stats_average_fps": 0,
    "encoder_stats_average_latency": 0,
    "encoder_stats_session_count": 0,
    "fan_speed": 30,
    "fbc_stats_average_fps": 0,
    "fbc_stats_average_latency": 0,
    "fbc_stats_session_count": 0,
    "memory_free": 555,
    "memory_reserved": 441,
    "memory_total": 23028,
    "memory_used": 22033,
    "pcie_link_gen_current": 1,
    "pcie_link_width_current": 16,
    "power_draw": 6.25,
    "power_limit": 230,
    "pstate": "P8",
    "remapped_rows_correctable": 0,
    "remapped_rows_failure": "No",
    "remapped_rows_pending": "No",
    "remapped_rows_uncorrectable": 0,
    "temperature_gpu": 11,
    "utilization_decoder": 0,
    "utilization_encoder": 0,
    "utilization_gpu": 0,
    "utilization_jpeg": 0,
    "utilization_memory": 0,
    "utilization_ofa": 0,
    "vbios_version": "94.02.6D.00.0D"
  },
  "name": "nvidia_smi",
  "tags": {
    "arch": "Ampere",
    "host": "ava",
    "index": "0",
    "name": "NVIDIA RTX A5000",
    "serial": "1234567890123",
    "uuid": "GPU-12345678-aaaa-bbbb-cccc-0123456789ab"
  },
  "timestamp": 1767202870
}

{
  "fields": {
    "pid": 2623,
    "used_memory": 22024
  },
  "name": "nvidia_smi_process",
  "tags": {
    "host": "ava",
    "name": "/root/src/llama.cpp/build/bin/llama-server",
    "type": "C"
  },
  "timestamp": 1767202870
}

telegraf-tiger · 2025-12-31T18:29:37Z

Thanks so much for the pull request!
🤝 ✒️ Just a reminder that the CLA has not yet been signed, and we'll need it before merging. Please sign the CLA when you get a chance, then post a comment here saying !signed-cla

robcowart · 2025-12-31T18:46:28Z

Wow! some of these requirements are pretty annoying just to submit a PR to an MIT licensed open source project. There is no way that this is supportive of community contributions.

I am NOT going to spend any time reading about "Semantic PR and Commit messages". If a maintainer wants to rename the PR, go for it.
I am not going to sign a CLA without InfluxData committing to reimburse me for the costs of having my attorney review it first.

It really disappoints me now that I wasted time making this PR.

telegraf-tiger · 2025-12-31T18:53:56Z

Download PR build artifacts for linux_amd64.tar.gz, darwin_arm64.tar.gz, and windows_amd64.zip.
Downloads for additional architectures and packages are available below.

☺️ This pull request doesn't significantly change the Telegraf binary size (less than 1%)

📦 Click here to get additional PR build artifacts

Artifact URLs

. DEB	. RPM	. TAR . GZ	. ZIP
amd64.deb	aarch64.rpm	darwin_amd64.tar.gz	windows_amd64.zip
arm64.deb	armel.rpm	darwin_arm64.tar.gz	windows_arm64.zip
armel.deb	armv6hl.rpm	freebsd_amd64.tar.gz	windows_i386.zip
armhf.deb	i386.rpm	freebsd_armv7.tar.gz
i386.deb	ppc64le.rpm	freebsd_i386.tar.gz
mips.deb	riscv64.rpm	linux_amd64.tar.gz
mipsel.deb	s390x.rpm	linux_arm64.tar.gz
ppc64el.deb	x86_64.rpm	linux_armel.tar.gz
riscv64.deb		linux_armhf.tar.gz
s390x.deb		linux_i386.tar.gz
		linux_mips.tar.gz
		linux_mipsel.tar.gz
		linux_ppc64le.tar.gz
		linux_riscv64.tar.gz
		linux_s390x.tar.gz

robcowart added 4 commits December 31, 2025 16:28

[17482] mvidia_smi - v13 schema support

58ff61d

[17417] nvidia_smi - correct tag selction

5cfba5e

[17416] nvidia_smi - add ECC errors to records

f9c6452

nvidia_smi - update tests for various changes

ece8f06

commit for some irrelevant test

fbf5481

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support nvidia-smi v13 schema, plus a few other issues. #18176

Support nvidia-smi v13 schema, plus a few other issues. #18176

robcowart commented Dec 31, 2025

Uh oh!

telegraf-tiger bot commented Dec 31, 2025

Uh oh!

robcowart commented Dec 31, 2025

Uh oh!

telegraf-tiger bot commented Dec 31, 2025

Artifact URLs

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Support nvidia-smi v13 schema, plus a few other issues. #18176

Are you sure you want to change the base?

Support nvidia-smi v13 schema, plus a few other issues. #18176

Conversation

robcowart commented Dec 31, 2025

Summary

Checklist

Related issues

Uh oh!

telegraf-tiger bot commented Dec 31, 2025

Uh oh!

robcowart commented Dec 31, 2025

Uh oh!

telegraf-tiger bot commented Dec 31, 2025

Artifact URLs

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant