[vslib] Add High Frequency Telemetry (HFT) support for virtual SAI#1812
Draft
50n1c-rnsft wants to merge 20 commits intosonic-net:masterfrom
Draft
[vslib] Add High Frequency Telemetry (HFT) support for virtual SAI#181250n1c-rnsft wants to merge 20 commits intosonic-net:masterfrom
50n1c-rnsft wants to merge 20 commits intosonic-net:masterfrom
Conversation
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
9fb21e0 to
f73d1ed
Compare
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
7621c8f to
76dc23e
Compare
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
- Add 'libnl', 'syscall', 'templatehdr' to tests/aspell.en.pws - Add SWSS_LOG_ENTER() to test fixture methods in TestTAMIpfixTemplate.cpp (SetUp, getIpfixTemplates, createCounterSubscription, validateTemplateHeader) - Add swss/logger.h include to test file Signed-off-by: Ze Gan agent <ganze_12345@qq.com>
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
6c00dcd to
17862f1
Compare
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
50n1c-rnsft
added a commit
to 50n1c-rnsft/sonic-buildimage
that referenced
this pull request
Mar 25, 2026
Add a minimal generic netlink kernel module (sonic_stel) for the VS platform that registers the 'sonic_stel' family with an 'ipfix' multicast group. This module acts as a relay: vslib (virtual SAI) sends IPFIX data records via SONIC_STEL_CMD_SEND_IPFIX, and the module multicasts them to countersyncd via genlmsg_multicast. Files: - platform/vs/sonic-stel-module/ - kernel module source + debian packaging - platform/vs/sonic-stel-ko.mk - buildimage rule Related PR: sonic-net/sonic-sairedis#1812 Signed-off-by: Ze Gan agent <ganze_12345@qq.com>
17862f1 to
64d2196
Compare
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
64d2196 to
5527bb2
Compare
Collaborator
|
/azp run |
50n1c-rnsft
added a commit
to 50n1c-rnsft/sonic-buildimage
that referenced
this pull request
Mar 25, 2026
Add a minimal generic netlink kernel module (sonic_stel) for the VS platform that registers the 'sonic_stel' family with an 'ipfix' multicast group. This module acts as a relay: vslib (virtual SAI) sends IPFIX data records via SONIC_STEL_CMD_SEND_IPFIX, and the module multicasts them to countersyncd via genlmsg_multicast. Files: - platform/vs/sonic-stel-module/ - kernel module source + debian packaging - platform/vs/sonic-stel-ko.mk - buildimage rule Related PR: sonic-net/sonic-sairedis#1812 Signed-off-by: Ze Gan agent <ganze_12345@qq.com>
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
- Add 'genl' and 'ko' to tests/aspell.en.pws - Add '// SWSS_LOG_ENTER() omitted' comments to static helper functions: - SwitchStateBaseStel.cpp: put_u16_be, put_u32_be, put_u64_be, build_ipfix_data_message - SwitchStateBase.cpp: write_u16_be, write_u32_be - TestTAMIpfixTemplate.cpp: read_u16_be, read_u32_be Signed-off-by: Ze Gan agent <ganze_12345@qq.com>
186e94e to
2c80d87
Compare
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Comment on lines
+572
to
+624
| switch (attr->value.s32) | ||
| { | ||
| case SAI_TAM_TEL_TYPE_STATE_CREATE_CONFIG: | ||
| send_tam_tel_type_config_change(tam_tel_type_id); | ||
| break; | ||
|
|
||
| case SAI_TAM_TEL_TYPE_STATE_START_STREAM: | ||
| { | ||
| // Count counter subscriptions for this tel_type to determine num_counters | ||
| size_t num_counters = 0; | ||
| auto it = m_objectHash.find(SAI_OBJECT_TYPE_TAM_COUNTER_SUBSCRIPTION); | ||
| if (it != m_objectHash.end()) | ||
| { | ||
| for (auto &kv : it->second) | ||
| { | ||
| auto tel_it = kv.second.find(sai_serialize_attr_id( | ||
| *sai_metadata_get_attr_metadata( | ||
| SAI_OBJECT_TYPE_TAM_COUNTER_SUBSCRIPTION, | ||
| SAI_TAM_COUNTER_SUBSCRIPTION_ATTR_TEL_TYPE))); | ||
|
|
||
| if (tel_it != kv.second.end() && | ||
| tel_it->second->getAttr()->value.oid == tam_tel_type_id) | ||
| { | ||
| num_counters++; | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Default poll interval: 100ms (100000 us) | ||
| // In real HW this comes from TAM report config; for vslib use a reasonable default | ||
| uint32_t poll_interval_us = 100000; | ||
| uint16_t template_id = 256; | ||
|
|
||
| SWSS_LOG_NOTICE("Starting STEL stream for tel_type %s: %zu counters, %u us interval", | ||
| sai_serialize_object_id(tam_tel_type_id).c_str(), | ||
| num_counters, poll_interval_us); | ||
|
|
||
| if (num_counters > 0) | ||
| { | ||
| startStelStream(poll_interval_us, template_id, num_counters); | ||
| } | ||
| break; | ||
| } | ||
|
|
||
| case SAI_TAM_TEL_TYPE_STATE_STOP_STREAM: | ||
| SWSS_LOG_NOTICE("Stopping STEL stream for tel_type %s", | ||
| sai_serialize_object_id(tam_tel_type_id).c_str()); | ||
| stopStelStream(); | ||
| break; | ||
|
|
||
| default: | ||
| break; | ||
| } |
Check notice
Code scanning / CodeQL
Long switch case Note
Per RFC 7011, an IPFIX Exporting Process must send Template Records before any Data Records. The STEL worker thread was only sending Data Records (Set ID=256), causing countersyncd to fail with: 'Failed to parse IPFIX data message: Error: Missing Template at 0x14' Fix: - Add build_ipfix_template_message() to wrap template set in IPFIX header - Read IPFIX template from SAI_TAM_TEL_TYPE_ATTR_IPFIX_TEMPLATES on START_STREAM and pass to worker thread - Send Template Record before first Data Record - Re-send Template Record every ~30 seconds per RFC 7011 recommendation - Update startStelStream/stelWorkerThread signatures to accept template Tested: 203/203 unit tests pass, aspell/swsslogenter clean. Bug found by msft-internal-linux during VS testbed integration testing. Signed-off-by: Ze Gan agent <ganze_12345@qq.com>
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Revert commit af43e00. Per Ze's clarification, IPFIX templates are NOT sent via the data plane (genetlink). Instead: 1. orchagent queries SAI_TAM_TEL_TYPE_ATTR_IPFIX_TEMPLATES 2. orchagent writes template to STATE_DB (session_config field) 3. countersyncd reads template from STATE_DB The genetlink channel only carries IPFIX Data Records. The 'Missing Template' error from countersyncd testing was because orchagent had not yet written the template to STATE_DB, not because vslib needed to send it. Signed-off-by: Ze Gan agent <ganze_12345@qq.com>
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Ze Gan agent <ganze_12345@qq.com>
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Add High Frequency Telemetry (HFT) / Stream Telemetry support to the virtual SAI (vslib), enabling the full HFT pipeline to run in virtual/test environments.
What this PR does
Task 1 — TAM object initialization:
SAI_SWITCH_ATTR_TAM_ST_REPORT_CHUNK_SIZE(default: 65535) andSAI_SWITCH_ATTR_TAM_ST_CHUNK_COUNT(default: 0) toset_initial_tam_objects()SAI_OBJECT_TYPE_TAM_TEL_TYPEandSAI_OBJECT_TYPE_TAM_COUNTER_SUBSCRIPTIONto supported object typesTAM_COUNTER_SUBSCRIPTIONCRUD works through the existing genericcreate_internal()pathTask 2 — IPFIX template generation:
refresh_tam_tel_ipfix_templates()with real IPFIX template binary per HLD 7.2.2(stat_id << 16) | object_typelabel | 0x8000(enterprise bit set)observationTimeNanoseconds(Element ID=325, 8 bytes)SWSS_LOG_WARNfor values exceeding 15-bit rangeTask 3 — STEL genetlink sender:
SwitchStateBaseStel.cpp: RAIIGenlConnectionwrapper using libnl-genl (genl_connect(),genl_ctrl_resolve(),genlmsg_put(),nla_put(),nl_send_auto())setTamTelType()to start/stop STEL stream onSTART_STREAM/STOP_STREAMstate changes~SwitchStateBase()destructorTask 4 — Stats capability extension:
queryBufferPoolStatsCapability()(15 stats) andqueryIngressPriorityGroupStatsCapability()(9 stats)queryStatsCapability()to dispatchBUFFER_POOLandINGRESS_PRIORITY_GROUPobject typesFiles changed (14 files, +1089/-156 lines)
vslib/SwitchStateBase.cppvslib/SwitchStateBase.hvslib/SwitchStateBaseStel.cppvslib/sonic_stel_uapi.hvslib/Makefile.am-lnl-genl-3 -lnl-3link flagsvslib/tests.cppsyncd/Makefile.am-lnl-genl-3 -lnl-3to SAILIBsyncd/tests/Makefile.am-lnl-genl-3 -lnl-3link flagssaidiscovery/Makefile.am-lnl-genl-3 -lnl-3to SAILIBsaisdkdump/Makefile.am-lnl-genl-3 -lnl-3to SAILIBtests/Makefile.am-lnl-genl-3 -lnl-3to SAILIBtests/aspell.en.pwsunittest/vslib/TestTAMIpfixTemplate.cppunittest/vslib/Makefile.amKernel module
The
sonic_stelkernel module (genetlink family registration) is in a separate PR:platform/vs/sonic-stel-module/Testing
dpkg-buildpackage -Psyncd,vs,nopython2passes insonic-slave-bookworm:master-amd64CI containerEmptySubscription_GeneratesMinimalTemplateSingleSubscription_CorrectBinaryFormatMultipleSubscriptions_CorrectFieldCountAndEncodingUnmatchedSubscription_FilteredReferences