Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is only part of the NICs used during allreduce, and why are inter-node connections not on the same rail even when I explicitly specify using all NICs? #1600

Open
FortPercent opened this issue Feb 11, 2025 · 2 comments

Comments

@FortPercent
Copy link

When using NCCL, I found that the inter-node RDMA connections only utilized a subset of the NICs instead of all available ones. Additionally, the inter-node connections did not appear to be on the same rail. As a result, when testing with 16 H100 machines, the performance was only 10GB/s.

Below are our topology diagram and the log printed with DEBUG=INFO with 2 nodes.
Here are the environment variables we used:
NCCL_NET_GDR_LEVEL=2
NCCL_IBEXT_DISABLE=1
NCCL_SHARP_DISABLE=1

test.log
topo.txt

@kiskra-nvidia
Copy link
Member

I suggest adding NCCL_DEBUG_SUBSYS=INIT,ENV,GRAPH,TUNING to gain more insight into the topology that NCCL is seeing and the collective algorithm/protocol it chooses.

The log you attached is from just one node -- based on it we can't be sure what the situation is on the other node.

NCCL will normally prefer rail-optimized connectivity but by default that's not actually enforced. See NCCL_CROSS_NIC for the list of available options.

@FortPercent
Copy link
Author

Thanks for the suggestion! I'll add NCCL_DEBUG_SUBSYS=INIT,ENV,GRAPH,TUNING for more insights and check NCCL_CROSS_NIC for connectivity options. I'll also collect logs from both nodes for a full picture. Appreciate your input!

I suggest adding NCCL_DEBUG_SUBSYS=INIT,ENV,GRAPH,TUNING to gain more insight into the topology that NCCL is seeing and the collective algorithm/protocol it chooses.

The log you attached is from just one node -- based on it we can't be sure what the situation is on the other node.

NCCL will normally prefer rail-optimized connectivity but by default that's not actually enforced. See NCCL_CROSS_NIC for the list of available options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants