RSS
rss-ladder
tool supports PCI-slot-based queue naming now. I mean this:
30: 127355089 0 0 0 PCI-MSI-edge mlx5_comp0@pci:0000:01:00.0
31: 120112828 5482507 0 0 PCI-MSI-edge mlx5_comp1@pci:0000:01:00.0
32: 121978940 0 5524729 0 PCI-MSI-edge mlx5_comp2@pci:0000:01:00.0
33: 122736116 0 0 5465612 PCI-MSI-edge mlx5_comp3@pci:0000:01:00.0
How to tune it?
$ rss-ladder pci:0000:01:00.0
- distribute interrupts of pci:0000:01:00.0 (mlx5_async_eq) on socket 0
- distribute interrupts of pci:0000:01:00.0 (mlx5_cmd_eq) on socket 0
- distribute interrupts of pci:0000:01:00.0 (mlx5_comp) on socket 0
- pci:0000:01:00.0: queue mlx5_comp0@pci:0000:01:00.0 (irq 30) bound to CPU0
- pci:0000:01:00.0: queue mlx5_comp1@pci:0000:01:00.0 (irq 31) bound to CPU1
- pci:0000:01:00.0: queue mlx5_comp2@pci:0000:01:00.0 (irq 32) bound to CPU2
- pci:0000:01:00.0: queue mlx5_comp3@pci:0000:01:00.0 (irq 33) bound to CPU3
- distribute interrupts of pci:0000:01:00.0 (mlx5_pages_eq) on socket 0
It may be not perfect but it works at least. Well, at least for mlx5 driver.
RPS
autorps
tool doesn't yelling at you with dreadful exception if you try to tune multiqueue NIC. Just says that it may be wrong idea and you should use -f
flag to really change RPS settings.
Also some processors have inverted CPU masks in rps_cpus
file and you could put all processing on foreign NUMA node. I don't know how these masks work and don't want to know, so default behaviour now is to copy mask from /sys/class/net/$dev/device/local_cpus
instead of evaluate it.