There is no issue with Mellanox NIC supporting basic RSS ie: outer SRC-IP + Dst-IP + protocol
, but the expectation of Mellanox NIC to do RSS for SRV6 header content is incorrect. For the current library (verbs) and firmware as of today RSS and RPS can be validated by
ethtool -S [interface-name] | grep packets | grep rx
- for HW RSS spread on multiple RX queues
grep mxl5 /proc/interupts
- for queue to CPU mapping.
ethtool --show-rxfh-indir [interface-name]
- for identify the flow hash setting
based on the comments, there is a gap in understanding of packet format for SRv6 too. Packet format is ETh + IPv6 + next-header is 43 + srv6 header (next header can be ip/tcp/udp)
.
hence RSS is done on outer src-IP + dst-ip + protocol (43)
, the packets with different hash is spread to different queues
now using XDP loaded to the interface, one can filter for SRv6 headers
and apply simple xor hash or murmur hash then redirect AF_XDP sockets or interface.
hence the whole expectation and assumption is incorrect
[EDIT-1] based on the live debug we have spent 1.5 hours explaining and educating the same.
[EDIT-2] address the comments raised
1. It refers to what the rx-flow-counter has already accumulated, not the increase in SRv6 packets
In the live debug @takeru uses TREX packet generator
to send packets to NIC, with packet format as ETH + SRC-IP-1 ... SRC-IP-n + DST-IP + Srv6
. With a direct interface to interface connection, no other packets other than SRv6 packets will be recieved
2. In fact, if you check the load on the CPU in the case of SRv6 packets, you will see that only one CPU core is being loaded
In the live debug, @takeru did not run top/htop
, this is new information. @takeru was trying to understand if RSS on Outer IP is happening or not only. I have requested for a screenshot of CPU usage and tcpdump.
3. If it is only IPv6, the CPU load will be applied to other cores
The request has been placed to run simple XDP-eBPF program which redirects/drops ipv6-Srv6 packet. @takeru did not run the same yet
4. Only IPv6 and ip / udp cases have increased the value count by the debugging method you mentioned The same thing happens with SRv6 in linux kernel
I have pointed out to @takeru, the TREX packet he is generating of format ETH + Ipv6 + next-hdr routing + Srv6 header + next-header UDP
. Hence the kernel statics will update as ipv6/UDP as it is not TCP or not SCTP or unknown protocol.
Note: takeru's reference github project