Merge branch 'xsk-support-redirect-to-any-socket-bound-to-the-same-umem'

Magnus Karlsson says:

====================
xsk: support redirect to any socket bound to the same umem

This patch set adds support for directing a packet to any socket bound
to the same umem. This makes it possible to use the XDP program to
select what socket the packet should be received on. The user can
populate the XSKMAP with various sockets and as long as they share the
same umem, the XDP program can pick any one of them.

The implementation is straight-forward. Instead of testing that the
incoming packet is targeting the same device and queue id as the
socket is bound to, just check that the umem the packet was received
on is the same as the socket we want it to be received on. This
guarantees that the redirect is legal as it is already in the correct
umem.

Patch #1 implements the feature and patch #2 adds documentation.

Thanks: Magnus
====================

Link: https://lore.kernel.org/r/20240205123553.22180-1-magnus.karlsson@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This commit is contained in:
Alexei Starovoitov 2024-02-05 20:01:16 -08:00
commit 6146fae67b
2 changed files with 22 additions and 14 deletions

View File

@ -329,23 +329,24 @@ XDP_SHARED_UMEM option and provide the initial socket's fd in the
sxdp_shared_umem_fd field as you registered the UMEM on that
socket. These two sockets will now share one and the same UMEM.
There is no need to supply an XDP program like the one in the previous
case where sockets were bound to the same queue id and
device. Instead, use the NIC's packet steering capabilities to steer
the packets to the right queue. In the previous example, there is only
one queue shared among sockets, so the NIC cannot do this steering. It
can only steer between queues.
In this case, it is possible to use the NIC's packet steering
capabilities to steer the packets to the right queue. This is not
possible in the previous example as there is only one queue shared
among sockets, so the NIC cannot do this steering as it can only steer
between queues.
In libbpf, you need to use the xsk_socket__create_shared() API as it
takes a reference to a FILL ring and a COMPLETION ring that will be
created for you and bound to the shared UMEM. You can use this
function for all the sockets you create, or you can use it for the
second and following ones and use xsk_socket__create() for the first
one. Both methods yield the same result.
In libxdp (or libbpf prior to version 1.0), you need to use the
xsk_socket__create_shared() API as it takes a reference to a FILL ring
and a COMPLETION ring that will be created for you and bound to the
shared UMEM. You can use this function for all the sockets you create,
or you can use it for the second and following ones and use
xsk_socket__create() for the first one. Both methods yield the same
result.
Note that a UMEM can be shared between sockets on the same queue id
and device, as well as between queues on the same device and between
devices at the same time.
devices at the same time. It is also possible to redirect to any
socket as long as it is bound to the same umem with XDP_SHARED_UMEM.
XDP_USE_NEED_WAKEUP bind flag
-----------------------------
@ -822,6 +823,10 @@ A: The short answer is no, that is not supported at the moment. The
switch, or other distribution mechanism, in your NIC to direct
traffic to the correct queue id and socket.
Note that if you are using the XDP_SHARED_UMEM option, it is
possible to switch traffic between any socket bound to the same
umem.
Q: My packets are sometimes corrupted. What is wrong?
A: Care has to be taken not to feed the same buffer in the UMEM into

View File

@ -313,10 +313,13 @@ static bool xsk_is_bound(struct xdp_sock *xs)
static int xsk_rcv_check(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len)
{
struct net_device *dev = xdp->rxq->dev;
u32 qid = xdp->rxq->queue_index;
if (!xsk_is_bound(xs))
return -ENXIO;
if (xs->dev != xdp->rxq->dev || xs->queue_id != xdp->rxq->queue_index)
if (!dev->_rx[qid].pool || xs->umem != dev->_rx[qid].pool->umem)
return -EINVAL;
if (len > xsk_pool_get_rx_frame_size(xs->pool) && !xs->sg) {