linux/drivers/net/ethernet
Brenden Blanco 326fe02d1e net/mlx4_en: protect ring->xdp_prog with rcu_read_lock
Depending on the preempt mode, the bpf_prog stored in xdp_prog may be
freed despite the use of call_rcu inside bpf_prog_put. The situation is
possible when running in PREEMPT_RCU=y mode, for instance, since the rcu
callback for destroying the bpf prog can run even during the bh handling
in the mlx4 rx path.

Several options were considered before this patch was settled on:

Add a napi_synchronize loop in mlx4_xdp_set, which would occur after all
of the rings are updated with the new program.
This approach has the disadvantage that as the number of rings
increases, the speed of update will slow down significantly due to
napi_synchronize's msleep(1).

Add a new rcu_head in bpf_prog_aux, to be used by a new bpf_prog_put_bh.
The action of the bpf_prog_put_bh would be to then call bpf_prog_put
later. Those drivers that consume a bpf prog in a bh context (like mlx4)
would then use the bpf_prog_put_bh instead when the ring is up. This has
the problem of complexity, in maintaining proper refcnts and rcu lists,
and would likely be harder to review. In addition, this approach to
freeing must be exclusive with other frees of the bpf prog, for instance
a _bh prog must not be referenced from a prog array that is consumed by
a non-_bh prog.

The placement of rcu_read_lock in this patch is functionally the same as
putting an rcu_read_lock in napi_poll. Actually doing so could be a
potentially controversial change, but would bring the implementation in
line with sk_busy_loop (though of course the nature of those two paths
is substantially different), and would also avoid future copy/paste
problems with future supporters of XDP. Still, this patch does not take
that opinionated option.

Testing was done with kernels in either PREEMPT_RCU=y or
CONFIG_PREEMPT_VOLUNTARY=y+PREEMPT_RCU=n modes, with neither exhibiting
any drawback. With PREEMPT_RCU=n, the extra call to rcu_read_lock did
not show up in the perf report whatsoever, and with PREEMPT_RCU=y the
overhead of rcu_read_lock (according to perf) was the same before/after.
In the rx path, rcu_read_lock is eventually called for every packet
from netif_receive_skb_internal, so the napi poll call's rcu_read_lock
is easily amortized.

v2:
Remove extra rcu_read_lock in mlx4_en_process_rx_cq body
Annotate xdp_prog with __rcu, and convert all usages to rcu_assign or
rcu_dereference[_protected] as appropriate.
Add explicit mutex lock around rcu_assign instead of xchg loop.

Fixes: d576acf0a2 ("net/mlx4_en: add page recycle to prepare rx ring for tx support")
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-06 13:39:33 -07:00
..
3com
8390 net: ethernet: ax88796: avoid null pointer dereference 2016-08-01 13:32:51 -07:00
adaptec
adi net: bfin_mac: Fix a few spelling fixes 2016-08-13 15:14:56 -07:00
aeroflex net: ethernet: greth: use phy_ethtool_{get|set}_link_ksettings 2016-08-08 15:42:21 -07:00
agere net: ethernet: et131x: constify ethtool_ops structures 2016-08-31 09:22:30 -07:00
allwinner net: ethernet: sun4i-emac: use phy_ethtool_{get|set}_link_ksettings 2016-06-22 16:22:41 -04:00
alteon
altera ethernet: altera: add missing of_node_put 2016-08-01 21:42:57 -07:00
amazon net: ena: change the return type of ena_set_push_mode() to be void. 2016-08-23 17:42:33 -07:00
amd xgbe: constify get_netdev_ops and get_ethtool_ops 2016-08-31 14:17:30 -07:00
apm net: xgene: fix backward compatibility fix 2016-09-01 10:00:42 -07:00
apple
arc net: arc_emac: add missing of_node_put() in arc_emac_probe() 2016-08-06 00:07:38 -04:00
atheros Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-08-30 00:54:02 -04:00
aurora ethernet: aurora: nb8800: add missing of_node_put after calling of_parse_phandle 2016-08-01 21:43:47 -07:00
broadcom net: systemport: constify ethtool_ops structures 2016-08-31 09:22:31 -07:00
brocade bna: remove global bnad_list_mutex 2016-08-08 15:41:27 -07:00
cadence net: ethernet: macb: Add support for rx_clk 2016-08-18 20:58:42 -07:00
calxeda
cavium liquidio:CN23XX pause frame support 2016-09-02 17:11:31 -07:00
chelsio cxgb4: Add support for ndo_get_vf_config 2016-09-04 11:46:00 -07:00
cirrus net: cx89x0: Add DT support 2016-06-15 12:17:57 -07:00
cisco net: enic: use correct type specifier 2016-08-01 13:32:52 -07:00
davicom dm9000: Fix irq trigger type setup on non-dt platforms 2016-08-09 15:08:22 -07:00
dec net: fix up a few missing hashtable.h conflict resolutions 2016-08-13 14:51:02 -07:00
dlink
emulex be2net: replace polling with sleeping in the FW completion path 2016-08-08 15:38:27 -07:00
ezchip Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-07-24 00:53:32 -04:00
faraday net/faraday: Disallow using reversed MAC address from hardware 2016-07-20 21:05:18 -07:00
freescale Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-08-30 00:54:02 -04:00
fujitsu
hisilicon net: hisilicon: constify ethtool_ops structures 2016-08-31 09:22:31 -07:00
hp
i825xx
ibm ibmvnic: fix error return code in ibmvnic_probe() 2016-08-25 16:41:00 -07:00
intel ixgbe: Eliminate useless message and improve logic 2016-08-30 22:12:35 -07:00
marvell sky2: use napi_complete_done 2016-09-01 14:09:51 -07:00
mediatek net: ethernet: mediatek: enhance RX path by aggregating more SKBs into NAPI 2016-09-06 13:33:19 -07:00
mellanox net/mlx4_en: protect ring->xdp_prog with rcu_read_lock 2016-09-06 13:39:33 -07:00
micrel drivers: net: Don't print unpopulated net_device name 2016-05-17 12:30:19 -04:00
microchip enc28j60: Fix race condition in enc28j60 driver 2016-07-02 14:48:14 -04:00
moxa
myricom
natsemi
neterion net: s2io: simplify logical constraint 2016-08-01 13:32:52 -07:00
netronome nfp: check idx is -ENOSPC before using it is an index 2016-07-11 13:52:00 -07:00
nuvoton net: ethernet: nuvoton: fix spelling mistake: "aligment" -> "alignment" 2016-08-18 23:29:43 -07:00
nvidia
nxp net: lpc_eth: Check clk_prepare_enable() error 2016-08-23 17:10:16 -07:00
oki-semi
packetengines
pasemi net: ethernet: pasemi_mac: use phy_ethtool_{get|set}_link_ksettings 2016-07-15 16:41:34 -07:00
qlogic rtnetlink: fdb dump: optimize by saving last interface markers 2016-09-01 16:56:15 -07:00
qualcomm net: emac: emac gigabit ethernet controller driver 2016-09-01 23:32:05 -07:00
rdc net: r6040: Bump version and date 2016-07-05 00:10:30 -07:00
realtek 8139cp: Fix one possible deadloop in cp_rx_poll 2016-08-25 17:02:48 -07:00
renesas ravb: avoid unused function warnings 2016-08-30 23:32:11 -07:00
rocker bridge: switchdev: Add forward mark support for stacked devices 2016-08-26 13:13:36 -07:00
samsung net: ethernet: sxgbe: use phy_ethtool_{get|set}_link_ksettings 2016-06-28 09:12:35 -04:00
seeq
sfc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-08-30 00:54:02 -04:00
sgi
silan
sis
smsc net: smc91x: fix SMC accesses 2016-08-28 23:44:55 -04:00
stmicro net: stmmac: dwmac-rk: add pd_gmac support for rk3399 2016-09-02 17:08:57 -07:00
sun
synopsys dwc_eth_qos: constify ethtool_ops structures 2016-08-31 09:22:31 -07:00
tehuti net: tehuti: fix typo: "eneble" -> "enable" 2016-08-21 15:21:36 -07:00
ti net: ti: cpmac: Fix compiler warning due to type confusion 2016-09-04 11:47:20 -07:00
tile timers, driver/net/ethernet/tile: Initialize the egress timer as pinned 2016-07-07 10:25:14 +02:00
toshiba net: ethernet: tc35815: use phy_ethtool_{get|set}_link_ksettings 2016-07-15 16:41:33 -07:00
tundra net/ethernet: tundra: fix dump_eth_one warning in tsi108_eth 2016-08-08 13:08:21 -07:00
via
wiznet net: ethernet: wiznet: Remove create_workqueue 2016-06-02 12:15:17 -07:00
xilinx net: axienet: constify ethtool_ops structures 2016-08-31 20:45:51 -07:00
xircom ethernet: xircom: fix spelling mistakes on "excessive collisions" 2016-06-27 04:19:14 -04:00
xscale net: ethernet: ixp4xx_eth: use phy_ethtool_{get|set}_link_ksettings 2016-07-04 15:59:52 -07:00
Kconfig net: ena: Add a driver for Amazon Elastic Network Adapters (ENA) 2016-08-12 17:12:08 -07:00
Makefile net: ena: Add a driver for Amazon Elastic Network Adapters (ENA) 2016-08-12 17:12:08 -07:00
dnet.c net: ethernet: dnet: use phy_ethtool_{get|set}_link_ksettings 2016-06-28 05:10:26 -04:00
dnet.h net: ethernet: dnet: use phydev from struct net_device 2016-06-28 05:10:26 -04:00
ec_bhf.c
ethoc.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-07-24 00:53:32 -04:00
fealnx.c
jme.c
jme.h
korina.c
lantiq_etop.c net: ethernet: lantiq_etop: use phy_ethtool_{get|set}_link_ksettings 2016-07-04 15:59:51 -07:00
netx-eth.c drivers: net: Don't print unpopulated net_device name 2016-05-17 12:30:19 -04:00