Commit Graph

918672 Commits

Author SHA1 Message Date
Linus Torvalds f85c1598dd Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Fix sk_psock reference count leak on receive, from Xiyu Yang.

 2) CONFIG_HNS should be invisible, from Geert Uytterhoeven.

 3) Don't allow locking route MTUs in ipv6, RFCs actually forbid this,
    from Maciej Żenczykowski.

 4) ipv4 route redirect backoff wasn't actually enforced, from Paolo
    Abeni.

 5) Fix netprio cgroup v2 leak, from Zefan Li.

 6) Fix infinite loop on rmmod in conntrack, from Florian Westphal.

 7) Fix tcp SO_RCVLOWAT hangs, from Eric Dumazet.

 8) Various bpf probe handling fixes, from Daniel Borkmann.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (68 commits)
  selftests: mptcp: pm: rm the right tmp file
  dpaa2-eth: properly handle buffer size restrictions
  bpf: Restrict bpf_trace_printk()'s %s usage and add %pks, %pus specifier
  bpf: Add bpf_probe_read_{user, kernel}_str() to do_refine_retval_range
  bpf: Restrict bpf_probe_read{, str}() only to archs where they work
  MAINTAINERS: Mark networking drivers as Maintained.
  ipmr: Add lockdep expression to ipmr_for_each_table macro
  ipmr: Fix RCU list debugging warning
  drivers: net: hamradio: Fix suspicious RCU usage warning in bpqether.c
  net: phy: broadcom: fix BCM54XX_SHD_SCR3_TRDDAPD value for BCM54810
  tcp: fix error recovery in tcp_zerocopy_receive()
  MAINTAINERS: Add Jakub to networking drivers.
  MAINTAINERS: another add of Karsten Graul for S390 networking
  drivers: ipa: fix typos for ipa_smp2p structure doc
  pppoe: only process PADT targeted at local interfaces
  selftests/bpf: Enforce returning 0 for fentry/fexit programs
  bpf: Enforce returning 0 for fentry/fexit progs
  net: stmmac: fix num_por initialization
  security: Fix the default value of secid_to_secctx hook
  libbpf: Fix register naming in PT_REGS s390 macros
  ...
2020-05-15 13:10:06 -07:00
Linus Torvalds d5dfe4f1b4 Second RDMA 5.7 rc pull request
A few minor bug fixes for user visible defects, and one regression:
 
 - Various bugs from static checkers and syzkaller
 
 - Add missing error checking in mlx4
 
 - Prevent RTNL lock recursion in i40iw
 
 - Fix segfault in cxgb4 in peer abort cases
 
 - Fix a regression added in 5.7 where the IB_EVENT_DEVICE_FATAL could be
   lost, and wasn't delivered to all the FDs
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl6+5V8ACgkQOG33FX4g
 mxo/Zw//QlrhDSM4DG4hZsFhFX3i/vT7xbxkNTp/U1+Vue9cKcUlotNOzQ79lQur
 /12QEiDffLZNsy2jCD1nxf/rEo9NDc8iSHPcryQ3aGa2K3XJ3QQ2ZhHBzfFU9GVP
 GQYlGt9uB4t6LEcU4P/ZCdf1nmkfYFvNDfmveVWrPLasABK7WqxCBXaqbLed+0ca
 lGhFBGLdGNJqyK2BkPCtXr9XYTjzZhW0lJqMuex0YD7cIAFfn+qbzvLQheBy/mhH
 o9n28GQbIgpNXvYz2HvUkTfiwDLylFaNVBYVctnm10cbNtLHv2J0bBQQb2ZaXUAM
 xt54AQ2QAMHKC2h6sUqyDAmWKfPEOnxZ9LycYDa0ZaIvm/uK/jiEOXE9XbQeiOKF
 eHiyuDg0EB4AVcYlNSm7roHdbh3rAluHerGe4Kv3efF/b1Zt2mtTW7q/XuROfSeT
 WBnALyl7RurnSG0HY9hyUMak65JwolgqPdBDpzRjqIF5jeKW0nn8MsbGTPGWlcyN
 zLu5IQn/vv+sfISav1cVMRoijboe85+3SOVtMBQOJ2m7EkdzRyRSQh3oG1556n2y
 oqseNSQFKNFA1p4hEeTf8BVEqVX3gj50ykj+2g1HyxH3sB+iE3+EctnHNhfd3IKI
 n6e/67Nm0ub/WM1OXeuXBmW0NvsTt0aPBfsp4s+oq3eICD4YbFE=
 =R5BJ
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "A few minor bug fixes for user visible defects, and one regression:

   - Various bugs from static checkers and syzkaller

   - Add missing error checking in mlx4

   - Prevent RTNL lock recursion in i40iw

   - Fix segfault in cxgb4 in peer abort cases

   - Fix a regression added in 5.7 where the IB_EVENT_DEVICE_FATAL could
     be lost, and wasn't delivered to all the FDs"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/uverbs: Move IB_EVENT_DEVICE_FATAL to destroy_uobj
  RDMA/uverbs: Do not discard the IB_EVENT_DEVICE_FATAL event
  RDMA/iw_cxgb4: Fix incorrect function parameters
  RDMA/core: Fix double put of resource
  IB/core: Fix potential NULL pointer dereference in pkey cache
  IB/hfi1: Fix another case where pq is left on waitlist
  IB/i40iw: Remove bogus call to netdev_master_upper_dev_get()
  IB/mlx4: Test return value of calls to ib_get_cached_pkey
  RDMA/rxe: Always return ERR_PTR from rxe_create_mmap_info()
  i40iw: Fix error handling in i40iw_manage_arp_cache()
2020-05-15 13:06:56 -07:00
Linus Torvalds ce24729667 linux-kselftest-5.7-rc6
This Kselftest update for Linux 5.7-rc6 consists of
 
 - lkdtm runner fixes to prevent dmesg clearing and shellcheck errors
 - ftrace test handling when test module doesn't exist
 - nsfs test fix to replace zero-length array with flexible-array
 - dmabuf-heaps test fix to return clear error value
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAl6+0KwACgkQCwJExA0N
 QxyBog/8DME+7YtjZ8TqgeWuxMckDHqVBTLTobd3Wyd2Vsk6W3EWKaDo+R8sqRfN
 VmjExY1PH0+N4JAFuQxR1UAmO0E/YZ/HwGxU6IH4iAoXtVZ0Tc52jzlZkwo0/RwF
 uOXukaEjLNlMva06/CEptxZy4UAWKP1l9DaEYjY48zDC4zlXaPgGjk69MACZhK4E
 2M6mWJuUewf+PobZTGFv/4RPIAihZpx8gDMRBR/hy+RWKh1qNhpNeXQRrwTQm0u3
 5ewOQjP26VRam4pA/e7RsU1b2IR8nbDZFpX3dTEzkme9dJ2hoN3DZLOw0/WXS/aQ
 /Ha5bPBI7RMhq0FUQP1vMErZ6Da/YEkbTxmCpx1kL6vliKqMkHb+epLABTJlGGnD
 i+ENoLeJVhcyQ3vyWq3VNZDGlHYF3KuUftf21sdG4bXDajucnu/lSPvfSKzhMNIG
 c1+AcvTAeGTPHi2dCfdETjg1dakO4exrX9/ABvu3JY1pp9D8DLHZNc/O9iOvG0O5
 6Lp9LhtxCxKdAGJVxKxzXPjwPDSnzNV5sGm1ElFpXzA0Nv17VH/DIL8i6745sQXW
 ik29Z1QNv3QIOs5U1pe5wyY22D51UJOfS7hVWH+pWPveG46ApYeUxqmfNYoG/Vtr
 cMspPzCSgZTAvkswCGfnSahlqCSpJxzuxUEkM4HlPTulrW4jbDk=
 =2cqI
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:

 - lkdtm runner fixes to prevent dmesg clearing and shellcheck errors

 - ftrace test handling when test module doesn't exist

 - nsfs test fix to replace zero-length array with flexible-array

 - dmabuf-heaps test fix to return clear error value

* tag 'linux-kselftest-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/lkdtm: Use grep -E instead of egrep
  selftests/lkdtm: Don't clear dmesg when running tests
  selftests/ftrace: mark irqsoff_tracer.tc test as unresolved if the test module does not exist
  tools/testing: Replace zero-length array with flexible-array
  kselftests: dmabuf-heaps: Fix confused return value on expected error testing
2020-05-15 12:57:50 -07:00
Linus Torvalds 67e45621af RISC-V Fixes for 5.7-rc6
This consists of a handful of build fixes, all found by Huawei's autobuilder.
 None of these patches should have any functional impact on kernels that build,
 and they're mostly related to various features intermingling with !MMU.  While
 some of these might be better hoisted to generic code, it seems better to have
 the simple fixes in the meanwhile.
 
 As far as I know these are the only outstanding patches for 5.7.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAl6+3AsTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYic+uD/9nfqXgmKUV+X0/Sr4L61fqOPPXAVY1
 k6JprJETUAMQdd2blGvuBcQqr76HQ6SgmBe3mQG2blL0XEyp54RWRAOwd5Ygz96P
 C8pQ+7n6Rmc1kA7U3RUkgGCWpCf6WHyq8wpYeZaPkPYBJz02wi8Czzwx8HHRNc+g
 r2hSlP2WT4UI6q1xWwKtwgQ8OpQXGvNHxG8DNkg0scm8CDWxO253FhbknrhK3Dx+
 2+kbUyqLdS0kQfPxcCooFkb05X4fCIwfbYYamNIfYGh8r8qNcNctw8j4maZ8/PcZ
 +Y+XuHHR/dck6oky9KNGE2K0X+P1yjbzsEb5rTs09GjDxpvtegGHRyngdeQ2GVXA
 8GeVTSpIfe76PcQ1yZ4FJTYZAp6yqIsRE+Brcp//AsWghZp7jnBL3TRJ+aV4u51V
 qU/UL+wIlUv2YIT9Kqe5D2/uki7IDdCqe/IZc0eoY+2seJ+Y5WbLwtz10f/Dfgt7
 6qBYsbjISsIOUqXrGsv4guvQuO0omwz55MikQkKWius5IRe8RkAFZWezBpD5BJ8q
 IyhRvtsIHjCxr/5V4KIw9RMeQXOtt1PI6CELvlRpu5XjoB3Q2M/2E2YzN7xGPGc7
 nuW5V9rAWhxAAta4kwtjb899AqFl3w6wt9c8znoVKzZHYOztF3a4gSnY7s3r1kfp
 klG2AHELH3jdfg==
 =8koH
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:
 "A handful of build fixes, all found by Huawei's autobuilder.

  None of these patches should have any functional impact on kernels
  that build, and they're mostly related to various features
  intermingling with !MMU.

  While some of these might be better hoisted to generic code, it seems
  better to have the simple fixes in the meanwhile.

  As far as I know these are the only outstanding patches for 5.7"

* tag 'riscv-for-linus-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: mmiowb: Fix implicit declaration of function 'smp_processor_id'
  riscv: pgtable: Fix __kernel_map_pages build error if NOMMU
  riscv: Make SYS_SUPPORTS_HUGETLBFS depends on MMU
  riscv: Disable ARCH_HAS_DEBUG_VIRTUAL if NOMMU
  riscv: Add pgprot_writecombine/device and PAGE_SHARED defination if NOMMU
  riscv: stacktrace: Fix undefined reference to `walk_stackframe'
  riscv: Fix unmet direct dependencies built based on SOC_VIRT
  riscv: perf: RISCV_BASE_PMU should be independent
  riscv: perf_event: Make some funciton static
2020-05-15 12:47:15 -07:00
David S. Miller 93d43e5868 Merge branch 'mptcp-fix-MP_JOIN-failure-handling'
Paolo Abeni says:

====================
mptcp: fix MP_JOIN failure handling

Currently if we hit an MP_JOIN failure on the third ack, the child socket is
closed with reset, but the request socket is not deleted, causing weird
behaviors.

The main problem is that MPTCP's MP_JOIN code needs to plug it's own
'valid 3rd ack' checks and the current TCP callbacks do not allow that.

This series tries to address the above shortcoming introducing a new MPTCP
specific bit in a 'struct tcp_request_sock' hole, and leveraging that to allow
tcp_check_req releasing the request socket when needed.

The above allows cleaning-up a bit current MPTCP hooking in tcp_check_req().

An alternative solution, possibly cleaner but more invasive, would be
changing the 'bool *own_req' syn_recv_sock() argument into 'int *req_status'
and let MPTCP set it to 'REQ_DROP'.

v1 -> v2:
 - be more conservative about drop_req initialization

RFC -> v1:
 - move the drop_req bit inside tcp_request_sock (Eric)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 12:30:13 -07:00
Paolo Abeni 729cd6436f mptcp: cope better with MP_JOIN failure
Currently, on MP_JOIN failure we reset the child
socket, but leave the request socket untouched.

tcp_check_req will deal with it according to the
'tcp_abort_on_overflow' sysctl value - by default the
req socket will stay alive.

The above leads to inconsistent behavior on MP JOIN
failure, and bad listener overflow accounting.

This patch addresses the issue leveraging the infrastructure
just introduced to ask the TCP stack to drop the req on
failure.

The child socket is not freed anymore by subflow_syn_recv_sock(),
instead it's moved to a dead state and will be disposed by the
next sock_put done by the TCP stack, so that listener overflow
accounting is not affected by MP JOIN failure.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Christoph Paasch <cpaasch@apple.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 12:30:13 -07:00
Paolo Abeni 2f8a397d0a inet_connection_sock: factor out destroy helper.
Move the steps to prepare an inet_connection_sock for
forced disposal inside a separate helper. No functional
changes inteded, this will just simplify the next patch.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Christoph Paasch <cpaasch@apple.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 12:30:13 -07:00
Paolo Abeni 90bf45134d mptcp: add new sock flag to deal with join subflows
MP_JOIN subflows must not land into the accept queue.
Currently tcp_check_req() calls an mptcp specific helper
to detect such scenario.

Such helper leverages the subflow context to check for
MP_JOIN subflows. We need to deal also with MP JOIN
failures, even when the subflow context is not available
due allocation failure.

A possible solution would be changing the syn_recv_sock()
signature to allow returning a more descriptive action/
error code and deal with that in tcp_check_req().

Since the above need is MPTCP specific, this patch instead
uses a TCP request socket hole to add a MPTCP specific flag.
Such flag is used by the MPTCP syn_recv_sock() to tell
tcp_check_req() how to deal with the request socket.

This change is a no-op for !MPTCP build, and makes the
MPTCP code simpler. It allows also the next patch to deal
correctly with MP JOIN failure.

v1 -> v2:
 - be more conservative on drop_req initialization (Mat)

RFC -> v1:
 - move the drop_req bit inside tcp_request_sock (Eric)

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Reviewed-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 12:30:13 -07:00
Linus Torvalds 01d8a74803 Fix flush_icache_range() second argument in machine_kexec() to be an
address rather than size.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl6+z98ACgkQa9axLQDI
 XvFKOA//VFEAWOrCWS65DUn4TuonMG+foUp82zxaFZWK1epGCZoIEJQrSG6g5Png
 phL6Gx2oniqj+taLYZWqckMe+7k/yn9Xq+0DwDliMjaY9NB9bldE9J9ZGxpa4jBI
 qIrdaTgtRip1LQvm5BczmANGf86/p9E4nC6qHBJf9Stn9SYxoQuZlOGiwaK8XTe2
 QGn+NQ69ln7TkfUwEixhGI39dOHasrDAGgcUp/pTn2ki8mtpE2lG8+qSxxGTGTIT
 1SaJftER8JSc3VKjXKhOIaSwx1MqwTx0Cx4dTfAGhKzcQLdmXz0GDJjQCzU0weOP
 yHyH1gbJYUmB5X+CSA2OKssxsxcckVNMvTm7Jnn3Pb0f9oHVZQV4UR8cE/G40/hh
 c6OHAJXfmGla5GhNpRUXB5SGJyWDxaIrdZRXbDPCWGOqWxwIJyFWNz5h4kiVsyAe
 OJBtgzCqw3oUkr7KXgz0EECWV+T22+Y5cBbolptpqm7m6JbDQf/iviVJV5tkXnLF
 vkbZEx94Ajy4U4TZkQBWlJFN8M4Z+ttYuGYxebnurNI2Ottb0czGFypx+WXdGItp
 9rBaklIfzsCPoXZH0iHtaRxbDl4avwS46yKndtb2Y6tGoqqSNW4A6rf+S3YxvqBN
 bEBxMqxCuMxPWNPBb3bXpw4QlFo4LLkWQeAvbc59SROep35cPTo=
 =BoxO
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fix from Catalin Marinas:
 "Fix flush_icache_range() second argument in machine_kexec() to be an
  address rather than size"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: fix the flush_icache_range arguments in machine_kexec
2020-05-15 11:08:46 -07:00
Oleksij Rempel ca1c933bce net: phy: tja11xx: execute cable test on link up
A typical 100Base-T1 link should be always connected. If the link is in
a shot or open state, it is a failure. In most cases, we won't be able
to automatically handle this issue, but we need to log it or notify user
(if possible).

With this patch, the cable will be tested on "ip l s dev .. up" attempt
and send ethnl notification to the user space.

This patch was tested with TJA1102 PHY and "ethtool --monitor" command.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 11:03:03 -07:00
David S. Miller 8e1381049e Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2020-05-15

The following pull-request contains BPF updates for your *net* tree.

We've added 9 non-merge commits during the last 2 day(s) which contain
a total of 14 files changed, 137 insertions(+), 43 deletions(-).

The main changes are:

1) Fix secid_to_secctx LSM hook default value, from Anders.

2) Fix bug in mmap of bpf array, from Andrii.

3) Restrict bpf_probe_read to archs where they work, from Daniel.

4) Enforce returning 0 for fentry/fexit progs, from Yonghong.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:57:21 -07:00
Kevin Lo b0ed0bbfb3 net: phy: broadcom: add support for BCM54811 PHY
The BCM54811 PHY shares many similarities with the already supported BCM54810
PHY but additionally requires some semi-unique configuration.

Signed-off-by: Kevin Lo <kevlo@kevlo.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:56:31 -07:00
David S. Miller d42d118cfc Merge branch 'cxgb4-improve-and-tune-TC-MQPRIO-offload'
Rahul Lakkireddy says:

====================
cxgb4: improve and tune TC-MQPRIO offload

Patch 1 improves the Tx path's credit request and recovery mechanism
when running under heavy load.

Patch 2 adds ability to tune the burst buffer sizes of all traffic
classes to improve performance for <= 1500 MTU, under heavy load.

Patch 3 adds support to track EOTIDs and dump software queue
contexts used by TC-MQPRIO offload.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:54:08 -07:00
Rahul Lakkireddy 5148e5950c cxgb4: add EOTID tracking and software context dump
Rework and add support for dumping EOTID software context used by
TC-MQPRIO. Also track number of EOTIDs in use.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:54:07 -07:00
Rahul Lakkireddy 4bccfc036a cxgb4: tune burst buffer size for TC-MQPRIO offload
For each traffic class, firmware handles up to 4 * MTU amount of data
per burst cycle. Under heavy load, this small buffer size is a
bottleneck when buffering large TSO packets in <= 1500 MTU case.
Increase the burst buffer size to 8 * MTU when supported.

Also, keep the driver's traffic class configuration API similar to
the firmware API counterpart.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:54:07 -07:00
Rahul Lakkireddy 4f1d97262d cxgb4: improve credits recovery in TC-MQPRIO Tx path
Request credit update for every half credits consumed, including
the current request. Also, avoid re-trying to post packets when there
are no credits left. The credit update reply via interrupt will
eventually restore the credits and will invoke the Tx path again.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:54:07 -07:00
David S. Miller 3430223d39 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-05-15

The following pull-request contains BPF updates for your *net-next* tree.

We've added 37 non-merge commits during the last 1 day(s) which contain
a total of 67 files changed, 741 insertions(+), 252 deletions(-).

The main changes are:

1) bpf_xdp_adjust_tail() now allows to grow the tail as well, from Jesper.

2) bpftool can probe CONFIG_HZ, from Daniel.

3) CAP_BPF is introduced to isolate user processes that use BPF infra and
   to secure BPF networking services by dropping CAP_SYS_ADMIN requirement
   in certain cases, from Alexei.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:43:52 -07:00
DENG Qingfang 0141792f8b net: dsa: mt7530: fix VLAN setup
Allow DSA to add VLAN entries even if VLAN filtering is disabled, so
enabling it will not block the traffic of existent ports in the bridge

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:42:58 -07:00
David S. Miller cd2809cca2 Merge branch 'Implement-classifier-action-terse-dump-mode'
Vlad Buslov says:

====================
Implement classifier-action terse dump mode

Output rate of current upstream kernel TC filter dump implementation if
relatively low (~100k rules/sec depending on configuration). This
constraint impacts performance of software switch implementation that
rely on TC for their datapath implementation and periodically call TC
filter dump to update rules stats. Moreover, TC filter dump output a lot
of static data that don't change during the filter lifecycle (filter
key, specific action details, etc.) which constitutes significant
portion of payload on resulting netlink packets and increases amount of
syscalls necessary to dump all filters on particular Qdisc. In order to
significantly improve filter dump rate this patch sets implement new
mode of TC filter dump operation named "terse dump" mode. In this mode
only parameters necessary to identify the filter (handle, action cookie,
etc.) and data that can change during filter lifecycle (filter flags,
action stats, etc.) are preserved in dump output while everything else
is omitted.

Userspace API is implemented using new TCA_DUMP_FLAGS tlv with only
available flag value TCA_DUMP_FLAGS_TERSE. Internally, new API requires
individual classifier support (new tcf_proto_ops->terse_dump()
callback). Support for action terse dump is implemented in act API and
don't require changing individual action implementations.

The following table provides performance comparison between regular
filter dump and new terse dump mode for two classifier-action profiles:
one minimal config with L2 flower classifier and single gact action and
another heavier config with L2+5tuple flower classifier with
tunnel_key+mirred actions.

 Classifier-action type      |        dump |  terse dump | X improvement
                             | (rules/sec) | (rules/sec) |
-----------------------------+-------------+-------------+---------------
 L2 with gact                |       141.8 |       293.2 |          2.07
 L2+5tuple tunnel_key+mirred |        76.4 |       198.8 |          2.60

Benchmark details: to measure the rate tc filter dump and terse dump
commands are invoked on ingress Qdisc that have one million filters
configured using following commands.

> time sudo tc -s filter show dev ens1f0 ingress >/dev/null

> time sudo tc -s filter show terse dev ens1f0 ingress >/dev/null

Value in results table is calculated by dividing 1000000 total rules by
"real" time reported by time command.

Setup details: 2x Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz, 32GB memory
====================

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:42:51 -07:00
Matthieu Baerts 9a2dbb59eb selftests: mptcp: pm: rm the right tmp file
"$err" is a variable pointing to a temp file. "$out" is not: only used
as a local variable in "check()" and representing the output of a
command line.

Fixes: eedbc68532 (selftests: add PM netlink functional tests)
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:33:56 -07:00
Ioana Ciornei efa6a7d075 dpaa2-eth: properly handle buffer size restrictions
Depending on the WRIOP version, the buffer size on the RX path must by a
multiple of 64 or 256. Handle this restriction properly by aligning down
the buffer size to the necessary value. Also, use the new buffer size
dynamically computed instead of the compile time one.

Fixes: 27c874867c ("dpaa2-eth: Use a single page per Rx buffer")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:30:47 -07:00
Vlad Buslov e7534fd42a selftests: implement flower classifier terse dump tests
Implement two basic tests to verify terse dump functionality of flower
classifier:

- Test that verifies that terse dump works.

- Test that verifies that terse dump doesn't print filter key.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:23:11 -07:00
Vlad Buslov 0348451db9 net: sched: cls_flower: implement terse dump support
Implement tcf_proto_ops->terse_dump() callback for flower classifier. Only
dump handle, flags and action data in terse mode.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:23:11 -07:00
Vlad Buslov ca44b738e5 net: sched: implement terse dump support in act
Extend tcf_action_dump() with boolean argument 'terse' that is used to
request terse-mode action dump. In terse mode only essential data needed to
identify particular action (action kind, cookie, etc.) and its stats is put
to resulting skb and everything else is omitted. Implement
tcf_exts_terse_dump() helper in cls API that is intended to be used to
request terse dump of all exts (actions) attached to the filter.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:23:11 -07:00
Vlad Buslov f8ab1807a9 net: sched: introduce terse dump flag
Add new TCA_DUMP_FLAGS attribute and use it in cls API to request terse
filter output from classifiers with TCA_DUMP_FLAGS_TERSE flag. This option
is intended to be used to improve performance of TC filter dump when
userland only needs to obtain stats and not the whole classifier/action
data. Extend struct tcf_proto_ops with new terse_dump() callback that must
be defined by supporting classifier implementations.

Support of the options in specific classifiers and actions is
implemented in following patches in the series.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:23:11 -07:00
Tobias Waldekranz 2e186a2cf8 net: core: recursively find netdev by device node
The assumption that a device node is associated either with the
netdev's device, or the parent of that device, does not hold for all
drivers. E.g. Freescale's DPAA has two layers of platform devices
above the netdev. Instead, recursively walk up the tree from the
netdev, allowing any parent to match against the sought after node.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-15 10:18:49 -07:00
Linus Torvalds 051e6b7e34 hwmon fixes for v5.7-rc6
Fix ADC access synchronization problem with da9052 driver
 Fix temperature limit and status reporting in nct7904 driver
 Fix drivetemp temperature reporting if SCT is supported
 but SCT data tables are not.
 -----BEGIN PGP SIGNATURE-----
 
 iQIyBAABCAAdFiEEiHPvMQj9QTOCiqgVyx8mb86fmYEFAl6+s2EACgkQyx8mb86f
 mYEJKg/yAmslfyCTM0d6p+O4Vzl7r1PWNjI1d1BxBXJfDxMorSeZqKIlcb5VlvNa
 aNXq61NfQdTv2HyMTiKzSuCCu6OuMUmwPnVwO92U/tfEsxX7UNKtcU+XM7CfyPyf
 USVadR8Uq9JawJRs0Sq5wNRcPxaTC7DIM9MyKLRdirBl/HWgyLhGH2UL+9we2xy8
 fukj060uFxm55QbIQsLhMN1oJ4u2gwvZuyGZdiZ0QRIHrDlacZKrVFRrXmm27qUr
 ZGN3gHKVFRkQU9NQdflait/EJdb/AeVJIjZGvP195SFg1DaRQL7q5Xh6BEipadRt
 vSZAJP4G8YqgYGsqRlIMiyzlSZjvmioCRrUtbcK1ATCsHtz5KhbtUnACmzDiEA0c
 86xEkQ8ucIb+VO2qMLZsjJSFu5SQeQHzIh5scxQSbQBXYmGY9cK0EkvvALlqItM6
 /afBISGoAcsCxyZQ9u4bLhzHS/A6pmRTE2tWcRbFfKMU1bvcxYESdF19P5UFSiix
 gUEdbyMh8lF3jAB/zqcMR+J/Jlm+UGM+gFlBuwdOiu/kdrHT/F36UVcKclVjmkKL
 iKkD37fgm4X+vUo/fPCCU6ISSzsxWBBlb7IetVNsDEGdmHwuivrJN1kNX3ra2pUE
 4aovpn0tcLSjdMWNhqTe3c3+rYIC/DL6ZRIiNpTsO1xfednhUw==
 =YCH3
 -----END PGP SIGNATURE-----

Merge tag 'hwmon-for-v5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

 - Fix ADC access synchronization problem with da9052 driver

 - Fix temperature limit and status reporting in nct7904 driver

 - Fix drivetemp temperature reporting if SCT is supported but SCT data
   tables are not.

* tag 'hwmon-for-v5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (da9052) Synchronize access with mfd
  hwmon: (nct7904) Fix incorrect range of temperature limit registers
  hwmon: (nct7904) Read all SMI status registers in probe function
  hwmon: (drivetemp) Fix SCT support if SCT data tables are not supported
2020-05-15 10:14:05 -07:00
Linus Torvalds 1742bcd0cb sound fixes for 5.7-rc6
The things look good and calming down; the only change to ALSA core
 is the fix for racy rawmidi buffer accesses spotted by syzkaller,
 and the rest are all small device-specific quirks for HD-audio and
 USB-audio devices.
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAl6+NX0OHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE9P9g//ZE9BbJqlDTyVKToZC1knpIsLe2nBknhsUsZC
 AfYYlThPELB+P5YoCN0YfrTRecSVsCfhckEN7LDXjL30G/8R6z7TUe5as/+3PX1I
 31G31XAm8mQKURttkjjuc+ImzjG1aWSDJHUqH9S9NL4yskZ6ckuD62tOTuMZhofV
 2rjNL8JSwv//OJ0ThQzBvH42i81z1Pji3opqCm6ONEHhJzcv1EfZ20Mx1OcTE2E9
 32lDk3xaekd8Vq+ulYqvV9VmeMEd9ffZn/v01s89vSuSI8tT253e3wf84S7dR4U9
 1bAU13h74CdCGHcxiRiLRaao/s5txljyEmTaTsdWcUqUvsZ54D7lFdMCkcFDbJJZ
 cP9nahKCUK45n2ynUq/NrgNLDJXlyE2QKnGUWABPcP4im/oCllKZFmtGVjb4IWun
 tVsHs1jLBg9mF43XeNZXW4fG7RMnfD3g/r2c6P9pfQg/3nJ7pywqkN0PA+vuw/OC
 Ff5I1HQq5IqyDi30hQJpbKg6YfgZ61zNZTX7lndYXvQveZIveSkcWp6eKuO6iSAn
 u3CdhYEponJ23QJo6XxT4Nh9P0jgsEmweeoCORV14RIMmufDUI4f3QGP1G9C3KPT
 h1+T8hNLuRfE0CcW9ddHYLsn8K7yKJZtC0zsRKivssjK7rlgMYpGyS5NyukidS3h
 xrPvnQw=
 =HlsV
 -----END PGP SIGNATURE-----

Merge tag 'sound-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Things look good and calming down; the only change to ALSA core is the
  fix for racy rawmidi buffer accesses spotted by syzkaller, and the
  rest are all small device-specific quirks for HD-audio and USB-audio
  devices"

* tag 'sound-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek - Limit int mic boost for Thinkpad T530
  ALSA: hda/realtek - Add COEF workaround for ASUS ZenBook UX431DA
  ALSA: hda/realtek: Enable headset mic of ASUS UX581LV with ALC295
  ALSA: hda/realtek - Enable headset mic of ASUS UX550GE with ALC295
  ALSA: hda/realtek - Enable headset mic of ASUS GL503VM with ALC295
  ALSA: hda/realtek: Add quirk for Samsung Notebook
  ALSA: rawmidi: Fix racy buffer resize under concurrent accesses
  ALSA: usb-audio: add mapping for ASRock TRX40 Creator
  ALSA: hda/realtek - Fix S3 pop noise on Dell Wyse
  Revert "ALSA: hda/realtek: Fix pop noise on ALC225"
  ALSA: firewire-lib: fix 'function sizeof not defined' error of tracepoints format
  ALSA: usb-audio: Add control message quirk delay for Kingston HyperX headset
2020-05-15 10:06:49 -07:00
Linus Torvalds e7cea79058 drm fixes for v5.7-rc6
i915 (two weeks):
 - Handle idling during i915_gem_evict_something busy loops (Chris)
 - Mark current submissions with a weak-dependency (Chris)
 - Propagate error from completed fences (Chris)
 - Fixes on execlist to avoid GPU hang situation (Chris)
 - Fixes couple deadlocks (Chris)
 - Timeslice preemption fixes (Chris)
 - Fix Display Port interrupt handling on Tiger Lake (Imre)
 - Reduce debug noise around Frame Buffer Compression (Peter)
 - Fix logic around IPC W/a for Coffee Lake and Kaby Lake (Sultan)
 - Avoid dereferencing a dead context (Chris)
 
 tegra:
 - tegra120/4 smmu fixes
 
  amdgpu:
 - Clockgating fixes
 - Fix fbdev with scatter/gather display
 - S4 fix for navi
 - Soft recovery for gfx10
 - Freesync fixes
 - Atomic check cursor fix
 - Add a gfxoff quirk
 - MST fix
 
 amdkfd:
 - Fix GEM reference counting
 
 meson:
 - error code propogation fix
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJevjILAAoJEAx081l5xIa+eGMQAJ13DoMXrJUdaH8XCsh77kLn
 cQzV/pHFJP8yjHaU1BFebbzcOMTIrNFXn2kncPgCzaPwh8mBoWeYxjMT3eWeJNvy
 CDEbBvPTL/j13Yq0H9jzp3LxAMI/atXyL/ujUI3AAEtDoMaMbJ5ai//4rHxoXeQ5
 UDXNk9VarbTGKXovJK2/Wq0xNMrMspLQi85+uejF6FA6uztgfyCk+Me5WvuRk2vx
 A5OFa3Gl+PhVqjwQ1pVwO8eii33YVosQ4TfnYMrDV1YnjB6Od8oOeUCUIio9lh2x
 3hIlhAR1CELdx66U2BmGPd0ZXiE8CBXZ8Eh8JBoFIms3S88kIKypeZglnJnIV9cA
 ELC4eXiXNViyHwhgq8+h6le3JXdiYJ+PvUVvKXtKw2N9w67pUP8q571jwA4PGnTv
 9iAiXc55K9cBTbIiL4BE9zU/Ap7eMkMMFiQsQeOtXZb8PrhvlfVChDpfcESjh6uR
 9qIg1JyRYiOLZv3UOiR1MJVvjOssX0YzUsO4riw+5W1uaMNCcBFG9d4YphrRJjBK
 ReBu0YbI9ZwhMGldj8iXANdxqV7B2VELBquDg65ev6epFOw40skAqsIxAfAGRXO9
 fI7OWX25TPHqbLWZNBoSyYfMbsbMwfwX+5j7Sg0cF2T/CYXlCd1rJgYye55ifE3b
 wYC4wlYNKj9K0LQs8Rik
 =b/ek
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2020-05-15' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "As mentioned last week an i915 PR came in late, but I left it, so the
  i915 bits of this cover 2 weeks, which is why it's likely a bit larger
  than usual.

  Otherwise it's mostly amdgpu fixes, one tegra fix, one meson fix.

  i915:
   - Handle idling during i915_gem_evict_something busy loops (Chris)
   - Mark current submissions with a weak-dependency (Chris)
   - Propagate error from completed fences (Chris)
   - Fixes on execlist to avoid GPU hang situation (Chris)
   - Fixes couple deadlocks (Chris)
   - Timeslice preemption fixes (Chris)
   - Fix Display Port interrupt handling on Tiger Lake (Imre)
   - Reduce debug noise around Frame Buffer Compression (Peter)
   - Fix logic around IPC W/a for Coffee Lake and Kaby Lake (Sultan)
   - Avoid dereferencing a dead context (Chris)

  tegra:
   - tegra120/4 smmu fixes

  amdgpu:
   - Clockgating fixes
   - Fix fbdev with scatter/gather display
   - S4 fix for navi
   - Soft recovery for gfx10
   - Freesync fixes
   - Atomic check cursor fix
   - Add a gfxoff quirk
   - MST fix

  amdkfd:
   - Fix GEM reference counting

  meson:
   - error code propogation fix"

* tag 'drm-fixes-2020-05-15' of git://anongit.freedesktop.org/drm/drm: (29 commits)
  drm/i915: Handle idling during i915_gem_evict_something busy loops
  drm/meson: pm resume add return errno branch
  drm/amd/amdgpu: Update update_config() logic
  drm/amd/amdgpu: add raven1 part to the gfxoff quirk list
  drm/i915: Mark concurrent submissions with a weak-dependency
  drm/i915: Propagate error from completed fences
  drm/i915/gvt: Fix kernel oops for 3-level ppgtt guest
  drm/i915/gvt: Init DPLL/DDI vreg for virtual display instead of inheritance.
  drm/amd/display: add basic atomic check for cursor plane
  drm/amd/display: Fix vblank and pageflip event handling for FreeSync
  drm/amdgpu: implement soft_recovery for gfx10
  drm/amdgpu: enable hibernate support on Navi1X
  drm/amdgpu: Use GEM obj reference for KFD BOs
  drm/amdgpu: force fbdev into vram
  drm/amd/powerplay: perform PG ungate prior to CG ungate
  drm/amdgpu: drop unnecessary cancel_delayed_work_sync on PG ungate
  drm/amdgpu: disable MGCG/MGLS also on gfx CG ungate
  drm/i915/execlists: Track inflight CCID
  drm/i915/execlists: Avoid reusing the same logical CCID
  drm/i915/gem: Remove object_is_locked assertion from unpin_from_display_plane
  ...
2020-05-15 09:59:49 -07:00
Daniel Borkmann ed24a7a852 Merge branch 'bpf-cap'
Alexei Starovoitov says:

====================
v6->v7:
- permit SK_REUSEPORT program type under CAP_BPF as suggested by Marek Majkowski.
  It's equivalent to SOCKET_FILTER which is unpriv.

v5->v6:
- split allow_ptr_leaks into four flags.
- retain bpf_jit_limit under cap_sys_admin.
- fixed few other issues spotted by Daniel.

v4->v5:

Split BPF operations that are allowed under CAP_SYS_ADMIN into combination of
CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN and keep some of them under CAP_SYS_ADMIN.

The user process has to have
- CAP_BPF to create maps, do other sys_bpf() commands and load SK_REUSEPORT progs.
  Note: dev_map, sock_hash, sock_map map types still require CAP_NET_ADMIN.
  That could be relaxed in the future.
- CAP_BPF and CAP_PERFMON to load tracing programs.
- CAP_BPF and CAP_NET_ADMIN to load networking programs.
(or CAP_SYS_ADMIN for backward compatibility).

CAP_BPF solves three main goals:
1. provides isolation to user space processes that drop CAP_SYS_ADMIN and switch to CAP_BPF.
   More on this below. This is the major difference vs v4 set back from Sep 2019.
2. makes networking BPF progs more secure, since CAP_BPF + CAP_NET_ADMIN
   prevents pointer leaks and arbitrary kernel memory access.
3. enables fuzzers to exercise all of the verifier logic. Eventually finding bugs
   and making BPF infra more secure. Currently fuzzers run in unpriv.
   They will be able to run with CAP_BPF.

The patchset is long overdue follow-up from the last plumbers conference.
Comparing to what was discussed at LPC the CAP* checks at attach time are gone.
For tracing progs the CAP_SYS_ADMIN check was done at load time only. There was
no check at attach time. For networking and cgroup progs CAP_SYS_ADMIN was
required at load time and CAP_NET_ADMIN at attach time, but there are several
ways to bypass CAP_NET_ADMIN:
- if networking prog is using tail_call writing FD into prog_array will
  effectively attach it, but bpf_map_update_elem is an unprivileged operation.
- freplace prog with CAP_SYS_ADMIN can replace networking prog

Consolidating all CAP checks at load time makes security model similar to
open() syscall. Once the user got an FD it can do everything with it.
read/write/poll don't check permissions. The same way when bpf_prog_load
command returns an FD the user can do everything (including attaching,
detaching, and bpf_test_run).

The important design decision is to allow ID->FD transition for
CAP_SYS_ADMIN only. What it means that user processes can run
with CAP_BPF and CAP_NET_ADMIN and they will not be able to affect each
other unless they pass FDs via scm_rights or via pinning in bpffs.
ID->FD is a mechanism for human override and introspection.
An admin can do 'sudo bpftool prog ...'. It's possible to enforce via LSM that
only bpftool binary does bpf syscall with CAP_SYS_ADMIN and the rest of user
space processes do bpf syscall with CAP_BPF isolating bpf objects (progs, maps,
links) that are owned by such processes from each other.

Another significant change from LPC is that the verifier checks are split into
four flags. The allow_ptr_leaks flag allows pointer manipulations. The
bpf_capable flag enables all modern verifier features like bpf-to-bpf calls,
BTF, bounded loops, dead code elimination, etc. All the goodness. The
bypass_spec_v1 flag enables indirect stack access from bpf programs and
disables speculative analysis and bpf array mitigations. The bypass_spec_v4
flag disables store sanitation. That allows networking progs with CAP_BPF +
CAP_NET_ADMIN enjoy modern verifier features while being more secure.

Some networking progs may need CAP_BPF + CAP_NET_ADMIN + CAP_PERFMON,
since subtracting pointers (like skb->data_end - skb->data) is a pointer leak,
but the verifier may get smarter in the future.
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2020-05-15 17:29:46 +02:00
Alexei Starovoitov 8162600118 selftests/bpf: Use CAP_BPF and CAP_PERFMON in tests
Make all test_verifier test exercise CAP_BPF and CAP_PERFMON

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200513230355.7858-4-alexei.starovoitov@gmail.com
2020-05-15 17:29:41 +02:00
Alexei Starovoitov 2c78ee898d bpf: Implement CAP_BPF
Implement permissions as stated in uapi/linux/capability.h
In order to do that the verifier allow_ptr_leaks flag is split
into four flags and they are set as:
  env->allow_ptr_leaks = bpf_allow_ptr_leaks();
  env->bypass_spec_v1 = bpf_bypass_spec_v1();
  env->bypass_spec_v4 = bpf_bypass_spec_v4();
  env->bpf_capable = bpf_capable();

The first three currently equivalent to perfmon_capable(), since leaking kernel
pointers and reading kernel memory via side channel attacks is roughly
equivalent to reading kernel memory with cap_perfmon.

'bpf_capable' enables bounded loops, precision tracking, bpf to bpf calls and
other verifier features. 'allow_ptr_leaks' enable ptr leaks, ptr conversions,
subtraction of pointers. 'bypass_spec_v1' disables speculative analysis in the
verifier, run time mitigations in bpf array, and enables indirect variable
access in bpf programs. 'bypass_spec_v4' disables emission of sanitation code
by the verifier.

That means that the networking BPF program loaded with CAP_BPF + CAP_NET_ADMIN
will have speculative checks done by the verifier and other spectre mitigation
applied. Such networking BPF program will not be able to leak kernel pointers
and will not be able to access arbitrary kernel memory.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200513230355.7858-3-alexei.starovoitov@gmail.com
2020-05-15 17:29:41 +02:00
Alexei Starovoitov a17b53c4a4 bpf, capability: Introduce CAP_BPF
Split BPF operations that are allowed under CAP_SYS_ADMIN into
combination of CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN.
For backward compatibility include them in CAP_SYS_ADMIN as well.

The end result provides simple safety model for applications that use BPF:
- to load tracing program types
  BPF_PROG_TYPE_{KPROBE, TRACEPOINT, PERF_EVENT, RAW_TRACEPOINT, etc}
  use CAP_BPF and CAP_PERFMON
- to load networking program types
  BPF_PROG_TYPE_{SCHED_CLS, XDP, SK_SKB, etc}
  use CAP_BPF and CAP_NET_ADMIN

There are few exceptions from this rule:
- bpf_trace_printk() is allowed in networking programs, but it's using
  tracing mechanism, hence this helper needs additional CAP_PERFMON
  if networking program is using this helper.
- BPF_F_ZERO_SEED flag for hash/lru map is allowed under CAP_SYS_ADMIN only
  to discourage production use.
- BPF HW offload is allowed under CAP_SYS_ADMIN.
- bpf_probe_write_user() is allowed under CAP_SYS_ADMIN only.

CAPs are not checked at attach/detach time with two exceptions:
- loading BPF_PROG_TYPE_CGROUP_SKB is allowed for unprivileged users,
  hence CAP_NET_ADMIN is required at attach time.
- flow_dissector detach doesn't check prog FD at detach,
  hence CAP_NET_ADMIN is required at detach time.

CAP_SYS_ADMIN is required to iterate BPF objects (progs, maps, links) via get_next_id
command and convert them to file descriptor via GET_FD_BY_ID command.
This restriction guarantees that mutliple tasks with CAP_BPF are not able to
affect each other. That leads to clean isolation of tasks. For example:
task A with CAP_BPF and CAP_NET_ADMIN loads and attaches a firewall via bpf_link.
task B with the same capabilities cannot detach that firewall unless
task A explicitly passed link FD to task B via scm_rights or bpffs.
CAP_SYS_ADMIN can still detach/unload everything.

Two networking user apps with CAP_SYS_ADMIN and CAP_NET_ADMIN can
accidentely mess with each other programs and maps.
Two networking user apps with CAP_NET_ADMIN and CAP_BPF cannot affect each other.

CAP_NET_ADMIN + CAP_BPF allows networking programs access only packet data.
Such networking progs cannot access arbitrary kernel memory or leak pointers.

bpftool, bpftrace, bcc tools binaries should NOT be installed with
CAP_BPF and CAP_PERFMON, since unpriv users will be able to read kernel secrets.
But users with these two permissions will be able to use these tracing tools.

CAP_PERFMON is least secure, since it allows kprobes and kernel memory access.
CAP_NET_ADMIN can stop network traffic via iproute2.
CAP_BPF is the safest from security point of view and harmless on its own.

Having CAP_BPF and/or CAP_NET_ADMIN is not enough to write into arbitrary map
and if that map is used by firewall-like bpf prog.
CAP_BPF allows many bpf prog_load commands in parallel. The verifier
may consume large amount of memory and significantly slow down the system.

Existing unprivileged BPF operations are not affected.
In particular unprivileged users are allowed to load socket_filter and cg_skb
program types and to create array, hash, prog_array, map-in-map map types.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200513230355.7858-2-alexei.starovoitov@gmail.com
2020-05-15 17:29:41 +02:00
Daniel Borkmann 0ee52c0f6c bpf, bpftool: Allow probing for CONFIG_HZ from kernel config
In Cilium we've recently switched to make use of bpf_jiffies64() for
parts of our tc and XDP datapath since bpf_ktime_get_ns() is more
expensive and high-precision is not needed for our timeouts we have
anyway. Our agent has a probe manager which picks up the json of
bpftool's feature probe and we also use the macro output in our C
programs e.g. to have workarounds when helpers are not available on
older kernels.

Extend the kernel config info dump to also include the kernel's
CONFIG_HZ, and rework the probe_kernel_image_config() for allowing a
macro dump such that CONFIG_HZ can be propagated to BPF C code as a
simple define if available via config. Latter allows to have _compile-
time_ resolution of jiffies <-> sec conversion in our code since all
are propagated as known constants.

Given we cannot generally assume availability of kconfig everywhere,
we also have a kernel hz probe [0] as a fallback. Potentially, bpftool
could have an integrated probe fallback as well, although to derive it,
we might need to place it under 'bpftool feature probe full' or similar
given it would slow down the probing process overall. Yet 'full' doesn't
fit either for us since we don't want to pollute the kernel log with
warning messages from bpf_probe_write_user() and bpf_trace_printk() on
agent startup; I've left it out for the time being.

  [0] https://github.com/cilium/cilium/blob/master/bpf/cilium-probe-kernel-hz.c

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200513075849.20868-1-daniel@iogearbox.net
2020-05-15 08:18:53 -07:00
Alexei Starovoitov 59df9f1fb4 Merge branch 'restrict-bpf_probe_read'
Daniel Borkmann says:

====================
Small set of fixes in order to restrict BPF helpers for tracing which are
broken on archs with overlapping address ranges as per discussion in [0].
I've targetted this for -bpf tree so they can be routed as fixes. Thanks!

v1 -> v2:
  - switch to reusable %pks, %pus format specifiers (Yonghong)
    - fixate %s on kernel_ds probing for archs with overlapping addr space

      [0] https://lore.kernel.org/bpf/CAHk-=wjJKo0GVixYLmqPn-Q22WFu0xHaBSjKEo7e7Yw72y5SPQ@mail.gmail.com/T/
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-05-15 08:15:07 -07:00
Daniel Borkmann b2a5212fb6 bpf: Restrict bpf_trace_printk()'s %s usage and add %pks, %pus specifier
Usage of plain %s conversion specifier in bpf_trace_printk() suffers from the
very same issue as bpf_probe_read{,str}() helpers, that is, it is broken on
archs with overlapping address ranges.

While the helpers have been addressed through work in 6ae08ae3de ("bpf: Add
probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers"), we need
an option for bpf_trace_printk() as well to fix it.

Similarly as with the helpers, force users to make an explicit choice by adding
%pks and %pus specifier to bpf_trace_printk() which will then pick the corresponding
strncpy_from_unsafe*() variant to perform the access under KERNEL_DS or USER_DS.
The %pk* (kernel specifier) and %pu* (user specifier) can later also be extended
for other objects aside strings that are probed and printed under tracing, and
reused out of other facilities like bpf_seq_printf() or BTF based type printing.

Existing behavior of %s for current users is still kept working for archs where it
is not broken and therefore gated through CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE.
For archs not having this property we fall-back to pick probing under KERNEL_DS as
a sensible default.

Fixes: 8d3b7dce86 ("bpf: add support for %s specifier to bpf_trace_printk()")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Link: https://lore.kernel.org/bpf/20200515101118.6508-4-daniel@iogearbox.net
2020-05-15 08:10:36 -07:00
Daniel Borkmann 47cc0ed574 bpf: Add bpf_probe_read_{user, kernel}_str() to do_refine_retval_range
Given bpf_probe_read{,str}() BPF helpers are now only available under
CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE, we need to add the drop-in
replacements of bpf_probe_read_{kernel,user}_str() to do_refine_retval_range()
as well to avoid hitting the same issue as in 849fa50662 ("bpf/verifier:
refine retval R0 state for bpf_get_stack helper").

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200515101118.6508-3-daniel@iogearbox.net
2020-05-15 08:10:36 -07:00
Daniel Borkmann 0ebeea8ca8 bpf: Restrict bpf_probe_read{, str}() only to archs where they work
Given the legacy bpf_probe_read{,str}() BPF helpers are broken on archs
with overlapping address ranges, we should really take the next step to
disable them from BPF use there.

To generally fix the situation, we've recently added new helper variants
bpf_probe_read_{user,kernel}() and bpf_probe_read_{user,kernel}_str().
For details on them, see 6ae08ae3de ("bpf: Add probe_read_{user, kernel}
and probe_read_{user,kernel}_str helpers").

Given bpf_probe_read{,str}() have been around for ~5 years by now, there
are plenty of users at least on x86 still relying on them today, so we
cannot remove them entirely w/o breaking the BPF tracing ecosystem.

However, their use should be restricted to archs with non-overlapping
address ranges where they are working in their current form. Therefore,
move this behind a CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE and
have x86, arm64, arm select it (other archs supporting it can follow-up
on it as well).

For the remaining archs, they can workaround easily by relying on the
feature probe from bpftool which spills out defines that can be used out
of BPF C code to implement the drop-in replacement for old/new kernels
via: bpftool feature probe macro

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/bpf/20200515101118.6508-2-daniel@iogearbox.net
2020-05-15 08:10:36 -07:00
Dave Airlie 1d2a1eb136 Just one meson patch this time to propagate an error code
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXrz0qQAKCRDj7w1vZxhR
 xSImAQDRwP5tNB3ZZmnJ2ABF2uqT+YTzS5oPW//bxCgnq128FQEA4UxFvgVOXAGI
 HUsEGriyhoYgPMApmKHsFTTCvphlcQY=
 =Igwt
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-fixes-2020-05-14' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Just one meson patch this time to propagate an error code

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20200514073538.wvdtv5s2mt4wdrdj@gilmour.lan
2020-05-15 16:00:57 +10:00
Alexei Starovoitov 5cc5924d83 Merge branch 'xdp-grow-tail'
Jesper Dangaard Brouer says:

====================
V4:
- Fixup checkpatch.pl issues
- Collected more ACKs

V3:
- Fix issue on virtio_net patch spotted by Jason Wang
- Adjust name for variable in mlx5 patch
- Collected more ACKs

V2:
- Fix bug in mlx5 for XDP_PASS case
- Collected nitpicks and ACKs from mailing list

V1:
- Fix bug in dpaa2

XDP have evolved to support several frame sizes, but xdp_buff was not
updated with this information. This have caused the side-effect that
XDP frame data hard end is unknown. This have limited the BPF-helper
bpf_xdp_adjust_tail to only shrink the packet. This patchset address
this and add packet tail extend/grow.

The purpose of the patchset is ALSO to reserve a memory area that can be
used for storing extra information, specifically for extending XDP with
multi-buffer support. One proposal is to use same layout as
skb_shared_info, which is why this area is currently 320 bytes.

When converting xdp_frame to SKB (veth and cpumap), the full tailroom
area can now be used and SKB truesize is now correct. For most
drivers this result in a much larger tailroom in SKB "head" data
area. The network stack can now take advantage of this when doing SKB
coalescing. Thus, a good driver test is to use xdp_redirect_cpu from
samples/bpf/ and do some TCP stream testing.

Use-cases for tail grow/extend:
(1) IPsec / XFRM needs a tail extend[1][2].
(2) DNS-cache responses in XDP.
(3) HAProxy ALOHA would need it to convert to XDP.
(4) Add tail info e.g. timestamp and collect via tcpdump

[1] http://vger.kernel.org/netconf2019_files/xfrm_xdp.pdf
[2] http://vger.kernel.org/netconf2019.html

Examples on howto access the tail area of an XDP packet is shown in the
XDP-tutorial example[3].

[3] https://github.com/xdp-project/xdp-tutorial/blob/master/experiment01-tailgrow/
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-05-14 21:50:03 -07:00
Jesper Dangaard Brouer 7ae2e00e8f selftests/bpf: Xdp_adjust_tail add grow tail tests
Extend BPF selftest xdp_adjust_tail with grow tail tests, which is added
as subtest's. The first grow test stays in same form as original shrink
test. The second grow test use the newer bpf_prog_test_run_xattr() calls,
and does extra checking of data contents.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/158945350567.97035.9632611946765811876.stgit@firesoul
2020-05-14 21:21:57 -07:00
Jesper Dangaard Brouer 68545fb6f2 selftests/bpf: Adjust BPF selftest for xdp_adjust_tail
Current selftest for BPF-helper xdp_adjust_tail only shrink tail.
Make it more clear that this is a shrink test case.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/158945350058.97035.17280775016196207372.stgit@firesoul
2020-05-14 21:21:57 -07:00
Jesper Dangaard Brouer bc56c919fc bpf: Add xdp.frame_sz in bpf_prog_test_run_xdp().
Update the memory requirements, when adding xdp.frame_sz in BPF test_run
function bpf_prog_test_run_xdp() which e.g. is used by XDP selftests.

Specifically add the expected reserved tailroom, but also allocated a
larger memory area to reflect that XDP frames usually comes in this
format. Limit the provided packet data size to 4096 minus headroom +
tailroom, as this also reflect a common 3520 bytes MTU limit with XDP.

Note that bpf_test_init already use a memory allocation method that clears
memory.  Thus, this already guards against leaking uninit kernel memory.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/158945349549.97035.15316291762482444006.stgit@firesoul
2020-05-14 21:21:56 -07:00
Jesper Dangaard Brouer ddb47d518c xdp: Clear grow memory in bpf_xdp_adjust_tail()
Clearing memory of tail when grow happens, because it is too easy
to write a XDP_PASS program that extend the tail, which expose
this memory to users that can run tcpdump.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/158945349039.97035.5262100484553494.stgit@firesoul
2020-05-14 21:21:56 -07:00
Jesper Dangaard Brouer c8741e2bfe xdp: Allow bpf_xdp_adjust_tail() to grow packet size
Finally, after all drivers have a frame size, allow BPF-helper
bpf_xdp_adjust_tail() to grow or extend packet size at frame tail.

Remember that helper/macro xdp_data_hard_end have reserved some
tailroom.  Thus, this helper makes sure that the BPF-prog don't have
access to this tailroom area.

V2: Remove one chicken check and use WARN_ONCE for other

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/158945348530.97035.12577148209134239291.stgit@firesoul
2020-05-14 21:21:56 -07:00
Jesper Dangaard Brouer d628ee4fef mlx5: Rx queue setup time determine frame_sz for XDP
The mlx5 driver have multiple memory models, which are also changed
according to whether a XDP bpf_prog is attached.

The 'rx_striding_rq' setting is adjusted via ethtool priv-flags e.g.:
 # ethtool --set-priv-flags mlx5p2 rx_striding_rq off

On the general case with 4K page_size and regular MTU packet, then
the frame_sz is 2048 and 4096 when XDP is enabled, in both modes.

The info on the given frame size is stored differently depending on the
RQ-mode and encoded in a union in struct mlx5e_rq union wqe/mpwqe.
In rx striding mode rq->mpwqe.log_stride_sz is either 11 or 12, which
corresponds to 2048 or 4096 (MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ).
In non-striding mode (MLX5_WQ_TYPE_CYCLIC) the frag_stride is stored
in rq->wqe.info.arr[0].frag_stride, for the first fragment, which is
what the XDP case cares about.

To reduce effect on fast-path, this patch determine the frame_sz at
setup time, to avoid determining the memory model runtime. Variable
is named frame0_sz to make it clear that this is only the frame
size of the first fragment.

This mlx5 driver does a DMA-sync on XDP_TX action, but grow is safe
as it have done a DMA-map on the entire PAGE_SIZE. The driver also
already does a XDP length check against sq->hw_mtu on the possible
XDP xmit paths mlx5e_xmit_xdp_frame() + mlx5e_xmit_xdp_frame_mpwqe().

V3+4: Change variable name first_frame_sz to frame0_sz

V2: Fix that frag_size need to be recalc before creating SKB.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Link: https://lore.kernel.org/bpf/158945348021.97035.12295039384250022883.stgit@firesoul
2020-05-14 21:21:56 -07:00
Jesper Dangaard Brouer 2a637c5b1a xdp: For Intel AF_XDP drivers add XDP frame_sz
Intel drivers implement native AF_XDP zerocopy in separate C-files,
that have its own invocation of bpf_prog_run_xdp(). The setup of
xdp_buff is also handled in separately from normal code path.

This patch update XDP frame_sz for AF_XDP zerocopy drivers i40e, ice
and ixgbe, as the code changes needed are very similar.  Introduce a
helper function xsk_umem_xdp_frame_sz() for calculating frame size.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/158945347511.97035.8536753731329475655.stgit@firesoul
2020-05-14 21:21:56 -07:00
Jesper Dangaard Brouer d4ecdbf7aa ice: Add XDP frame size to driver
This driver uses different memory models depending on PAGE_SIZE at
compile time. For PAGE_SIZE 4K it uses page splitting, meaning for
normal MTU frame size is 2048 bytes (and headroom 192 bytes). For
larger MTUs the driver still use page splitting, by allocating
order-1 pages (8192 bytes) for RX frames. For PAGE_SIZE larger than
4K, driver instead advance its rx_buffer->page_offset with the frame
size "truesize".

For XDP frame size calculations, this mean that in PAGE_SIZE larger
than 4K mode the frame_sz change on a per packet basis. For the page
split 4K PAGE_SIZE mode, xdp.frame_sz is more constant and can be
updated once outside the main NAPI loop.

The default setting in the driver uses build_skb(), which provides
the necessary headroom and tailroom for XDP-redirect in RX-frame
(in both modes).

There is one complication, which is legacy-rx mode (configurable via
ethtool priv-flags). There are zero headroom in this mode, which is a
requirement for XDP-redirect to work. The conversion to xdp_frame
(convert_to_xdp_frame) will detect this insufficient space, and
xdp_do_redirect() call will fail. This is deemed acceptable, as it
allows other XDP actions to still work in legacy-mode. In
legacy-mode + larger PAGE_SIZE due to lacking tailroom, we also
accept that xdp_adjust_tail shrink doesn't work.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Link: https://lore.kernel.org/bpf/158945347002.97035.328088795813704587.stgit@firesoul
2020-05-14 21:21:56 -07:00
Jesper Dangaard Brouer 24104024ce i40e: Add XDP frame size to driver
This driver uses different memory models depending on PAGE_SIZE at
compile time. For PAGE_SIZE 4K it uses page splitting, meaning for
normal MTU frame size is 2048 bytes (and headroom 192 bytes). For
larger MTUs the driver still use page splitting, by allocating
order-1 pages (8192 bytes) for RX frames. For PAGE_SIZE larger than
4K, driver instead advance its rx_buffer->page_offset with the frame
size "truesize".

For XDP frame size calculations, this mean that in PAGE_SIZE larger
than 4K mode the frame_sz change on a per packet basis. For the page
split 4K PAGE_SIZE mode, xdp.frame_sz is more constant and can be
updated once outside the main NAPI loop.

The default setting in the driver uses build_skb(), which provides
the necessary headroom and tailroom for XDP-redirect in RX-frame
(in both modes).

There is one complication, which is legacy-rx mode (configurable via
ethtool priv-flags). There are zero headroom in this mode, which is a
requirement for XDP-redirect to work. The conversion to xdp_frame
(convert_to_xdp_frame) will detect this insufficient space, and
xdp_do_redirect() call will fail. This is deemed acceptable, as it
allows other XDP actions to still work in legacy-mode. In
legacy-mode + larger PAGE_SIZE due to lacking tailroom, we also
accept that xdp_adjust_tail shrink doesn't work.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Link: https://lore.kernel.org/bpf/158945346494.97035.12809400414566061815.stgit@firesoul
2020-05-14 21:21:56 -07:00
Jesper Dangaard Brouer 81f3c6283c ixgbevf: Add XDP frame size to VF driver
This patch mirrors the changes to ixgbe in previous patch.

This VF driver doesn't support XDP_REDIRECT, but correct tailroom is
still necessary for BPF-helper xdp_adjust_tail.  In legacy-mode +
larger PAGE_SIZE, due to lacking tailroom, we accept that
xdp_adjust_tail shrink doesn't work.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: intel-wired-lan@lists.osuosl.org
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Link: https://lore.kernel.org/bpf/158945345984.97035.13518286183248025173.stgit@firesoul
2020-05-14 21:21:56 -07:00