Commit Graph

19446 Commits

Author SHA1 Message Date
David S. Miller 14684b9301 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
One conflict in the BPF samples Makefile, some fixes in 'net' whilst
we were converting over to Makefile.target rules in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-09 11:04:37 -08:00
Linus Torvalds 0058b0a506 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) BPF sample build fixes from Björn Töpel

 2) Fix powerpc bpf tail call implementation, from Eric Dumazet.

 3) DCCP leaks jiffies on the wire, fix also from Eric Dumazet.

 4) Fix crash in ebtables when using dnat target, from Florian Westphal.

 5) Fix port disable handling whne removing bcm_sf2 driver, from Florian
    Fainelli.

 6) Fix kTLS sk_msg trim on fallback to copy mode, from Jakub Kicinski.

 7) Various KCSAN fixes all over the networking, from Eric Dumazet.

 8) Memory leaks in mlx5 driver, from Alex Vesker.

 9) SMC interface refcounting fix, from Ursula Braun.

10) TSO descriptor handling fixes in stmmac driver, from Jose Abreu.

11) Add a TX lock to synchonize the kTLS TX path properly with crypto
    operations. From Jakub Kicinski.

12) Sock refcount during shutdown fix in vsock/virtio code, from Stefano
    Garzarella.

13) Infinite loop in Intel ice driver, from Colin Ian King.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (108 commits)
  ixgbe: need_wakeup flag might not be set for Tx
  i40e: need_wakeup flag might not be set for Tx
  igb/igc: use ktime accessors for skb->tstamp
  i40e: Fix for ethtool -m issue on X722 NIC
  iavf: initialize ITRN registers with correct values
  ice: fix potential infinite loop because loop counter being too small
  qede: fix NULL pointer deref in __qede_remove()
  net: fix data-race in neigh_event_send()
  vsock/virtio: fix sock refcnt holding during the shutdown
  net: ethernet: octeon_mgmt: Account for second possible VLAN header
  mac80211: fix station inactive_time shortly after boot
  net/fq_impl: Switch to kvmalloc() for memory allocation
  mac80211: fix ieee80211_txq_setup_flows() failure path
  ipv4: Fix table id reference in fib_sync_down_addr
  ipv6: fixes rt6_probe() and fib6_nh->last_probe init
  net: hns: Fix the stray netpoll locks causing deadlock in NAPI path
  net: usb: qmi_wwan: add support for DW5821e with eSIM support
  CDC-NCM: handle incomplete transfer of MTU
  nfc: netlink: fix double device reference drop
  NFC: st21nfca: fix double free
  ...
2019-11-08 18:21:05 -08:00
Jiri Pirko f95e6c9c46 selftest: net: add alternative names test
Add a simple test for recently added netdevice alternative names.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-08 14:10:27 -08:00
Amit Cohen 83b2b61e05 selftests: mlxsw: Add test cases for devlink-trap layer 3 exceptions
Test that each supported packet trap exception is triggered under the
right conditions.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 19:51:41 -08:00
Amit Cohen f10caf0278 selftests: forwarding: tc_common: Add hitting check
Add an option to check that packets hit the tc filter without providing
the exact number of packets that should hit it.

It is useful while sending many packets in background and checking that
at least one of them hit the tc filter.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 19:51:40 -08:00
Amit Cohen 7ce4e76086 selftests: forwarding: devlink: Add functionality for trap exceptions test
Add common part of all the tests - check devlink status to ensure that
packets were trapped.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 19:51:40 -08:00
Amit Cohen d3e985c917 selftests: mlxsw: Add test cases for devlink-trap layer 3 drops
Test that each supported packet trap is triggered under the right
conditions and that packets are indeed dropped and not forwarded.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 19:51:40 -08:00
Amit Cohen ef7f6b1615 selftests: devlink: Make devlink_trap_cleanup() more generic
Add proto parameter in order to enable the use of devlink_trap_cleanup()
in tests that use IPv6 protocol.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 19:51:40 -08:00
Amit Cohen 6b45fe95fd selftests: devlink: Export functions to devlink library
l2_drops_test() is used to check that drop traps are functioning as
intended. Currently it is only used in the layer 2 test, but it is also
useful for the layer 3 test introduced in the subsequent patch.

l2_drops_cleanup() is used to clean configurations and kill mausezahn
proccess.

Export the functions to the common devlink library to allow it to be
re-used by future tests.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 19:51:40 -08:00
David Ahern 2386d74845 selftests: Add source route tests to fib_tests
Add tests to verify routes with source address set are deleted when
source address is deleted.

Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-07 16:16:55 -08:00
Francesco Ruggeri 3c28d99fc6 selftest: net: add some traceroute tests
Added the following traceroute tests.

IPV6:
Verify that in this scenario

       ------------------------ N2
        |                    |
      ------              ------  N3  ----
      | R1 |              | R2 |------|H2|
      ------              ------      ----
        |                    |
       ------------------------ N1
                 |
                ----
                |H1|
                ----

where H1's default route goes through R1 and R1's default route goes
through R2 over N2, traceroute6 from H1 to H2 reports R2's address
on N2 and not N1.

IPV4:
Verify that traceroute from H1 to H2 shows 1.0.1.1 in this scenario

                   1.0.3.1/24
---- 1.0.1.3/24    1.0.1.1/24 ---- 1.0.2.1/24    1.0.2.4/24 ----
|H1|--------------------------|R1|--------------------------|H2|
----            N1            ----            N2            ----

where net.ipv4.icmp_errors_use_inbound_ifaddr is set on R1 and
1.0.3.1/24 and 1.0.1.1/24 are respectively R1's primary and secondary
address on N1.

v2: fixed some typos, and have bridge in R1 instead of R2 in IPV6 test.

Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 17:35:49 -08:00
Jakub Kicinski 41098af59d selftests/tls: add test for concurrent recv and send
Add a test which spawns 16 threads and performs concurrent
send and recv calls on the same socket.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 17:33:32 -08:00
Linus Torvalds 4dd5815825 Merge branch 'akpm' (patches from Andrew)
Merge more fixes from Andrew Morton:
 "17 fixes"

Mostly mm fixes and one ocfs2 locking fix.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: memcontrol: fix network errors from failing __GFP_ATOMIC charges
  mm/memory_hotplug: fix updating the node span
  scripts/gdb: fix debugging modules compiled with hot/cold partitioning
  mm: slab: make page_cgroup_ino() to recognize non-compound slab pages properly
  MAINTAINERS: update information for "MEMORY MANAGEMENT"
  dump_stack: avoid the livelock of the dump_lock
  zswap: add Vitaly to the maintainers list
  mm/page_alloc.c: ratelimit allocation failure warnings more aggressively
  mm/khugepaged: fix might_sleep() warn with CONFIG_HIGHPTE=y
  mm, vmstat: reduce zone->lock holding time by /proc/pagetypeinfo
  mm, vmstat: hide /proc/pagetypeinfo from normal users
  mm/mmu_notifiers: use the right return code for WARN_ON
  ocfs2: protect extent tree in ocfs2_prepare_inode_for_write()
  mm: thp: handle page cache THP correctly in PageTransCompoundMap
  mm, meminit: recalculate pcpu batch and high limits after init completes
  mm/gup_benchmark: fix MAP_HUGETLB case
  mm: memcontrol: fix NULL-ptr deref in percpu stats flush
2019-11-06 12:02:13 -08:00
Roman Mashak 71c780f119 tc-testing: updated pedit TDC tests
Added tests for u8/u32 clear value, u8/16 retain value, u16/32 invert value,
u8/u16/u32 preserve value and test for negative offsets.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 11:12:39 -08:00
Jakub Kicinski 462ef97526 selftests: devlink: undo changes at the end of resource_test
The netdevsim object is reused by all the tests, but the resource
tests puts it into a broken state (failed reload in a different
namespace). Make sure it's fixed up at the end of that test
otherwise subsequent tests fail.

Fixes: b74c37fd35 ("selftests: netdevsim: add tests for devlink reload with resources")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 11:11:30 -08:00
Jakub Kicinski acceca8d24 selftests: bpf: log direct file writes
Recent changes to netdevsim moved creating and destroying
devices from netlink to sysfs. The sysfs writes have been
implemented as direct writes, without shelling out. This
is faster, but leaves no trace in the logs. Add explicit
logs to make debugging possible.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-06 09:59:58 -08:00
John Hubbard 64801d19eb mm/gup_benchmark: fix MAP_HUGETLB case
The MAP_HUGETLB ("-H" option) of gup_benchmark fails:

  $ sudo ./gup_benchmark -H
  mmap: Invalid argument

This is because gup_benchmark.c is passing in a file descriptor to
mmap(), but the fd came from opening up the /dev/zero file.  This
confuses the mmap syscall implementation, which thinks that, if the
caller did not specify MAP_ANONYMOUS, then the file must be a huge page
file.  So it attempts to verify that the file really is a huge page
file, as you can see here:

ksys_mmap_pgoff()
{
    if (!(flags & MAP_ANONYMOUS)) {
        retval = -EINVAL;
        if (unlikely(flags & MAP_HUGETLB && !is_file_hugepages(file)))
            goto out_fput; /* THIS IS WHERE WE END UP */

    else if (flags & MAP_HUGETLB) {
        ...proceed normally, /dev/zero is ok here...

...and of course is_file_hugepages() returns "false" for the /dev/zero
file.

The problem is that the user space program, gup_benchmark.c, really just
wants anonymous memory here.  The simplest way to get that is to pass
MAP_ANONYMOUS whenever MAP_HUGETLB is specified, so that's what this
patch does.

Link: http://lkml.kernel.org/r/20191021212435.398153-2-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Keith Busch <keith.busch@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-06 08:28:58 -08:00
Roman Mashak 2bceefbe55 tc-testing: added tests with cookie for mpls TC action
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05 17:49:43 -08:00
David S. Miller 41de23e223 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2019-11-02

The following pull-request contains BPF updates for your *net* tree.

We've added 6 non-merge commits during the last 6 day(s) which contain
a total of 8 files changed, 35 insertions(+), 9 deletions(-).

The main changes are:

1) Fix ppc BPF JIT's tail call implementation by performing a second pass
   to gather a stable JIT context before opcode emission, from Eric Dumazet.

2) Fix build of BPF samples sys_perf_event_open() usage to compiled out
   unavailable test_attr__{enabled,open} checks. Also fix potential overflows
   in bpf_map_{area_alloc,charge_init} on 32 bit archs, from Björn Töpel.

3) Fix narrow loads of bpf_sysctl context fields with offset > 0 on big endian
   archs like s390x and also improve the test coverage, from Ilya Leoshkevich.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-05 17:38:21 -08:00
Linus Torvalds 7111fa1151 GPIO fixes for the v5.4 series:
- Fix a build error in the tools used for kselftest.
 - A series of reverts to bring the Intel Merrifield back to
   working. We will likely unrevert the reverts for v5.5
   but we can't have v5.4 broken.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAl3BjZ8ACgkQQRCzN7AZ
 XXPbTRAAxCBCegC7ZISN413NqmzAc8S9e2pHik/0+UinIiq8Kha1zHEmiWKa/GFv
 NSulFytfwYwi3UVgAMds2lFMpbq6atG60D/FgGhRDeG56RHOs29sYDXOIrWvrad6
 49B2U//2FQPcfMpH5PFT0O+B90cxZRYW/MhljnAuBRPUEF6F9m/BJqHlQYD6eY6v
 Xn005z1N/sQbs/1EBXryNlvfHOtepKXTNC0d+HQnRRTEZsAhid8sc+iTVEvLUZh8
 IttKs2I7rB26swx/HwWyiuWxE3hf5lrlwf3xxMQn2cWyyp08vJNRyiEXsEErHQf6
 9eqt1j4iXARf1fh4KoLCgC6zQgqsLG0BZ9Bf28gVXIRXJZDYV1FOpBfc8YwTk29T
 VhNrafUN0o+72qieXqLLKpMT4gjMnqzBhI06NRXEqA2AwIqQJQd4DHdLNHmiaICJ
 cyfzI4TBTjc3qh2av+D43bESh2UwXYErImvaIT+TqFv1P2T7x2UCFL6WGuQZ/qaB
 z7Gg0vh1gc+yrQnVxOh3vuUpD0x2FXo52k74fSYy55aMQlrj77y3w27hzSZ9Prgs
 zIZtVZbTXXQ+SxtuujzdzZKDJ9gJ0+FWyVCe7MxbgP6vBi+7omZ6TPLn0iAS9GQG
 aqLVGVgaIS0kOVvQVR6aAEonbsZZILns8ib4bpPlGIMNgQTCN2A=
 =ml6d
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v5.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
 "More GPIO fixes! We found a late regression in the Intel Merrifield
  driver. Oh well. We fixed it up.

   - Fix a build error in the tools used for kselftest

   - A series of reverts to bring the Intel Merrifield back to working.

  We will likely unrevert the reverts for v5.5 but we can't have v5.4
  broken"

* tag 'gpio-v5.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  Revert "gpio: merrifield: Pass irqchip when adding gpiochip"
  Revert "gpio: merrifield: Restore use of irq_base"
  Revert "gpio: merrifield: Move hardware initialization to callback"
  tools: gpio: Use !building_out_of_srctree to determine srctree
2019-11-05 09:23:08 -08:00
Daniel Borkmann 56c1291ee4 bpf: re-fix skip write only files in debugfs
Commit 5bc60de50d ("selftests: bpf: Don't try to read files without
read permission") got reverted as the fix was not working as expected
and real fix came in via 8101e06941 ("selftests: bpf: Skip write
only files in debugfs"). When bpf-next got merged into net-next, the
test_offload.py had a small conflict. Fix the resolution in ae8a76fb8b
iby not reintroducing 5bc60de50d again.

Fixes: ae8a76fb8b ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-04 11:34:34 -08:00
Linus Torvalds 3a69c9e522 USB fixes for 5.4-rc6
The USB sub-maintainers woke up this past week and sent a bunch of tiny
 fixes.  Here are a lot of small patches that that resolve a bunch of
 reported issues in the USB core, drivers, serial drivers, gadget
 drivers, and of course, xhci :)
 
 All of these have been in linux-next with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXb7SXg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylCyQCgleuRSWwpH3QVRvCXpT/kxqXPkEQAn0ct2ZOi
 oInjMIDpRJ+EuEithFOI
 =P61y
 -----END PGP SIGNATURE-----

Merge tag 'usb-5.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "The USB sub-maintainers woke up this past week and sent a bunch of
  tiny fixes. Here are a lot of small patches that that resolve a bunch
  of reported issues in the USB core, drivers, serial drivers, gadget
  drivers, and of course, xhci :)

  All of these have been in linux-next with no reported issues"

* tag 'usb-5.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (31 commits)
  usb: dwc3: gadget: fix race when disabling ep with cancelled xfers
  usb: cdns3: gadget: Fix g_audio use case when connected to Super-Speed host
  usb: cdns3: gadget: reset EP_CLAIMED flag while unloading
  USB: serial: whiteheat: fix line-speed endianness
  USB: serial: whiteheat: fix potential slab corruption
  USB: gadget: Reject endpoints with 0 maxpacket value
  UAS: Revert commit 3ae62a4209 ("UAS: fix alignment of scatter/gather segments")
  usb-storage: Revert commit 747668dbc0 ("usb-storage: Set virt_boundary_mask to avoid SG overflows")
  usbip: Fix free of unallocated memory in vhci tx
  usbip: tools: Fix read_usb_vudc_device() error path handling
  usb: xhci: fix __le32/__le64 accessors in debugfs code
  usb: xhci: fix Immediate Data Transfer endianness
  xhci: Fix use-after-free regression in xhci clear hub TT implementation
  USB: ldusb: fix control-message timeout
  USB: ldusb: use unsigned size format specifiers
  USB: ldusb: fix ring-buffer locking
  USB: Skip endpoints with 0 maxpacket length
  usb: cdns3: gadget: Don't manage pullups
  usb: dwc3: remove the call trace of USBx_GFLADJ
  usb: gadget: configfs: fix concurrent issue between composite APIs
  ...
2019-11-03 08:25:25 -08:00
David S. Miller ae8a76fb8b Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2019-11-02

The following pull-request contains BPF updates for your *net-next* tree.

We've added 30 non-merge commits during the last 7 day(s) which contain
a total of 41 files changed, 1864 insertions(+), 474 deletions(-).

The main changes are:

1) Fix long standing user vs kernel access issue by introducing
   bpf_probe_read_user() and bpf_probe_read_kernel() helpers, from Daniel.

2) Accelerated xskmap lookup, from Björn and Maciej.

3) Support for automatic map pinning in libbpf, from Toke.

4) Cleanup of BTF-enabled raw tracepoints, from Alexei.

5) Various fixes to libbpf and selftests.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-02 15:29:58 -07:00
David S. Miller d31e95585c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The only slightly tricky merge conflict was the netdevsim because the
mutex locking fix overlapped a lot of driver reload reorganization.

The rest were (relatively) trivial in nature.

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-02 13:54:56 -07:00
Daniel Borkmann fa553d9b57 bpf, testing: Add selftest to read/write sockaddr from user space
Tested on x86-64 and Ilya was also kind enough to give it a spin on
s390x, both passing with probe_user:OK there. The test is using the
newly added bpf_probe_read_user() to dump sockaddr from connect call
into .bss BPF map and overrides the user buffer via bpf_probe_write_user():

  # ./test_progs
  [...]
  #17 pkt_md_access:OK
  #18 probe_user:OK
  #19 prog_run_xattr:OK
  [...]

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/90f449d8af25354e05080e82fc6e2d3179da30ea.1572649915.git.daniel@iogearbox.net
2019-11-02 12:45:08 -07:00
Daniel Borkmann 50f9aa44ca bpf, testing: Convert prog tests to probe_read_{user, kernel}{, _str} helper
Use probe read *_{kernel,user}{,_str}() helpers instead of bpf_probe_read()
or bpf_probe_read_user_str() for program tests where appropriate.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/4a61d4b71ce3765587d8ef5cb93afa18515e5b3e.1572649915.git.daniel@iogearbox.net
2019-11-02 12:39:13 -07:00
Daniel Borkmann 6ae08ae3de bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers
The current bpf_probe_read() and bpf_probe_read_str() helpers are broken
in that they assume they can be used for probing memory access for kernel
space addresses /as well as/ user space addresses.

However, plain use of probe_kernel_read() for both cases will attempt to
always access kernel space address space given access is performed under
KERNEL_DS and some archs in-fact have overlapping address spaces where a
kernel pointer and user pointer would have the /same/ address value and
therefore accessing application memory via bpf_probe_read{,_str}() would
read garbage values.

Lets fix BPF side by making use of recently added 3d7081822f ("uaccess:
Add non-pagefault user-space read functions"). Unfortunately, the only way
to fix this status quo is to add dedicated bpf_probe_read_{user,kernel}()
and bpf_probe_read_{user,kernel}_str() helpers. The bpf_probe_read{,_str}()
helpers are kept as-is to retain their current behavior.

The two *_user() variants attempt the access always under USER_DS set, the
two *_kernel() variants will -EFAULT when accessing user memory if the
underlying architecture has non-overlapping address ranges, also avoiding
throwing the kernel warning via 00c42373d3 ("x86-64: add warning for
non-canonical user access address dereferences").

Fixes: a5e8c07059 ("bpf: add bpf_probe_read_str helper")
Fixes: 2541517c32 ("tracing, perf: Implement BPF programs attached to kprobes")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/796ee46e948bc808d54891a1108435f8652c6ca4.1572649915.git.daniel@iogearbox.net
2019-11-02 12:39:12 -07:00
Toke Høiland-Jørgensen 2f4a32cc83 selftests: Add tests for automatic map pinning
This adds a new BPF selftest to exercise the new automatic map pinning
code.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269298209.394725.15420085139296213182.stgit@toke.dk
2019-11-02 12:35:07 -07:00
Toke Høiland-Jørgensen 57a00f4164 libbpf: Add auto-pinning of maps when loading BPF objects
This adds support to libbpf for setting map pinning information as part of
the BTF map declaration, to get automatic map pinning (and reuse) on load.
The pinning type currently only supports a single PIN_BY_NAME mode, where
each map will be pinned by its name in a path that can be overridden, but
defaults to /sys/fs/bpf.

Since auto-pinning only does something if any maps actually have a
'pinning' BTF attribute set, we default the new option to enabled, on the
assumption that seamless pinning is what most callers want.

When a map has a pin_path set at load time, libbpf will compare the map
pinned at that location (if any), and if the attributes match, will re-use
that map instead of creating a new one. If no existing map is found, the
newly created map will instead be pinned at the location.

Programs wanting to customise the pinning can override the pinning paths
using bpf_map__set_pin_path() before calling bpf_object__load() (including
setting it to NULL to disable pinning of a particular map).

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269298092.394725.3966306029218559681.stgit@toke.dk
2019-11-02 12:35:07 -07:00
Toke Høiland-Jørgensen 196f8487f5 libbpf: Move directory creation into _pin() functions
The existing pin_*() functions all try to create the parent directory
before pinning. Move this check into the per-object _pin() functions
instead. This ensures consistent behaviour when auto-pinning is
added (which doesn't go through the top-level pin_maps() function), at the
cost of a few more calls to mkdir().

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269297985.394725.5882630952992598610.stgit@toke.dk
2019-11-02 12:35:07 -07:00
Toke Høiland-Jørgensen 4580b25fce libbpf: Store map pin path and status in struct bpf_map
Support storing and setting a pin path in struct bpf_map, which can be used
for automatic pinning. Also store the pin status so we can avoid attempts
to re-pin a map that has already been pinned (or reused from a previous
pinning).

The behaviour of bpf_object__{un,}pin_maps() is changed so that if it is
called with a NULL path argument (which was previously illegal), it will
(un)pin only those maps that have a pin_path set.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269297876.394725.14782206533681896279.stgit@toke.dk
2019-11-02 12:35:07 -07:00
Toke Høiland-Jørgensen d1b4574a4b libbpf: Fix error handling in bpf_map__reuse_fd()
bpf_map__reuse_fd() was calling close() in the error path before returning
an error value based on errno. However, close can change errno, so that can
lead to potentially misleading error messages. Instead, explicitly store
errno in the err variable before each goto.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157269297769.394725.12634985106772698611.stgit@toke.dk
2019-11-02 12:35:06 -07:00
Linus Torvalds 1204c70d9d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Fix free/alloc races in batmanadv, from Sven Eckelmann.

 2) Several leaks and other fixes in kTLS support of mlx5 driver, from
    Tariq Toukan.

 3) BPF devmap_hash cost calculation can overflow on 32-bit, from Toke
    Høiland-Jørgensen.

 4) Add an r8152 device ID, from Kazutoshi Noguchi.

 5) Missing include in ipv6's addrconf.c, from Ben Dooks.

 6) Use siphash in flow dissector, from Eric Dumazet. Attackers can
    easily infer the 32-bit secret otherwise etc.

 7) Several netdevice nesting depth fixes from Taehee Yoo.

 8) Fix several KCSAN reported errors, from Eric Dumazet. For example,
    when doing lockless skb_queue_empty() checks, and accessing
    sk_napi_id/sk_incoming_cpu lockless as well.

 9) Fix jumbo packet handling in RXRPC, from David Howells.

10) Bump SOMAXCONN and tcp_max_syn_backlog values, from Eric Dumazet.

11) Fix DMA synchronization in gve driver, from Yangchun Fu.

12) Several bpf offload fixes, from Jakub Kicinski.

13) Fix sk_page_frag() recursion during memory reclaim, from Tejun Heo.

14) Fix ping latency during high traffic rates in hisilicon driver, from
    Jiangfent Xiao.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (146 commits)
  net: fix installing orphaned programs
  net: cls_bpf: fix NULL deref on offload filter removal
  selftests: bpf: Skip write only files in debugfs
  selftests: net: reuseport_dualstack: fix uninitalized parameter
  r8169: fix wrong PHY ID issue with RTL8168dp
  net: dsa: bcm_sf2: Fix IMP setup for port different than 8
  net: phylink: Fix phylink_dbg() macro
  gve: Fixes DMA synchronization.
  inet: stop leaking jiffies on the wire
  ixgbe: Remove duplicate clear_bit() call
  Documentation: networking: device drivers: Remove stray asterisks
  e1000: fix memory leaks
  i40e: Fix receive buffer starvation for AF_XDP
  igb: Fix constant media auto sense switching when no cable is connected
  net: ethernet: arc: add the missed clk_disable_unprepare
  igb: Enable media autosense for the i350.
  igb/igc: Don't warn on fatal read failures when the device is removed
  tcp: increase tcp_max_syn_backlog max value
  net: increase SOMAXCONN to 4096
  netdevsim: Fix use-after-free during device dismantle
  ...
2019-11-01 17:48:11 -07:00
Jakub Kicinski 8101e06941 selftests: bpf: Skip write only files in debugfs
DebugFS for netdevsim now contains some "action trigger" files
which are write only. Don't try to capture the contents of those.

Note that we can't use os.access() because the script requires
root.

Fixes: 4418f862d6 ("netdevsim: implement support for devlink region and snapshots")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01 15:16:01 -07:00
Wei Wang d64479a3e3 selftests: net: reuseport_dualstack: fix uninitalized parameter
This test reports EINVAL for getsockopt(SOL_SOCKET, SO_DOMAIN)
occasionally due to the uninitialized length parameter.
Initialize it to fix this, and also use int for "test_family" to comply
with the API standard.

Fixes: d6a61f80b8 ("soreuseport: test mixed v4/v6 sockets")
Reported-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Cc: Craig Gallek <cgallek@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01 15:11:02 -07:00
Roman Mashak c23fcbbc6a tc-testing: added tests with cookie for conntrack TC action
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-01 14:53:41 -07:00
Jakub Kicinski 75b0bfd2e1 Revert "selftests: bpf: Don't try to read files without read permission"
This reverts commit 5bc60de50d ("selftests: bpf: Don't try to read
files without read permission").

Quoted commit does not work at all, and was never tested.
Script requires root permissions (and tests for them)
and os.access() will always return true for root.

The correct fix is needed in the bpf tree, so let's just
revert and save ourselves the merge conflict.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Pirko <jiri@resnulli.us>
Link: https://lore.kernel.org/bpf/20191101005127.1355-1-jakub.kicinski@netronome.com
2019-11-01 13:13:21 +01:00
Björn Töpel 6bd7cf6657 perf tools: Make usage of test_attr__* optional for perf-sys.h
For users of perf-sys.h outside perf, e.g. samples/bpf/bpf_load.c, it's
convenient not to depend on test_attr__*.

After commit 91854f9a07 ("perf tools: Move everything related to
sys_perf_event_open() to perf-sys.h"), all users of perf-sys.h will
depend on test_attr__enabled and test_attr__open.

This commit enables a user to define HAVE_ATTR_TEST to zero in order
to omit the test dependency.

Fixes: 91854f9a07 ("perf tools: Move everything related to sys_perf_event_open() to perf-sys.h")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: http://lore.kernel.org/bpf/20191001113307.27796-2-bjorn.topel@gmail.com
2019-10-31 21:38:41 +01:00
Alexei Starovoitov 12a8654b2e libbpf: Add support for prog_tracing
Cleanup libbpf from expected_attach_type == attach_btf_id hack
and introduce BPF_PROG_TYPE_TRACING.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191030223212.953010-3-ast@kernel.org
2019-10-31 15:16:59 +01:00
Vlad Buslov 9ae6b78708 tc-testing: implement tests for new fast_init action flag
Add basic tests to verify action creation with new fast_init flag for all
actions that support the flag.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-30 18:07:51 -07:00
Ilya Leoshkevich 7541c87c9b bpf: Allow narrow loads of bpf_sysctl fields with offset > 0
"ctx:file_pos sysctl:read read ok narrow" works on s390 by accident: it
reads the wrong byte, which happens to have the expected value of 0.
Improve the test by seeking to the 4th byte and expecting 4 instead of
0.

This makes the latent problem apparent: the test attempts to read the
first byte of bpf_sysctl.file_pos, assuming this is the least-significant
byte, which is not the case on big-endian machines: a non-zero offset is
needed.

The point of the test is to verify narrow loads, so we cannot cheat our
way out by simply using BPF_W. The existence of the test means that such
loads have to be supported, most likely because llvm can generate them.
Fix the test by adding a big-endian variant, which uses an offset to
access the least-significant byte of bpf_sysctl.file_pos.

This reveals the final problem: verifier rejects accesses to bpf_sysctl
fields with offset > 0. Such accesses are already allowed for a wide
range of structs: __sk_buff, bpf_sock_addr and sk_msg_md to name a few.
Extend this support to bpf_sysctl by using bpf_ctx_range instead of
offsetof when matching field offsets.

Fixes: 7b146cebe3 ("bpf: Sysctl hook")
Fixes: e1550bfe0d ("bpf: Add file_pos field to bpf_sysctl ctx")
Fixes: 9a1027e525 ("selftests/bpf: Test file_pos field in bpf_sysctl ctx")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrey Ignatov <rdna@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191028122902.9763-1-iii@linux.ibm.com
2019-10-30 12:49:13 -07:00
Roman Mashak c4917bfc3a tc-testing: fixed two failing pedit tests
Two pedit tests were failing due to incorrect operation
value in matchPattern, should be 'add' not 'val', so fix it.

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-30 12:38:08 -07:00
Ilya Leoshkevich 9ffccb7606 selftests/bpf: Test narrow load from bpf_sysctl.write
There are tests for full and narrows loads from bpf_sysctl.file_pos, but
for bpf_sysctl.write only full load is tested. Add the missing test.

Suggested-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/20191029143027.28681-1-iii@linux.ibm.com
2019-10-30 16:24:06 +01:00
Andrii Nakryiko a566e35f1e libbpf: Don't use kernel-side u32 type in xsk.c
u32 is a kernel-side typedef. User-space library is supposed to use __u32.
This breaks Github's projection of libbpf. Do u32 -> __u32 fix.

Fixes: 94ff9ebb49 ("libbpf: Fix compatibility for kernels without need_wakeup")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20191029055953.2461336-1-andriin@fb.com
2019-10-29 06:38:12 -07:00
Andrii Nakryiko d3a3aa0c59 libbpf: Fix off-by-one error in ELF sanity check
libbpf's bpf_object__elf_collect() does simple sanity check after iterating
over all ELF sections, if checks that .strtab index is correct. Unfortunately,
due to section indices being 1-based, the check breaks for cases when .strtab
ends up being the very last section in ELF.

Fixes: 77ba9a5b48 ("tools lib bpf: Fetch map names from correct strtab")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191028233727.1286699-1-andriin@fb.com
2019-10-28 20:27:40 -07:00
Magnus Karlsson 94ff9ebb49 libbpf: Fix compatibility for kernels without need_wakeup
When the need_wakeup flag was added to AF_XDP, the format of the
XDP_MMAP_OFFSETS getsockopt was extended. Code was added to the
kernel to take care of compatibility issues arrising from running
applications using any of the two formats. However, libbpf was
not extended to take care of the case when the application/libbpf
uses the new format but the kernel only supports the old
format. This patch adds support in libbpf for parsing the old
format, before the need_wakeup flag was added, and emulating a
set of static need_wakeup flags that will always work for the
application.

v2 -> v3:
* Incorporated code improvements suggested by Jonathan Lemon

v1 -> v2:
* Rebased to bpf-next
* Rewrote the code as the previous version made you blind

Fixes: a4500432c2 ("libbpf: add support for need_wakeup flag in AF_XDP part")
Reported-by: Eloy Degen <degeneloy@gmail.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/bpf/1571995035-21889-1-git-send-email-magnus.karlsson@intel.com
2019-10-28 20:25:32 -07:00
GwanYeong Kim 28df0642ab usbip: tools: Fix read_usb_vudc_device() error path handling
This isn't really accurate right. fread() doesn't always
return 0 in error. It could return < number of elements
and set errno.

Signed-off-by: GwanYeong Kim <gy741.kim@gmail.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20191018032223.4644-1-gy741.kim@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-10-28 17:51:06 +01:00
Ilya Leoshkevich e93d99180a selftests/bpf: Restore $(OUTPUT)/test_stub.o rule
`make O=/linux-build kselftest TARGETS=bpf` fails with

	make[3]: *** No rule to make target '/linux-build/bpf/test_stub.o', needed by '/linux-build/bpf/test_verifier'

The same command without the O= part works, presumably thanks to the
implicit rule.

Fix by restoring the explicit $(OUTPUT)/test_stub.o rule.

Fixes: 74b5a5968f ("selftests/bpf: Replace test_progs and test_maps w/ general rule")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191028102110.7545-1-iii@linux.ibm.com
2019-10-28 16:16:10 +01:00
Ilya Leoshkevich 313e7f6fb1 selftest/bpf: Use -m{little, big}-endian for clang
When cross-compiling tests from x86 to s390, the resulting BPF objects
fail to load due to endianness mismatch.

Fix by using BPF-GCC endianness check for clang as well.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191028102049.7489-1-iii@linux.ibm.com
2019-10-28 16:15:51 +01:00
Linus Torvalds a8a31fdcca Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 "A set of perf fixes:

  kernel:

   - Unbreak the tracking of auxiliary buffer allocations which got
     imbalanced causing recource limit failures.

   - Fix the fallout of splitting of ToPA entries which missed to shift
     the base entry PA correctly.

   - Use the correct context to lookup the AUX event when unmapping the
     associated AUX buffer so the event can be stopped and the buffer
     reference dropped.

  tools:

   - Fix buildiid-cache mode setting in copyfile_mode_ns() when copying
     /proc/kcore

   - Fix freeing id arrays in the event list so the correct event is
     closed.

   - Sync sched.h anc kvm.h headers with the kernel sources.

   - Link jvmti against tools/lib/ctype.o to have weak strlcpy().

   - Fix multiple memory and file descriptor leaks, found by coverity in
     perf annotate.

   - Fix leaks in error handling paths in 'perf c2c', 'perf kmem', found
     by a static analysis tool"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/aux: Fix AUX output stopping
  perf/aux: Fix tracking of auxiliary trace buffer allocation
  perf/x86/intel/pt: Fix base for single entry topa
  perf kmem: Fix memory leak in compact_gfp_flags()
  tools headers UAPI: Sync sched.h with the kernel
  tools headers kvm: Sync kvm.h headers with the kernel sources
  tools headers kvm: Sync kvm headers with the kernel sources
  tools headers kvm: Sync kvm headers with the kernel sources
  perf c2c: Fix memory leak in build_cl_output()
  perf tools: Fix mode setting in copyfile_mode_ns()
  perf annotate: Fix multiple memory and file descriptor leaks
  perf tools: Fix resource leak of closedir() on the error paths
  perf evlist: Fix fix for freed id arrays
  perf jvmti: Link against tools/lib/ctype.h to have weak strlcpy()
2019-10-27 06:59:34 -04:00