Commit Graph

949714 Commits

Author SHA1 Message Date
Jiri Olsa 193a983c5b tools resolve_btfids: Add size check to get_id function
To make sure we don't crash on malformed symbols.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825192124.710397-2-jolsa@kernel.org
2020-08-25 15:37:41 -07:00
Alexei Starovoitov 2532f849b5 bpf: Disallow BPF_PRELOAD in allmodconfig builds
The CC_CAN_LINK checks that the host compiler can link, but bpf_preload
relies on libbpf which in turn needs libelf to be present during linking.
allmodconfig runs in odd setups with cross compilers and missing host
libraries like libelf. Instead of extending kconfig with every possible
library that bpf_preload might need disallow building BPF_PRELOAD in
such build-only configurations.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-08-25 15:23:46 -07:00
KP Singh cd324d7abb bpf: Add selftests for local_storage
inode_local_storage:

* Hook to the file_open and inode_unlink LSM hooks.
* Create and unlink a temporary file.
* Store some information in the inode's bpf_local_storage during
  file_open.
* Verify that this information exists when the file is unlinked.

sk_local_storage:

* Hook to the socket_post_create and socket_bind LSM hooks.
* Open and bind a socket and set the sk_storage in the
  socket_post_create hook using the start_server helper.
* Verify if the information is set in the socket_bind hook.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-8-kpsingh@chromium.org
2020-08-25 15:00:04 -07:00
KP Singh 30897832d8 bpf: Allow local storage to be used from LSM programs
Adds support for both bpf_{sk, inode}_storage_{get, delete} to be used
in LSM programs. These helpers are not used for tracing programs
(currently) as their usage is tied to the life-cycle of the object and
should only be used where the owning object won't be freed (when the
owning object is passed as an argument to the LSM hook). Thus, they
are safer to use in LSM hooks than tracing. Usage of local storage in
tracing programs will probably follow a per function based whitelist
approach.

Since the UAPI helper signature for bpf_sk_storage expect a bpf_sock,
it, leads to a compilation warning for LSM programs, it's also updated
to accept a void * pointer instead.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-7-kpsingh@chromium.org
2020-08-25 15:00:04 -07:00
KP Singh 8ea636848a bpf: Implement bpf_local_storage for inodes
Similar to bpf_local_storage for sockets, add local storage for inodes.
The life-cycle of storage is managed with the life-cycle of the inode.
i.e. the storage is destroyed along with the owning inode.

The BPF LSM allocates an __rcu pointer to the bpf_local_storage in the
security blob which are now stackable and can co-exist with other LSMs.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-6-kpsingh@chromium.org
2020-08-25 15:00:04 -07:00
KP Singh 450af8d0f6 bpf: Split bpf_local_storage to bpf_sk_storage
A purely mechanical change:

	bpf_sk_storage.c = bpf_sk_storage.c + bpf_local_storage.c
	bpf_sk_storage.h = bpf_sk_storage.h + bpf_local_storage.h

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-5-kpsingh@chromium.org
2020-08-25 15:00:04 -07:00
KP Singh f836a56e84 bpf: Generalize bpf_sk_storage
Refactor the functionality in bpf_sk_storage.c so that concept of
storage linked to kernel objects can be extended to other objects like
inode, task_struct etc.

Each new local storage will still be a separate map and provide its own
set of helpers. This allows for future object specific extensions and
still share a lot of the underlying implementation.

This includes the changes suggested by Martin in:

  https://lore.kernel.org/bpf/20200725013047.4006241-1-kafai@fb.com/

adding new map operations to support bpf_local_storage maps:

* storages for different kernel objects to optionally have different
  memory charging strategy (map_local_storage_charge,
  map_local_storage_uncharge)
* Functionality to extract the storage pointer from a pointer to the
  owning object (map_owner_storage_ptr)

Co-developed-by: Martin KaFai Lau <kafai@fb.com>

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-4-kpsingh@chromium.org
2020-08-25 15:00:04 -07:00
KP Singh 4cc9ce4e73 bpf: Generalize caching for sk_storage.
Provide the a ability to define local storage caches on a per-object
type basis. The caches and caching indices for different objects should
not be inter-mixed as suggested in:

  https://lore.kernel.org/bpf/20200630193441.kdwnkestulg5erii@kafai-mbp.dhcp.thefacebook.com/

  "Caching a sk-storage at idx=0 of a sk should not stop an
  inode-storage to be cached at the same idx of a inode."

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-3-kpsingh@chromium.org
2020-08-25 15:00:03 -07:00
KP Singh 1f00d375af bpf: Renames in preparation for bpf_local_storage
A purely mechanical change to split the renaming from the actual
generalization.

Flags/consts:

  SK_STORAGE_CREATE_FLAG_MASK	BPF_LOCAL_STORAGE_CREATE_FLAG_MASK
  BPF_SK_STORAGE_CACHE_SIZE	BPF_LOCAL_STORAGE_CACHE_SIZE
  MAX_VALUE_SIZE		BPF_LOCAL_STORAGE_MAX_VALUE_SIZE

Structs:

  bucket			bpf_local_storage_map_bucket
  bpf_sk_storage_map		bpf_local_storage_map
  bpf_sk_storage_data		bpf_local_storage_data
  bpf_sk_storage_elem		bpf_local_storage_elem
  bpf_sk_storage		bpf_local_storage

The "sk" member in bpf_local_storage is also updated to "owner"
in preparation for changing the type to void * in a subsequent patch.

Functions:

  selem_linked_to_sk			selem_linked_to_storage
  selem_alloc				bpf_selem_alloc
  __selem_unlink_sk			bpf_selem_unlink_storage_nolock
  __selem_link_sk			bpf_selem_link_storage_nolock
  selem_unlink_sk			__bpf_selem_unlink_storage
  sk_storage_update			bpf_local_storage_update
  __sk_storage_lookup			bpf_local_storage_lookup
  bpf_sk_storage_map_free		bpf_local_storage_map_free
  bpf_sk_storage_map_alloc		bpf_local_storage_map_alloc
  bpf_sk_storage_map_alloc_check	bpf_local_storage_map_alloc_check
  bpf_sk_storage_map_check_btf		bpf_local_storage_map_check_btf

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-2-kpsingh@chromium.org
2020-08-25 14:59:58 -07:00
Joe Perches ca65a280fb sunrpc: Avoid comma separated statements
Use semicolons and braces.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-25 07:54:19 -07:00
Joe Perches dee847793f ipv6: fib6: Avoid comma separated statements
Use semicolons and braces.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-25 07:54:19 -07:00
Joe Perches ac937e1f7d wan: sbni: Avoid comma separated statements
Use semicolons and braces.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-25 07:54:19 -07:00
Joe Perches 2d59079ff7 fs_enet: Avoid comma separated statements
Use semicolons and braces.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-25 07:54:19 -07:00
Joe Perches e7fee115bf 8390: Avoid comma separated statements
Use semicolons and braces.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-25 07:54:19 -07:00
Miaohe Lin 343d8c6014 net: clean up codestyle for net/ipv4
This is a pure codestyle cleanup patch. Also add a blank line after
declarations as warned by checkpatch.pl.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-25 06:28:02 -07:00
Miaohe Lin fdf1923bf9 net: Remove duplicated midx check against 0
Check midx against 0 is always equal to check midx against sk_bound_dev_if
when sk_bound_dev_if is known not equal to 0 in these case.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-25 06:23:59 -07:00
Miaohe Lin 0ce779a9f5 net: Avoid unnecessary inet_addr_type() call when addr is INADDR_ANY
We can avoid unnecessary inet_addr_type() call by check addr against
INADDR_ANY first.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-25 06:20:10 -07:00
Miaohe Lin 0316a21116 net: Set ping saddr after we successfully get the ping port
We can defer set ping saddr until we successfully get the ping port. So we
can avoid clear saddr when failed. Since ping_clear_saddr() is not used
anymore now, remove it.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-25 06:18:13 -07:00
Raju Rangoju cca852831c cxgb4: add error handlers to LE intr_handler
cxgb4 does not look for HASHTBLMEMCRCERR and CMDTIDERR
bits in LE_DB_INT_CAUSE register, but these are enabled
in LE_DB_INT_ENABLE. So, add error handlers to LE
interrupt handler to emit a warning or alert message
for hash table mem crc and cmd tid errors

Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-25 06:07:44 -07:00
Miaohe Lin 4718a471f1 netlink: remove duplicated nla_need_padding_for_64bit() check
The need for padding 64bit is implicitly checked by nla_align_64bit(), so
remove this explicit one.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-25 06:06:19 -07:00
Miaohe Lin 8b4510d76c net: gain ipv4 mtu when mtu is not locked
When mtu is locked, we should not obtain ipv4 mtu as we return immediately
in this case and leave acquired ipv4 mtu unused.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-25 06:04:39 -07:00
Yonghong Song 0fcdfffe80 selftests/bpf: Enable tc verbose mode for test_sk_assign
Currently test_sk_assign failed verifier with llvm11/llvm12.
During debugging, I found the default verifier output is
truncated like below
  Verifier analysis:

  Skipped 2200 bytes, use 'verb' option for the full verbose log.
  [...]
  off=23,r=34,imm=0) R5=inv0 R6=ctx(id=0,off=0,imm=0) R7=pkt(id=0,off=0,r=34,imm=0) R10=fp0
  80: (0f) r7 += r2
  last_idx 80 first_idx 21
  regs=4 stack=0 before 78: (16) if w3 == 0x11 goto pc+1
when I am using "./test_progs -vv -t assign".

The reason is tc verbose mode is not enabled.

This patched enabled tc verbose mode and the output looks like below
  Verifier analysis:

  0: (bf) r6 = r1
  1: (b4) w0 = 2
  2: (61) r1 = *(u32 *)(r6 +80)
  3: (61) r7 = *(u32 *)(r6 +76)
  4: (bf) r2 = r7
  5: (07) r2 += 14
  6: (2d) if r2 > r1 goto pc+61
   R0_w=inv2 R1_w=pkt_end(id=0,off=0,imm=0) R2_w=pkt(id=0,off=14,r=14,imm=0)
  ...

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200824222807.100200-1-yhs@fb.com
2020-08-24 21:15:13 -07:00
Daniel T. Lee f0c328f8af samples: bpf: Refactor tracepoint tracing programs with libbpf
For the problem of increasing fragmentation of the bpf loader programs,
instead of using bpf_loader.o, which is used in samples/bpf, this
commit refactors the existing tracepoint tracing programs with libbbpf
bpf loader.

    - Adding a tracepoint event and attaching a bpf program to it was done
    through bpf_program_attach().
    - Instead of using the existing BPF MAP definition, MAP definition
    has been refactored with the new BTF-defined MAP format.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200823085334.9413-4-danieltimlee@gmail.com
2020-08-24 20:59:35 -07:00
Daniel T. Lee 3677d0a131 samples: bpf: Refactor kprobe tracing programs with libbpf
For the problem of increasing fragmentation of the bpf loader programs,
instead of using bpf_loader.o, which is used in samples/bpf, this
commit refactors the existing kprobe tracing programs with libbbpf
bpf loader.

    - For kprobe events pointing to system calls, the SYSCALL() macro in
    trace_common.h was used.
    - Adding a kprobe event and attaching a bpf program to it was done
    through bpf_program_attach().
    - Instead of using the existing BPF MAP definition, MAP definition
    has been refactored with the new BTF-defined MAP format.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200823085334.9413-3-danieltimlee@gmail.com
2020-08-24 20:59:35 -07:00
Daniel T. Lee 35a8b6dd33 samples: bpf: Cleanup bpf_load.o from Makefile
Since commit cc7f641d63 ("samples: bpf: Refactor BPF map performance
test with libbpf") has ommited the removal of bpf_load.o from Makefile,
this commit removes the bpf_load.o rule for targets where bpf_load.o is
not used.

Fixes: cc7f641d63 ("samples: bpf: Refactor BPF map performance test with libbpf")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200823085334.9413-2-danieltimlee@gmail.com
2020-08-24 20:59:35 -07:00
David S. Miller 079f921e9f This cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
 
  - Drop unused function batadv_hardif_remove_interfaces(),
    by Sven Eckelmann
 
  - delete duplicated words, by Randy Dunlap
 
  - Drop (even more) repeated words in comments, by Sven Eckelmann
 
  - Migrate to linux/prandom.h, by Sven Eckelmann
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAl9D6i0WHHN3QHNpbW9u
 d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoUU7D/sEZhE7ht0F7tf5hyl8AlCaPpS4
 Yd2IwrI1m4Vc8Z0gd4d3MtdCAilGsLVY32PY/+MQBRktvEZ+Ms9CD+MaVJJiCWnU
 OxhBZxNYeE/omP/nPZgwub/kHcIojGd/1XTy6Gyp2IGroOvAB3BA0acLsZsAoFrR
 /AmRLFFajnYgodn0yuWeomDjOaUoe5n4GCpWutlb6bLRA9t1B2YMWtRj5vXJO+Ht
 EIxBw3Gvz/b7eVsbo0VQMpsOju8OM9f6XZw66NnS2ekEepIBrkJzIi8fa97H7hy6
 Ef6XkwmZhLgQQbsylVZSsynkNoB6cVOPDg/k4/BrErYTMnqAoh2U6v8IFOssmTKX
 Od5FBRqcXnRB2+/sPq7+VodlFXTqSriJGdlwCyGyg7dU4FPPdbdwubdJXWMjwe9A
 7yMrty3VRLmVAwd5JuefNekyzEXXf2Pd/HHhDdgnLJ/LkJAgkswpmbjkz8kNZAwp
 69MbFiC027O4s1l8PdiREjfAcMG1SicQ08LhBA8OV2ZUQ7IcYlg0Hq/UQNTEuJni
 yPFnzA5n/KtbY5sJGA+oYHGfQiNLx3xmM/GNl5YpfYrDSg8srcd+GTvwy2Vzzzp1
 2Q7KSxPHVJDdcLPgyUlfTXp27870/aPY0Q+8r5fKjPgq9Fj4ZAuop7FtY8XSzg6R
 3N4lPNKQS38UUPfjOg==
 =1GXQ
 -----END PGP SIGNATURE-----

Merge tag 'batadv-next-for-davem-20200824' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This cleanup patchset includes the following patches:

 - bump version strings, by Simon Wunderlich

 - Drop unused function batadv_hardif_remove_interfaces(),
   by Sven Eckelmann

 - delete duplicated words, by Randy Dunlap

 - Drop (even more) repeated words in comments, by Sven Eckelmann

 - Migrate to linux/prandom.h, by Sven Eckelmann
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:18:22 -07:00
David S. Miller 64d123fc25 Merge branch 'Add-PTP-support-for-Octeontx2'
Subbaraya Sundeep says:

====================
Add PTP support for Octeontx2

This patchset adds PTP support for Octeontx2 platform.
PTP is an independent coprocessor block from which
CGX block fetches timestamp and prepends it to the
packet before sending to NIX block. Patches are as
follows:

Patch 1: Patch to enable/disable packet timstamping
         in CGX upon mailbox request. It also adjusts
         packet parser (NPC) for the 8 bytes timestamp
         appearing before the packet.

Patch 2: Patch adding PTP pci driver which configures
         the PTP block and hooks up to RVU AF driver.
         It also exposes a mailbox call to adjust PTP
         hardware clock.

Patch 3: Patch adding PTP clock driver for PF netdev.

v8:
 Added missing header file reported by kernel test robot
 in patch 2
v7:
 As per Jesse Brandeburg comments:
 Simplified functions in patch 1
 Replaced magic numbers with macros
 Added Copyrights
 Added code comments wherever required
 Modified commit description of patch 2
v6:
 Resent after net-next is open
v5:
 As suggested by David separated the fix (adding rtnl lock/unlock)
 and submitted to net.
 https://www.spinics.net/lists/netdev/msg669617.html
v4:
 Added rtnl_lock/unlock in otx2_reset to protect against
 network stack ndo_open and close calls
 Added NULL check after ptp_clock_register in otx2_ptp.c
v3:
 Fixed sparse error in otx2_txrx.c
 Removed static inlines in otx2_txrx.c
v2:
 Fixed kernel build robot reported error by
 adding timecounter.h to otx2_common.h
====================

Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:16:28 -07:00
Aleksey Makarov c9c12d339d octeontx2-pf: Add support for PTP clock
This patch adds PTP clock and uses it in Octeontx2
network device. PTP clock uses mailbox calls to
access the hardware counter on the RVU side.

Co-developed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Aleksey Makarov <amakarov@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:15:45 -07:00
Aleksey Makarov 4086f2a06a octeontx2-af: Add support for Marvell PTP coprocessor
Precision Timestamping block found on Octeontx2
platform is an independent coprocessor and has
internal PTP hardware clock. Once configured PTP
runs independently and when a packet arrives
CGX hardware block gets the current timestamp
from PTP block and forwards the packet to NIX
by prepending timestamp to the packet.
This patch adds the pci driver for PTP block.
The driver gets registered by AF driver and does
initial configuration and exposes a mailbox function to
read and adjust PTP hardware clock. The mailbox function
is called by AF consumers like netdev drivers or
userspace drivers. Since PTP being a single block
in platform this driver helps in accessing PTP
block by any AF consumer.

Co-developed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Aleksey Makarov <amakarov@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:15:45 -07:00
Zyta Szpak 421572175b octeontx2-af: Support to enable/disable HW timestamping
Four new mbox messages ids and handler are added in order to
enable or disable timestamping procedure on tx and rx side.
Additionally when PTP is enabled, the packet parser must skip
over 8 bytes and start analyzing packet data there. To make NPC
profiles work seemlesly PTR_ADVANCE of IKPU is set so that
parsing can be done as before when all data pointers
are shifted by 8 bytes automatically.

Co-developed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Zyta Szpak <zyta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:15:45 -07:00
Miaohe Lin 373c15c2e9 net: Use helper macro RT_TOS() in __icmp_send()
Use helper macro RT_TOS() to get tos in __icmp_send().

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:12:36 -07:00
Miaohe Lin 7551144978 net: Avoid access icmp_err_convert when icmp code is ICMP_FRAG_NEEDED
There is no need to fetch errno and fatal info from icmp_err_convert when
icmp code is ICMP_FRAG_NEEDED.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:11:43 -07:00
David S. Miller 0caeba3d3c Merge branch 'qed-introduce-devlink-health-support'
Igor Russkikh says:

====================
qed: introduce devlink health support

This is a followup implementation after series

https://patchwork.ozlabs.org/project/netdev/cover/20200514095727.1361-1-irusskikh@marvell.com/

This is an implementation of devlink health infrastructure.

With this we are now able to report HW errors to devlink, and it'll take
its own actions depending on user configuration to capture and store the
dump at the bad moment, and to request the driver to recover the device.

So far we do not differentiate global device failures or specific PCI
function failures. This means that some errors specific to one physical
function will affect an entire device. This is not yet fully designed
and verified, will followup in future.

Solution was verified with artificial HW errors generated, existing
tools for dump analysis could be used.

v7: comments from Jesse and Jakub
 - p2: extra edev check
 - p9: removed extra indents
v6: patch 4: changing serial to board.serial and fw to fw.app
v5: improved patch 4 description
v4:
 - commit message and other fixes after Jiri's comments
 - removed one patch (will send to net)
v3: fix uninit var usage in patch 11
v2: fix #include issue from kbuild test robot.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:01:33 -07:00
Igor Russkikh adc100d098 qede: make driver reliable on unload after failures
In case recovery was not successful, netdev still should be
present. But we should clear cdev if something bad happens
on recovery.

We also check cdev for null on dev close. That could be a case
if recovery was not successful.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:01:33 -07:00
Igor Russkikh c5c642c55e qed: align adjacent indent
Remove extra indent on some of adjacent declarations.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:01:33 -07:00
Igor Russkikh 27fed78737 qed: implement devlink dump
Gather and push out full device dump to devlink.
Device dump is the same as with `ethtool -d`, but now its generated
exactly at the moment bad thing happens.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:01:33 -07:00
Igor Russkikh b228cb1602 qed*: make use of devlink recovery infrastructure
Remove forcible recovery trigger and put it as a normal devlink
callback.

This allows user to enable/disable it via

    devlink health set pci/0000:03:00.0 reporter fw_fatal auto_recover false

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:01:33 -07:00
Igor Russkikh 4f5a8db27e qed: use devlink logic to report errors
Use devlink_health_report to push error indications.
We implement this in qede via callback function to make it possible
to reuse the same for other drivers sitting on top of qed in future.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:01:33 -07:00
Igor Russkikh 9524067b9a qed: health reporter init deinit seq
Here we declare health reporter ops (empty for now)
and register these in qed probe and remove callbacks.

This way we get devlink attached to all kind of qed* PCI
device entities: networking or storage offload entity.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:01:32 -07:00
Igor Russkikh 53916a67c3 qed: implement devlink info request
Here we return existing fw & mfw versions, we also fetch device's
serial number:

~$ sudo ~/iproute2/devlink/devlink  dev info
pci/0000:01:00.1:
  driver qed
  board.serial_number REE1915E44552
  versions:
      running:
        fw.app 8.42.2.0
      stored:
        fw.mgmt 8.52.10.0

MFW and FW are different firmwares on device.
Management is a firmware responsible for link configuration and
various control plane features. Its permanent and resides in NVM.

Running FW (or fastpath FW) is an embedded microprogram implementing
all the packet processing, offloads, etc. This FW is being loaded
on each start by the driver from FW binary blob.

The base device specific structure (qed_dev_info) was not directly
available to the base driver before. Thus, here we create and store
a private copy of this structure in qed_dev root object to
access the data.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:01:32 -07:00
Igor Russkikh b75d05b2da qed: fix kconfig help entries
This patch replaces stubs in kconfig help entries with an actual description.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:01:32 -07:00
Igor Russkikh 755f982bb1 qed/qede: make devlink survive recovery
Devlink instance lifecycle was linked to qed_dev object,
that caused devlink to be recreated on each recovery.

Changing it by making higher level driver (qede) responsible for its
life. This way devlink now survives recoveries.

qede now stores devlink structure pointer as a part of its device
object, devlink private data contains a linkage structure,
qed_devlink.

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:01:32 -07:00
Igor Russkikh 52306dee54 qed: move out devlink logic into a new file
We are extending devlink infrastructure, thus move the existing
stuff into a new file qed_devlink.c

Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 18:01:32 -07:00
Christophe JAILLET 9ab9017948 chelsio: switch from 'pci_' to 'dma_' API
The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'free_rx_resources()' and
'alloc_tx_resources()' (sge.c) GFP_KERNEL can be used because it is
already used in these functions.

Moreover, they can only be called from a .ndo_open	function. So it is
guarded by the 'rtnl_lock()', which is a mutex.

While at it, a pr_err message in 'init_one()' has been updated accordingly
(s/consistent/coherent).

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 17:52:07 -07:00
David S. Miller f6d89dc51e Merge branch 'mlxsw-Misc-updates'
Ido Schimmel says:

====================
mlxsw: Misc updates

This patch set includes various updates for mlxsw.

Patches #1-#4 adjust the default burst size of packet trap policers to
conform to Spectrum-{2,3} requirements. The corresponding selftest is
also adjusted so that it could reliably pass on these platforms.

Patch #5 adjusts a selftest so that it could pass with both old and new
versions of mausezahn.

Patch #6 significantly reduces the runtime of tc-police scale test by
changing the preference and masks of the used tc filters.

Patch #7 prevents the driver from trying to set invalid ethtool link
modes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 17:36:11 -07:00
Danielle Ratson 5bf01b571c mlxsw: spectrum_ethtool: Remove internal speeds from PTYS register
The PTYS register is used to report and configure the port type and
speed. Currently, internal bits in the register are used the same way
other bits are used.

Using the internal bits can cause bad parameter firmware errors. For
example, trying to write to internal bit 25 returns:

EMAD reg access failed (tid=53e2bffa00004310,reg_id=5004(ptys),type=write,status=7(bad parameter))

Remove the internal bits from the PTYS register, so that it is no longer
possible to pass them to firmware.

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 17:36:11 -07:00
Ido Schimmel ffff9c9cb4 selftests: mlxsw: Reduce runtime of tc-police scale test
Currently, the test takes about 626 seconds to complete because of an
inefficient use of the device's TCAM. Reduce the runtime to 202 seconds
by inserting all the flower filters with the same preference and mask,
but with a different key.

In particular, this reduces the deletion of the qdisc (which triggers
the deletion of all the filters) from 66 seconds to 0.2 seconds. This
prevents various netlink requests from user space applications (e.g.,
systemd-networkd) from timing-out because RTNL is not held for too long
anymore.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 17:36:11 -07:00
Danielle Ratson 24f54c5225 selftests: forwarding: Fix mausezahn delay parameter in mirror_test()
Currently, mausezahn delay parameter in mirror_test() is specified with
'ms' units.

mausezahn versions before 0.6.5 interpret 'ms' as seconds and therefore
the tests that use mirror_test() take a very long time to complete.

Resolve this by specifying 'msec' units.

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 17:36:11 -07:00
Ido Schimmel b36cca02dc selftests: mlxsw: Increase burst size for burst test
The current combination of rate and burst size does not adhere to
Spectrum-{2,3} limitation which states that the minimum burst size
should be 40% of the rate.

Increase the burst size in order to honor above mentioned limitation and
avoid intermittent failures of this test case on Spectrum-{2,3}.

Remove the first sub-test case as the variation in number of received
packets is simply too large to reliably test it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 17:36:11 -07:00
Ido Schimmel 8e0d8ce4fc selftests: mlxsw: Increase burst size for rate test
The current combination of rate and burst size does not adhere to
Spectrum-{2,3} limitation which states that the minimum burst size
should be 40% of the rate.

Increase the burst size in order to honor above mentioned limitation and
avoid intermittent failures of this test case on Spectrum-{2,3}.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-24 17:36:11 -07:00