Go to file
David S. Miller c476299312 Merge branch 'skbuff-introduce-skbuff_heads-bulking-and-reusing'
Alexander Lobakin says:

====================
skbuff: introduce skbuff_heads bulking and reusing

Currently, all sorts of skb allocation always do allocate
skbuff_heads one by one via kmem_cache_alloc().
On the other hand, we have percpu napi_alloc_cache to store
skbuff_heads queued up for freeing and flush them by bulks.

We can use this cache not only for bulk-wiping, but also to obtain
heads for new skbs and avoid unconditional allocations, as well as
for bulk-allocating (like XDP's cpumap code and veth driver already
do).

As this might affect latencies, cache pressure and lots of hardware
and driver-dependent stuff, this new feature is mostly optional and
can be issued via:
 - a new napi_build_skb() function (as a replacement for build_skb());
 - existing {,__}napi_alloc_skb() and napi_get_frags() functions;
 - __alloc_skb() with passing SKB_ALLOC_NAPI in flags.

iperf3 showed 35-70 Mbps bumps for both TCP and UDP while performing
VLAN NAT on 1.2 GHz MIPS board. The boost is likely to be bigger
on more powerful hosts and NICs with tens of Mpps.

Note on skbuff_heads from distant slabs or pfmemalloc'ed slabs:
 - kmalloc()/kmem_cache_alloc() itself allows by default allocating
   memory from the remote nodes to defragment their slabs. This is
   controlled by sysctl, but according to this, skbuff_head from a
   remote node is an OK case;
 - The easiest way to check if the slab of skbuff_head is remote or
   pfmemalloc'ed is:

	if (!dev_page_is_reusable(virt_to_head_page(skb)))
		/* drop it */;

   ...*but*, regarding that most slabs are built of compound pages,
   virt_to_head_page() will hit unlikely-branch every single call.
   This check costed at least 20 Mbps in test scenarios and seems
   like it'd be better to _not_ do this.

Since v5 [4]:
 - revert flags-to-bool conversion and simplify flags testing in
   __alloc_skb() (Alexander Duyck).

Since v4 [3]:
 - rebase on top of net-next and address kernel build robot issue;
 - reorder checks a bit in __alloc_skb() to make new condition even
   more harmless.

Since v3 [2]:
 - make the feature mostly optional, so driver developers could
   decide whether to use it or not (Paolo Abeni).
   This reuses the old flag for __alloc_skb() and introduces
   a new napi_build_skb();
 - reduce bulk-allocation size from 32 to 16 elements (also Paolo).
   This equals to the value of XDP's devmap and veth batch processing
   (which were tested a lot) and should be sane enough;
 - don't waste cycles on explicit in_serving_softirq() check.

Since v2 [1]:
 - also cover {,__}alloc_skb() and {,__}build_skb() cases (became handy
   after the changes that pass tiny skbs requests to kmalloc layer);
 - cover the cache with KASAN instrumentation (suggested by Eric
   Dumazet, help of Dmitry Vyukov);
 - completely drop redundant __kfree_skb_flush() (also Eric);
 - lots of code cleanups;
 - expand the commit message with NUMA and pfmemalloc points (Jakub).

Since v1 [0]:
 - use one unified cache instead of two separate to greatly simplify
   the logics and reduce hotpath overhead (Edward Cree);
 - new: recycle also GRO_MERGED_FREE skbs instead of immediate
   freeing;
 - correct performance numbers after optimizations and performing
   lots of tests for different use cases.

[0] https://lore.kernel.org/netdev/20210111182655.12159-1-alobakin@pm.me
[1] https://lore.kernel.org/netdev/20210113133523.39205-1-alobakin@pm.me
[2] https://lore.kernel.org/netdev/20210209204533.327360-1-alobakin@pm.me
[3] https://lore.kernel.org/netdev/20210210162732.80467-1-alobakin@pm.me
[4] https://lore.kernel.org/netdev/20210211185220.9753-1-alobakin@pm.me
====================

Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-13 14:32:04 -08:00
Documentation dt-bindings: net: xilinx_axienet: add xlnx,switch-x-sgmii attribute 2021-02-12 17:38:53 -08:00
LICENSES LICENSES: Add the CC-BY-4.0 license 2020-12-08 10:33:27 -07:00
arch dts: marvell: add CM3 SRAM memory to cp11x ethernet device tree 2021-02-11 14:50:23 -08:00
block block-5.11-2021-02-05 2021-02-06 14:40:27 -08:00
certs .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
crypto X.509: Fix crash caused by NULL pointer 2021-01-20 11:33:51 -08:00
drivers net: axienet: Support dynamic switching between 1000BaseX and SGMII 2021-02-12 17:38:53 -08:00
fs nilfs2: make splice write available again 2021-02-10 11:19:58 -08:00
include skbuff: queue NAPI_MERGED_FREE skbs into NAPI cache instead of freeing 2021-02-13 14:32:04 -08:00
init init/gcov: allow CONFIG_CONSTRUCTORS on UML to fix module gcov 2021-02-05 11:03:47 -08:00
ipc Merge branch 'akpm' (patches from Andrew) 2020-12-15 12:53:37 -08:00
kernel Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-02-10 13:30:12 -08:00
lib Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-02-10 13:30:12 -08:00
mm Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-02-10 13:30:12 -08:00
net skbuff: queue NAPI_MERGED_FREE skbs into NAPI cache instead of freeing 2021-02-13 14:32:04 -08:00
samples bpf: Rename BPF_XADD and prepare to encode other atomics in .imm 2021-01-14 18:34:29 -08:00
scripts kallsyms: fix nonconverging kallsyms table with lld 2021-02-05 17:53:28 +09:00
security cap: fix conversions on getxattr 2021-01-28 10:22:48 +01:00
sound ALSA: hda/via: Apply the workaround generically for Clevo machines 2021-01-26 18:05:03 +01:00
tools selftests: tc: Add generic mpls matching support for tc-flower 2021-02-12 17:13:52 -08:00
usr Merge branch 'work.fdpic' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 13:29:39 -07:00
virt KVM/arm64 fixes for 5.11, take #2 2021-01-25 18:52:01 -05:00
.clang-format clang-format: Update with the latest for_each macro list 2021-01-29 15:00:23 +01:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: docs: ignore sphinx_*/ directories 2020-09-10 10:44:31 -06:00
.mailmap MAINTAINERS: update Andrey Ryabinin's email address 2021-02-09 17:26:44 -08:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: dccp: move Gerrit Renker to CREDITS 2021-01-14 10:53:49 -08:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS net: broadcom: rename BCM4908 driver & update DT binding 2021-02-11 15:04:17 -08:00
Makefile Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-02-10 13:30:12 -08:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.