Go to file
David S. Miller 4a54877ee7 Merge branch 'sk_buff-add-extension-infrastructure'
Florian Westphal says:

====================
sk_buff: add extension infrastructure

TL;DR:
 - objdiff shows no change if CONFIG_XFRM=n && BR_NETFILTER=n
 - small size reduction when one or both options are set
 - no changes in ipsec performance

 Changes since v1:
 - Allocate entire extension space from a kmem_cache.
 - Avoid atomic_dec_and_test operation on skb_ext_put() for refcnt == 1 case.
   (similar to kfree_skbmem() fclone_ref use).

This adds an optional extension infrastructure, with ispec (xfrm) and
bridge netfilter as first users.

The third (future) user is Multipath TCP which is still out-of-tree.
MPTCP needs to map logical mptcp sequence numbers to the tcp sequence
numbers used by individual subflows.

This DSS mapping is read/written from tcp option space on receive and
written to tcp option space on transmitted tcp packets that are part of
and MPTCP connection.

Extending skb_shared_info or adding a private data field to skb fclones
doesn't work for incoming skb, so a different DSS propagation method would
be required for the receive side.

mptcp has same requirements as secpath/bridge netfilter:

1. extension memory is released when the sk_buff is free'd.
2. data is shared after cloning an skb (clone inherits extension)
3. adding extension to an skb will COW the extension buffer if needed.

Two new members are added to sk_buff:
1. 'active_extensions' byte (filling a hole), telling which extensions
   are available for this skb.
   This has two purposes.
   a) avoids the need to initialize the pointer.
   b) allows to "delete" an extension by clearing its bit
   value in ->active_extensions.

   While it would be possible to store the active_extensions byte
   in the extension struct instead of sk_buff, there is one problem
   with this:
    When an extension has to be disabled, we can always clear the
    bit in skb->active_extensions.  But in case it would be stored in the
    extension buffer itself, we might have to COW it first, if
    we are dealing with a cloned skb.  On kmalloc failure we would
    be unable to turn an extension off.
2. extension pointer, located at the end of the sk_buff.
   If the active_extensions byte is 0, the pointer is undefined,
   it is not initialized on skb allocation.

This adds extra code to skb clone and free paths (to deal with
refcount/free of extension area) but this replaces similar code that
manages skb->nf_bridge and skb->sp structs in the followup patches of
the series.

It is possible to add support for extensions that are not preseved on
clones/copies:

1. define a bitmask of all extensions that need copy/cow on clone
2. change __skb_ext_copy() to check
   ->active_extensions & SKB_EXT_PRESERVE_ON_CLONE
3. set clone->active_extensions to 0 if test is false.

This isn't done here because all extensions that get added here
need the copy/cow semantics.

Last patch converts skb->sp, secpath information gets stored as
new SKB_EXT_SEC_PATH, so the 'sp' pointer is removed from skbuff.

Extra code added to skb clone and free paths (to deal with refcount/free
of extension area) replaces the existing code that does the same for
skb->nf_bridge and skb->secpath.

I don't see any other in-tree users that could benefit from this
infrastructure, it doesn't make sense to add an extension just for the sake
of a single flag bit (like skb->nf_trace).

Adding a new extension is a good fit if all of the following are true:

1. Data is related to the skb/packet aggregate
2. Data should be freed when the skb is free'd
3. Data is not going to be relevant/needed in normal case (udp, tcp,
   forwarding workloads, ...)
4. There are no fancy action(s) needed on clone/free, such as callbacks
   into kernel modules.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-19 11:21:45 -08:00
Documentation net: switch secpath to use skb extension infrastructure 2018-12-19 11:21:38 -08:00
LICENSES This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
arch Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2018-12-10 18:00:43 -08:00
block blk-mq: punt failed direct issue to dispatch list 2018-12-07 08:16:11 -07:00
certs export.h: remove VMLINUX_SYMBOL() and VMLINUX_SYMBOL_STR() 2018-08-22 23:21:44 +09:00
crypto crypto: user - Disable statistics interface 2018-12-07 13:56:08 +08:00
drivers xfrm: prefer secpath_set over secpath_dup 2018-12-19 11:21:38 -08:00
firmware kbuild: remove all dummy assignments to obj- 2017-11-18 11:46:06 +09:00
fs a fix for smb3 direct i/o, a fix for CIFS DFS for stable and a minor cifs Kconfig fix 2018-12-09 10:15:13 -08:00
include net: switch secpath to use skb extension infrastructure 2018-12-19 11:21:38 -08:00
init initramfs: clean old path before creating a hardlink 2018-11-30 14:56:14 -08:00
ipc ipc: IPCMNI limit check for semmni 2018-10-31 08:54:14 -07:00
kernel Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2018-12-10 18:00:43 -08:00
lib test_rhashtable: remove semaphore usage 2018-12-18 15:12:53 -08:00
mm dax fixes 4.20-rc6 2018-12-09 09:54:04 -08:00
net net: switch secpath to use skb extension infrastructure 2018-12-19 11:21:38 -08:00
samples samples: bpf: fix: seg fault with NULL pointer arg 2018-12-03 23:58:03 +01:00
scripts Fixes for stackleak 2018-12-07 13:13:07 -08:00
security net: use skb_sec_path helper in more places 2018-12-19 11:21:37 -08:00
sound ALSA: hda/realtek: Fix mic issue on Acer AIO Veriton Z4860G/Z6860G 2018-12-05 16:39:59 +01:00
tools Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2018-12-18 08:49:48 -08:00
usr initramfs: move gen_initramfs_list.sh from scripts/ to usr/ 2018-08-22 23:21:44 +09:00
virt Revert "mm, mmu_notifier: annotate mmu notifiers with blockable invalidate callbacks" 2018-10-26 16:25:19 -07:00
.clang-format page cache: Convert find_get_pages_contig to XArray 2018-10-21 10:46:34 -04:00
.cocciconfig
.get_maintainer.ignore
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Kbuild updates for v4.17 (2nd) 2018-04-15 17:21:30 -07:00
.mailmap mailmap: Update email for Punit Agrawal 2018-11-05 10:02:11 +00:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS MAINTAINERS: change Sparse's maintainer 2018-11-25 09:17:43 -08:00
Kbuild Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
Kconfig kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt 2018-08-02 08:06:55 +09:00
MAINTAINERS MAINTAINERS: Add a maintainer for Microsemi switches 2018-12-18 21:51:34 -08:00
Makefile Linux 4.20-rc6 2018-12-09 15:31:00 -08:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.