v3: added previously removed sock_put() to the tun_release() callback, because
sk_release_kernel() doesn't drop the socket reference.
v2: sk_release_kernel() used for socket release. Dummy tun_release() is
required for sk_release_kernel() ---> sock_release() ---> sock->ops->release()
call.
TUN was designed to destroy it's socket on network namesapce shutdown. But this
will never happen for persistent device, because it's socket holds network
namespace.
This patch removes of holding network namespace by TUN socket and replaces it
by creating socket in init_net and then changing it's net it to desired one. On
shutdown socket is moved back to init_net prior to final put.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Per discussion with Ben Hutchings and David Miller, go through and
remove assignments of "N/A" to fw_version in various drivers'
.get_drvinfo routines. While there clean-up some use of bare
constants and such.
Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
v2: add couple missing conversions in drivers
split unexporting netdev_fix_features()
implemented %pNF
convert sock::sk_route_(no?)caps
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert some remaining straglers' .get_drvinfo routines to use strlcpy
rather than strcpy/strncpy.
Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After the last patch, We are left in a state in which only drivers calling
ether_setup have IFF_TX_SKB_SHARING set (we assume that drivers touching real
hardware call ether_setup for their net_devices and don't hold any state in
their skbs. There are a handful of drivers that violate this assumption of
course, and need to be fixed up. This patch identifies those drivers, and marks
them as not being able to support the safe transmission of skbs by clearning the
IFF_TX_SKB_SHARING flag in priv_flags
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Karsten Keil <isdn@linux-pingi.de>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Patrick McHardy <kaber@trash.net>
CC: Krzysztof Halasa <khc@pm.waw.pl>
CC: "John W. Linville" <linville@tuxdriver.com>
CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: Marcel Holtmann <marcel@holtmann.org>
CC: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 8d8fc29d02 changed the behavior of slave
devices in regards to netpoll. Specifically it created a mutually exclusive
relationship between being a slave and a netpoll-capable device. This creates
problems for KVM because guests relied on needing netconsole active on a slave
device to a bridge. Ideally libvirtd could just attach netconsole to the bridge
device instead, but thats currently infeasible, because while the bridge device
supports netpoll, it requires that all slave interface also support it, but the
tun/tap driver currently does not. The most direct solution is to teach tun/tap
to support netpoll, which is implemented by the patch below.
I've not tested this yet, but its pretty straightforward.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Rik van Riel <riel@redhat.com>
CC: Rik van Riel <riel@redhat.com>
CC: Maxim Krasnyansky <maxk@qualcomm.com>
CC: Cong Wang <amwang@redhat.com>
CC: "David S. Miller" <davem@davemloft.net>
Reviewed-by: Rik van Riel <riel@redhat.com>
Tested-by: Rik van Riel <riel@redhat.com>
Reviewed-by: WANG Cong <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
There's no need for the guest to validate the checksum if it have been
validated by host nics. So this patch introduces a new flag -
VIRTIO_NET_HDR_F_DATA_VALID which is used to bypass the checksum
examing in guest. The backend (tap/macvtap) may set this flag when
met skbs with CHECKSUM_UNNECESSARY to save cpu utilization.
No feature negotiation is needed as old driver just ignore this flag.
Iperf shows 12%-30% performance improvement for UDP traffic. For TCP,
when gro is on no difference as it produces skb with partial
checksum. But when gro is disabled, 20% or even higher improvement
could be measured by netperf.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Perf shows a relatively high rate (about 8%) race in
spin_lock_irqsave() when doing netperf between external host and
guest. It's mainly becuase the lock contention between the
tun_do_read() and tun_xmit_skb(), so this patch do not put self into
waitqueue to reduce this kind of race. After this patch, it drops to
4%.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current standard practice is to not mark most functions as inline
and let compiler decide instead.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tun driver allocates skb's to hold data from user and then passes
the data into the network stack as received data. Most network devices
allocate the receive skb with routines like dev_alloc_skb() that reserves
additional space for use by network protocol stack but tun does not.
Because of the lack of padding, when the packet is passed through bridge
netfilter a new skb has to be allocated.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Semicolons are not necessary after switch/while/for/if braces
so remove them.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Force dev_alloc_name() to be called from register_netdevice() by
dev_get_valid_name(). That allows to remove multiple explicit
dev_alloc_name() calls.
The possibility to call dev_alloc_name in advance remains.
This also fixes veth creation regresion caused by
84c49d8c3e
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This updates the network drivers so that they don't access the
ethtool_cmd::speed field directly, but use ethtool_cmd_speed()
instead.
For most of the drivers, these changes are purely cosmetic and don't
fix any problem, such as for those 1GbE/10GbE drivers that indirectly
call their own ethtool get_settings()/mii_ethtool_gset(). The changes
are meant to enforce code consistency and provide robustness with
future larger throughputs, at the expense of a few CPU cycles for each
ethtool operation.
All drivers compiled with make allyesconfig ion x86_64 have been
updated.
Tested: make allyesconfig on x86_64 + e1000e/bnx2x work
Signed-off-by: David Decotigny <decot@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This changes offload setting behaviour to what I think is correct:
- offloads set via ethtool mean what admin wants to use (by default
he wants 'em all)
- offloads set via ioctl() mean what userspace is expecting to get
(this limits which admin wishes are granted)
- TUN_NOCHECKSUM is ignored, as it might cause broken packets when
forwarded (ip_summed == CHECKSUM_UNNECESSARY means that checksum
was verified, not that it can be ignored)
If TUN_NOCHECKSUM is implemented, it should set skb->csum_* and
skb->ip_summed (= CHECKSUM_PARTIAL) for known protocols and let others
be verified by kernel when necessary.
TUN_NOCHECKSUM handling was introduced by commit
f43798c27684ab925adde7d8acc34c78c6e50df8:
tun: Allow GSO using virtio_net_hdr
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the current logging forms with pr_fmt.
Convert DBG macro to tun_debug, use netdev_printk as well.
Add printf verification when TUN_DEBUG not defined.
Miscellaneous comment typo fix.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Quoting Ben Hutchings: we presumably won't be defining features that
can only be enabled on 64-bit architectures.
Occurences found by `grep -r` on net/, drivers/net, include/
[ Move features and vlan_features next to each other in
struct netdev, as per Eric Dumazet's suggestion -DaveM ]
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace skb->csum_start - skb_headroom(skb) with skb_checksum_start_offset().
Note for usb/smsc95xx: skb->data - skb->head == skb_headroom(skb).
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, only ethtool can get accurate link state of a tap device.
With this patch, IFF_RUNNING and IF_OPER_UP/DOWN are kept up to date as
well.
Signed-off-by: Nolan Leake <nolan@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are still some LRO cards that cause GSO errors in tun,
and BUG on this is an unfriendly way to tell the admin
to disable LRO.
Further, experience shows we might have more GSO bugs lurking.
See https://bugzilla.kernel.org/show_bug.cgi?id=16413
as a recent example.
dumping a packet will make it easier to figure it out.
Replace BUG with warning+dump+drop the packet to make
GSO errors in tun less critical and easier to debug.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Alex Unigovsky <unik@compot.ru>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (63 commits)
drivers/net/usb/asix.c: Fix pointer cast.
be2net: Bug fix to avoid disabling bottom half during firmware upgrade.
proc_dointvec: write a single value
hso: add support for new products
Phonet: fix potential use-after-free in pep_sock_close()
ath9k: remove VEOL support for ad-hoc
ath9k: change beacon allocation to prefer the first beacon slot
sock.h: fix kernel-doc warning
cls_cgroup: Fix build error when built-in
macvlan: do proper cleanup in macvlan_common_newlink() V2
be2net: Bug fix in init code in probe
net/dccp: expansion of error code size
ath9k: Fix rx of mcast/bcast frames in PS mode with auto sleep
wireless: fix sta_info.h kernel-doc warnings
wireless: fix mac80211.h kernel-doc warnings
iwlwifi: testing the wrong variable in iwl_add_bssid_station()
ath9k_htc: rare leak in ath9k_hif_usb_alloc_tx_urbs()
ath9k_htc: dereferencing before check in hif_usb_tx_cb()
rt2x00: Fix rt2800usb TX descriptor writing.
rt2x00: Fix failed SLEEP->AWAKE and AWAKE->SLEEP transitions.
...
This adds:
alias: devname:<name>
to some common kernel modules, which will allow the on-demand loading
of the kernel module when the device node is accessed.
Ideally all these modules would be compiled-in, but distros seems too
much in love with their modularization that we need to cover the common
cases with this new facility. It will allow us to remove a bunch of pretty
useless init scripts and modprobes from init scripts.
The static device node aliases will be carried in the module itself. The
program depmod will extract this information to a file in the module directory:
$ cat /lib/modules/2.6.34-00650-g537b60d-dirty/modules.devname
# Device nodes to trigger on-demand module loading.
microcode cpu/microcode c10:184
fuse fuse c10:229
ppp_generic ppp c108:0
tun net/tun c10:200
dm_mod mapper/control c10:235
Udev will pick up the depmod created file on startup and create all the
static device nodes which the kernel modules specify, so that these modules
get automatically loaded when the device node is accessed:
$ /sbin/udevd --debug
...
static_dev_create_from_modules: mknod '/dev/cpu/microcode' c10:184
static_dev_create_from_modules: mknod '/dev/fuse' c10:229
static_dev_create_from_modules: mknod '/dev/ppp' c108:0
static_dev_create_from_modules: mknod '/dev/net/tun' c10:200
static_dev_create_from_modules: mknod '/dev/mapper/control' c10:235
udev_rules_apply_static_dev_perms: chmod '/dev/net/tun' 0666
udev_rules_apply_static_dev_perms: chmod '/dev/fuse' 0666
A few device nodes are switched to statically allocated numbers, to allow
the static nodes to work. This might also useful for systems which still run
a plain static /dev, which is completely unsafe to use with any dynamic minor
numbers.
Note:
The devname aliases must be limited to the *common* and *single*instance*
device nodes, like the misc devices, and never be used for conceptually limited
systems like the loop devices, which should rather get fixed properly and get a
control node for losetup to talk to, instead of creating a random number of
device nodes in advance, regardless if they are ever used.
This facility is to hide the mess distros are creating with too modualized
kernels, and just to hide that these modules are not compiled-in, and not to
paper-over broken concepts. Thanks! :)
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: Ian Kent <raven@themaw.net>
Signed-Off-By: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch makes tun update its socket classid every time we
inject a packet into the network stack. This is so that any
updates made by the admin to the process writing packets to
tun is effected.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
switch and while statements don't need semicolons at end of statement
[ Fixup minor conflicts with recent wimax merge... -DaveM ]
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes from drivers/net/ all the unnecessary
return; statements that precede the last closing brace of
void functions.
It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.
It also does not remove null void functions with return.
Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'
with some cleanups by hand.
Compile tested x86 allmodconfig only.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that core network takes care of trans_start updates, dont do it
in drivers themselves, if possible. Drivers can avoid one cache miss
(on dev->trans_start) in their start_xmit() handler.
Exceptions are NETIF_F_LLTX drivers
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
virtio added mergeable buffers mode where 2 bytes of extra info is put
after vnet header but before actual data (tun does not need this data).
In hindsight, it would have been better to add the new info *before* the
packet: as it is, users need a lot of tricky code to skip the extra 2
bytes in the middle of the iovec, and in fact applications seem to get
it wrong, and only work with specific iovec layout. The fact we might
need to split iovec also means we might in theory overflow iovec max
size.
This patch adds a simpler way for applications to handle this,
and future proofs the interface against further extensions,
by making the size of the virtio net header configurable
from userspace. As a result, tun driver will simply
skip the extra 2 bytes on both input and output.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David S. Miller <davem@davemloft.net>
sk_callback_lock rwlock actually protects sk->sk_sleep pointer, so we
need two atomic operations (and associated dirtying) per incoming
packet.
RCU conversion is pretty much needed :
1) Add a new structure, called "struct socket_wq" to hold all fields
that will need rcu_read_lock() protection (currently: a
wait_queue_head_t and a struct fasync_struct pointer).
[Future patch will add a list anchor for wakeup coalescing]
2) Attach one of such structure to each "struct socket" created in
sock_alloc_inode().
3) Respect RCU grace period when freeing a "struct socket_wq"
4) Change sk_sleep pointer in "struct sock" by sk_wq, pointer to "struct
socket_wq"
5) Change sk_sleep() function to use new sk->sk_wq instead of
sk->sk_sleep
6) Change sk_has_sleeper() to wq_has_sleeper() that must be used inside
a rcu_read_lock() section.
7) Change all sk_has_sleeper() callers to :
- Use rcu_read_lock() instead of read_lock(&sk->sk_callback_lock)
- Use wq_has_sleeper() to eventually wakeup tasks.
- Use rcu_read_unlock() instead of read_unlock(&sk->sk_callback_lock)
8) sock_wake_async() is modified to use rcu protection as well.
9) Exceptions :
macvtap, drivers/net/tun.c, af_unix use integrated "struct socket_wq"
instead of dynamically allocated ones. They dont need rcu freeing.
Some cleanups or followups are probably needed, (possible
sk_callback_lock conversion to a spinlock for example...).
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Define a new function to return the waitqueue of a "struct sock".
static inline wait_queue_head_t *sk_sleep(struct sock *sk)
{
return sk->sk_sleep;
}
Change all read occurrences of sk_sleep by a call to this function.
Needed for a future RCU conversion. sk_sleep wont be a field directly
available.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following situation was observed in the field:
tap1 sends packets, tap2 does not consume them, as a result
tap1 can not be closed. This happens because
tun/tap devices can hang on to skbs undefinitely.
As noted by Herbert, possible solutions include a timeout followed by a
copy/change of ownership of the skb, or always copying/changing
ownership if we're going into a hostile device.
This patch implements the second approach.
Note: one issue still remaining is that since skbs
keep reference to tun socket and tun socket has a
reference to tun device, we won't flush backlog,
instead simply waiting for all skbs to get transmitted.
At least this is not user-triggerable, and
this was not reported in practice, my assumption is
other devices besides tap complete an skb
within finite time after it has been queued.
A possible solution for the second issue
would not to have socket reference the device,
instead, implement dev->destructor for tun, and
wait for all skbs to complete there, but this
needs some thought, probably too risky for 2.6.34.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Yan Vugenfirer <yvugenfi@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds Linux Socket Filter support to
tun driver.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In particular, several occurances of funny versions of 'success',
'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address',
'beginning', 'desirable', 'separate' and 'necessary' are fixed.
Signed-off-by: Daniel Mack <daniel@caiaq.de>
Cc: Joe Perches <joe@perches.com>
Cc: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Tun device looks similar to a packet socket
in that both pass complete frames from/to userspace.
This patch fills in enough fields in the socket underlying tun driver
to support sendmsg/recvmsg operations, and message flags
MSG_TRUNC and MSG_DONTWAIT, and exports access to this socket
to modules. Regular read/write behaviour is unchanged.
This way, code using raw sockets to inject packets
into a physical device, can support injecting
packets into host network stack almost without modification.
First user of this interface will be vhost virtualization
accelerator.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using macro tun_sk is more clear and shorter. However tun.c has tun_sk,
but doesn't use it.
Signed-off-by: Vitaliy Gusev <vgusev@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tun driver is the only code in the kernel that operates
on a character device with struct ifreq. Change the driver
to handle the conversion itself so we can contain the
remaining ifreq handling in the socket layer.
This also fixes a bug in the handling of invalid ioctl
numbers on an unbound tun device. The driver treats this
as a TUNSETIFF in native mode, but there is no way for
the generic compat_ioctl() function to emulate this
behaviour. Possibly the driver was only doing this
accidentally anyway, but if any code relies on this
misfeature, it now also works in compat mode.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The lock_kernel/unlock_kernel() in cycle_kernel_lock() which is called
in tun_chr_open() is not serializing against anything and safe to
remove.
tun_chr_fasync() is serialized by get/put_tun() and fasync_helper()
has no dependency on BKL. The modification of tun->flags is racy with
and without the BKL so removing it does not make it worse.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 2b980dbd77
("lsm: Add hooks to the TUN driver") tun_set_iff doesn't
return -EINVAL though neither IFF_TUN nor IFF_TAP is set.
Signed-off-by: Kusanagi Kouichi <slash@ma.neweb.ne.jp>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows subsytems to provide devtmpfs with non-default permissions
for the device node. Instead of the default mode of 0600, null, zero,
random, urandom, full, tty, ptmx now have a mode of 0666, which allows
non-privileged processes to access standard device nodes in case no
other userspace process applies the expected permissions.
This also fixes a wrong assignment in pktcdvd and a checkpatch.pl complain.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1623 commits)
netxen: update copyright
netxen: fix tx timeout recovery
netxen: fix file firmware leak
netxen: improve pci memory access
netxen: change firmware write size
tg3: Fix return ring size breakage
netxen: build fix for INET=n
cdc-phonet: autoconfigure Phonet address
Phonet: back-end for autoconfigured addresses
Phonet: fix netlink address dump error handling
ipv6: Add IFA_F_DADFAILED flag
net: Add DEVTYPE support for Ethernet based devices
mv643xx_eth.c: remove unused txq_set_wrr()
ucc_geth: Fix hangs after switching from full to half duplex
ucc_geth: Rearrange some code to avoid forward declarations
phy/marvell: Make non-aneg speed/duplex forcing work for 88E1111 PHYs
drivers/net/phy: introduce missing kfree
drivers/net/wan: introduce missing kfree
net: force bridge module(s) to be GPL
Subject: [PATCH] appletalk: Fix skb leak when ipddp interface is not loaded
...
Fixed up trivial conflicts:
- arch/x86/include/asm/socket.h
converted to <asm-generic/socket.h> in the x86 tree. The generic
header has the same new #define's, so that works out fine.
- drivers/net/tun.c
fix conflict between 89f56d1e9 ("tun: reuse struct sock fields") that
switched over to using 'tun->socket.sk' instead of the redundantly
available (and thus removed) 'tun->sk', and 2b980dbd ("lsm: Add hooks
to the TUN driver") which added a new 'tun->sk' use.
Noted in 'next' by Stephen Rothwell.
As tun always has an embeedded struct sock,
use sk and sk_receive_queue fields instead of
duplicating them in tun_struct.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These are all drivers that don't touch real hardware.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TUN driver lacks any LSM hooks which makes it difficult for LSM modules,
such as SELinux, to enforce access controls on network traffic generated by
TUN users; this is particularly problematic for virtualization apps such as
QEMU and KVM. This patch adds three new LSM hooks designed to control the
creation and attachment of TUN devices, the hooks are:
* security_tun_dev_create()
Provides access control for the creation of new TUN devices
* security_tun_dev_post_create()
Provides the ability to create the necessary socket LSM state for newly
created TUN devices
* security_tun_dev_attach()
Provides access control for attaching to existing, persistent TUN devices
and the ability to update the TUN device's socket LSM state as necessary
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Eric Paris <eparis@parisplace.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: James Morris <jmorris@namei.org>
As it is, parts of the ioctl runs under the RTNL and parts of
it do not. The unlocked section is still protected by the BKL,
but there can be subtle races. For example, Eric Biederman and
Paul Moore observed that if two threads tried to create two tun
devices on the same file descriptor, then unexpected results
may occur.
As there isn't anything in the ioctl that is expected to sleep
indefinitely, we can prevent this from occurring by extending
the RTNL lock coverage.
This also allows to get rid of the BKL.
Finally, I changed tun_get_iff to take a tun device in order to
avoid calling tun_put which would dead-lock as it also tries to
take the RTNL lock.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Allow setting UFO on tap device and handle UFO packets.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
---------------------------------------------------------
Signed-off-by: David S. Miller <davem@davemloft.net>