High-availability Seamless Redundancy ("HSR") provides instant failover
redundancy for Ethernet networks. It requires a special network topology where
all nodes are connected in a ring (each node having two physical network
interfaces). It is suited for applications that demand high availability and
very short reaction time.
HSR acts on the Ethernet layer, using a registered Ethernet protocol type to
send special HSR frames in both directions over the ring. The driver creates
virtual network interfaces that can be used just like any ordinary Linux
network interface, for IP/TCP/UDP traffic etc. All nodes in the network ring
must be HSR capable.
This code is a "best effort" to comply with the HSR standard as described in
IEC 62439-3:2010 (HSRv0).
Signed-off-by: Arvid Brodin <arvid.brodin@xdin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_flags=flags/mask
Bitwise match on TCP flags. The flags and mask are 16-bit num‐
bers written in decimal or in hexadecimal prefixed by 0x. Each
1-bit in mask requires that the corresponding bit in port must
match. Each 0-bit in mask causes the corresponding bit to be
ignored.
TCP protocol currently defines 9 flag bits, and additional 3
bits are reserved (must be transmitted as zero), see RFCs 793,
3168, and 3540. The flag bits are, numbering from the least
significant bit:
0: FIN No more data from sender.
1: SYN Synchronize sequence numbers.
2: RST Reset the connection.
3: PSH Push function.
4: ACK Acknowledgement field significant.
5: URG Urgent pointer field significant.
6: ECE ECN Echo.
7: CWR Congestion Windows Reduced.
8: NS Nonce Sum.
9-11: Reserved.
12-15: Not matchable, must be zero.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
This patch adds a separate ioctl for delivering syncpoint base number
to user space. If the syncpoint does not have an associated base, the
function returns -ENXIO.
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The gr3d engine renders images bottom-up. Allow buffers that are used
for 3D content to be marked as such and implement support in the display
controller to present them properly.
Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The gr2d and gr3d engines work more efficiently on buffers with a tiled
memory layout. Allow created buffers to be marked as tiled so that the
display controller can scan them out properly.
Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
So far we've succeeded at making KVM and VFIO mostly unaware of each
other, but areas are cropping up where a connection beyond eventfds
and irqfds needs to be made. This patch introduces a KVM-VFIO device
that is meant to be a gateway for such interaction. The user creates
the device and can add and remove VFIO groups to it via file
descriptors. When a group is added, KVM verifies the group is valid
and gets a reference to it via the VFIO external user interface.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a kvm ioctl which states which system functionality kvm emulates.
The format used is that of CPUID and we return the corresponding CPUID
bits set for which we do emulate functionality.
Make sure ->padding is being passed on clean from userspace so that we
can use it for something in the future, after the ioctl gets cast in
stone.
s/kvm_dev_ioctl_get_supported_cpuid/kvm_dev_ioctl_get_cpuid/ while at
it.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This work contains a lightweight BPF-based traffic classifier that can
serve as a flexible alternative to ematch-based tree classification, i.e.
now that BPF filter engine can also be JITed in the kernel. Naturally, tc
actions and policies are supported as well with cls_bpf. Multiple BPF
programs/filter can be attached for a class, or they can just as well be
written within a single BPF program, that's really up to the user how he
wishes to run/optimize the code, e.g. also for inversion of verdicts etc.
The notion of a BPF program's return/exit codes is being kept as follows:
0: No match
-1: Select classid given in "tc filter ..." command
else: flowid, overwrite the default one
As a minimal usage example with iproute2, we use a 3 band prio root qdisc
on a router with sfq each as leave, and assign ssh and icmp bpf-based
filters to band 1, http traffic to band 2 and the rest to band 3. For the
first two bands we load the bytecode from a file, in the 2nd we load it
inline as an example:
echo 1 > /proc/sys/net/core/bpf_jit_enable
tc qdisc del dev em1 root
tc qdisc add dev em1 root handle 1: prio bands 3 priomap 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
tc qdisc add dev em1 parent 1:1 sfq perturb 16
tc qdisc add dev em1 parent 1:2 sfq perturb 16
tc qdisc add dev em1 parent 1:3 sfq perturb 16
tc filter add dev em1 parent 1: bpf run bytecode-file /etc/tc/ssh.bpf flowid 1:1
tc filter add dev em1 parent 1: bpf run bytecode-file /etc/tc/icmp.bpf flowid 1:1
tc filter add dev em1 parent 1: bpf run bytecode-file /etc/tc/http.bpf flowid 1:2
tc filter add dev em1 parent 1: bpf run bytecode "`bpfc -f tc -i misc.ops`" flowid 1:3
BPF programs can be easily created and passed to tc, either as inline
'bytecode' or 'bytecode-file'. There are a couple of front-ends that can
compile opcodes, for example:
1) People familiar with tcpdump-like filters:
tcpdump -iem1 -ddd port 22 | tr '\n' ',' > /etc/tc/ssh.bpf
2) People that want to low-level program their filters or use BPF
extensions that lack support by libpcap's compiler:
bpfc -f tc -i ssh.ops > /etc/tc/ssh.bpf
ssh.ops example code:
ldh [12]
jne #0x800, drop
ldb [23]
jneq #6, drop
ldh [20]
jset #0x1fff, drop
ldxb 4 * ([14] & 0xf)
ldh [%x + 14]
jeq #0x16, pass
ldh [%x + 16]
jne #0x16, drop
pass: ret #-1
drop: ret #0
It was chosen to load bytecode into tc, since the reverse operation,
tc filter list dev em1, is then able to show the exact commands again.
Possible follow-up work could also include a small expression compiler
for iproute2. Tested with the help of bmon. This idea came up during
the Netfilter Workshop 2013 in Copenhagen. Also thanks to feedback from
Eric Dumazet!
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PPC64 people noticed a missing memory barrier and crufty old
comments in the perf ring buffer code. So update all the comments and
add the missing barrier.
When the architecture implements local_t using atomic_long_t there
will be double barriers issued; but short of introducing more
conditional barrier primitives this is the best we can do.
Reported-by: Victor Kaplansky <victork@il.ibm.com>
Tested-by: Victor Kaplansky <victork@il.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: michael@ellerman.id.au
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: anton@samba.org
Cc: benh@kernel.crashing.org
Link: http://lkml.kernel.org/r/20131025173749.GG19466@laptop.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Since the parsed sec= flavor is now stored in nfs_server->auth_info,
we no longer need an nfs_server flag to determine if a sec= option was
used.
This flag has not been completely removed because it is still needed for
the (old but still supported) non-text parsed mount options ABI
compatability.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
VPE is a block which consists of a single memory to memory path which
can perform chrominance up/down sampling, de-interlacing, scaling, and
color space conversion of raster or tiled YUV420 coplanar, YUV422
coplanar or YUV422 interleaved video formats.
We create a mem2mem driver based primarily on the mem2mem-testdev
example. The de-interlacer, scaler and color space converter are all
bypassed for now to keep the driver simple. Chroma up/down sampler
blocks are implemented, so conversion beteen different YUV formats is
possible.
Each mem2mem context allocates a buffer for VPE MMR values which it will
use when it gets access to the VPE HW via the mem2mem queue, it also
allocates a VPDMA descriptor list to which configuration and data
descriptors are added.
Based on the information received via v4l2 ioctls for the source and
destination queues, the driver configures the values for the MMRs, and
stores them in the buffer. There are also some VPDMA parameters like
frame start and line mode which needs to be configured, these are
configured by direct register writes via the VPDMA helper functions.
The driver's device_run() mem2mem op will add each descriptor based on
how the source and destination queues are set up for the given ctx, once
the list is prepared, it's submitted to VPDMA, these descriptors when
parsed by VPDMA will upload MMR registers, start DMA of video buffers on
the various input and output clients/ports.
When the list is parsed completely(and the DMAs on all the output ports
done), an interrupt is generated which we use to notify that the source
and destination buffers are done. The rest of the driver is quite
similar to other mem2mem drivers, we use the multiplane v4l2 ioctls as
the HW support coplanar formats.
Signed-off-by: Archit Taneja <archit@ti.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Kamil Debski <k.debski@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
To use DFS in IBSS mode, userspace is required to react to radar events.
It can inform nl80211 that it is capable of doing so by adding a
NL80211_ATTR_HANDLE_DFS attribute when joining the IBSS.
This attribute is supplied to let the kernelspace know that the
userspace application can and will handle radar events, e.g. by
intiating channel switches to a valid channel. DFS channels may
only be used if this attribute is supplied and the driver supports
it. Driver support will be checked even if a channel without DFS
will be initially joined, as a DFS channel may be chosen later.
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
[fix attribute name in commit message]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The helper is for user applications, and it is just a copy of
the kernel helper: mtd_type_is_nand();
Signed-off-by: Huang Shijie <b32955@freescale.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
In current code, the MTD_NANDFLASH is used to represent both the SLC and
MLC. It is confusing to us.
By adding an explicit comment about these two macros, this patch makes it
clear that:
MTD_NANDFLASH : stands for SLC NAND,
MTD_MLCNANDFLASH : stands for MLC NAND (including TLC).
Signed-off-by: Huang Shijie <b32955@freescale.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
- Further work on the dmaengine helpers, including support for
configuring the parameters for DMA by reading the capabilities of the
DMA controller which removes some guesswork and magic numbers fromm
drivers.
- A refresh of the documentation.
- Conversions of many drivers to direct regmap API usage in order to
allow the ASoC level register I/O code to be removed, this will
hopefully be completed by v3.14.
- Support for using async register I/O in DAPM, reducing the time taken
to implement power transitions on systems that support it.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJSajLdAAoJELSic+t+oim9MVEQAJ3t7df5K9R/OynjhKiEFxpP
cBWo306CegZ5oO17UqG+SReJkOWgUI8zIUkNC818suTjtgyhv4WUBx1QgXG8akO5
arHZEQGyReLxgWbnO5ScP7BJt5ZYldfQWN+NPnNlzwvVA8R4xChvAwuHL+kUSSYW
DrOb0ag/Gtn2jQo3o9GbZb5c3UhZqoMg/pQSoVtnvG/O8N/xR0yoeXGsdJv1su6g
OKhCJTRWU8v3FONatR2wWXnSrCBOeJ2Ec7YUJil1FQQdENYZfV3AOsFHxmqsyG2O
Xj2P7CioY0JY7dtFcKjrXgsnjvgZmVdVsdegTJPWS9RjunjyupvSyhMhZYkoA60j
V7RxyIbHAx7hILQqCYYhlOczYHom4MSwAGGt7y7T3oKt0432RvIjE2fP7sTGaqD8
wzuVYuVl4km03xX9g9abF6xjyDE6e+4wun+d8kSvOosvd/nF47gkXUXEvPZh0Ley
013e5fHNDaNF4uaSVXE169JyVxVnHP6nXJDRWZakXsryGXGUpn0quIzobf6fb6XE
fY5Q3QoyP5rHdSMIvGN5Gi76KsHF5CWILWqcWLEVPLnaf9gJmrp3IypmF1c8i7VE
CrcTim5mhNePEX56skRaHhpYHmsxYApSAzxNAA/t3cJ2rtwb87jMM4jOcjHi/war
emSVe5lXkcwv/lU/Pa0N
=rVsK
-----END PGP SIGNATURE-----
Merge tag 'asoc-v3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-next
ASoC: Updates for v3.13
- Further work on the dmaengine helpers, including support for
configuring the parameters for DMA by reading the capabilities of the
DMA controller which removes some guesswork and magic numbers fromm
drivers.
- A refresh of the documentation.
- Conversions of many drivers to direct regmap API usage in order to
allow the ASoC level register I/O code to be removed, this will
hopefully be completed by v3.14.
- Support for using async register I/O in DAPM, reducing the time taken
to implement power transitions on systems that support it.
Conflicts:
drivers/net/usb/qmi_wwan.c
include/net/dst.h
Trivial merge conflicts, both were overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
"Sorry I let so much accumulate, I was in Buffalo and wanted a few
things to cook in my tree for a while before sending to you. Anyways,
it's a lot of little things as usual at this stage in the game"
1) Make bonding MAINTAINERS entry reflect reality, from Andy
Gospodarek.
2) Fix accidental sock_put() on timewait mini sockets, from Eric
Dumazet.
3) Fix crashes in l2tp due to mis-handling of ipv4 mapped ipv6
addresses, from François CACHEREUL.
4) Fix heap overflow in __audit_sockaddr(), from the eagle eyed Dan
Carpenter.
5) tcp_shifted_skb() doesn't take handle FINs properly, from Eric
Dumazet.
6) SFC driver bug fixes from Ben Hutchings.
7) Fix TX packet scheduling wedge after channel change in ath9k driver,
from Felix Fietkau.
8) Fix user after free in BPF JIT code, from Alexei Starovoitov.
9) Source address selection test is reversed in
__ip_route_output_key(), fix from Jiri Benc.
10) VLAN and CAN layer mis-size netlink attributes, from Marc
Kleine-Budde.
11) Fix permission checks in sysctls to use current_euid() instead of
current_uid(). From Eric W Biederman.
12) IPSEC policies can go away while a timer is still pending for them,
add appropriate ref-counting to fix, from Steffen Klassert.
13) Fix mis-programming of FDR and RMCR registers on R8A7740 sh_eth
chips, from Nguyen Hong Ky and Simon Horman.
14) MLX4 forgets to DMA unmap pages on RX, fix from Amir Vadai.
15) IPV6 GRE tunnel MTU upper limit is miscalculated, from Oussama
Ghorbel.
16) Fix typo in fq_change(), we were assigning "initial quantum" to
"quantum". From Eric Dumazet.
17) Set a more appropriate sk_pacing_rate for non-TCP sockets, otherwise
FQ packet scheduler does not pace those flows properly. Also from
Eric Dumazet.
18) rtlwifi miscalculates packet pointers, from Mark Cave-Ayland.
19) l2tp_xmit_skb() can be called from process context, not just softirq
context, so we must always make sure to BH disable around it. From
Eric Dumazet.
20) On qdisc reset, we forget to purge the RB tree of SKBs in netem
packet scheduler. From Stephen Hemminger.
21) Fix info leak in farsync WAN driver ioctl() handler, from Dan
Carpenter and Salva Peiró.
22) Fix PHY reset and other issues in dm9000 driver, from Nikita
Kiryanov and Michael Abbott.
23) When hardware can do SCTP crc32 checksums, we accidently don't
disable the csum offload when IPSEC transformations have been
applied. From Fan Du and Vlad Yasevich.
24) Tail loss probing in TCP leaves the socket in the wrong congestion
avoidance state. From Yuchung Cheng.
25) In CPSW driver, enable NAPI before interrupts are turned on, from
Markus Pargmann.
26) Integer underflow and dual-assignment in YAM hamradio driver, from
Dan Carpenter.
27) If we are going to mangle a packet in tcp_set_skb_tso_segs() we must
unclone it. This fixes various hard to track down crashes in
drivers where the SKBs ->gso_segs was changing right from underneath
the driver during TX queueing. From Eric Dumazet.
28) Fix the handling of VLAN IDs, and in particular the special IDs 0
and 4095, in the bridging layer. From Toshiaki Makita.
29) Another info leak, this time in wanxl WAN driver, from Salva Peiró.
30) Fix race in socket credential passing, from Daniel Borkmann.
31) WHen NETLABEL is disabled, we don't validate CIPSO packets properly,
from Seif Mazareeb.
32) Fix identification of fragmented frames in ipv4/ipv6 UDP
Fragmentation Offload output paths, from Jiri Pirko.
33) Virtual Function fixes in bnx2x driver from Yuval Mintz and Ariel
Elior.
34) When we removed the explicit neighbour pointer from ipv6 routes a
slight regression was introduced for users such as IPVS, xt_TEE, and
raw sockets. We mix up the users requested destination address with
the routes assigned nexthop/gateway. From Julian Anastasov and
Simon Horman.
35) Fix stack overruns in rt6_probe(), the issue is that can end up
doing two full packet xmit paths at the same time when emitting
neighbour discovery messages. From Hannes Frederic Sowa.
36) davinci_emac driver doesn't handle IFF_ALLMULTI correctly, from
Mariusz Ceier.
37) Make sure to set TCP sk_pacing_rate after the first legitimate RTT
sample, from Neal Cardwell.
38) Wrong netlink attribute passed to xfrm_replay_verify_len(), from
Steffen Klassert.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (152 commits)
ax88179_178a: Add VID:DID for Samsung USB Ethernet Adapter
ax88179_178a: Correct the RX error definition in RX header
Revert "bridge: only expire the mdb entry when query is received"
tcp: initialize passive-side sk_pacing_rate after 3WHS
davinci_emac.c: Fix IFF_ALLMULTI setup
mac802154: correct a typo in ieee802154_alloc_device() prototype
ipv6: probe routes asynchronous in rt6_probe
netfilter: nf_conntrack: fix rt6i_gateway checks for H.323 helper
ipv6: fill rt6i_gateway with nexthop address
ipv6: always prefer rt6i_gateway if present
bnx2x: Set NETIF_F_HIGHDMA unconditionally
bnx2x: Don't pretend during register dump
bnx2x: Lock DMAE when used by statistic flow
bnx2x: Prevent null pointer dereference on error flow
bnx2x: Fix config when SR-IOV and iSCSI are enabled
bnx2x: Fix Coalescing configuration
bnx2x: Unlock VF-PF channel on MAC/VLAN config error
bnx2x: Prevent an illegal pointer dereference during panic
bnx2x: Fix Maximum CoS estimation for VFs
drivers: net: cpsw: fix kernel warn during iperf test with interrupt pacing
...
Collect mega flow mask stats. ovs-dpctl show command can be used to
display them for debugging and performance tuning.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
This adds support for the Armada 510 display subsystem found on the
Marvell Dove devices. This IP is re-used across several different Marvell
SoCs with various tweaks, and this driver has been structured to allow
the other IPs to re-use the bulk of this code; further work in this area
is expected from interested parties.
This has been extensively tested on the SolidRun Cubox platform and
appears to work well there.
[airlied: update for api changes merged previous to this]
The create_flow/destroy_flow uverbs and the associated extensions to
the user-kernel verbs ABI are under review and are too experimental to
freeze at this point.
So userspace is not exposed to experimental features and an uinstable
ABI, temporarily disable this for v3.12 (with a Kconfig option behind
staging to reenable it if desired).
The feature will be enabled after proper cleanup for v3.13.
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Link: http://marc.info/?i=cover.1381351016.git.ydroneaud@opteya.com
Link: http://marc.info/?i=cover.1381177342.git.ydroneaud@opteya.com
[ Add a Kconfig option to reenable these verbs. - Roland ]
Signed-off-by: Roland Dreier <roland@purestorage.com>
Pavel Roskin reported that DRM_IOCTL_MODE_GETCONNECTOR was overwritting
the 4 bytes beyond the end of its structure with a 32-bit userspace
running on a 64-bit kernel. This is due to the padding gcc inserts as
the drm_mode_get_connector struct includes a u64 and its size is not a
natural multiple of u64s.
64-bit kernel:
sizeof(drm_mode_get_connector)=80, alignof=8
sizeof(drm_mode_get_encoder)=20, alignof=4
sizeof(drm_mode_modeinfo)=68, alignof=4
32-bit userspace:
sizeof(drm_mode_get_connector)=76, alignof=4
sizeof(drm_mode_get_encoder)=20, alignof=4
sizeof(drm_mode_modeinfo)=68, alignof=4
Fortuituously we can insert explicit padding to the tail of our
structures without breaking ABI.
Reported-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Airlie <airlied@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
As a start point for further development, this is an incomplete driver
for DICE devices:
- only playback (so no clock source except the bus clock)
- only 44.1 kHz
- no MIDI
- recovery after bus reset is slow
- hwdep device is created, but not actually implemented
Contains compilation fixes by Stefan Richter.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
This moves the kvmppc_ops callbacks to be a per VM entity. This
enables us to select HV and PR mode when creating a VM. We also
allow both kvm-hv and kvm-pr kernel module to be loaded. To
achieve this we move /dev/kvm ownership to kvm.ko module. Depending on
which KVM mode we select during VM creation we take a reference
count on respective module
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[agraf: fix coding style]
Signed-off-by: Alexander Graf <agraf@suse.de>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJSXeGaAAoJEEtpOizt6ddyeyYH/AnWdKGUELjxC0lIBDkTitnD
znyzSxqXG6z1Z6d+EYI3XCL1eB3dtyOBSJsZj45adG4HXGkCmGqosgDzivGO6GcI
yhjYgXGhP8ZvIwky1ijbVQODaEE70SEYqKwyCpU4rLJw2uRkbfRaxTrpgnusL8Bg
RG37uaOS/sasLoNxCe5GEUjm8BFGbvZGVAjcL7yJTPBw5qd7GYBxndFSTILa2iRQ
ikoBD0bUVhoaBUqSNQenoNllUBwDpFJF1HiEXKMJkUIxX/FggrSvRp8A/MAWDBw0
6Ef1P8Pt/hMfMQpOOeu8QFWM2s+smh2rTkO/O9mqi/tSvEf5YcZHMAl48B8OR88=
=tJ2u
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-3.13-1' of git://git.linaro.org/people/cdall/linux-kvm-arm into next
Updates for KVM/ARM including cpu=host and Cortex-A7 support
This patch adds a batch support to nfnetlink. Basically, it adds
two new control messages:
* NFNL_MSG_BATCH_BEGIN, that indicates the beginning of a batch,
the nfgenmsg->res_id indicates the nfnetlink subsystem ID.
* NFNL_MSG_BATCH_END, that results in the invocation of the
ss->commit callback function. If not specified or an error
ocurred in the batch, the ss->abort function is invoked
instead.
The end message represents the commit operation in nftables, the
lack of end message results in an abort. This patch also adds the
.call_batch function that is only called from the batch receival
path.
This patch adds atomic rule updates and dumps based on
bitmask generations. This allows to atomically commit a set of
rule-set updates incrementally without altering the internal
state of existing nf_tables expressions/matches/targets.
The idea consists of using a generation cursor of 1 bit and
a bitmask of 2 bits per rule. Assuming the gencursor is 0,
then the genmask (expressed as a bitmask) can be interpreted
as:
00 active in the present, will be active in the next generation.
01 inactive in the present, will be active in the next generation.
10 active in the present, will be deleted in the next generation.
^
gencursor
Once you invoke the transition to the next generation, the global
gencursor is updated:
00 active in the present, will be active in the next generation.
01 active in the present, needs to zero its future, it becomes 00.
10 inactive in the present, delete now.
^
gencursor
If a dump is in progress and nf_tables enters a new generation,
the dump will stop and return -EBUSY to let userspace know that
it has to retry again. In order to invalidate dumps, a global
genctr counter is increased everytime nf_tables enters a new
generation.
This new operation can be used from the user-space utility
that controls the firewall, eg.
nft -f restore
The rule updates contained in `file' will be applied atomically.
cat file
-----
add filter INPUT ip saddr 1.1.1.1 counter accept #1
del filter INPUT ip daddr 2.2.2.2 counter drop #2
-EOF-
Note that the rule 1 will be inactive until the transition to the
next generation, the rule 2 will be evicted in the next generation.
There is a penalty during the rule update due to the branch
misprediction in the packet matching framework. But that should be
quickly resolved once the iteration over the commit list that
contain rules that require updates is finished.
Event notification happens once the rule-set update has been
committed. So we skip notifications is case the rule-set update
is aborted, which can happen in case that the rule-set is tested
to apply correctly.
This patch squashed the following patches from Pablo:
* nf_tables: atomic rule updates and dumps
* nf_tables: get rid of per rule list_head for commits
* nf_tables: use per netns commit list
* nfnetlink: add batch support and use it from nf_tables
* nf_tables: all rule updates are transactional
* nf_tables: attach replacement rule after stale one
* nf_tables: do not allow deletion/replacement of stale rules
* nf_tables: remove unused NFTA_RULE_FLAGS
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds a new rule attribute NFTA_RULE_POSITION which is
used to store the position of a rule relatively to the others.
By providing the create command and specifying the position, the
rule is inserted after the rule with the handle equal to the
provided position.
Regarding notification, the position attribute specifies the
handle of the previous rule to make sure we don't point to any
stale rule in notifications coming from the commit path.
This patch includes the following fix from Pablo:
* nf_tables: fix rule deletion event reporting
Signed-off-by: Eric Leblond <eric@regit.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch generalizes the NAT expression to support both IPv4 and IPv6
using the existing IPv4/IPv6 NAT infrastructure. This also adds the
NAT chain type for IPv6.
This patch collapses the following patches that were posted to the
netfilter-devel mailing list, from Tomasz:
* nf_tables: Change NFTA_NAT_ attributes to better semantic significance
* nf_tables: Split IPv4 NAT into NAT expression and IPv4 NAT chain
* nf_tables: Add support for IPv6 NAT expression
* nf_tables: Add support for IPv6 NAT chain
* nf_tables: Fix up build issue on IPv6 NAT support
And, from Pablo Neira Ayuso:
* fix missing dependencies in nft_chain_nat
Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch allows you to temporarily disable an entire table.
You can change the state of a dormant table via NFT_MSG_NEWTABLE
messages. Using this operation you can wake up a table, so their
chains are registered.
This provides atomicity at chain level. Thus, the rule-set of one
chain is applied at once, avoiding any possible intermediate state
in every chain. Still, the chains that belongs to a table are
registered consecutively. This also allows you to have inactive
tables in the kernel.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds the x_tables compatibility layer. This allows you
to use existing x_tables matches and targets from nf_tables.
This compatibility later allows us to use existing matches/targets
for features that are still missing in nf_tables. We can progressively
replace them with native nf_tables extensions. It also provides the
userspace compatibility software that allows you to express the
rule-set using the iptables syntax but using the nf_tables kernel
components.
In order to get this compatibility layer working, I've done the
following things:
* add NFNL_SUBSYS_NFT_COMPAT: this new nfnetlink subsystem is used
to query the x_tables match/target revision, so we don't need to
use the native x_table getsockopt interface.
* emulate xt structures: this required extending the struct nft_pktinfo
to include the fragment offset, which is already obtained from
ip[6]_tables and that is used by some matches/targets.
* add support for default policy to base chains, required to emulate
x_tables.
* add NFTA_CHAIN_USE attribute to obtain the number of references to
chains, required by x_tables emulation.
* add chain packet/byte counters using per-cpu.
* support 32-64 bits compat.
For historical reasons, this patch includes the following patches
that were posted in the netfilter-devel mailing list.
From Pablo Neira Ayuso:
* nf_tables: add default policy to base chains
* netfilter: nf_tables: add NFTA_CHAIN_USE attribute
* nf_tables: nft_compat: private data of target and matches in contiguous area
* nf_tables: validate hooks for compat match/target
* nf_tables: nft_compat: release cached matches/targets
* nf_tables: x_tables support as a compile time option
* nf_tables: fix alias for xtables over nftables module
* nf_tables: add packet and byte counters per chain
* nf_tables: fix per-chain counter stats if no counters are passed
* nf_tables: don't bump chain stats
* nf_tables: add protocol and flags for xtables over nf_tables
* nf_tables: add ip[6]t_entry emulation
* nf_tables: move specific layer 3 compat code to nf_tables_ipv[4|6]
* nf_tables: support 32bits-64bits x_tables compat
* nf_tables: fix compilation if CONFIG_COMPAT is disabled
From Patrick McHardy:
* nf_tables: move policy to struct nft_base_chain
* nf_tables: send notifications for base chain policy changes
From Alexander Primak:
* nf_tables: remove the duplicate NF_INET_LOCAL_OUT
From Nicolas Dichtel:
* nf_tables: fix compilation when nf-netlink is a module
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch converts built-in tables/chains to chain types that
allows you to deploy customized table and chain configurations from
userspace.
After this patch, you have to specify the chain type when
creating a new chain:
add chain ip filter output { type filter hook input priority 0; }
^^^^ ------
The existing chain types after this patch are: filter, route and
nat. Note that tables are just containers of chains with no specific
semantics, which is a significant change with regards to iptables.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds the new netlink API for maintaining nf_tables sets
independently of the ruleset. The API supports the following operations:
- creation of sets
- deletion of sets
- querying of specific sets
- dumping of all sets
- addition of set elements
- removal of set elements
- dumping of all set elements
Sets are identified by name, each table defines an individual namespace.
The name of a set may be allocated automatically, this is mostly useful
in combination with the NFT_SET_ANONYMOUS flag, which destroys a set
automatically once the last reference has been released.
Sets can be marked constant, meaning they're not allowed to change while
linked to a rule. This allows to perform lockless operation for set
types that would otherwise require locking.
Additionally, if the implementation supports it, sets can (as before) be
used as maps, associating a data value with each key (or range), by
specifying the NFT_SET_MAP flag and can be used for interval queries by
specifying the NFT_SET_INTERVAL flag.
Set elements are added and removed incrementally. All element operations
support batching, reducing netlink message and set lookup overhead.
The old "set" and "hash" expressions are replaced by a generic "lookup"
expression, which binds to the specified set. Userspace is not aware
of the actual set implementation used by the kernel anymore, all
configuration options are generic.
Currently the implementation selection logic is largely missing and the
kernel will simply use the first registered implementation supporting the
requested operation. Eventually, the plan is to have userspace supply a
description of the data characteristics and select the implementation
based on expected performance and memory use.
This patch includes the new 'lookup' expression to look up for element
matching in the set.
This patch includes kernel-doc descriptions for this set API and it
also includes the following fixes.
From Patrick McHardy:
* netfilter: nf_tables: fix set element data type in dumps
* netfilter: nf_tables: fix indentation of struct nft_set_elem comments
* netfilter: nf_tables: fix oops in nft_validate_data_load()
* netfilter: nf_tables: fix oops while listing sets of built-in tables
* netfilter: nf_tables: destroy anonymous sets immediately if binding fails
* netfilter: nf_tables: propagate context to set iter callback
* netfilter: nf_tables: add loop detection
From Pablo Neira Ayuso:
* netfilter: nf_tables: allow to dump all existing sets
* netfilter: nf_tables: fix wrong type for flags variable in newelem
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds nftables which is the intended successor of iptables.
This packet filtering framework reuses the existing netfilter hooks,
the connection tracking system, the NAT subsystem, the transparent
proxying engine, the logging infrastructure and the userspace packet
queueing facilities.
In a nutshell, nftables provides a pseudo-state machine with 4 general
purpose registers of 128 bits and 1 specific purpose register to store
verdicts. This pseudo-machine comes with an extensible instruction set,
a.k.a. "expressions" in the nftables jargon. The expressions included
in this patch provide the basic functionality, they are:
* bitwise: to perform bitwise operations.
* byteorder: to change from host/network endianess.
* cmp: to compare data with the content of the registers.
* counter: to enable counters on rules.
* ct: to store conntrack keys into register.
* exthdr: to match IPv6 extension headers.
* immediate: to load data into registers.
* limit: to limit matching based on packet rate.
* log: to log packets.
* meta: to match metainformation that usually comes with the skbuff.
* nat: to perform Network Address Translation.
* payload: to fetch data from the packet payload and store it into
registers.
* reject (IPv4 only): to explicitly close connection, eg. TCP RST.
Using this instruction-set, the userspace utility 'nft' can transform
the rules expressed in human-readable text representation (using a
new syntax, inspired by tcpdump) to nftables bytecode.
nftables also inherits the table, chain and rule objects from
iptables, but in a more configurable way, and it also includes the
original datatype-agnostic set infrastructure with mapping support.
This set infrastructure is enhanced in the follow up patch (netfilter:
nf_tables: add netlink set API).
This patch includes the following components:
* the netlink API: net/netfilter/nf_tables_api.c and
include/uapi/netfilter/nf_tables.h
* the packet filter core: net/netfilter/nf_tables_core.c
* the expressions (described above): net/netfilter/nft_*.c
* the filter tables: arp, IPv4, IPv6 and bridge:
net/ipv4/netfilter/nf_tables_ipv4.c
net/ipv6/netfilter/nf_tables_ipv6.c
net/ipv4/netfilter/nf_tables_arp.c
net/bridge/netfilter/nf_tables_bridge.c
* the NAT table (IPv4 only):
net/ipv4/netfilter/nf_table_nat_ipv4.c
* the route table (similar to mangle):
net/ipv4/netfilter/nf_table_route_ipv4.c
net/ipv6/netfilter/nf_table_route_ipv6.c
* internal definitions under:
include/net/netfilter/nf_tables.h
include/net/netfilter/nf_tables_core.h
* It also includes an skeleton expression:
net/netfilter/nft_expr_template.c
and the preliminary implementation of the meta target
net/netfilter/nft_meta_target.c
It also includes a change in struct nf_hook_ops to add a new
pointer to store private data to the hook, that is used to store
the rule list per chain.
This patch is based on the patch from Patrick McHardy, plus merged
accumulated cleanups, fixes and small enhancements to the nftables
code that has been done since 2009, which are:
From Patrick McHardy:
* nf_tables: adjust netlink handler function signatures
* nf_tables: only retry table lookup after successful table module load
* nf_tables: fix event notification echo and avoid unnecessary messages
* nft_ct: add l3proto support
* nf_tables: pass expression context to nft_validate_data_load()
* nf_tables: remove redundant definition
* nft_ct: fix maxattr initialization
* nf_tables: fix invalid event type in nf_tables_getrule()
* nf_tables: simplify nft_data_init() usage
* nf_tables: build in more core modules
* nf_tables: fix double lookup expression unregistation
* nf_tables: move expression initialization to nf_tables_core.c
* nf_tables: build in payload module
* nf_tables: use NFPROTO constants
* nf_tables: rename pid variables to portid
* nf_tables: save 48 bits per rule
* nf_tables: introduce chain rename
* nf_tables: check for duplicate names on chain rename
* nf_tables: remove ability to specify handles for new rules
* nf_tables: return error for rule change request
* nf_tables: return error for NLM_F_REPLACE without rule handle
* nf_tables: include NLM_F_APPEND/NLM_F_REPLACE flags in rule notification
* nf_tables: fix NLM_F_MULTI usage in netlink notifications
* nf_tables: include NLM_F_APPEND in rule dumps
From Pablo Neira Ayuso:
* nf_tables: fix stack overflow in nf_tables_newrule
* nf_tables: nft_ct: fix compilation warning
* nf_tables: nft_ct: fix crash with invalid packets
* nft_log: group and qthreshold are 2^16
* nf_tables: nft_meta: fix socket uid,gid handling
* nft_counter: allow to restore counters
* nf_tables: fix module autoload
* nf_tables: allow to remove all rules placed in one chain
* nf_tables: use 64-bits rule handle instead of 16-bits
* nf_tables: fix chain after rule deletion
* nf_tables: improve deletion performance
* nf_tables: add missing code in route chain type
* nf_tables: rise maximum number of expressions from 12 to 128
* nf_tables: don't delete table if in use
* nf_tables: fix basechain release
From Tomasz Bursztyka:
* nf_tables: Add support for changing users chain's name
* nf_tables: Change chain's name to be fixed sized
* nf_tables: Add support for replacing a rule by another one
* nf_tables: Update uapi nftables netlink header documentation
From Florian Westphal:
* nft_log: group is u16, snaplen u32
From Phil Oester:
* nf_tables: operational limit match
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds support for the pair of LCD controllers on the Marvell
Armada 510 SoCs. This driver supports:
- multiple contiguous scanout buffers for video and graphics
- shm backed cacheable buffer objects for X pixmaps for Vivante GPU
acceleration
- dual lcd0 and lcd1 crt operation
- video overlay on each LCD crt via DRM planes
- page flipping of the main scanout buffers
- DRM prime for buffer export/import
This driver is trivial to extend to other Armada SoCs.
Included in this commit is the core driver with no output support; output
support is platform and encoder driver dependent.
Tested-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The information of the peer's supported channels and supported operating
classes are required for the driver to perform TDLS off channel
operations. This commit enhances the function nl80211_(new)set_station
to pass this information of the peer to the driver.
Signed-off-by: Sunil Dutt <c_duttus@qti.qualcomm.com>
[return errors for malformed tuples]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
It is incorrect to refer to this as 11d as 802.11d was just a
proposed amendment, 802.11d was merged to the standard so
use proper terminology.
Signed-off-by: Luis R. Rodriguez <mcgrof@do-not-panic.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Conflicts:
include/linux/netdevice.h
net/core/sock.c
Trivial merge issues.
Removal of "extern" for functions declaration in netdevice.h
at the same time "const" was added to an argument.
Two parallel line additions in net/core/sock.c
Signed-off-by: David S. Miller <davem@davemloft.net>
The names are prefixed incorrectly on the documentation.
Signed-off-by: Luis R. Rodriguez <mcgrof@do-not-panic.com>
[also remove spurious blank line]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This patch enables support for OSPM suspend and resume in the MIC
driver. During a host suspend event, the driver performs an
orderly shutdown of the cards if they are online. Upon resume, any
cards that were previously online before suspend are rebooted.
The driver performs an orderly shutdown of the card primarily to
ensure that applications in the card are terminated and mounted
devices are safely un-mounted before the card is powered down in
the event of an OSPM suspend.
The driver makes use of the MIC daemon to accomplish OSPM suspend
and resume. The driver registers a PM notifier per MIC device.
The devices get notified synchronously during PM_SUSPEND_PREPARE and
PM_POST_SUSPEND phases.
During the PM_SUSPEND_PREPARE phase, the driver performs one of the
following three tasks.
1) If the card is 'offline', the driver sets the card to a
'suspended' state and returns.
2) If the card is 'online', the driver initiates card shutdown by
setting the card state to suspending. This notifies the MIC
daemon which invokes shutdown and sets card state to 'suspended'.
The driver returns after the shutdown is complete.
3) If the card is already being shutdown, possibly by a host user
space application, the driver sets the card state to 'suspended'
and returns after the shutdown is complete.
During the PM_POST_SUSPEND phase, the driver simply notifies the
daemon and returns. The daemon boots those cards that were previously
online during the suspend phase.
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pablo Neira Ayuso says:
====================
The following patchset contains Netfilter updates for your net-next tree,
mostly ipset improvements and enhancements features, they are:
* Don't call ip_nest_end needlessly in the error path from me, suggested
by Pablo Neira Ayuso, from Jozsef Kadlecsik.
* Fixed sparse warnings about shadowed variable and missing rcu annotation
and fix of "may be used uninitialized" warnings, also from Jozsef.
* Renamed simple macro names to avoid namespace issues, reported by David
Laight, again from Jozsef.
* Use fix sized type for timeout in the extension part, and cosmetic
ordering of matches and targets separatedly in xt_set.c, from Jozsef.
* Support package fragments for IPv4 protos without ports from Anders K.
Pedersen. For example this allows a hash:ip,port ipset containing the
entry 192.168.0.1,gre:0 to match all package fragments for PPTP VPN
tunnels to/from the host. Without this patch only the first package
fragment (with fragment offset 0) was matched.
* Introduced a new operation to get both setname and family, from Jozsef.
ip[6]tables set match and SET target need to know the family of the set
in order to reject adding rules which refer to a set with a non-mathcing
family. Currently such rules are silently accepted and then ignored
instead of generating an error message to the user.
* Reworked extensions support in ipset types from Jozsef. The approach of
defining structures with all variations is not manageable as the
number of extensions grows. Therefore a blob for the extensions is
introduced, somewhat similar to conntrack. The support of extensions
which need a per data destroy function is added as well.
* When an element timed out in a list:set type of set, the garbage
collector skipped the checking of the next element. So the purging
was delayed to the next run of the gc, fixed by Jozsef.
* A small Kconfig fix: NETFILTER_NETLINK cannot be selected and
ipset requires it.
* hash:net,net type from Oliver Smith. The type provides the ability to
store pairs of subnets in a set.
* Comment for ipset entries from Oliver Smith. This makes possible to
annotate entries in a set with comments, for example:
ipset n foo hash:net,net comment
ipset a foo 10.0.0.0/21,192.168.1.0/24 comment "office nets A and B"
* Fix of hash types resizing with comment extension from Jozsef.
* Fix of new extensions for list:set type when an element is added
into a slot from where another element was pushed away from Jozsef.
* Introduction of a common function for the listing of the element
extensions from Jozsef.
* Net namespace support for ipset from Vitaly Lavrov.
* hash:net,port,net type from Oliver Smith, which makes possible
to store the triples of two subnets and a protocol, port pair in
a set.
* Get xt_TCPMSS working with net namespace, by Gao feng.
* Use the proper net netnamespace to allocate skbs, also by Gao feng.
* A couple of cleanups for the conntrack SIP helper, by Holger
Eitzenberger.
* Extend cttimeout to allow setting default conntrack timeouts via
nfnetlink, so we can get rid of all our sysctl/proc interfaces in
the future for timeout tuning, from me.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a generic qualifier for transaction events, as a new sample
type that returns a flag word. This is particularly useful
for qualifying aborts: to distinguish aborts which happen
due to asynchronous events (like conflicts caused by another
CPU) versus instructions that lead to an abort.
The tuning strategies are very different for those cases,
so it's important to distinguish them easily and early.
Since it's inconvenient and inflexible to filter for this
in the kernel we report all the events out and allow
some post processing in user space.
The flags are based on the Intel TSX events, but should be fairly
generic and mostly applicable to other HTM architectures too. In addition
to various flag words there's also reserved space to report an
program supplied abort code. For TSX this is used to distinguish specific
classes of aborts, like a lock busy abort when doing lock elision.
Flags:
Elision and generic transactions (ELISION vs TRANSACTION)
(HLE vs RTM on TSX; IBM etc. would likely only use TRANSACTION)
Aborts caused by current thread vs aborts caused by others (SYNC vs ASYNC)
Retryable transaction (RETRY)
Conflicts with other threads (CONFLICT)
Transaction write capacity overflow (CAPACITY WRITE)
Transaction read capacity overflow (CAPACITY READ)
Transactions implicitely aborted can also return an abort code.
This can be used to signal specific events to the profiler. A common
case is abort on lock busy in a RTM eliding library (code 0xff)
To handle this case we include the TSX abort code
Common example aborts in TSX would be:
- Data conflict with another thread on memory read.
Flags: TRANSACTION|ASYNC|CONFLICT
- executing a WRMSR in a transaction. Flags: TRANSACTION|SYNC
- HLE transaction in user space is too large
Flags: ELISION|SYNC|CAPACITY-WRITE
The only flag that is somewhat TSX specific is ELISION.
This adds the perf core glue needed for reporting the new flag word out.
v2: Add MEM/MISC
v3: Move transaction to the end
v4: Separate capacity-read/write and remove misc
v5: Remove _SAMPLE. Move abort flags to 32bit. Rename
transaction to txn
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch adds two new hash policy modes which use skb_flow_dissect:
3 - Encapsulated layer 2+3
4 - Encapsulated layer 3+4
There should be a good improvement for tunnel users in those modes.
It also changes the old hash functions to:
hash ^= (__force u32)flow.dst ^ (__force u32)flow.src;
hash ^= (hash >> 16);
hash ^= (hash >> 8);
Where hash will be initialized either to L2 hash, that is
SRCMAC[5] XOR DSTMAC[5], or to flow->ports which should be extracted
from the upper layer. Flow's dst and src are also extracted based on the
xmit policy either directly from the buffer or by using skb_flow_dissect,
but in both cases if the protocol is IPv6 then dst and src are obtained by
ipv6_addr_hash() on the real addresses. In case of a non-dissectable
packet, the algorithms fall back to L2 hashing.
The bond_set_mode_ops() function is now obsolete and thus deleted
because it was used only to set the proper hash policy. Also we trim a
pointer from struct bonding because we no longer need to keep the hash
function, now there's only a single hash function - bond_xmit_hash that
works based on bond->params.xmit_policy.
The hash function and skb_flow_dissect were suggested by Eric Dumazet.
The layer names were suggested by Andy Gospodarek, because I suck at
semantics.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jamal sent patch to add tc user simple actions to iproute2
but required header was not being exported.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
For implementing CPU=host, we need a mechanism for querying
preferred VCPU target type on underlying Host.
This patch implements KVM_ARM_PREFERRED_TARGET vm ioctl which
returns struct kvm_vcpu_init instance containing information
about preferred VCPU target type and target specific features
available for it.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Conflicts:
drivers/net/ethernet/emulex/benet/be.h
drivers/net/usb/qmi_wwan.c
drivers/net/wireless/brcm80211/brcmfmac/dhd_bus.h
include/net/netfilter/nf_conntrack_synproxy.h
include/net/secure_seq.h
The conflicts are of two varieties:
1) Conflicts with Joe Perches's 'extern' removal from header file
function declarations. Usually it's an argument signature change
or a function being added/removed. The resolutions are trivial.
2) Some overlapping changes in qmi_wwan.c and be.h, one commit adds
a new value, another changes an existing value. That sort of
thing.
Signed-off-by: David S. Miller <davem@davemloft.net>
Default timeouts are currently set via proc/sysctl interface, the
typical pattern is a file name like:
/proc/sys/net/netfilter/nf_conntrack_PROTOCOL_timeout_STATE
This results in one entry per default protocol state timeout.
This patch simplifies this by allowing to set default protocol
timeouts via cttimeout netlink interface.
This should allow us to get rid of the existing proc/sysctl code
in the midterm.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The kernel shouldn't accept invalid modes, just say No.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This allows us to use fewer bits in the mode structure, leaving room for
future work while allowing more stereo layouts types than we could have
ever dreamt of.
I also exposed the previously private DRM_MODE_FLAG_3D_MASK to set in
stone that we are using 5 bits for the stereo layout enum, reserving 32
values.
Even with that reservation, we gain 3 bits from the previous encoding.
The code adding the mandatory stereo modes needeed to be adapted as it was
relying or being able to or stereo layouts together.
Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This capability allows user space to control the delivery of modes with
the 3D flags set. This is to not play games with current user space
users not knowing anything about stereo 3D flags and that could try
to set a mode with one or several of those bits set.
So, the plan is to remove the stereo modes from the list of modes we
give to DRM clients by default, and let them through if we are being
told otherwise.
stereo_allowed is bound to the drm_file structure to make it a
per-client setting, not a global one.
v2: Replace clearing 3D flags by discarding the stereo modes now that
they are regular modes.
v3: SET_CAP -> SET_CLIENT_CAP rename (Chris Wilson)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
HDMI 1.4a defines a few layouts that we'd like to expose. This commits
add new modeinfo flags that can be used to list the supported stereo
layouts (when querying the list of modes) and to set a given stereo 3D
mode (when setting a mode).
v2: Add a drm_mode_is_stereo() helper
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This ioctl can be used to turn some knobs in a DRM driver. The client
can ask the DRM core for an alternate view of the reality: it can be
useful to be able to instruct the core that the DRM client can handle
new functionnality that would otherwise break current ABI.
v2: Rename to ioctl from SET_CAP to SET_CLIENT_CAP (Chris Wilson)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It's a tiny bit more logical to find the different capabilities you can
use with the GET_CAP ioctl next to the structure rather than putting
them at the end of the file.
v2: Tab align the litterals (David Herrmann)
v3: Make it clearer that DRM_PRIME_CAP_EXPORT/IMPORT are flags of
DRM_CAP_PRIME.
v4: Rebase on top of latest bits (DRM_CAP_ASYNC_PAGE_FLIP was
introduced)
Reviewed-by: David Herrmann <dh.herrmann@gmail.com> (for v2)
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
drm-intel-next-2013-09-21:
- clock state handling rework from Ville
- l3 parity handling fixes for hsw from Ben
- some more watermark improvements from Ville
- ban badly behaved context from Mika
- a few vlv improvements from Jesse
- VGA power domain handling from Ville
drm-intel-next-2013-09-06:
- Basic mipi dsi support from Jani. Not yet converted over to drm_bridge
since that was too fresh, but the porting is in progress already.
- More vma patches from Ben, this time the code to convert the execbuffer
code. Now that the shrinker recursion bug is tracked down we can move
ahead here again. Yay!
- Optimize hw context switching to not generate needless interrupts (Chris
Wilson). Also some shuffling for the oustanding request allocation.
- Opregion support for SWSCI, although not yet fully wired up (we need a
bit of runtime D3 support for that apparently, due to Windows design
deficiencies), from Jani Nikula.
- A few smaller changes all over.
[airlied: merge conflict fix in i9xx_set_pipeconf]
* tag 'drm-intel-next-2013-09-21-merged' of git://people.freedesktop.org/~danvet/drm-intel: (119 commits)
drm/i915: assume all GM45 Acer laptops use inverted backlight PWM
drm/i915: cleanup a min_t() cast
drm/i915: Pull intel_init_power_well() out of intel_modeset_init_hw()
drm/i915: Add POWER_DOMAIN_VGA
drm/i915: Refactor power well refcount inc/dec operations
drm/i915: Add intel_display_power_{get, put} to request power for specific domains
drm/i915: Change i915_request power well handling
drm/i915: POSTING_READ IPS_CTL before waiting for the vblank
drm/i915: don't disable ERR_INT on the IRQ handler
drm/i915/vlv: disable rc6p and rc6pp residency reporting on BYT
drm/i915/vlv: honor i915_enable_rc6 boot param on VLV
drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPF
drm/i915: Do remaps for all contexts
drm/i915: Keep a list of all contexts
drm/i915: Make l3 remapping use the ring
drm/i915: Add second slice l3 remapping
drm/i915: Fix HSW parity test
drm/i915: dump crtc timings from the pipe config
drm/i915: register backlight device also when backlight class is a module
drm/i915: write D_COMP using the mailbox
...
Conflicts:
drivers/gpu/drm/i915/intel_display.c
This adds the core support for having comments on ipset entries.
The comments are stored as standard null-terminated strings in
dynamically allocated memory after being passed to the kernel. As a
result of this, code has been added to the generic destroy function to
iterate all extensions and call that extension's destroy task if the set
has that extension activated, and if such a task is defined.
Signed-off-by: Oliver Smith <oliver@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
ip[6]tables set match and SET target need to know the family of the set
in order to reject adding rules which refer to a set with a non-mathcing
family. Currently such rules are silently accepted and then ignored
instead of generating a clear error message to the user, which is not
helpful.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
We need/want the mei fixes in here so we can apply other updates that
are depending on them.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull drm fixes from Dave Airlie:
"Nothing too major, radeon still has some dpm changes for off by
default.
Radeon, intel, msm:
- radeon: a few more dpm fixes (still off by default), uvd fixes
- i915: runtime warn backtrace and regression fix
- msm: iommu changes fallout"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (27 commits)
drm/msm: use drm_gem_dumb_destroy helper
drm/msm: deal with mach/iommu.h removal
drm/msm: Remove iommu include from mdp4_kms.c
drm/msm: Odd PTR_ERR usage
drm/i915: Fix up usage of SHRINK_STOP
drm/radeon: fix hdmi audio on DCE3.0/3.1 asics
drm/i915: preserve pipe A quirk in i9xx_set_pipeconf
drm/i915/tv: clear adjusted_mode.flags
drm/i915/dp: increase i2c-over-aux retry interval on AUX DEFER
drm/radeon/cik: fix overflow in vram fetch
drm/radeon: add missing hdmi callbacks for rv6xx
drm/i915: Use a temporary va_list for two-pass string handling
drm/radeon/uvd: lower msg&fb buffer requirements on UVD3
drm/radeon: disable tests/benchmarks if accel is disabled
drm/radeon: don't set default clocks for SI when DPM is disabled
drm/radeon/dpm/ci: filter clocks based on voltage/clk dep tables
drm/radeon/dpm/si: filter clocks based on voltage/clk dep tables
drm/radeon/dpm/ni: filter clocks based on voltage/clk dep tables
drm/radeon/dpm/btc: filter clocks based on voltage/clk dep tables
drm/radeon/dpm: fetch the max clk from voltage dep tables helper
...
As mentioned in commit afe4fd0624 ("pkt_sched: fq: Fair Queue packet
scheduler"), this patch adds a new socket option.
SO_MAX_PACING_RATE offers the application the ability to cap the
rate computed by transport layer. Value is in bytes per second.
u32 val = 1000000;
setsockopt(sockfd, SOL_SOCKET, SO_MAX_PACING_RATE, &val, sizeof(val));
To be effectively paced, a flow must use FQ packet scheduler.
Note that a packet scheduler takes into account the headers for its
computations. The effective payload rate depends on MSS and retransmits
if any.
I chose to make this pacing rate a SOL_SOCKET option instead of a
TCP one because this can be used by other protocols.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
More radeon fixes for 3.12. Kind of all over the place: UVD, DPM,
tiling, etc.
* 'drm-fixes-3.12' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: fix hdmi audio on DCE3.0/3.1 asics
drm/radeon/cik: fix overflow in vram fetch
drm/radeon: add missing hdmi callbacks for rv6xx
drm/radeon/uvd: lower msg&fb buffer requirements on UVD3
drm/radeon: disable tests/benchmarks if accel is disabled
drm/radeon: don't set default clocks for SI when DPM is disabled
drm/radeon/dpm/ci: filter clocks based on voltage/clk dep tables
drm/radeon/dpm/si: filter clocks based on voltage/clk dep tables
drm/radeon/dpm/ni: filter clocks based on voltage/clk dep tables
drm/radeon/dpm/btc: filter clocks based on voltage/clk dep tables
drm/radeon/dpm: fetch the max clk from voltage dep tables helper
drm/radeon: fix missed variable sized access
drm/radeon: Make r100_cp_ring_info() and radeon_ring_gfx() safe (v2)
drm/radeon/cik: Add tiling mode index for 1D tiled depth/stencil surfaces
drm/radeon/cik: Fix encoding of number of banks in tiling configuration info
drm/radeon/cik: Fix printing of client name on VM protection fault
drm/radeon: additional gcc fixes for radeon_atombios.c
drm/radeon: avoid UVD corruption on AGP cards using GPU gart
The following warning from mic_ioctl.h is fixed via this patch:
found __[us]{8,16,32,64} type without #include <linux/types.h>
Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* pci/misc:
PCI: Remove unused PCI_MSIX_FLAGS_BIRMASK definition
PCI: acpiphp_ibm: Convert to dynamic debug
PCI: acpiphp: Convert to dynamic debug
PCI: Remove Intel Haswell D3 delays
PCI: Pass type, width, and prefetchability for window alignment
PCI: Document reason for using pci_is_root_bus()
PCI: Use pci_is_root_bus() to check for root bus
PCI: Remove unused "is_pcie" from pci_dev structure
PCI: Update pci_find_slot() description in pci.txt
[SCSI] qla2xxx: Use standard PCIe Capability Link register field names
PCI: Fix comment typo, remove unnecessary !! in pci_is_pcie()
PCI: Drop "setting latency timer" messages
PCI_MSIX_FLAGS_BIRMASK has been replaced by PCI_MSIX_TABLE_BIR for better
readability. Now no one uses it, remove it. No functional change.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This patch introduces the host "Virtio over PCIe" interface for
Intel MIC. It allows creating user space backends on the host and instantiating
virtio devices for them on the Intel MIC card. It uses the existing VRINGH
infrastructure in the kernel to access virtio rings from the host. A character
device per MIC is exposed with IOCTL, mmap and poll callbacks. This allows the
user space backend to:
(a) add/remove a virtio device via a device page.
(b) map (R/O) virtio rings and device page to user space.
(c) poll for availability of data.
(d) copy a descriptor or entire descriptor chain to/from the card.
(e) modify virtio configuration.
(f) handle virtio device reset.
The buffers are copied over using CPU copies for this initial patch
and host initiated MIC DMA support is planned for future patches.
The avail and desc virtio rings are in host memory and the used ring
is in card memory to maximize writes across PCIe for performance.
Co-author: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Caz Yokoyama <Caz.Yokoyama@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Acked-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Reviewed-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch enables the following features:
a) Boots and shuts down the card via sysfs entries.
b) Allocates and maps a device page for communication with the
card driver and updates the device page address via scratchpad
registers.
c) Provides sysfs entries for shutdown status, kernel command line,
ramdisk and log buffer information.
Co-author: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Caz Yokoyama <Caz.Yokoyama@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Harshavardhan R Kharche <harshavardhan.r.kharche@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Acked-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
Reviewed-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In order to send and receive ISO7816 APDUs to and from NFC embedded
secure elements, we define a specific netlink command.
On a typical SE use case, host applications will send very few APDUs
(Less than 10) per transaction. This is why we decided to go for a
simple netlink API. Defining another NFC socket protocol for such low
traffic would have been overengineered.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQEcBAABAgAGBQJSQMORAAoJEHm+PkMAQRiGj14H/1bjhtfNjPdX7MVQAzA+WpwX
s7h1IQu2Si9S5S1lBiM2sBTOssVcmfheO9x4yqm7JNOD1RnssWKOM3q+zVOLstwd
GD3gluJPeraD5EyYSqEJ9ILPQ3gbxb4wOlT0Z291TW6E8XhLRr0RTOJPksRsgvLH
Ckm9uJh6ArS6ZXfXiaDQfd+xHAQJkUfW6nMSA0g9ZO9C6KIDRvcbUmrY3m4HhfIk
mK0TXCBs+AXGDIjTEB8JgIQL/5y1Qn0c4R+2uTU/4YWwyLvJTV1e44kGoleukMMT
6Pw/TNlUEN161dbSaqCyF3sfXHDYQ5valycI2PDgitMtPSxbzsU1VDizS8+daRg=
=lEmF
-----END PGP SIGNATURE-----
Merge tag 'v3.12-rc2' into drm-intel-next
Backmerge Linux 3.12-rc2 to prep for a bunch of -next patches:
- Header cleanup in intel_drv.h, both changed in -fixes and my current
-next pile.
- Cursor handling cleanup for -next which depends upon the cursor
handling fix merged into -rc2.
All just trivial conflicts of the "changed adjacent lines" type:
drivers/gpu/drm/i915/i915_gem.c
drivers/gpu/drm/i915/intel_display.c
drivers/gpu/drm/i915/intel_drv.h
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
pci_is_pcie() and pcie_capability_clear_and_set_word() make it trivial
to set the PCIe Completion Timeout, so just fold the
csio_set_pcie_completion_timeout() function into its caller.
[bhelgaas: changelog, fold csio_set_pcie_completion_timeout() into caller]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Naresh Kumar Inna <naresh@chelsio.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jesper Juhl <jj@chaosbits.net>
This file is copied to the source code of user space applications (in
this case can-utils) and so it makes sense to mention explicitly their
copyright.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
These files are copied to the source code of user space applications (in
this case can-utils) and so it makes sense to mention explicitly their
copyright. I added the terms of C code that was introduced in the same
commit as these headers.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Acked-by: Urs Thuermann <urs.thuermann@volkswagen.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
CIK uses a different index for 1D DST surfaces compared to SI. Expose
the new index so libdrm_radeon can use it properly for userspace
drivers.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
HTB already can deal with 64bit rates, we only have to add two new
attributes so that tc can use them to break the current 32bit ABI
barrier.
TCA_HTB_RATE64 : class rate (in bytes per second)
TCA_HTB_CEIL64 : class ceil (in bytes per second)
This allows us to setup HTB on 40Gbps links, as 32bit limit is
actually ~34Gbps
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Solve the problems around the broken definition of perf_event_mmap_page::
cap_usr_time and cap_usr_rdpmc fields which used to overlap, partially
fixed by:
860f085b74 ("perf: Fix broken union in 'struct perf_event_mmap_page'")
The problem with the fix (merged in v3.12-rc1 and not yet released
officially), noticed by Vince Weaver is that the new behavior is
not detectable by new user-space, and that due to the reuse of the
field names it's easy to mis-compile a binary if old headers are used
on a new kernel or new headers are used on an old kernel.
To solve all that make this change explicit, detectable and self-contained,
by iterating the ABI the following way:
- Always clear bit 0, and rename it to usrpage->cap_bit0, to at least not
confuse old user-space binaries. RDPMC will be marked as unavailable
to old binaries but that's within the ABI, this is a capability bit.
- Rename bit 1 to ->cap_bit0_is_deprecated and always set it to 1, so new
libraries can reliably detect that bit 0 is deprecated and perma-zero
without having to check the kernel version.
- Use bits 2, 3, 4 for the newly defined, correct functionality:
cap_user_rdpmc : 1, /* The RDPMC instruction can be used to read counts */
cap_user_time : 1, /* The time_* fields are used */
cap_user_time_zero : 1, /* The time_zero field is used */
- Rename all the bitfield names in perf_event.h to be different from the
old names, to make sure it's not possible to mis-compile it
accidentally with old assumptions.
The 'size' field can then be used in the future to add new fields and it
will act as a natural ABI version indicator as well.
Also adjust tools/perf/ userspace for the new definitions, noticed by
Adrian Hunter.
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Also-Fixed-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/n/tip-zr03yxjrpXesOzzupszqglbv@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For some mysterious reason the sample_id field of PERF_RECORD_MMAP went AWOL.
Reported-by: Vince Weaver <vince@deater.net>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Certain HSW SKUs have a second bank of L3. This L3 remapping has a
separate register set, and interrupt from the first "slice". A slice is
simply a term to define some subset of the GPU's l3 cache. This patch
implements both the interrupt handler, and ability to communicate with
userspace about this second slice.
v2: Remove redundant check about non-existent slice.
Change warning about interrupts of unknown slices to WARN_ON_ONCE
Handle the case where we get 2 slice interrupts concurrently, and switch
the tracking of interrupts to be non-destructive (all Ville)
Don't enable/mask the second slice parity interrupt for ivb/vlv (even
though all docs I can find claim it's rsvd) (Ville + Bryan)
Keep BYT excluded from L3 parity
v3: Fix the slice = ffs to be decremented by one (found by Ville). When
I initially did my testing on the series, I was using 1-based slice
counting, so this code was correct. Not sure why my simpler tests that
I've been running since then didn't pick it up sooner.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Without the following patch I have problems compiling code using
the new PERF_EVENT_IOC_ID ioctl(). It looks like u64 was used
instead of __u64
Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1309171450380.11444@vincent-weaver-1.um.maine.edu
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With the architecture gone, any references to it are no longer needed.
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Pull input update from Dmitry Torokhov:
"The only change is David Hermann's new EVIOCREVOKE evdev ioctl that
allows safely passing file descriptors to input devices to session
processes and later being able to stop delivery of events through
these fds so that inactive sessions will no longer receive user input
that does not belong to them"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: evdev - add EVIOCREVOKE ioctl
Pull vfs pile 4 from Al Viro:
"list_lru pile, mostly"
This came out of Andrew's pile, Al ended up doing the merge work so that
Andrew didn't have to.
Additionally, a few fixes.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (42 commits)
super: fix for destroy lrus
list_lru: dynamically adjust node arrays
shrinker: Kill old ->shrink API.
shrinker: convert remaining shrinkers to count/scan API
staging/lustre/libcfs: cleanup linux-mem.h
staging/lustre/ptlrpc: convert to new shrinker API
staging/lustre/obdclass: convert lu_object shrinker to count/scan API
staging/lustre/ldlm: convert to shrinkers to count/scan API
hugepage: convert huge zero page shrinker to new shrinker API
i915: bail out earlier when shrinker cannot acquire mutex
drivers: convert shrinkers to new count/scan API
fs: convert fs shrinkers to new scan/count API
xfs: fix dquot isolation hang
xfs-convert-dquot-cache-lru-to-list_lru-fix
xfs: convert dquot cache lru to list_lru
xfs: rework buffer dispose list tracking
xfs-convert-buftarg-lru-to-generic-code-fix
xfs: convert buftarg LRU to generic code
fs: convert inode and dentry shrinking to be node aware
vmscan: per-node deferred work
...
Pull btrfs updates from Chris Mason:
"This is against 3.11-rc7, but was pulled and tested against your tree
as of yesterday. We do have two small incrementals queued up, but I
wanted to get this bunch out the door before I hop on an airplane.
This is a fairly large batch of fixes, performance improvements, and
cleanups from the usual Btrfs suspects.
We've included Stefan Behren's work to index subvolume UUIDs, which is
targeted at speeding up send/receive with many subvolumes or snapshots
in place. It closes a long standing performance issue that was built
in to the disk format.
Mark Fasheh's offline dedup work is also here. In this case offline
means the FS is mounted and active, but the dedup work is not done
inline during file IO. This is a building block where utilities are
able to ask the FS to dedup a series of extents. The kernel takes
care of verifying the data involved really is the same. Today this
involves reading both extents, but we'll continue to evolve the
patches"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (118 commits)
Btrfs: optimize key searches in btrfs_search_slot
Btrfs: don't use an async starter for most of our workers
Btrfs: only update disk_i_size as we remove extents
Btrfs: fix deadlock in uuid scan kthread
Btrfs: stop refusing the relocation of chunk 0
Btrfs: fix memory leak of uuid_root in free_fs_info
btrfs: reuse kbasename helper
btrfs: return btrfs error code for dev excl ops err
Btrfs: allow partial ordered extent completion
Btrfs: convert all bug_ons in free-space-cache.c
Btrfs: add support for asserts
Btrfs: adjust the fs_devices->missing count on unmount
Btrf: cleanup: don't check for root_refs == 0 twice
Btrfs: fix for patch "cleanup: don't check the same thing twice"
Btrfs: get rid of one BUG() in write_all_supers()
Btrfs: allocate prelim_ref with a slab allocater
Btrfs: pass gfp_t to __add_prelim_ref() to avoid always using GFP_ATOMIC
Btrfs: fix race conditions in BTRFS_IOC_FS_INFO ioctl
Btrfs: fix race between removing a dev and writing sbs
Btrfs: remove ourselves from the cluster list under lock
...
Pull CIFS fixes from Steve French:
"CIFS update including case insensitive file name matching improvements
for UTF-8 to Unicode, various small cifs fixes, SMB2/SMB3 leasing
improvements, support for following SMB2 symlinks, SMB3 packet signing
improvements"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6: (25 commits)
CIFS: Respect epoch value from create lease context v2
CIFS: Add create lease v2 context for SMB3
CIFS: Move parsing lease buffer to ops struct
CIFS: Move creating lease buffer to ops struct
CIFS: Store lease state itself rather than a mapped oplock value
CIFS: Replace clientCanCache* bools with an integer
[CIFS] quiet sparse compile warning
cifs: Start using per session key for smb2/3 for signature generation
cifs: Add a variable specific to NTLMSSP for key exchange.
cifs: Process post session setup code in respective dialect functions.
CIFS: convert to use le32_add_cpu()
CIFS: Fix missing lease break
CIFS: Fix a memory leak when a lease break comes
cifs: add winucase_convert.pl to Documentation/ directory
cifs: convert case-insensitive dentry ops to use new case conversion routines
cifs: add new case-insensitive conversion routines that are based on wchar_t's
[CIFS] Add Scott to list of cifs contributors
cifs: Move and expand MAX_SERVER_SIZE definition
cifs: Expand max share name length to 256
cifs: Move string length definitions to uapi
...
This series reworks our current object cache shrinking infrastructure in
two main ways:
* Noticing that a lot of users copy and paste their own version of LRU
lists for objects, we put some effort in providing a generic version.
It is modeled after the filesystem users: dentries, inodes, and xfs
(for various tasks), but we expect that other users could benefit in
the near future with little or no modification. Let us know if you
have any issues.
* The underlying list_lru being proposed automatically and
transparently keeps the elements in per-node lists, and is able to
manipulate the node lists individually. Given this infrastructure, we
are able to modify the up-to-now hammer called shrink_slab to proceed
with node-reclaim instead of always searching memory from all over like
it has been doing.
Per-node lru lists are also expected to lead to less contention in the lru
locks on multi-node scans, since we are now no longer fighting for a
global lock. The locks usually disappear from the profilers with this
change.
Although we have no official benchmarks for this version - be our guest to
independently evaluate this - earlier versions of this series were
performance tested (details at
http://permalink.gmane.org/gmane.linux.kernel.mm/100537) yielding no
visible performance regressions while yielding a better qualitative
behavior in NUMA machines.
With this infrastructure in place, we can use the list_lru entry point to
provide memcg isolation and per-memcg targeted reclaim. Historically,
those two pieces of work have been posted together. This version presents
only the infrastructure work, deferring the memcg work for a later time,
so we can focus on getting this part tested. You can see more about the
history of such work at http://lwn.net/Articles/552769/
Dave Chinner (18):
dcache: convert dentry_stat.nr_unused to per-cpu counters
dentry: move to per-sb LRU locks
dcache: remove dentries from LRU before putting on dispose list
mm: new shrinker API
shrinker: convert superblock shrinkers to new API
list: add a new LRU list type
inode: convert inode lru list to generic lru list code.
dcache: convert to use new lru list infrastructure
list_lru: per-node list infrastructure
shrinker: add node awareness
fs: convert inode and dentry shrinking to be node aware
xfs: convert buftarg LRU to generic code
xfs: rework buffer dispose list tracking
xfs: convert dquot cache lru to list_lru
fs: convert fs shrinkers to new scan/count API
drivers: convert shrinkers to new count/scan API
shrinker: convert remaining shrinkers to count/scan API
shrinker: Kill old ->shrink API.
Glauber Costa (7):
fs: bump inode and dentry counters to long
super: fix calculation of shrinkable objects for small numbers
list_lru: per-node API
vmscan: per-node deferred work
i915: bail out earlier when shrinker cannot acquire mutex
hugepage: convert huge zero page shrinker to new shrinker API
list_lru: dynamically adjust node arrays
This patch:
There are situations in very large machines in which we can have a large
quantity of dirty inodes, unused dentries, etc. This is particularly true
when umounting a filesystem, where eventually since every live object will
eventually be discarded.
Dave Chinner reported a problem with this while experimenting with the
shrinker revamp patchset. So we believe it is time for a change. This
patch just moves int to longs. Machines where it matters should have a
big long anyway.
Signed-off-by: Glauber Costa <glommer@openvz.org>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: J. Bruce Fields <bfields@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
device-mapper device. This dm-stats code required the reintroduction of
a div64_u64_rem() helper, but as a separate method that doesn't slow
down div64_u64() -- especially on 32-bit systems.
Allow the error target to replace request-based DM devices
(e.g. multipath) in addition to bio-based DM devices.
Various other small code fixes and improvements to thin-provisioning, DM
cache and the DM ioctl interface.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQEcBAABAgAGBQJSLyNnAAoJEMUj8QotnQNaXVEIAKA1l43enaGiROBZEZXgAGUY
1JUsnHES4ujyn/jtT39jPTQf9AW/rS4FUCrZiXG2aaNHXo7+7cdVoBHAiWc7mXad
budBSqn47W7WDyFlQarKwsuYFcdLnqdnieRDMXQ1cN5dl4Rx61LclnsylQd4SSS0
lznXkfOTquetDSuEPOuUHJDZufdacw3PpxWbTKGJld40fd7YZfGWQoG0ek1OeqqL
fA30DTlYnkFyhheLCjFcDY6H55Rt7QpBWOUAa2XXLR6GLfk5iFK99autjWk2xTPT
nppRwQrw9VH+HdW0jGLU+LRs1Y3nxwT9OBLWt9wav87Smdg/7jQAjwde9eKbO2k=
=3ooH
-----END PGP SIGNATURE-----
Merge tag 'dm-3.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device-mapper updates from Mike Snitzer:
"Add the ability to collect I/O statistics on user-defined regions of a
device-mapper device. This dm-stats code required the reintroduction
of a div64_u64_rem() helper, but as a separate method that doesn't
slow down div64_u64() -- especially on 32-bit systems.
Allow the error target to replace request-based DM devices (e.g.
multipath) in addition to bio-based DM devices.
Various other small code fixes and improvements to thin-provisioning,
DM cache and the DM ioctl interface"
* tag 'dm-3.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm stripe: silence a couple sparse warnings
dm: add statistics support
dm thin: always return -ENOSPC if no_free_space is set
dm ioctl: cleanup error handling in table_load
dm ioctl: increase granularity of type_lock when loading table
dm ioctl: prevent rename to empty name or uuid
dm thin: set pool read-only if breaking_sharing fails block allocation
dm thin: prefix pool error messages with pool device name
dm: allow error target to replace bio-based and request-based targets
math64: New separate div64_u64_rem helper
dm space map: optimise sm_ll_dec and sm_ll_inc
dm btree: prefetch child nodes when walking tree for a dm_btree_del
dm btree: use pop_frame in dm_btree_del to cleanup code
dm cache: eliminate holes in cache structure
dm cache: fix stacking of geometry limits
dm thin: fix stacking of geometry limits
dm thin: add data block size limits to Documentation
dm cache: add data block size limits to code and Documentation
dm cache: document metadata device is exclussive to a cache
dm: stop using WQ_NON_REENTRANT
For 3.12-rc1 there are a number of bugfixes in addition to work to ease usage
of shared code between libxfs and the kernel, the rest of the work to enable
project and group quotas to be used simultaneously, performance optimisations
in the log and the CIL, directory entry file type support, fixes for log space
reservations, some spelling/grammar cleanups, and the addition of user
namespace support.
- introduce readahead to log recovery
- add directory entry file type support
- fix a number of spelling errors in comments
- introduce new Q_XGETQSTATV quotactl for project quotas
- add USER_NS support
- log space reservation rework
- CIL optimisations
- kernel/userspace libxfs rework
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAABAgAGBQJSLeikAAoJENaLyazVq6ZOciEP/3tc850sQsPlNwP9aqd1l2Wk
S1RJ8i+MUQ2W/PlbswCXvdUCT8DIwXWxL31tGvi8vtaLhh6t8ICSZwqNil+/GCIJ
BErVvY4oXhEMHhlbIRRvpxblTfJGiYy3puUEz9VI0yDdUVnC33+DuEeLTQ/0mibo
/UUqKFmM3KYpOc8vIQvH5K5i8PkjtMt9yge0k4l9COD30gtY2okkaD4b1voOsKc+
5YFqulq7zcXBUYti+EFCQeV8aUBTGEPN4PJRdcS12/ylzsTzZivAOO+QREu7qBW8
x+Gj8fOC+yYWCttmJlfa1n8taxge3ndEuzKN97nvvfQgjvvunMvwJ499skryYVdB
EcPnBnpDUQuz/y7exKBT9uROK817vZBtfHzSova29ayQSWC+qDpNE4xXeDIqeCtT
CPxdHuWMOvIdZg41E4x7je0elaZl8EAZ8hycc2WuRhtukEkIdE1O8aD7IVrMYee8
kg+aVHG5nmYRInO1WuMinbtiCzwvVoBJToWM3y4cbfgW0dILASRyL53HDd+eCr1j
kOpPIVgXlBZgiPMmdYahWxyVVWcE7zyex0w4frzWVlJMZ4lP5brppD6qfQg1JwOB
z21Y95F5C2GxSyN/Lwps0G6jujHrpe6GVeYK7uKCtnqTD83nSShv5Naln7pQ3AUs
qUMsqmJob4+bwt94Xgbx
=V4s4
-----END PGP SIGNATURE-----
Merge tag 'xfs-for-linus-v3.12-rc1' of git://oss.sgi.com/xfs/xfs
Pull xfs updates from Ben Myers:
"For 3.12-rc1 there are a number of bugfixes in addition to work to
ease usage of shared code between libxfs and the kernel, the rest of
the work to enable project and group quotas to be used simultaneously,
performance optimisations in the log and the CIL, directory entry file
type support, fixes for log space reservations, some spelling/grammar
cleanups, and the addition of user namespace support.
- introduce readahead to log recovery
- add directory entry file type support
- fix a number of spelling errors in comments
- introduce new Q_XGETQSTATV quotactl for project quotas
- add USER_NS support
- log space reservation rework
- CIL optimisations
- kernel/userspace libxfs rework"
* tag 'xfs-for-linus-v3.12-rc1' of git://oss.sgi.com/xfs/xfs: (112 commits)
xfs: XFS_MOUNT_QUOTA_ALL needed by userspace
xfs: dtype changed xfs_dir2_sfe_put_ino to xfs_dir3_sfe_put_ino
Fix wrong flag ASSERT in xfs_attr_shortform_getvalue
xfs: finish removing IOP_* macros.
xfs: inode log reservations are too small
xfs: check correct status variable for xfs_inobt_get_rec() call
xfs: inode buffers may not be valid during recovery readahead
xfs: check LSN ordering for v5 superblocks during recovery
xfs: btree block LSN escaping to disk uninitialised
XFS: Assertion failed: first <= last && last < BBTOB(bp->b_length), file: fs/xfs/xfs_trans_buf.c, line: 568
xfs: fix bad dquot buffer size in log recovery readahead
xfs: don't account buffer cancellation during log recovery readahead
xfs: check for underflow in xfs_iformat_fork()
xfs: xfs_dir3_sfe_put_ino can be static
xfs: introduce object readahead to log recovery
xfs: Simplify xfs_ail_min() with list_first_entry_or_null()
xfs: Register hotcpu notifier after initialization
xfs: add xfs sb v4 support for dirent filetype field
xfs: Add write support for dirent filetype field
xfs: Add read-only support for dirent filetype field
...
an external user interface exported to allow other modules to hold
references to VFIO groups, a fix to test for extended config space
on PCIe and PCI-x, and new hot reset interfaces for PCI devices
which allows the user to do PCI bus/slot resets when all of the
devices affected by the reset are owned by the user.
For this last feature, the PCI bus reset interface, I depend on
changes already merged from Bjorn's PCI pull request. I therefore
merged my tree up to commit cb3e433, which I think was the correct
action, but as Stephen Rothwell noted, I failed to provide a commit
message indicating why the merge was required. Sorry for that.
Thanks,
Alex
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQIcBAABAgAGBQJSLKr0AAoJECObm247sIsiouMP/1rym/JDjSsjIONvjnQEH5hk
aZy3gGjqvSBBSN2aWqYkEgsSmgh4lSdDAbSBpY6vfz4Q0h7OXhSZcZ5oG2eGECia
rN6GvfI/g2mYBCHJbS30bEkpljx9ssMZi7n9dnWOWEkzXSTC/uLHx9DCUiHdczPF
SEeEHbxAnCTL5S5SQgUwhjPI813eVPjm94KDZK6pHcd14GGtM/piGjJNkyfzymRZ
0l7lu95YaY70qL1Bk1EZ2TlsqWtYeKfDXsp465U3uBj5eXslMKpxXBn/XndRWu25
ZnjRitF/izMZdcBgIM4uVK/yDhdAI2qav1rtkVirqp9Qrs9BgJXxUpO3p0e/GhFl
KZgq3DnE1aUfJlyu+YGmjIKWDT+CBrkkxidRwYoaBCs3if/1E19zUH5Gfuuimy/D
rpET3HPuJlPhxyE2vH9u6EVTybBx4wlvDP43FyhetriQjwv+SuPP2oNWD96L/qBY
BHcSMHfkyq90Lp6tjCeTISkwOb66qqkRUlzycPiB67BgrIea8yn14lKpE7GLcgmO
XSLLRvv6tFD2VcqEWQZNLoLo8XqAngvgStbOXxiTFpSQD4h2Z0TbpSc9qdVZlgIt
JBMA0BuDj+gzCDePUlsuM5L1EVOaFuKSHiZA6MUewDBqiiKoWPPsDKsFpKt2jFv4
Mp47JcWkDBg1DCC8nQ7f
=FAuJ
-----END PGP SIGNATURE-----
Merge tag 'vfio-v3.12-rc0' of git://github.com/awilliam/linux-vfio
Pull VFIO update from Alex Williamson:
"VFIO updates include safer default file flags for VFIO device fds, an
external user interface exported to allow other modules to hold
references to VFIO groups, a fix to test for extended config space on
PCIe and PCI-x, and new hot reset interfaces for PCI devices which
allows the user to do PCI bus/slot resets when all of the devices
affected by the reset are owned by the user.
For this last feature, the PCI bus reset interface, I depend on
changes already merged from Bjorn's PCI pull request. I therefore
merged my tree up to commit cb3e433, which I think was the correct
action, but as Stephen Rothwell noted, I failed to provide a commit
message indicating why the merge was required. Sorry for that.
Thanks, Alex"
* tag 'vfio-v3.12-rc0' of git://github.com/awilliam/linux-vfio:
vfio: fix documentation
vfio-pci: PCI hot reset interface
vfio-pci: Test for extended config space
vfio-pci: Use fdget() rather than eventfd_fget()
vfio: Add O_CLOEXEC flag to vfio device fd
vfio: use get_unused_fd_flags(0) instead of get_unused_fd()
vfio: add external user support
MAX_SERVER_SIZE has been moved to cifs_mount.h and renamed
CIFS_NI_MAXHOST for clarity. It has been expanded to 1024 as the
previous value of 16 was very short.
Signed-off-by: Scott Lovenberg <scott.lovenberg@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
The old max share name length limit was 80 due to Windows NET SHARE
command not allowing more than that. However, share names can be much
longer. This is a more reasonable maximum share name length.
Signed-off-by: Scott Lovenberg <scott.lovenberg@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
The max string length definitions for user name, domain name, password,
and share name have been moved into their own header file in uapi so the
mount helper can use autoconf to define them instead of keeping the
kernel side and userland side definitions in sync manually. The names
have also been standardized with a "CIFS" prefix and "LEN" suffix.
Signed-off-by: Scott Lovenberg <scott.lovenberg@gmail.com>
Reviewed-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Pull NVM Express driver update from Matthew Wilcox.
* git://git.infradead.org/users/willy/linux-nvme:
NVMe: Merge issue on character device bring-up
NVMe: Handle ioremap failure
NVMe: Add pci suspend/resume driver callbacks
NVMe: Use normal shutdown
NVMe: Separate controller init from disk discovery
NVMe: Separate queue alloc/free from create/delete
NVMe: Group pci related actions in functions
NVMe: Disk stats for read/write commands only
NVMe: Bring up cdev on set feature failure
NVMe: Fix checkpatch issues
NVMe: Namespace IDs are unsigned
NVMe: Update nvme_id_power_state with latest spec
NVMe: Split header file into user-visible and kernel-visible pieces
NVMe: Call nvme_process_cq from submission path
NVMe: Remove "process_cq did something" message
NVMe: Return correct value from interrupt handler
NVMe: Disk IO statistics
NVMe: Restructure MSI / MSI-X setup
NVMe: Use kzalloc instead of kmalloc+memset
Pull security subsystem updates from James Morris:
"Nothing major for this kernel, just maintenance updates"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (21 commits)
apparmor: add the ability to report a sha1 hash of loaded policy
apparmor: export set of capabilities supported by the apparmor module
apparmor: add the profile introspection file to interface
apparmor: add an optional profile attachment string for profiles
apparmor: add interface files for profiles and namespaces
apparmor: allow setting any profile into the unconfined state
apparmor: make free_profile available outside of policy.c
apparmor: rework namespace free path
apparmor: update how unconfined is handled
apparmor: change how profile replacement update is done
apparmor: convert profile lists to RCU based locking
apparmor: provide base for multiple profiles to be replaced at once
apparmor: add a features/policy dir to interface
apparmor: enable users to query whether apparmor is enabled
apparmor: remove minimum size check for vmalloc()
Smack: parse multiple rules per write to load2, up to PAGE_SIZE-1 bytes
Smack: network label match fix
security: smack: add a hash table to quicken smk_find_entry()
security: smack: fix memleak in smk_write_rules_list()
xattr: Constify ->name member of "struct xattr".
...
If we have multiple sessions on a system, we normally don't want
background sessions to read input events. Otherwise, it could capture
passwords and more entered by the user on the foreground session. This is
a real world problem as the recent XMir development showed:
http://mjg59.dreamwidth.org/27327.html
We currently rely on sessions to release input devices when being
deactivated. This relies on trust across sessions. But that's not given on
usual systems. We therefore need a way to control which processes have
access to input devices.
With VTs the kernel simply routed them through the active /dev/ttyX. This
is not possible with evdev devices, though. Moreover, we want to avoid
routing input-devices through some dispatcher-daemon in userspace (which
would add some latency).
This patch introduces EVIOCREVOKE. If called on an evdev fd, this revokes
device-access irrecoverably for that *single* open-file. Hence, once you
call EVIOCREVOKE on any dup()ed fd, all fds for that open-file will be
rather useless now (but still valid compared to close()!). This allows us
to pass fds directly to session-processes from a trusted source. The
source keeps a dup()ed fd and revokes access once the session-process is
no longer active.
Compared to the EVIOCMUTE proposal, we can avoid the CAP_SYS_ADMIN
restriction now as there is no way to revive the fd again. Hence, a user
is free to call EVIOCREVOKE themself to kill the fd.
Additionally, this ioctl allows multi-layer access-control (again compared
to EVIOCMUTE which was limited to one layer via CAP_SYS_ADMIN). A middle
layer can simply request a new open-file from the layer above and pass it
to the layer below. Now each layer can call EVIOCREVOKE on the fds to
revoke access for all layers below, at the expense of one fd per layer.
There's already ongoing experimental user-space work which demonstrates
how it can be used:
http://lists.freedesktop.org/archives/systemd-devel/2013-August/012897.html
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Pull input updates from Dmitry Torokhov:
"A new driver for slidebar on Ideapad laptops and a bunch of assorted
driver fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (32 commits)
Input: add SYN_MAX and SYN_CNT constants
Input: max11801_ts - convert to devm
Input: egalax-ts - fix typo and improve text
Input: MAINTAINERS - change maintainer for cyttsp driver
Input: cyttsp4 - kill 'defined but not used' compiler warnings
Input: add driver for slidebar on Lenovo IdeaPad laptops
Input: omap-keypad - set up irq type from DT
Input: omap-keypad - enable wakeup capability for keypad.
Input: omap-keypad - clear interrupts on open
Input: omap-keypad - convert to threaded IRQ
Input: omap-keypad - use bitfiled instead of hardcoded values
Input: cyttsp4 - remove useless NULL test from cyttsp4_watchdog_timer()
Input: wacom - fix error return code in wacom_probe()
Input: as5011 - fix error return code in as5011_probe()
Input: keyboard, serio - simplify use of devm_ioremap_resource
Input: tegra-kbc - simplify use of devm_ioremap_resource
Input: htcpen - fix incorrect placement of __initdata
Input: qt1070 - add power management ops
Input: wistron_btns - add MODULE_DEVICE_TABLE
Input: wistron_btns - mark the Medion MD96500 keymap as tested
...
This reverts commits 61e00655e9, 73f8645db1 and 8e22ecb603c8:
"Input: introduce BTN/ABS bits for drums and guitars"
"HID: wiimote: add support for Guitar-Hero drums"
"HID: wiimote: add support for Guitar-Hero guitars"
The extra new ABS_xx values resulted in ABS_MAX no longer being a
power-of-two, which broke the comparison logic. It also caused the
ioctl numbers to overflow into the next byte, causing problems for that.
We'll try again for 3.13.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>