1. Assorted changes
1.1 allow more feature bits for the guest
1.2 Store breaking event address on program interrupts
2. Interrupt handling rework
2.1 Fix copy_to_user while holding a spinlock (cc stable)
2.2 Rework floating interrupts to follow the priorities
2.3 Allow to inject all local interrupts via new ioctl
2.4 allow to get/set the full local irq state, e.g. for migration
and introspection
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJVGvEUAAoJEBF7vIC1phx82tEP/3KrwsDRs+buiBqyv9k+qCFV
v+R94gReBB5ggfbGfUYgBJMR2/4XQ+0jcZ55jfBCC4osOq6Juw/8HIj2nSgbQHmz
F9Go0n8IqJ3DnqPTc0KYdFZ7kqDvMV5ME3XJrFiAHv1TUL9H/KpZArkcVIwD2NOo
w01AVrCDY4bTajYqKShzGFymQl1K5vTGGvgxhh4kAHct4Nt5N5HFmyROm0RrsFZx
Sycx4t177O7zhCN2tv5Zy8iWaEvzHAESoXkhZ2cJ6t+FXii2Eov5IgyyfYRXBfbm
YACyvlFD087UdFGTt85ggPVS/S/5hn9xXmVHuIimHeyZU7CXCN5vYPcn+ZyksYr5
uA8+/2OPAgcaeDa2f7nCjl8jmcLR3hkQ0n/urA+pPYAZANJoFDfiGOr/kVk6aKff
JTGSFUjNK891/IGEsdrSk2p64U5xMd8LFa3Il++kZT91gc2nrZOHNz5FGlXlkLdJ
sADeNFWhoprEt/2P4aX6W2j26L8G874XkldDSjrS41U8L55+IiEm09r8oAWgfc5A
pryeDaN4nSjFC+HOtlPkcVkAcsswiI6nHIm3+/XFetCq+v4pnVKFMHWsTeEjiQgQ
H5aV9mfEKTJaCPrAJMsj8ZsKq0usG+BeRNqpIvxPAQB8fyl3jw9iu+RHeY1xWsTg
BRHB/+CGYIxDu4XdRexv
=Rrx5
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-next-20150331' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
Features and fixes for 4.1 (kvm/next)
1. Assorted changes
1.1 allow more feature bits for the guest
1.2 Store breaking event address on program interrupts
2. Interrupt handling rework
2.1 Fix copy_to_user while holding a spinlock (cc stable)
2.2 Rework floating interrupts to follow the priorities
2.3 Allow to inject all local interrupts via new ioctl
2.4 allow to get/set the full local irq state, e.g. for migration
and introspection
Fixes page refcounting issues in our Stage-2 page table management code,
fixes a missing unlock in a gicv3 error path, and fixes a race that can
cause lost interrupts if signals are pending just prior to entering the
guest.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJVBs95AAoJEEtpOizt6ddyAfoH/Rwj2T38ZDikImPpgfeFrmJs
ZlWC+Z3akwjVHPv308/MsKdyashtA7OjiMp3DOheMFMYJay/ecgY/92vFCc6uh5S
LDoXCbp+Pneth6C6bbU2Gw+aoCD07ZYCn9PeFq40MfpQUhCEGWhx41OFzHppqOZx
e+jodHRE+sBVTFUtbz+HubAfcM46f/8bP7682CEKsVZPeTSiHyeojdZEglfB37MG
ar/iC1/cyO/097vWaBqv1t1WZoHbWmMrDlzo5X+AtayVXFNdv4Ztw0Rz2kRhnLB8
8GXYawoSQoTN8FX1oyTyr5YWcWD7wDTzhcHsHS1xZHhvrdLCEcFrHeEWkuUlYjU=
=YS6j
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-fixes-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into 'kvm-next'
Fixes for KVM/ARM for 4.0-rc5.
Fixes page refcounting issues in our Stage-2 page table management code,
fixes a missing unlock in a gicv3 error path, and fixes a race that can
cause lost interrupts if signals are pending just prior to entering the
guest.
This patch adds support to migrate vcpu interrupts. Two new vcpu ioctls
are added which get/set the complete status of pending interrupts in one
go. The ioctls are marked as available with the new capability
KVM_CAP_S390_IRQ_STATE.
We can not use a ONEREG, as the number of pending local interrupts is not
constant and depends on the number of CPUs.
To retrieve the interrupt state we add an ioctl KVM_S390_GET_IRQ_STATE.
Its input parameter is a pointer to a struct kvm_s390_irq_state which
has a buffer and length. For all currently pending interrupts, we copy
a struct kvm_s390_irq into the buffer and pass it to userspace.
To store interrupt state into a buffer provided by userspace, we add an
ioctl KVM_S390_SET_IRQ_STATE. It passes a struct kvm_s390_irq_state into
the kernel and injects all interrupts contained in the buffer.
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
We have introduced struct kvm_s390_irq a while ago which allows to
inject all kinds of interrupts as defined in the Principles of
Operation.
Add ioctl to inject interrupts with the extended struct kvm_s390_irq
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Currently we have struct kvm_exit_mmio for encapsulating MMIO abort
data to be passed on from syndrome decoding all the way down to the
VGIC register handlers. Now as we switch the MMIO handling to be
routed through the KVM MMIO bus, it does not make sense anymore to
use that structure already from the beginning. So we keep the data in
local variables until we put them into the kvm_io_bus framework.
Then we fill kvm_exit_mmio in the VGIC only, making it a VGIC private
structure. On that way we replace the data buffer in that structure
with a pointer pointing to a single location in a local variable, so
we get rid of some copying on the way.
With all of the virtual GIC emulation code now being registered with
the kvm_io_bus, we can remove all of the old MMIO handling code and
its dispatching functionality.
I didn't bother to rename kvm_exit_mmio (to vgic_mmio or something),
because that touches a lot of code lines without any good reason.
This is based on an original patch by Nikolay.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Using the framework provided by the recent vgic.c changes, we
register a kvm_io_bus device on mapping the virtual GICv3 resources.
The distributor mapping is pretty straight forward, but the
redistributors need some more love, since they need to be tagged with
the respective redistributor (read: VCPU) they are connected with.
We use the kvm_io_bus framework to register one devices per VCPU.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Now that the code is in place for KVM to support MIPS SIMD Architecutre
(MSA) in MIPS guests, wire up the new KVM_CAP_MIPS_MSA capability.
For backwards compatibility, the capability must be explicitly enabled
in order to detect or make use of MSA from the guest.
The capability is not supported if the hardware supports MSA vector
partitioning, since the extra support cannot be tested yet and it
extends the state that the userland program would have to save.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-api@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Now that the code is in place for KVM to support FPU in MIPS KVM guests,
wire up the new KVM_CAP_MIPS_FPU capability.
For backwards compatibility, the capability must be explicitly enabled
in order to detect or make use of the FPU from the guest.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Cc: linux-api@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Using the framework provided by the recent vgic.c changes we register
a kvm_io_bus device when initializing the virtual GICv2.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Currently we use a lot of VGIC specific code to do the MMIO
dispatching.
Use the previous reworks to add kvm_io_bus style MMIO handlers.
Those are not yet called by the MMIO abort handler, also the actual
VGIC emulator function do not make use of it yet, but will be enabled
with the following patches.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
iodev.h contains definitions for the kvm_io_bus framework. This is
needed both by the generic KVM code in virt/kvm as well as by
architecture specific code under arch/. Putting the header file in
virt/kvm and using local includes in the architecture part seems at
least dodgy to me, so let's move the file into include/kvm, so that a
more natural "#include <kvm/iodev.h>" can be used by all of the code.
This also solves a problem later when using struct kvm_io_device
in arm_vgic.h.
Fixing up the FSF address in the GPL header and a wrong include path
on the way.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
This is needed in e.g. ARM vGIC emulation, where the MMIO handling
depends on the VCPU that does the access.
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
1. Fixes
2. Implement access register mode in KVM
3. Provide a userspace post handler for the STSI instruction
4. Provide an interface for compliant memory accesses
5. Provide an interface for getting/setting the guest storage key
6. Fixup for the vector facility patches: do not announce the
vector facility in the guest for old QEMUs.
1-5 were initially shown as RFC in
http://www.spinics.net/lists/kvm/msg114720.html
some small review changes
- added some ACKs
- have the AR mode patches first
- get rid of unnecessary AR_INVAL define
- typos and language
6. two new patches
The two new patches fixup the vector support patches that were
introduced in the last pull request for QEMU versions that dont
know about vector support and guests that do. (We announce the
facility bit, but dont enable the facility so vector aware guests
will crash on vector instructions).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJVCV7NAAoJEBF7vIC1phx8ymcP/RovYmBGd7e6jBZLx4fooc97
DFuMkEdNT3bkA0x/L+SgYcExFkoAUX5KPK74mHOTmmBZHopX1AMDKQyEDacjNWBb
9CJdPffJWKKjFtC7KwrkgnDKBrOsmNLdWsLtl8aEIAxxKznvLXYsrvMrBYqdRkUh
nJsjaQueKP8AbzSLsG9N3Yilps2988VMo+wArfw0jVCCO+sWNZnYYsMXwRgyQsPb
K5+0Co/cw1wfnzy1hUWqpRWs26JLIcewLnMx9Ycoaap1V0A59t8J/9xPDHHODX3d
2wXlJsiNmoJj/kqakT4xlNTS0q7Tn2iLlJ1NNUADScR3zP7twXs17H2g8hGSVRiM
xBd0s671m4eSZB5Bk3LSf0PPLbOmTnCB1qYpXhd56an3MGbYIzRtcmU9LvbY3SzH
yxsVUGww1uvYN6A5RABDjqrbmnl9eQ4HruNUnA/fHLS6sDtYbmi7ln4bV3eE1sPa
0r7lPYKWdvyN0FO3Rb9Qwnjhd3F/uaLTWjptdLXapRLO0fD8adiYqzWmbXMZrgcD
BU1CNejIIYP/GZDOQanJoVQJWd9akUW/s6QiDMJRc87KaL13/cCWoicPi1j62ygj
gueYjz1KfKh6hoVvviTl2NgyXke2qs+bIKDrRV6VfdIgfeSN50lUxdJm61RdDA9d
IPrxUJDy8YdzH7rcMPXP
=GqOj
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-next-20150318' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into queue
KVM: s390: Features and fixes for 4.1 (kvm/next)
1. Fixes
2. Implement access register mode in KVM
3. Provide a userspace post handler for the STSI instruction
4. Provide an interface for compliant memory accesses
5. Provide an interface for getting/setting the guest storage key
6. Fixup for the vector facility patches: do not announce the
vector facility in the guest for old QEMUs.
1-5 were initially shown as RFC in
http://www.spinics.net/lists/kvm/msg114720.html
some small review changes
- added some ACKs
- have the AR mode patches first
- get rid of unnecessary AR_INVAL define
- typos and language
6. two new patches
The two new patches fixup the vector support patches that were
introduced in the last pull request for QEMU versions that dont
know about vector support and guests that do. (We announce the
facility bit, but dont enable the facility so vector aware guests
will crash on vector instructions).
The following point:
2. per-CPU pvclock time info is updated if the
underlying CPU changes.
Is not true anymore since "KVM: x86: update pvclock area conditionally,
on cpu migration".
Add task migration notification back.
Problem noticed by Andy Lutomirski.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
CC: stable@kernel.org # 3.11+
Provide the KVM_S390_GET_SKEYS and KVM_S390_SET_SKEYS ioctl which can be used
to get/set guest storage keys. This functionality is needed for live migration
of s390 guests that use storage keys.
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
The Store System Information (STSI) instruction currently collects all
information it relays to the caller in the kernel. Some information,
however, is only available in user space. An example of this is the
guest name: The kernel always sets "KVMGuest", but user space knows the
actual guest name.
This patch introduces a new exit, KVM_EXIT_S390_STSI, guarded by a
capability that can be enabled by user space if it wants to be able to
insert such data. User space will be provided with the target buffer
and the requested STSI function code.
Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
On s390, we've got to make sure to hold the IPTE lock while accessing
logical memory. So let's add an ioctl for reading and writing logical
memory to provide this feature for userspace, too.
The maximum transfer size of this call is limited to 64kB to prevent
that the guest can trigger huge copy_from/to_user transfers. QEMU
currently only requests up to one or two pages so far, so 16*4kB seems
to be a reasonable limit here.
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
When a VCPU is no longer running, we currently check to see if it has a
timer scheduled in the future, and if it does, we schedule a host
hrtimer to notify is in case the timer expires while the VCPU is still
not running. When the hrtimer fires, we mask the guest's timer and
inject the timer IRQ (still relying on the guest unmasking the time when
it receives the IRQ).
This is all good and fine, but when migration a VM (checkpoint/restore)
this introduces a race. It is unlikely, but possible, for the following
sequence of events to happen:
1. Userspace stops the VM
2. Hrtimer for VCPU is scheduled
3. Userspace checkpoints the VGIC state (no pending timer interrupts)
4. The hrtimer fires, schedules work in a workqueue
5. Workqueue function runs, masks the timer and injects timer interrupt
6. Userspace checkpoints the timer state (timer masked)
At restore time, you end up with a masked timer without any timer
interrupts and your guest halts never receiving timer interrupts.
Fix this by only kicking the VCPU in the workqueue function, and sample
the expired state of the timer when entering the guest again and inject
the interrupt and mask the timer only then.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Migrating active interrupts causes the active state to be lost
completely. This implements some additional bitmaps to track the active
state on the distributor and export this to user space.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
There is an interesting bug in the vgic code, which manifests itself
when the KVM run loop has a signal pending or needs a vmid generation
rollover after having disabled interrupts but before actually switching
to the guest.
In this case, we flush the vgic as usual, but we sync back the vgic
state and exit to userspace before entering the guest. The consequence
is that we will be syncing the list registers back to the software model
using the GICH_ELRSR and GICH_EISR from the last execution of the guest,
potentially overwriting a list register containing an interrupt.
This showed up during migration testing where we would capture a state
where the VM has masked the arch timer but there were no interrupts,
resulting in a hung test.
Cc: Marc Zyngier <marc.zyngier@arm.com>
Reported-by: Alex Bennee <alex.bennee@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
1. Several Fixes and enhancements
---------------------------------
- These 3 patches have cc stable:
b75f4c9 KVM: s390: Zero out current VMDB of STSI before including level3 data.
261520d KVM: s390: fix handling of write errors in the tpi handler
15462e3 KVM: s390: reinjection of irqs can fail in the tpi handler
2. SIMD support the kernel part (introduced with z13)
-----------------------------------------------------
- two KVM-generic changes in kvm.h:
1. New capability that can be enabled: KVM_CAP_S390_VECTOR_REGISTERS
2. increased padding size for sync regs in struct kvm_run to clarify that
sync regs can be larger than 1k. This is fine as this is the last
element in the structure.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJU+ab+AAoJEBF7vIC1phx8jMMQAJ1lXYJUbnOSJ7GD1aqQe4b3
pGW9rjQtZ1sZ6tH45sQeHzHg9tbD5ZHvSwMRuW2itLZXtusUXYajkR3TdHiR2bcR
karOqEducjPR6UCLdyfCAKkssZAkPyTXMnj3PRr9vpISvw53W5ZKxulIMtqSbMP0
Cl5eP1x+qz24csDV7u/S0k4Kccol43XfmVqVtZm2yVVt1qXIM1qqbUoJBTGECLHC
Ux9BbtTYC4lZl8Q/m9hyCD6YV7flhSKPKN6VbyAnRIM1LQiLCEAmppWJIbTbTI+3
m+fM52ue/QlkogGCQ7Mg9+lJkwBKfNCwneSPds4/M41Atwqz5m66L/D1rIlfD4Bh
HbnMB9FaRXIeyDAgdG/pemZQeOsnDHiJBHphQ0Q5TPl1lLbWjHFweG8uQ9vGMizf
Diy5Rk16MBVbgjYBf8Jy7ZXHisBN+gbOkzXD5j28+j3OGdzhVilFQIJu2oqv5wzW
r30ogxmdGH7ljh6bWJKj7YXZx0C3oDF/8HRkf4Fevh064c66S0jTLd8wb4YqGSy5
3qfNqovnvaSjPyKLVajmSOeFIVwtSnylFTqmTOxsC5o786BWgU0Mh6fhKGyUQlcs
PMmn5B5VrLLb8YZtX0XOWkeA45KXXeT19qb2WaB/VZSSgb4fG7W/oqqQ6LrZl0sc
pEBYwbsFvkFG+04aIYN3
=bzDF
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-next-20150306' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into queue
KVM: s390: Features and Fixes for 4.1 (kvm/next)
1. Several Fixes and enhancements
---------------------------------
- These 3 patches have cc stable:
b75f4c9 KVM: s390: Zero out current VMDB of STSI before including level3 data.
261520d KVM: s390: fix handling of write errors in the tpi handler
15462e3 KVM: s390: reinjection of irqs can fail in the tpi handler
2. SIMD support the kernel part (introduced with z13)
-----------------------------------------------------
- two KVM-generic changes in kvm.h:
1. New capability that can be enabled: KVM_CAP_S390_VECTOR_REGISTERS
2. increased padding size for sync regs in struct kvm_run to clarify that
sync regs can be larger than 1k. This is fine as this is the last
element in the structure.
Introduce __KVM_HAVE_ARCH_INTC_INITIALIZED define and
associated kvm_arch_intc_initialized function. This latter
allows to test whether the virtual interrupt controller is initialized
and ready to accept virtual IRQ injection. On some architectures,
the virtual interrupt controller is dynamically instantiated, justifying
that kind of check.
The new function can now be used by irqfd to check whether the
virtual interrupt controller is ready on KVM_IRQFD request. If not,
KVM_IRQFD returns -EAGAIN.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
We can definitely decide at run-time whether to use the GIC and timers
or not, and the extra code and data structures that we allocate space
for is really negligable with this config option, so I don't think it's
worth the extra complexity of always having to define stub static
inlines. The !CONFIG_KVM_ARM_VGIC/TIMER case is pretty much an untested
code path anyway, so we're better off just getting rid of it.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
kvm_kvfree() provides exactly the same functionality as the
new common kvfree() function - so let's simply replace the
kvm function with the common function.
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Pull networking fixes from David Miller:
1) nft_compat accidently truncates ethernet protocol to 8-bits, from
Arturo Borrero.
2) Memory leak in ip_vs_proc_conn(), from Julian Anastasov.
3) Don't allow the space required for nftables rules to exceed the
maximum value representable in the dlen field. From Patrick
McHardy.
4) bcm63xx_enet can accidently leave interrupts permanently disabled
due to errors in the NAPI polling exit logic. Fix from Nicolas
Schichan.
5) Fix OOPSes triggerable by the ping protocol module, due to missing
address family validations etc. From Lorenzo Colitti.
6) Don't use RCU locking in sleepable context in team driver, from Jiri
Pirko.
7) xen-netback miscalculates statistic offset pointers when reporting
the stats to userspace. From David Vrabel.
8) Fix a leak of up to 256 pages per VIF destroy in xen-netaback, also
from David Vrabel.
9) ip_check_defrag() cannot assume that skb_network_offset(),
particularly when it is used by the AF_PACKET fanout defrag code.
From Alexander Drozdov.
10) gianfar driver doesn't query OF node names properly when trying to
determine the number of hw queues available. Fix it to explicitly
check for OF nodes named queue-group. From Tobias Waldekranz.
11) MID field in macb driver should be 12 bits, not 16. From Punnaiah
Choudary Kalluri.
12) Fix unintentional regression in traceroute due to timestamp socket
option changes. Empty ICMP payloads should be allowed in
non-timestamp cases. From Willem de Bruijn.
13) When devices are unregistered, we have to get rid of AF_PACKET
multicast list entries that point to it via ifindex. Fix from
Francesco Ruggeri.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits)
tipc: fix bug in link failover handling
net: delete stale packet_mclist entries
net: macb: constify macb configuration data
MAINTAINERS: add Marc Kleine-Budde as co maintainer for CAN networking layer
MAINTAINERS: linux-can moved to github
can: kvaser_usb: Read all messages in a bulk-in URB buffer
can: kvaser_usb: Avoid double free on URB submission failures
can: peak_usb: fix missing ctrlmode_ init for every dev
can: add missing initialisations in CAN related skbuffs
ip: fix error queue empty skb handling
bgmac: Clean warning messages
tcp: align tcp_xmit_size_goal() on tcp_tso_autosize()
net: fec: fix unbalanced clk disable on driver unbind
net: macb: Correct the MID field length value
net: gianfar: correctly determine the number of queue groups
ipv4: ip_check_defrag should not assume that skb_network_offset is zero
net: bcmgenet: properly disable password matching
net: eth: xgene: fix booting with devicetree
bnx2x: Force fundamental reset for EEH recovery
xen-netback: refactor xenvif_handle_frag_list()
...
A collection of driver specific fixes to which the usual comments about
them being important if you see them mostly apply (except for the
comment fix). The pl022 one is particularly nasty for anyone affected
by it.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJU+tr0AAoJECTWi3JdVIfQ4HIH/2p3AR8VJ1NzmqKslFUaC7SQ
CT6iIiV+1gT+Q/2CLtTwY04gVJrmbO85pl4aotefxuCsb8YFGPCEo3f0lYU/3XwK
ZQuC/7LFpWCqQCtSxoat9XQBHoFkWMrFDdsesQJLg9F46bCx/vVUuMaPrTXwSPLG
DA6isoNZgEBJeKAxKhOdwT/nJUrVJhNwEX8fa/vuISnde4ckVuX+34O60V0N0/S2
7hEw3LQFZW0IPsnkmEygd5ATonK/+s7BXLwoAZWJGpZeWB1YsBUiHV7fLunj6gVy
DMbKI3Fp1Yy/q0h6J+DzzbLvQxj0WTAX8EUz8PCh2QYRvUNiKeJdbKLbLUjGZgA=
=Lpqm
-----END PGP SIGNATURE-----
Merge tag 'spi-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A collection of driver specific fixes to which the usual comments
about them being important if you see them mostly apply (except for
the comment fix). The pl022 one is particularly nasty for anyone
affected by it"
* tag 'spi-v4.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: pl022: Fix race in giveback() leading to driver lock-up
spi: dw-mid: avoid potential NULL dereference
spi: img-spfi: Verify max spfi transfer length
spi: fix a typo in comment.
spi: atmel: Fix interrupt setup for PDC transfers
spi: dw: revisit FIFO size detection again
spi: dw-pci: correct number of chip selects
drivers: spi: ti-qspi: wait for busy bit clear before data write/read
* Fix regression in with omapdss when using i2c displays
* Fix possible null deref in fbmon
* Check kalloc return value in AMBA CLCD
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJU/ZY5AAoJEPo9qoy8lh71SQgP/R+w3TCTH4/+bu0D7f7Mebqu
50Jt16eevaNLIXy5Ib4nHAoR6dpkxZO681lkadJHV6uOPSObKPsthLRXbpa95E5G
iX6FeFS4GfS3y9xPDKxKIZBNFjp+w8U2Bkp5V6LAQvlE/oZTQ76eN0nCp25Kb6/h
7mORq9ZdBI4LDG2dgNNZLwXk+n4DX49P5E3SqG2LmbGmZ3gNWyHckU07BB2kgHs0
mCPMoiiheK8RvY64LtXZW1bdAkOYHA+lK4vxkp4bNZUr4ibs3SlUI2eHF4WnRc/6
wYMRnnltyTHdIznBVvnaka9ktLNWJ4eRiA8A7Msd890GL6hqCp+UopfNb/jPgSS8
WbpLVJG4jr2yE5bP6S60GimnN9N8XbJArYHs8gK0iacaFOs96SXRFs2KVM1mwV8P
NnzZKaa+TzNJhCb0ud7zvMXwuNJcEO+TmqnHZRF05h3L3eZ/9LpC8zjXveoJF3GX
3RzI5rn5ZSaXKDjGOKiMaS8AH2aZPcJeJJN6XEozpGMgolSepETS1gL8eXcAl8PE
Wm0Ff4sKZO+/2H9ZYVkI6e7TdHzJQyJPU9Wi9t3iaoEzks3K7QEbCiqQOWPkelFs
RQXZSQ4HP79lsCYlwj91xu/MH/eyNcF/TVV9fIvlkdOcTAJvpGU1ra9PZFxJlWI4
Kfu9afF+MpVAhSd4IxTk
=GHOG
-----END PGP SIGNATURE-----
Merge tag 'fbdev-fixes-4.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux
Pull fbdev fixes from Tomi Valkeinen:
- Fix regression in with omapdss when using i2c displays
- Fix possible null deref in fbmon
- Check kalloc return value in AMBA CLCD
* tag 'fbdev-fixes-4.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
OMAPDSS: fix regression with display sysfs files
video: fbdev: fix possible null dereference
video: ARM CLCD: Add missing error check for devm_kzalloc
Pull workqueue fix from Tejun Heo:
"One fix patch for a subtle livelock condition which can happen on
PREEMPT_NONE kernels involving two racing cancel_work calls. Whoever
comes in the second has to wait for the previous one to finish. This
was implemented by making the later one block for the same condition
that the former would be (work item completion) and then loop and
retest; unfortunately, depending on the wake up order, the later one
could lock out the former one to finish by busy looping on the cpu.
This is fixed by implementing explicit wait mechanism. Work item
might not belong anywhere at this point and there's remote possibility
of thundering herd problem. I originally tried to use bit_waitqueue
but it didn't work for static work items on modules. It's currently
using single wait queue with filtering wake up function and exclusive
wakeup. If this ever becomes a problem, which is not very likely, we
can try to figure out a way to piggy back on bit_waitqueue"
* 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix hang involving racing cancel[_delayed]_work_sync()'s for PREEMPT_NONE
Here's a round of USB fixes for 4.0-rc3.
Nothing major, the usual gadget, xhci and usb-serial fixes and a few new
device ids as well.
All have been in linux-next successfully.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlT8Q2oACgkQMUfUDdst+ym9cgCgloBq7GqYw5lnW+zVy6fmyS3U
zHMAoMYPLjpUuO4tHfXt46NxVHIMzGsg
=TMtd
-----END PGP SIGNATURE-----
Merge tag 'usb-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here's a round of USB fixes for 4.0-rc3.
Nothing major, the usual gadget, xhci and usb-serial fixes and a few
new device ids as well.
All have been in linux-next successfully"
* tag 'usb-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (36 commits)
xhci: Workaround for PME stuck issues in Intel xhci
xhci: fix reporting of 0-sized URBs in control endpoint
usb: ftdi_sio: Add jtag quirk support for Cyber Cortex AV boards
USB: ch341: set tty baud speed according to tty struct
USB: serial: cp210x: Adding Seletek device id's
USB: pl2303: disable break on shutdown
USB: mxuport: fix null deref when used as a console
USB: serial: clean up bus probe error handling
USB: serial: fix port attribute-creation race
USB: serial: fix tty-device error handling at probe
USB: serial: fix potential use-after-free after failed probe
USB: console: add dummy __module_get
USB: ftdi_sio: add PIDs for Actisense USB devices
Revert "USB: serial: make bulk_out_size a lower limit"
cdc-acm: Add support for Denso cradle CU-321
usb-storage: support for more than 8 LUNs
uas: Add US_FL_NO_REPORT_OPCODES for JMicron JMS539
USB: usbfs: don't leak kernel data in siginfo
xhci: Clear the host side toggle manually when endpoint is 'soft reset'
xhci: Allocate correct amount of scratchpad buffers
...
Here are some tty and serial driver fixes for 4.0-rc3.
Along with the atime fix that you know about, here are some other serial
driver bugfixes as well. Most notable is a wait_until_sent bugfix that
was traced back to being around since before 2.6.12 that Johan has fixed
up.
All have been in linux-next successfully.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlT8RCYACgkQMUfUDdst+yk62QCgycxS4giC2hyRver3dyvaNR6g
zYYAn2w0uRndW+AqP4Tls54isRz6owpF
=gA2k
-----END PGP SIGNATURE-----
Merge tag 'tty-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are some tty and serial driver fixes for 4.0-rc3.
Along with the atime fix that you know about, here are some other
serial driver bugfixes as well. Most notable is a wait_until_sent
bugfix that was traced back to being around since before 2.6.12 that
Johan has fixed up.
All have been in linux-next successfully"
* tag 'tty-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
TTY: fix tty_wait_until_sent maximum timeout
TTY: fix tty_wait_until_sent on 64-bit machines
USB: serial: fix infinite wait_until_sent timeout
TTY: bfin_jtag_comm: remove incorrect wait_until_sent operation
net: irda: fix wait_until_sent poll timeout
serial: uapi: Declare all userspace-visible io types
serial: core: Fix iotype userspace breakage
serial: sprd: Fix missing spin_unlock in sprd_handle_irq()
console: Fix console name size mismatch
tty: fix up atime/mtime mess, take four
serial: 8250_dw: Fix get_mctrl behaviour
serial:8250:8250_pci: delete unneeded quirk entries
serial:8250:8250_pci: fix redundant entry report for WCH_CH352_2S
Change email address for 8250_pci
serial: 8250: Revert "tty: serial: 8250_core: read only RX if there is something in the FIFO"
Revert "tty/serial: of_serial: add DT alias ID handling"
ioctl(TIOCGSERIAL|TIOCSSERIAL) report and can change the port->iotype.
UART drivers use the UPIO_* definitions, but the uapi header defines
parallel values and userspace uses these parallel values for ioctls;
thus the userspace values are definitive.
Define UPIO_* iotypes in terms of the uapi defines, SERIAL_IO_*;
extend the uapi defines to include all values in use by the serial
core.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3ffb1a8193 ("serial: core: Add big-endian iotype")
re-numbered userspace-dependent values; ioctl(TIOCSSERIAL) can
assign the port iotype (which is expected to match the selected
i/o accessors), so iotype values must not be changed.
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: <stable@vger.kernel.org> # 3.19+
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Reviewed-by: Kevin Cernekee <cernekee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull drm fixes from Dave Airlie:
"Radeon, imx, msm, and i915 fixes.
The msm, imx and i915 ones are fairly run of the mill.
Radeon had some DP audio and posting reads for irq fixes, along with a
fix for 32-bit kernels with new cards, we were using unsigned long to
represent GPU side memory space, but since that changed size on 32 vs
64 cards with lots of VRAM failed, so the change has no effect on
x86-64, just moves to using uint64_t instead"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (35 commits)
drm/msm: kexec fixes
drm/msm/mdp5: fix cursor blending
drm/msm/mdp5: fix cursor ROI
drm/msm/atomic: Don't leak atomic commit object when commit fails
drm/msm/mdp5: Avoid flushing registers when CRTC is disabled
drm/msm: update generated headers (add 6th lm.base entry)
drm/msm/mdp5: fixup "drm/msm: fix fallout of atomic dpms changes"
drm/ttm: device address space != CPU address space
drm/mm: Support 4 GiB and larger ranges
drm/i915: gen4: work around hang during hibernation
drm/i915: Check for driver readyness before handling an underrun interrupt
drm/radeon: fix interlaced modes on DCE8
drm/radeon: fix DRM_IOCTL_RADEON_CS oops
drm/radeon: do a posting read in cik_set_irq
drm/radeon: do a posting read in si_set_irq
drm/radeon: do a posting read in evergreen_set_irq
drm/radeon: do a posting read in r600_set_irq
drm/radeon: do a posting read in rs600_set_irq
drm/radeon: do a posting read in r100_set_irq
radeon/audio: fix DP audio on DCE6
...
- Fix ACPI resources management problems introduced by the recent
rework of the code in question (Jiang Liu) and a build issue
introduced by those changes (Joachim Nilsson).
- Fix a recent suspend-to-idle regression on systems where entering
idle states causes local timers to stop, prevent suspend-to-idle
from crashing in restricted configurations (no cpuidle driver,
cpuidle disabled etc.) and clean up the idle loop somewhat while
at it (Rafael J Wysocki).
- Fix build problem in the cpufreq ppc driver (Geert Uytterhoeven).
- Allow the ACPI backlight driver module to be loaded if ACPI is
disabled which helps the i915 driver in those configurations
(stable-candidate) and change the code to help debug unusual use
cases (Chris Wilson).
- Wakeup IRQ management changes in v3.18 caused some drivers on the
at91 platform to trigger a warning from the IRQ core related to
an unexpected combination of interrupt action handler flags.
However, on at91 a timer IRQ is shared with some other devices
(including system wakeup ones) and that leads to the unusual
combination of flags in question. To make it possible to avoid
the warning introduce a new interrupt action handler flag (which
can be used by drivers to indicate the special case to the core)
and rework the problematic at91 drivers to use it and work as
expected during system suspend/resume. From Boris Brezillon,
Rafael J Wysocki and Mark Rutland.
- Clean up the generic power domains subsystem's debugfs interface
(Kevin Hilman).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJU+cBpAAoJEILEb/54YlRxb+8P+weKzn3Lim4R86ZkYjUjSr+P
Y+1d9CvQETsGMqaJRssBQ8npSaXqGF7kDjj3a4WIONxrgIs9k/7wZmNtTDYC2C7T
flxQQunlaHrELFqguowSq2pLxDTbWIe1lF7vtPwv/Xn7bOd755NrnAPgITseuxh5
ggoZg4gWnfHL6THnnOY8Dw6ZciCe7/lxfdAQavL+0xYybvG8/0/Urn+CsA/Q4Oz7
S9g7OLuK5LOlgE8f14TvLykHCVrluGKXMaulDUqx0z4DqOS+OP+Dp65bLGAf6faE
kYmfnJfN5vcfARxvBHyYCKuQAviMxhbS3R4fqO15SbRws4hLHL7IEmuuBAuEbPES
oIXLR2OBHAWeyiStHxEOZ0yxwhK2KjCOks/dPPPGtK2ZF4PAmCsOk0cxh6WdnzH3
g50Tg5ebPFjnyT8OCFNFm1g1pAoKjt2RuN8OGcKwChYjek3Yk5fCrkty7jkJYtQE
xcfXwaDPwolZbo3X0yGrchbqJYmOU16Kuu1U20L80uL/1TxmzlF27pUyLj4BbJxW
co+cxumF4WA6lixfNOcVil4PEBgh3lhCD5FzkGOiE0CI/l3omVdmR40nPN++IllD
O7QxFVGxSRZfEeIP0ujjB6rwxJ8JsK3vwlUngommby7KFtssh9/VZ8l4FbjXnDXl
qLVbX2fxxSD3j8U9aEov
=nc5T
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
"These are fixes for recent regressions (ACPI resources management,
suspend-to-idle), stable-candidate fixes (ACPI backlight), fixes
related to the wakeup IRQ management changes made in v3.18, other
fixes (suspend-to-idle, cpufreq ppc driver) and a couple of cleanups
(suspend-to-idle, generic power domains, ACPI backlight).
Specifics:
- Fix ACPI resources management problems introduced by the recent
rework of the code in question (Jiang Liu) and a build issue
introduced by those changes (Joachim Nilsson).
- Fix a recent suspend-to-idle regression on systems where entering
idle states causes local timers to stop, prevent suspend-to-idle
from crashing in restricted configurations (no cpuidle driver,
cpuidle disabled etc.) and clean up the idle loop somewhat while at
it (Rafael J Wysocki).
- Fix build problem in the cpufreq ppc driver (Geert Uytterhoeven).
- Allow the ACPI backlight driver module to be loaded if ACPI is
disabled which helps the i915 driver in those configurations
(stable-candidate) and change the code to help debug unusual use
cases (Chris Wilson).
- Wakeup IRQ management changes in v3.18 caused some drivers on the
at91 platform to trigger a warning from the IRQ core related to an
unexpected combination of interrupt action handler flags. However,
on at91 a timer IRQ is shared with some other devices (including
system wakeup ones) and that leads to the unusual combination of
flags in question.
To make it possible to avoid the warning introduce a new interrupt
action handler flag (which can be used by drivers to indicate the
special case to the core) and rework the problematic at91 drivers
to use it and work as expected during system suspend/resume. From
Boris Brezillon, Rafael J Wysocki and Mark Rutland.
- Clean up the generic power domains subsystem's debugfs interface
(Kevin Hilman)"
* tag 'pm+acpi-4.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
genirq / PM: describe IRQF_COND_SUSPEND
tty: serial: atmel: rework interrupt and wakeup handling
watchdog: at91sam9: request the irq with IRQF_NO_SUSPEND
cpuidle / sleep: Use broadcast timer for states that stop local timer
clk: at91: implement suspend/resume for the PMC irqchip
rtc: at91rm9200: rework wakeup and interrupt handling
rtc: at91sam9: rework wakeup and interrupt handling
PM / wakeup: export pm_system_wakeup symbol
genirq / PM: Add flag for shared NO_SUSPEND interrupt lines
ACPI / video: Propagate the error code for acpi_video_register
ACPI / video: Load the module even if ACPI is disabled
PM / Domains: cleanup: rename gpd -> genpd in debugfs interface
cpufreq: ppc: Add missing #include <asm/smp.h>
x86/PCI/ACPI: Relax ACPI resource descriptor checks to work around BIOS bugs
x86/PCI/ACPI: Ignore resources consumed by host bridge itself
cpuidle: Clean up fallback handling in cpuidle_idle_call()
cpuidle / sleep: Do sanity checks in cpuidle_enter_freeze() too
idle / sleep: Avoid excessive disabling and enabling interrupts
PCI: versatile: Update for list_for_each_entry() API change
genirq / PM: better describe IRQF_NO_SUSPEND semantics
Highlights include:
- Fix a regression in the NFSv4 open state recovery code
- Fix a regression in the NFSv4 close code
- Fix regressions and side-effects of the loop-back mounted NFS fixes
in 3.18, that cause the NFS read() syscall to return EBUSY.
- Fix regressions around the readdirplus code and how it interacts with
the VFS lazy unmount changes that went into v3.18.
- Fix issues with out-of-order RPC call replies replacing updated
attributes with stale ones (particularly after a truncate()).
- Fix an underflow checking issue with RPC/RDMA credits
- Fix a number of issues with the NFSv4 delegation return/free code.
- Fix issues around stale NFSv4.1 leases when doing a mount
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJU+SJ6AAoJEGcL54qWCgDyVgQQAKNsF/O2O9ip2uAHZRZvM6TS
Ev8c3Spj/FmRI1tlCcGi1zZ8uCSwvPQz3uAN0vOTMocUWjokT5sAgN5yBIxHasem
6YK7jxs9WHiP7MGyReaAFJwG/W6LZndnlNqPWPs9KiaWwKVXIsQ3uFm/Y0lr90Fi
ew16DQm0DUd4Yvv42WJR9ay7UwUPvT7wmaGIVVK2hjQqr2lx02jspt5kfrC+vsMU
OYU/0YDofb2ajs5krbah6tUHf3VDnSVrXP6if66IrukCM9S4AvowpnMJQ5QJALh+
cPlqHDm2ZzuIecpqZEgYLM73wQ2q+KBXTlDLcgYg6LjnqBEivwO9RDn6tfCwKTcS
tCFohQc9iDOj9rTZ9EQlQME6u/FdxWncpxyTsMyBk7FlcLsOQRqio/FXZhbyJGuH
gvIdIW3fPseijtejpYkxgabe6JL9NRvjv3SnOay7xHs9Vn4tRkFF2mkkQDZOG6HT
atkxQp8kB3m9gMoeAmTLdTcJkcFdk6AKnNrcyJaa1GW4msmMuGZtq/6vayKJsBZb
OIw788bDSOVpcVR+6SAC24/dutcl+WJHlSJShqvIrTKejBxPCc7IVSeg63FJsWTO
sxfXdUr3wVJ1ooDFCspeBj+3zAimIq7qDmRRs85ekgEtxrgdUldhg/VwbDjvLjmb
whXFpiCS7Ii0fVYtZvjz
=qMB7
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-4.0-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
- Fix a regression in the NFSv4 open state recovery code
- Fix a regression in the NFSv4 close code
- Fix regressions and side-effects of the loop-back mounted NFS fixes
in 3.18, that cause the NFS read() syscall to return EBUSY.
- Fix regressions around the readdirplus code and how it interacts
with the VFS lazy unmount changes that went into v3.18.
- Fix issues with out-of-order RPC call replies replacing updated
attributes with stale ones (particularly after a truncate()).
- Fix an underflow checking issue with RPC/RDMA credits
- Fix a number of issues with the NFSv4 delegation return/free code.
- Fix issues around stale NFSv4.1 leases when doing a mount"
* tag 'nfs-for-4.0-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (24 commits)
NFSv4.1: Clear the old state by our client id before establishing a new lease
NFSv4: Fix a race in NFSv4.1 server trunking discovery
NFS: Don't write enable new pages while an invalidation is proceeding
NFS: Fix a regression in the read() syscall
NFSv4: Ensure we skip delegations that are already being returned
NFSv4: Pin the superblock while we're returning the delegation
NFSv4: Ensure we honour NFS_DELEGATION_RETURNING in nfs_inode_set_delegation()
NFSv4: Ensure that we don't reap a delegation that is being returned
NFS: Fix stateid used for NFS v4 closes
NFSv4: Don't call put_rpccred() under the rcu_read_lock()
NFS: Don't require a filehandle to refresh the inode in nfs_prime_dcache()
NFSv3: Use the readdir fileid as the mounted-on-fileid
NFS: Don't invalidate a submounted dentry in nfs_prime_dcache()
NFSv4: Set a barrier in the update_changeattr() helper
NFS: Fix nfs_post_op_update_inode() to set an attribute barrier
NFS: Remove size hack in nfs_inode_attrs_need_update()
NFSv4: Add attribute update barriers to delegreturn and pNFS layoutcommit
NFS: Add attribute update barriers to NFS writebacks
NFS: Set an attribute barrier on all updates
NFS: Add attribute update barriers to nfs_setattr_update_inode()
...
Define and allocate space for both the host and guest views of
the vector registers for a given vcpu. The 32 vector registers
occupy 128 bits each (512 bytes total), but architecturally are
paired with 512 additional bytes of reserved space for future
expansion.
The kvm_sync_regs structs containing the registers are union'ed
with 1024 bytes of padding in the common kvm_run struct. The
addition of 1024 bytes of new register information clearly exceeds
the existing union, so an expansion of that padding is required.
When changing environments, we need to appropriately save and
restore the vector registers viewed by both the host and guest,
into and out of the sync_regs space.
The floating point registers overlay the upper half of vector
registers 0-15, so there's a bit of data duplication here that
needs to be carefully avoided.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes for your net tree,
they are:
1) Don't truncate ethernet protocol type to u8 in nft_compat, from
Arturo Borrero.
2) Fix several problems in the addition/deletion of elements in nf_tables.
3) Fix module refcount leak in ip_vs_sync, from Julian Anastasov.
4) Fix a race condition in the abort path in the nf_tables transaction
infrastructure. Basically aborted rules can show up as active rules
until changes are unrolled, oneliner from Patrick McHardy.
5) Check for overflows in the data area of the rule, also from Patrick.
6) Fix off-by-one in the per-rule user data size field. This introduces
a new nft_userdata structure that is placed at the beginning of the
user data area that contains the length to save some bits from the
rule and we only need one bit to indicate its presence, from Patrick.
7) Fix rule replacement error path, the replaced rule is deleted on
error instead of leaving it in place. This has been fixed by relying
on the abort path to undo the incomplete replacement.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
* suspend-to-idle:
cpuidle / sleep: Use broadcast timer for states that stop local timer
cpuidle: Clean up fallback handling in cpuidle_idle_call()
cpuidle / sleep: Do sanity checks in cpuidle_enter_freeze() too
idle / sleep: Avoid excessive disabling and enabling interrupts
Commit 3810631332 (PM / sleep: Re-implement suspend-to-idle handling)
overlooked the fact that entering some sufficiently deep idle states
by CPUs may cause their local timers to stop and in those cases it
is necessary to switch over to a broadcast timer prior to entering
the idle state. If the cpuidle driver in use does not provide
the new ->enter_freeze callback for any of the idle states, that
problem affects suspend-to-idle too, but it is not taken into account
after the changes made by commit 3810631332.
Fix that by changing the definition of cpuidle_enter_freeze() and
re-arranging of the code in cpuidle_idle_call(), so the former does
not call cpuidle_enter() any more and the fallback case is handled
by cpuidle_idle_call() directly.
Fixes: 3810631332 (PM / sleep: Re-implement suspend-to-idle handling)
Reported-and-tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
cancel[_delayed]_work_sync() are implemented using
__cancel_work_timer() which grabs the PENDING bit using
try_to_grab_pending() and then flushes the work item with PENDING set
to prevent the on-going execution of the work item from requeueing
itself.
try_to_grab_pending() can always grab PENDING bit without blocking
except when someone else is doing the above flushing during
cancelation. In that case, try_to_grab_pending() returns -ENOENT. In
this case, __cancel_work_timer() currently invokes flush_work(). The
assumption is that the completion of the work item is what the other
canceling task would be waiting for too and thus waiting for the same
condition and retrying should allow forward progress without excessive
busy looping
Unfortunately, this doesn't work if preemption is disabled or the
latter task has real time priority. Let's say task A just got woken
up from flush_work() by the completion of the target work item. If,
before task A starts executing, task B gets scheduled and invokes
__cancel_work_timer() on the same work item, its try_to_grab_pending()
will return -ENOENT as the work item is still being canceled by task A
and flush_work() will also immediately return false as the work item
is no longer executing. This puts task B in a busy loop possibly
preventing task A from executing and clearing the canceling state on
the work item leading to a hang.
task A task B worker
executing work
__cancel_work_timer()
try_to_grab_pending()
set work CANCELING
flush_work()
block for work completion
completion, wakes up A
__cancel_work_timer()
while (forever) {
try_to_grab_pending()
-ENOENT as work is being canceled
flush_work()
false as work is no longer executing
}
This patch removes the possible hang by updating __cancel_work_timer()
to explicitly wait for clearing of CANCELING rather than invoking
flush_work() after try_to_grab_pending() fails with -ENOENT.
Link: http://lkml.kernel.org/g/20150206171156.GA8942@axis.com
v3: bit_waitqueue() can't be used for work items defined in vmalloc
area. Switched to custom wake function which matches the target
work item and exclusive wait and wakeup.
v2: v1 used wake_up() on bit_waitqueue() which leads to NULL deref if
the target bit waitqueue has wait_bit_queue's on it. Use
DEFINE_WAIT_BIT() and __wake_up_bit() instead. Reported by Tomeu
Vizoso.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Rabin Vincent <rabin.vincent@axis.com>
Cc: Tomeu Vizoso <tomeu.vizoso@gmail.com>
Cc: stable@vger.kernel.org
Tested-by: Jesper Nilsson <jesper.nilsson@axis.com>
Tested-by: Rabin Vincent <rabin.vincent@axis.com>
We need to store device offsets in 64 bit as the device
address space may be larger than the CPU's.
Fixes GPU init failures on radeons with 4GB or more of
vram on 32 bit kernels. We put vram at the start of the
GPU's address space so the gart aperture starts at 4 GB
causing all GPU addresses in the gart aperture to get
truncated.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=89072
[airlied: fix warning on nouveau build]
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: thellstrom@vmware.com
Acked-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The current implementation is limited by the number of addresses that
fit into an unsigned long. This causes problems on 32-bit Tegra where
unsigned long is 32-bit but drm_mm is used to manage an IOVA space of
4 GiB. Given the 32-bit limitation, the range is limited to 4 GiB - 1
(or 4 GiB - 4 KiB for page granularity).
This commit changes the start and size of the range to be an unsigned
64-bit integer, thus allowing much larger ranges to be supported.
[airlied: fix i915 warnings and coloring callback]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
fixupo
It currently is required that all users of NO_SUSPEND interrupt
lines pass the IRQF_NO_SUSPEND flag when requesting the IRQ or the
WARN_ON_ONCE() in irq_pm_install_action() will trigger. That is
done to warn about situations in which unprepared interrupt handlers
may be run unnecessarily for suspended devices and may attempt to
access those devices by mistake. However, it may cause drivers
that have no technical reasons for using IRQF_NO_SUSPEND to set
that flag just because they happen to share the interrupt line
with something like a timer.
Moreover, the generic handling of wakeup interrupts introduced by
commit 9ce7a25849 (genirq: Simplify wakeup mechanism) only works
for IRQs without any NO_SUSPEND users, so the drivers of wakeup
devices needing to use shared NO_SUSPEND interrupt lines for
signaling system wakeup generally have to detect wakeup in their
interrupt handlers. Thus if they happen to share an interrupt line
with a NO_SUSPEND user, they also need to request that their
interrupt handlers be run after suspend_device_irqs().
In both cases the reason for using IRQF_NO_SUSPEND is not because
the driver in question has a genuine need to run its interrupt
handler after suspend_device_irqs(), but because it happens to
share the line with some other NO_SUSPEND user. Otherwise, the
driver would do without IRQF_NO_SUSPEND just fine.
To make it possible to specify that condition explicitly, introduce
a new IRQ action handler flag for shared IRQs, IRQF_COND_SUSPEND,
that, when set, will indicate to the IRQ core that the interrupt
user is generally fine with suspending the IRQ, but it also can
tolerate handler invocations after suspend_device_irqs() and, in
particular, it is capable of detecting system wakeup and triggering
it as appropriate from its interrupt handler.
That will allow us to work around a problem with a shared timer
interrupt line on at91 platforms.
Link: http://marc.info/?l=linux-kernel&m=142252777602084&w=2
Link: http://marc.info/?t=142252775300011&r=1&w=2
Link: https://lkml.org/lkml/2014/12/15/552
Reported-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
The NFT_USERDATA_MAXLEN is defined to 256, however we only have a u8
to store its size. Introduce a struct nft_userdata which contains a
length field and indicate its presence using a single bit in the rule.
The length field of struct nft_userdata is also a u8, however we don't
store zero sized data, so the actual length is udata->len + 1.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pull networking fixes from David Miller:
1) If an IPVS tunnel is created with a mixed-family destination
address, it cannot be removed. Fix from Alexey Andriyanov.
2) Fix module refcount underflow in netfilter's nft_compat, from Pablo
Neira Ayuso.
3) Generic statistics infrastructure can reference variables sitting on
a released function stack, therefore use dynamic allocation always.
Fix from Ignacy Gawędzki.
4) skb_copy_bits() return value test is inverted in ip_check_defrag().
5) Fix network namespace exit in openvswitch, we have to release all of
the per-net vports. From Pravin B Shelar.
6) Fix signedness bug in CAIF's cfpkt_iterate(), from Dan Carpenter.
7) Fix rhashtable grow/shrink behavior, only expand during inserts and
shrink during deletes. From Daniel Borkmann.
8) Netdevice names with semicolons should never be allowed, because
they serve as a separator. From Matthew Thode.
9) Use {,__}set_current_state() where appropriate, from Fabian
Frederick.
10) Revert byte queue limits support in r8169 driver, it's causing
regressions we can't figure out.
11) tcp_should_expand_sndbuf() erroneously uses tp->packets_out to
measure packets in flight, properly use tcp_packets_in_flight()
instead. From Neal Cardwell.
12) Fix accidental removal of support for bluetooth in CSR based Intel
wireless cards. From Marcel Holtmann.
13) We accidently added a behavioral change between native and compat
tasks, wrt testing the MSG_CMSG_COMPAT bit. Just ignore it if the
user happened to set it in a native binary as that was always the
behavior we had. From Catalin Marinas.
14) Check genlmsg_unicast() return valud in hwsim netlink tx frame
handling, from Bob Copeland.
15) Fix stale ->radar_required setting in mac80211 that can prevent
starting new scans, from Eliad Peller.
16) Fix memory leak in nl80211 monitor, from Johannes Berg.
17) Fix race in TX index handling in xen-netback, from David Vrabel.
18) Don't enable interrupts in amx-xgbe driver until all software et al.
state is ready for the interrupt handler to run. From Thomas
Lendacky.
19) Add missing netlink_ns_capable() checks to rtnl_newlink(), from Eric
W Biederman.
20) The amount of header space needed in macvtap was not calculated
properly, fix it otherwise we splat past the beginning of the
packet. From Eric Dumazet.
21) Fix bcmgenet TCP TX perf regression, from Jaedon Shin.
22) Don't raw initialize or mod timers, use setup_timer() and
mod_timer() instead. From Vaishali Thakkar.
23) Fix software maintained statistics in bcmgenet and systemport
drivers, from Florian Fainelli.
24) DMA descriptor updates in sh_eth need proper memory barriers, from
Ben Hutchings.
25) Don't do UDP Fragmentation Offload on RAW sockets, from Michal
Kubecek.
26) Openvswitch's non-masked set actions aren't constructed properly
into netlink messages, fix from Joe Stringer.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
openvswitch: Fix serialization of non-masked set actions.
gianfar: Reduce logging noise seen due to phy polling if link is down
ibmveth: Add function to enable live MAC address changes
net: bridge: add compile-time assert for cb struct size
udp: only allow UFO for packets from SOCK_DGRAM sockets
sh_eth: Really fix padding of short frames on TX
Revert "sh_eth: Enable Rx descriptor word 0 shift for r8a7790"
sh_eth: Fix RX recovery on R-Car in case of RX ring underrun
sh_eth: Ensure proper ordering of descriptor active bit write/read
net/mlx4_en: Disbale GRO for incoming loopback/selftest packets
net/mlx4_core: Fix wrong mask and error flow for the update-qp command
net: systemport: fix software maintained statistics
net: bcmgenet: fix software maintained statistics
rxrpc: don't multiply with HZ twice
rxrpc: terminate retrans loop when sending of skb fails
net/hsr: Fix NULL pointer dereference and refcnt bugs when deleting a HSR interface.
net: pasemi: Use setup_timer and mod_timer
net: stmmac: Use setup_timer and mod_timer
net: 8390: axnet_cs: Use setup_timer and mod_timer
net: 8390: pcnet_cs: Use setup_timer and mod_timer
...
When invalidating the page cache for a regular file, we want to first
sync all dirty data to disk and then call invalidate_inode_pages2().
The latter relies on nfs_launder_page() and nfs_release_page() to deal
respectively with dirty pages, and unstable written pages.
When commit 9590544694 ("NFS: avoid deadlocks with loop-back mounted
NFS filesystems.") changed the behaviour of nfs_release_page(), then it
made it possible for invalidate_inode_pages2() to fail with an EBUSY.
Unfortunately, that error is then propagated back to read().
Let's therefore work around the problem for now by protecting the call
to sync the data and invalidate_inode_pages2() so that they are atomic
w.r.t. the addition of new writes.
Later on, we can revisit whether or not we still need nfs_launder_page()
and nfs_release_page().
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>