This patch removes duplicate macro useage in events_base.c.
It also fixes gcc warning:
variable ‘col’ set but not used [-Wunused-but-set-variable]
Signed-off-by: Joshua Abraham <j.abraham1776@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0:
BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased)
Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010
RIP: e030:device_offline+0x9/0xb0
Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6
RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000
R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30
R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0
FS: 00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000
CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660
Call Trace:
handle_vcpu_hotplug_event+0xb5/0xc0
xenwatch_thread+0x80/0x140
? wait_woken+0x80/0x80
kthread+0x112/0x130
? kthread_create_worker_on_cpu+0x40/0x40
ret_from_fork+0x3a/0x50
This happens because handle_vcpu_hotplug_event is called twice. In the
first iteration cpu_present is still true, in the second iteration
cpu_present is false which causes get_cpu_device to return NULL.
In case of cpu#0, cpu_online is apparently always true.
Fix this crash by checking if the cpu can be hotplugged, which is false
for a cpu that was just removed.
Also check if the cpu was actually offlined by device_remove, otherwise
leave the cpu_present state as it is.
Rearrange to code to do all work with device_hotplug_lock held.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Scrubbing pages on initial balloon down can take some time, especially
in nested virtualization case (nested EPT is slow). When HVM/PVH guest is
started with memory= significantly lower than maxmem=, all the extra
pages will be scrubbed before returning to Xen. But since most of them
weren't used at all at that point, Xen needs to populate them first
(from populate-on-demand pool). In nested virt case (Xen inside KVM)
this slows down the guest boot by 15-30s with just 1.5GB needed to be
returned to Xen.
Add runtime parameter to enable/disable it, to allow initially disabling
scrubbing, then enable it back during boot (for example in initramfs).
Such usage relies on assumption that a) most pages ballooned out during
initial boot weren't used at all, and b) even if they were, very few
secrets are in the guest at that time (before any serious userspace
kicks in).
Convert CONFIG_XEN_SCRUB_PAGES to CONFIG_XEN_SCRUB_PAGES_DEFAULT (also
enabled by default), controlling default value for the new runtime
switch.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
When guest receives a sysrq request from the host it acknowledges it by
writing '\0' to control/sysrq xenstore node. This, however, make xenstore
watch fire again but xenbus_scanf() fails to parse empty value with "%c"
format string:
sysrq: SysRq : Emergency Sync
Emergency Sync complete
xen:manage: Error -34 reading sysrq code in control/sysrq
Ignore -ERANGE the same way we already ignore -ENOENT, empty value in
control/sysrq is totally legal.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CONFIG_BATMAN_ADV_DEBUGFS is disabled by default because debugfs is not
supported for batman-adv interfaces in any non-default netns. Any remaining
users of this interface should still be informed about the deprecation and
the generic netlink alternative.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
The !CONFIG_GENERIC_IOMAP version of ioport_map uses MMIO_UPPER_LIMIT to
prevent users from making I/O accesses outside the expected I/O range -
however it erroneously treats MMIO_UPPER_LIMIT as a mask which is
contradictory to its other users.
The introduction of CONFIG_INDIRECT_PIO, which subtracts an arbitrary
amount from IO_SPACE_LIMIT to form MMIO_UPPER_LIMIT, results in ioport_map
mangling the given port rather than capping it.
We address this by aligning more closely with the CONFIG_GENERIC_IOMAP
implementation of ioport_map by using the comparison operator and
returning NULL where the port exceeds MMIO_UPPER_LIMIT. Though note that
we preserve the existing behavior of masking with IO_SPACE_LIMIT such that
we don't break existing buggy drivers that somehow rely on this masking.
Fixes: 5745392e0c ("PCI: Apply the new generic I/O management on PCI IO hosts")
Reported-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJbmj3VAAoJEFKgDEdIgJTyKjAP/ie5PLfZa0A5Epy/JEMFnII3
ISkEH4DxA2Ymxy6jNLIJMAH67OWJUNmIaIyjSINdiBw+r6i4oS5iLcLdo2chsPaJ
KUbxdMJ2p46b2zhNvx6COFe6FghVhrtIX4RIZN5ZuWF4ChIP2bMK7/cA4uFtJXeI
X/Ge6SpYZ4jnSlnw5jSdLCmC/fP6oEALD9r77j454K/TWNAYHFStmsKkjbrBMDlg
Ja56qfHNdCs+8IoIWONYKPOUiE325OGRjRSH7vE2uC+BecRpt/H6BxAxZIaMstgj
CeAdTiVvbCF8wbqvuVj0TkQU2hzNFzcPf0YaT07wPJl1ClSgTKCt/bkcOqOcpLQm
n9+4WfqHsVEmWlBHENuxmHm3jA2p11mWB4R/NqvvZCHifS6gnKv9P4RYrlbSD4KB
yVba9FF81yotQSO2G76QzuZ1MFjqxNkii5MDGsAGye1iZOWHHHCy3S23AYVXwfJX
K7RP3sZ2Gora6cTJnsLvJBbPHi7EZoraVzLZUen+ig2slPDsWoCM0gghvB/Kce0G
ih9zMGMhjLOd54QOlGHlfH67BO1pxle5PJAcraqcctOep4pr+pj/h1GDgsMfF0kI
+wxk9F+FIC6vtkCd+a/tDxc7C/4ObeiYQp6RGQGm5vw4/9uYhkxu4MX1+ltqHEVl
hzBOQCd4p2EI/pAMPDm1
=Dj1C
-----END PGP SIGNATURE-----
Merge tag 'printk-for-4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk fix from Petr Mladek:
"Revert a commit that caused "quiet", "debug", and "loglevel" early
parameters to be ignored for early boot messages"
* tag 'printk-for-4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
Revert "printk: make sure to print log on console."
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCW5qpOgAKCRDh3BK/laaZ
PDCQAQCIKLg0aLeWOkfUO76mBjlp5srKgJfrqpFoyuozD6l2fQEAl/W2x9NOduV+
PK4sCYMT8SpI0hMrbv9P4zZ683kmaA8=
=RnZU
-----END PGP SIGNATURE-----
Merge tag 'ovl-fixes-4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs fixes from Miklos Szeredi:
"This fixes a regression in the recent file stacking update, reported
and fixed by Amir Goldstein. The fix is fairly trivial, but involves
adding a fadvise() f_op and the associated churn in the vfs. As
discussed on -fsdevel, there are other possible uses for this method,
than allowing proper stacking for overlays.
And there's one other fix for a syzkaller detected oops"
* tag 'ovl-fixes-4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: fix oopses in ovl_fill_super() failure paths
ovl: add ovl_fadvise()
vfs: implement readahead(2) using POSIX_FADV_WILLNEED
vfs: add the fadvise() file operation
Documentation/filesystems: update documentation of file_operations
ovl: fix GPF in swapfile_activate of file from overlayfs over xfs
ovl: respect FIEMAP_FLAG_SYNC flag
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlubFpAQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpoYnEADQYssR8J6leHkYgrAT4zhuhNLALknzHj49
F/s6U3npw1CkUOyNEJ52KScVDEv+bvKIjxXbpVOB1qXqtZWNQ9HkEwAEQWVg11T9
xfbvqDyk/qywfx/cKt+Fko8wcuCX4F4tDX8io9RWeosWg/W5oOPJ8vjh54j0kUXK
xzY947eNorYqjwDYUj56JsVkHObTA8PWg578tgnbYNQtXvJMhWLL4YlfqsbtTMfb
SuQ/AVRhsFPfCMX+0Qj5vu08UwrZ7VDkPKpqUpKfQPqB5VLpKbbCItMt32rCujj9
Ptmo81kEVxZaj9vqqst2wbUp3RVAN5SzcuXoj20GGP2HpWEwBAe8kxgsOVzEgOhx
dNPX2XjfsN0FGN0nPXbYSN0IPeyoxDSgImWRwrdZCaSTXm/96apnzPXNXt0kWh4L
3RP82Y655AtDEfZwVCkUFwqUIgyYm978vgGn/3zf682mBeQImon0AMAJk3QPNj7f
Re8acaPyix39D0+RYfeEHzO9UM1PQ2QF9IX9DwnSrHfoEWEnjPlaEZIbO5fGPKUV
ny+nGoadWqosOUz6/OJuo+MDzE7BlbN25KTN4xcLQlaKxblLJNoPhbgafR1AloTi
DkBXztwAYC9n49D+eiclHplJG0xcWmKyAj3FrINJsW1wkiH2LBjZVC4SX0NJh9Ei
RsFc6NdSDQ==
=1Jo5
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20180913' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Three fixes that should go into this series. This contains:
- Increase number of policies supported by blk-cgroup.
With blk-iolatency, we now have four in kernel, but we had a hard
limit of three...
- Fix regression in null_blk, where the zoned supported broke
queue_mode=0 (bio based).
- NVMe pull request, with a single fix for an issue in the rdma code"
* tag 'for-linus-20180913' of git://git.kernel.dk/linux-block:
null_blk: fix zoned support for non-rq based operation
blk-cgroup: increase number of supported policies
nvmet-rdma: fix possible bogus dereference under heavy load
asynchronous crypto hadsh API.
- Fix to both DM crypt and DM integrity targets to discontinue using
CRYPTO_TFM_REQ_MAY_SLEEP because its use of GFP_KERNEL can lead to
deadlock by recursing back into a filesystem.
- Various DM raid fixes related to reshape and rebuild races.
- Fix for DM thin-provisioning to avoid data corruption that was a
side-effect of needing to abort DM thin metadata transaction due to
running out of metadata space. Fix is to reserve a small amount of
metadata space so that once it is used the DM thin-pool can finish its
active transaction before switching to read-only mode.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJbmq1sAAoJEMUj8QotnQNa9hwIANBzgefmIoBA8hCqlS7Nzogh
moHAbaUONYxAVksJ5s4Divpa4RE8jSZ8AiVJizbLj94XxqnpA5x4pmhtUs3OdWxq
4FW8fZvb2L1IMeshajMsO+rKrL6y8uwYnRI5zSJRKxhV2frj1VZ4ioMEXMGrwqjO
jwNYFWglX5G3eIohUIDxgXki/WkacndQlsy2T6Utx0idz50usOt1pNON/GPhRsXd
yWbvxQWuaZlm6+TF1ygsrXr1nMwRwDw0pdjY9sRQ3hHmuYQ24Wgkwf5Ggi5US7VZ
iwgK8dppR8zWqwRyxRYahJvQvwcnC18mewFAmuDQfvPXqvUHtuhWaxzE4D9c3L0=
=yIoD
-----END PGP SIGNATURE-----
Merge tag 'for-4.19/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- DM verity fix for crash due to using vmalloc'd buffers with the
asynchronous crypto hadsh API.
- Fix to both DM crypt and DM integrity targets to discontinue using
CRYPTO_TFM_REQ_MAY_SLEEP because its use of GFP_KERNEL can lead to
deadlock by recursing back into a filesystem.
- Various DM raid fixes related to reshape and rebuild races.
- Fix for DM thin-provisioning to avoid data corruption that was a
side-effect of needing to abort DM thin metadata transaction due to
running out of metadata space. Fix is to reserve a small amount of
metadata space so that once it is used the DM thin-pool can finish
its active transaction before switching to read-only mode.
* tag 'for-4.19/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm thin metadata: try to avoid ever aborting transactions
dm raid: bump target version, update comments and documentation
dm raid: fix RAID leg rebuild errors
dm raid: fix rebuild of specific devices by updating superblock
dm raid: fix stripe adding reshape deadlock
dm raid: fix reshape race on small devices
dm: disable CRYPTO_TFM_REQ_MAY_SLEEP to fix a GFP_KERNEL recursion deadlock
dm verity: fix crash on bufio buffer that was allocated with vmalloc
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbmxVCAAoJEAx081l5xIa+H0EP/2F5S+VeVX/GQB3hs5xdNubj
fBGWorXefXRySPOtS0SdSjHnPnJCUgWnHnKOGrBAxtqRaImnqZLra9HztdV7WbBI
mzhfw39cekWX6q2/5WoTP2IxaujVMEyCFX9lgW+96msFktuwvmNGA6QP9MtOfyfV
oBzjrtoQSJYHx7uzDDiAzghMKm6EGbkDjqHnOuO3HGNc2N+mPlNd4cIaReAehOkG
r1PqOn26CRi9y/bsDgERLEiQURS4kHLz6QaGotJtMVrQblFuNwwk4eBEL9FjkGur
f9jNnRAW9MXk/lPkXjWCwPV5wYcHW7XpldL0pZKTnL/Od6Fm956Ecoda72cRquMe
4Rt+LOlZueNEcY4WyDnT9i5kYdE18TYvYNW5SpWXT2ZwhHmRgsblZsO+nTufndGT
+GxGAalsLB8wSVh0m3h2TdJeULINzHoN287i6KXZOoc0tR4yXh3Bek3k5K1Sajiv
3Ehp4mzOEBcut7qR9P8KNoOtt5lExp8LwxHcpwbl+rYeFUhNUr1ffuLBV59bVLt7
sluZQ9+UVHu72XSXdXRtB9WeyKPToZBkh2+rwad4SPLpUKwJuyXJim/IHKfGt0O9
BeilehpFVerP4RyB7dK3PRRzKS0zQmfVlr5x1WDXwir1RHDjWdoyLvHFzwn5OKSH
rIGf2J64oreok1UrCKyN
=NZvZ
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2018-09-14' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"This is the general drm fixes pull for rc4.
i915:
- Two GVT fixes (one for the mm reference issue you pointed out)
- Gen 2 video playback fix
- IPS timeout error suppression on Broadwell
amdgpu:
- Small memory leak
- SR-IOV reset
- locking fix
- updated SDMA golden registers
nouveau:
- Remove some leftover debugging"
* tag 'drm-fixes-2018-09-14' of git://anongit.freedesktop.org/drm/drm:
drm/nouveau/devinit: fix warning when PMU/PRE_OS is missing
drm/amdgpu: fix error handling in amdgpu_cs_user_fence_chunk
drm/i915/overlay: Allocate physical registers from stolen
drm/amdgpu: move PSP init prior to IH in gpu reset
drm/amdgpu: Fix SDMA hang in prt mode v2
drm/amdgpu: fix amdgpu_mn_unlock() in the CS error path
drm/i915/bdw: Increase IPS disable timeout to 100ms
drm/i915/gvt: Fix the incorrect length of child_device_config issue
drm/i915/gvt: Fix life cycle reference on KVM mm
- A complicated IRQ fix for the MSM driver (see commit).
- Fix the group/function check in the Ingenic driver.
- Deal with a possible NULL pointer dereference in the Madera
driver.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbmlOCAAoJEEEQszewGV1zXTAP+gLLyF9Txaa4t65wGYnbafoi
6DgGHOCvgxvro8M1vlWViDLmGdHGvMqSA0kHdpur5H+91tHIsHFTvZwiUtOQrwiG
nNJVK+ijNPLnVQNALqFbxasDCLs3FPQU+8KsQfQ/L4K3hz848+B/3Rqb/zxur/rY
miGPgivvXqdKr//o+lh9ekK+xrc9Je1PMUoRbXaZWBVMNqRB38NnRpkcFmTnfYUS
VGn6gXhJ33pajQCQOJLXppRP0z7hN5L1g8W2JOmZucZdRZjTVRxv99dUiFLxneEX
r9mvAS4W0pQLZOSsmOFCc/R64W3Znr8sQaJjlH6La76zazNCE8wGYhOgFnfQgHoH
z08WRSdd34xXGjzI0ipOHS0NdvM2V8tQQTSAzlE8qc5ItNSbyAmmHCrj9iodAQ7F
B1N4/YQTfly8vlnO8jRWF1E3AJ6zcwLu8Irh4MiBqUPxSF9SsQvDJIoQsD0HpsT3
bWl6dUmr96NhVwzuatITIfX8NHhR3YPTWgSir4ri4ybRuLTrA0iOH8UBfdegL0gM
xLfAAQt1VjU4ZN2s9b+IzXjsB0N/TPCbxDFlLOGgxn1/hdU4e8+2oD6R9Ba98jLx
e2DQ9D8raJo76069yQ0wzu92zFj5oEY71praWA3hDBa+HOzgx9xeryb9WHEzuYkR
iCjRSnCUO3Mbf6EukIy3
=g3Rc
-----END PGP SIGNATURE-----
Merge tag 'pinctrl-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
- A complicated IRQ fix for the MSM driver (see commit)
- Fix the group/function check in the Ingenic driver
- Deal with a possible NULL pointer dereference in the Madera driver
* tag 'pinctrl-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: madera: Fix possible NULL pointer with pdata config
pinctrl: ingenic: Fix group & function error checking
pinctrl: msm: Really mask level interrupts to prevent latching
Pull percpu maintainership update from Tejun Heo:
"This updates the MAINTAINERS file to transfer the percpu tree
maintainership to Dennis Zhou.
Dennis rewrote a good portion of the percpu allocator, knows most of
percpu related code, is already listed as a co-maintainer, has been
reliable, and now sits right behind me. I'll keep reviewing and
involved with percpu stuff and am sure that Dennis will soon make a
better maintainer than I ever was"
* 'for-4.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
MAINTAINERS: Make Dennis the percpu tree maintainer
Pull hexagon fixes from Richard Kuo:
"Some fixes for compile warnings"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel:
hexagon: modify ffs() and fls() to return int
arch/hexagon: fix kernel/dma.c build warning
One fix for the zcrypt driver to correctly handle incomplete
encryption/decryption operations.
A cleanup for the aqmask/apmask parsing to avoid variable
length arrays on the stack.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJbmkYUAAoJEDjwexyKj9rgrugH/21Uf1S0mkkRfDeYxD6lJva6
zNmEZ3V+GXc8L/0CisFWQOcIU1fO+jozp9HPGkQxeTvAuqIfVhBRVoMMxiTRaZb3
xdSBel8EvAGxlZq/6eq6fU59HPWGm+N53rC9J5MMQQqgpSmq8F2QeO5CoidflRh8
bdio9cliLsjPu+3P2JU3noolhb/f577J3dgP4gKARRpOfh8vUI7NLU3Mham+7886
ASjG8s/zr9spPnrErJusloOLDJt4M94J8KrIbB/WAT1wZv7GxClaGsCoyCuk4cDt
TV8zIgMK9TChedwMOO0T8WMaxq+XJV+iI0dC1eZ7xNUlaIBw+UbifxCG335L3E0=
=dHq1
-----END PGP SIGNATURE-----
Merge tag 's390-4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
- One fix for the zcrypt driver to correctly handle incomplete
encryption/decryption operations.
- A cleanup for the aqmask/apmask parsing to avoid variable length
arrays on the stack.
* tag 's390-4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/zcrypt: remove VLA usage from the AP bus
s390/crypto: Fix return code checking in cbc_paes_crypt()
Jann Horn points out that the vmacache_flush_all() function is not only
potentially expensive, it's buggy too. It also happens to be entirely
unnecessary, because the sequence number overflow case can be avoided by
simply making the sequence number be 64-bit. That doesn't even grow the
data structures in question, because the other adjacent fields are
already 64-bit.
So simplify the whole thing by just making the sequence number overflow
case go away entirely, which gets rid of all the complications and makes
the code faster too. Win-win.
[ Oleg Nesterov points out that the VMACACHE_FULL_FLUSHES statistics
also just goes away entirely with this ]
Reported-by: Jann Horn <jannh@google.com>
Suggested-by: Will Deacon <will.deacon@arm.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In the quest to remove all stack VLA usage from the kernel[1], this
removes the VLA used for the emac xaht registers size. Since the size
of registers can only ever be 4 or 8, as detected in emac_init_config(),
the max can be hardcoded and a runtime test added for robustness.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christian Lamparter <chunkeey@gmail.com>
Cc: Ivan Mikhaylov <ivan@de.ibm.com>
Cc: netdev@vger.kernel.org
Co-developed-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
a IPS timeout error suppression on Broadwell and GVT bucked with
"Most critical one is to fix KVM's mm reference when we access guest memory,
issue was raised by Linus [1], and another one with virtual opregion fix."
[1] - https://lists.freedesktop.org/archives/intel-gvt-dev/2018-August/004130.html
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbmEJvAAoJEPpiX2QO6xPKNNMIAItkqjJpqsGZq0Ii1/ph3D9U
gX3Jj4H7ZoAoE/c+wx2nNADjmn8RhV7PnKrGFuXmLkRJaNJllXA8U7h7YfWqRBT9
jQ0RIuFdbfiRrwHmqtEZ2q3XP4HCWbdSIigimO/Lk/A2l+sCY8oA+9bGB7IRxeRV
e162GARNsxN0a8D+5i+KR0mrTezSoKOvYFtUFp76UKUDAKrK85XNQsn1TR8ZMXGC
fIi03y6ZM66I+bzIsq15HEj/jrRcaXm7xjd94/HetqeVJaEOEs2ztfaex+v3yDnC
meWZKWCJ5x9noqHTO3XdHJGlKnfFQxopZsINNiqpqwOKpH7oaHWyAnnUrafJfhA=
=/IQQ
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-fixes-2018-09-11' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
This contains a regression fix for video playbacks on gen 2 hardware,
a IPS timeout error suppression on Broadwell and GVT bucked with
"Most critical one is to fix KVM's mm reference when we access guest memory,
issue was raised by Linus [1], and another one with virtual opregion fix."
[1] - https://lists.freedesktop.org/archives/intel-gvt-dev/2018-August/004130.html
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180911223229.GA30328@intel.com
As reported by Reobert O'Callahan, since Viro's commit to kill
dev_ifsioc() we attempt to copy too much data in compat mode,
which may lead to EFAULT when the 32-bit version of struct ifreq
sits at/near the end of a page boundary, and the next page isn't
mapped.
Fix this by passing the approprate compat/non-compat size to copy
and using that, as before the dev_ifsioc() removal. This works
because only the embedded "struct ifmap" has different size, and
this is only used in SIOCGIFMAP/SIOCSIFMAP which has a different
handler. All other parts of the union are naturally compatible.
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=199469.
Fixes: bf4405737f ("kill dev_ifsioc()")
Reported-by: Robert O'Callahan <robert@ocallahan.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace "fallthru" with a proper "fall through" annotation.
This fix is part of the ongoing efforts to enabling
-Wimplicit-fallthrough
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace "fallthru" with a proper "fall through" annotation.
This fix is part of the ongoing efforts to enabling
-Wimplicit-fallthrough
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dennis rewrote a significant portion of the percpu allocator and has
shown that he can respond in a timely and helpful manner when issues
are reported against percpu allocator.
Let's make Dennis the percpu tree maintainer.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
When splitting a GSO segment that consists of encapsulated packets, the
skb->mac_len of the segments can end up being set wrong, causing packet
drops in particular when using act_mirred and ifb interfaces in
combination with a qdisc that splits GSO packets.
This happens because at the time skb_segment() is called, network_header
will point to the inner header, throwing off the calculation in
skb_reset_mac_len(). The network_header is subsequently adjust by the
outer IP gso_segment handlers, but they don't set the mac_len.
Fix this by adding skb_reset_mac_len() calls to both the IPv4 and IPv6
gso_segment handlers, after they modify the network_header.
Many thanks to Eric Dumazet for his help in identifying the cause of
the bug.
Acked-by: Dave Taht <dave.taht@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
When splitting a GSO segment that consists of encapsulated packets, the
skb->mac_len of the segments can end up being set wrong, causing packet
drops in particular when using act_mirred and ifb interfaces in
combination with a qdisc that splits GSO packets.
This happens because at the time skb_segment() is called, network_header
will point to the inner header, throwing off the calculation in
skb_reset_mac_len(). The network_header is subsequently adjust by the
outer IP gso_segment handlers, but they don't set the mac_len.
Fix this by adding skb_reset_mac_len() calls to both the IPv4 and IPv6
gso_segment handlers, after they modify the network_header.
Many thanks to Eric Dumazet for his help in identifying the cause of
the bug.
Acked-by: Dave Taht <dave.taht@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Hedberg says:
====================
pull request: bluetooth 2018-09-13
A few Bluetooth fixes for the 4.19-rc series:
- Fixed rw_semaphore leak in hci_ldisc
- Fixed local Out-of-Band pairing data handling
Let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca says:
====================
tls: don't leave keys in kernel memory
There are a few places where the RX/TX key for a TLS socket is copied
to kernel memory. This series clears those memory areas when they're no
longer needed.
v2: add union tls_crypto_context, following Vakul Garg's comment
swap patch 2 and 3, using new union in patch 3
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This contains key material in crypto_send_aes_gcm_128 and
crypto_recv_aes_gcm_128.
Introduce union tls_crypto_context, and replace the two identical
unions directly embedded in struct tls_context with it. We can then
use this union to clean up the memory in the new tls_ctx_free()
function.
Fixes: 3c4d755915 ("tls: kernel TLS support")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's no need to copy the key to an on-stack buffer before calling
crypto_aead_setkey().
Fixes: 3c4d755915 ("tls: kernel TLS support")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update 'confirmed' timestamp when ARP packet is received. It shouldn't
affect locktime logic and anyway entry can be confirmed by any higher-layer
protocol. Thus it makes sense to confirm it when ARP packet is received.
Fixes: 77d7123342 ("neighbour: update neigh timestamps iff update is effective")
Signed-off-by: Vasily Khoruzhick <vasilykh@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fix addresses https://bugzilla.kernel.org/show_bug.cgi?id=201071
Commit 5025f7f7d5 wrongly relied on __dev_change_flags to notify users of
dev flag changes in the case when dev->rtnl_link_state = RTNL_LINK_INITIALIZED.
Fix it by indicating flag changes explicitly to __dev_notify_flags.
Fixes: 5025f7f7d5 ("rtnetlink: add rtnl_link_state check in rtnl_configure_link")
Reported-By: Liam mcbirnie <liam.mcbirnie@boeing.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fields ->dev and ->next of struct ipddp_route may be copied to
userspace on the SIOCFINDIPDDPRT ioctl. This is only accessible
to CAP_NET_ADMIN though. Let's manually copy the relevant fields
instead of using memcpy().
BugLink: http://blog.infosectcbr.com.au/2018/09/linux-kernel-infoleaks.html
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the Device Tree is not providing the per-port interrupts, do not fail
during b53_srab_irq_enable() but instead bail out gracefully. The SRAB driver
is used on the BCM5301X (Northstar) platforms which do not yet have the SRAB
interrupts wired up.
Fixes: 16994374a6 ("net: dsa: b53: Make SRAB driver manage port interrupts")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang says:
====================
vhost_net TX batching
This series tries to batch submitting packets to underlayer socket
through msg_control during sendmsg(). This is done by:
1) Doing userspace copy inside vhost_net
2) Build XDP buff
3) Batch at most 64 (VHOST_NET_BATCH) XDP buffs and submit them once
through msg_control during sendmsg().
4) Underlayer sockets can use XDP buffs directly when XDP is enalbed,
or build skb based on XDP buff.
For the packet that can not be built easily with XDP or for the case
that batch submission is hard (e.g sndbuf is limited). We will go for
the previous slow path, passing iov iterator to underlayer socket
through sendmsg() once per packet.
This can help to improve cache utilization and avoid lots of indirect
calls with sendmsg(). It can also co-operate with the batching support
of the underlayer sockets (e.g the case of XDP redirection through
maps).
Testpmd(txonly) in guest shows obvious improvements:
Test /+pps%
XDP_DROP on TAP /+44.8%
XDP_REDIRECT on TAP /+29%
macvtap (skb) /+26%
Netperf TCP_STREAM TX from guest shows obvious improvements on small
packet:
size/session/+thu%/+normalize%
64/ 1/ +2%/ 0%
64/ 2/ +3%/ +1%
64/ 4/ +7%/ +5%
64/ 8/ +8%/ +6%
256/ 1/ +3%/ 0%
256/ 2/ +10%/ +7%
256/ 4/ +26%/ +22%
256/ 8/ +27%/ +23%
512/ 1/ +3%/ +2%
512/ 2/ +19%/ +14%
512/ 4/ +43%/ +40%
512/ 8/ +45%/ +41%
1024/ 1/ +4%/ 0%
1024/ 2/ +27%/ +21%
1024/ 4/ +38%/ +73%
1024/ 8/ +15%/ +24%
2048/ 1/ +10%/ +7%
2048/ 2/ +16%/ +12%
2048/ 4/ 0%/ +2%
2048/ 8/ 0%/ +2%
4096/ 1/ +36%/ +60%
4096/ 2/ -11%/ -26%
4096/ 4/ 0%/ +14%
4096/ 8/ 0%/ +4%
16384/ 1/ -1%/ +5%
16384/ 2/ 0%/ +2%
16384/ 4/ 0%/ -3%
16384/ 8/ 0%/ +4%
65535/ 1/ 0%/ +10%
65535/ 2/ 0%/ +8%
65535/ 4/ 0%/ +1%
65535/ 8/ 0%/ +3%
Please review.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch implements XDP batching for vhost_net. The idea is first to
try to do userspace copy and build XDP buff directly in vhost. Instead
of submitting the packet immediately, vhost_net will batch them in an
array and submit every 64 (VHOST_NET_BATCH) packets to the under layer
sockets through msg_control of sendmsg().
When XDP is enabled on the TUN/TAP, TUN/TAP can process XDP inside a
loop without caring GUP thus it can do batch map flushing. When XDP is
not enabled or not supported, the underlayer socket need to build skb
and pass it to network core. The batched packet submission allows us
to do batching like netif_receive_skb_list() in the future.
This saves lots of indirect calls for better cache utilization. For
the case that we can't so batching e.g when sndbuf is limited or
packet size is too large, we will go for usual one packet per
sendmsg() way.
Doing testpmd on various setups gives us:
Test /+pps%
XDP_DROP on TAP /+44.8%
XDP_REDIRECT on TAP /+29%
macvtap (skb) /+26%
Netperf tests shows obvious improvements for small packet transmission:
size/session/+thu%/+normalize%
64/ 1/ +2%/ 0%
64/ 2/ +3%/ +1%
64/ 4/ +7%/ +5%
64/ 8/ +8%/ +6%
256/ 1/ +3%/ 0%
256/ 2/ +10%/ +7%
256/ 4/ +26%/ +22%
256/ 8/ +27%/ +23%
512/ 1/ +3%/ +2%
512/ 2/ +19%/ +14%
512/ 4/ +43%/ +40%
512/ 8/ +45%/ +41%
1024/ 1/ +4%/ 0%
1024/ 2/ +27%/ +21%
1024/ 4/ +38%/ +73%
1024/ 8/ +15%/ +24%
2048/ 1/ +10%/ +7%
2048/ 2/ +16%/ +12%
2048/ 4/ 0%/ +2%
2048/ 8/ 0%/ +2%
4096/ 1/ +36%/ +60%
4096/ 2/ -11%/ -26%
4096/ 4/ 0%/ +14%
4096/ 8/ 0%/ +4%
16384/ 1/ -1%/ +5%
16384/ 2/ 0%/ +2%
16384/ 4/ 0%/ -3%
16384/ 8/ 0%/ +4%
65535/ 1/ 0%/ +10%
65535/ 2/ 0%/ +8%
65535/ 4/ 0%/ +1%
65535/ 8/ 0%/ +3%
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch implement TUN_MSG_PTR msg_control type. This type allows
the caller to pass an array of XDP buffs to tuntap through ptr field
of the tun_msg_control. Tap will build skb through those XDP buffers.
This will avoid lots of indirect calls thus improves the icache
utilization and allows to do XDP batched flushing when doing XDP
redirection.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch implement TUN_MSG_PTR msg_control type. This type allows
the caller to pass an array of XDP buffs to tuntap through ptr field
of the tun_msg_control. If an XDP program is attached, tuntap can run
XDP program directly. If not, tuntap will build skb and do a fast
receiving since part of the work has been done by vhost_net.
This will avoid lots of indirect calls thus improves the icache
utilization and allows to do XDP batched flushing when doing XDP
redirection.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces to a new tun/tap specific msg_control:
#define TUN_MSG_UBUF 1
#define TUN_MSG_PTR 2
struct tun_msg_ctl {
int type;
void *ptr;
};
This allows us to pass different kinds of msg_control through
sendmsg(). The first supported type is ubuf (TUN_MSG_UBUF) which will
be used by the existed vhost_net zerocopy code. The second is XDP
buff, which allows vhost_net to pass XDP buff to TUN. This could be
used to implement accepting an array of XDP buffs from vhost_net in
the following patches.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch split out XDP logic into a single function. This make it to
be reused by XDP batching path in the following patch.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we're sure not to go native XDP, there's no need for several things
like bh and rcu stuffs. So this patch introduces a helper to build skb
and hold page refcnt. When we found we will go through skb path, build
skb directly.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's no need to duplicate page get logic in each action. So this
patch tries to get page and calculate the offset before processing XDP
actions (except for XDP_DROP), and undo them when meet errors (we
don't care the performance on errors). This will be used for factoring
out XDP logic.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch move the bh enabling a little bit earlier, this will be
used for factoring out the core XDP logic of tuntap.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>