The "by8" implementation introduced in commit 22cddcc7df ("crypto: aes
- AES CTR x86_64 "by8" AVX optimization") is failing crypto tests as it
handles counter block overflows differently. It only accounts the right
most 32 bit as a counter -- not the whole block as all other
implementations do. This makes it fail the cryptomgr test #4 that
specifically tests this corner case.
As we're quite late in the release cycle, just disable the "by8" variant
for now.
Reported-by: Romain Francoise <romain@orebokech.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Chandramouli Narayanan <mouli@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The following bug can be triggered by hot adding and removing a large number of
xen domain0's vcpus repeatedly:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [..] find_busiest_group
PGD 5a9d5067 PUD 13067 PMD 0
Oops: 0000 [#3] SMP
[...]
Call Trace:
load_balance
? _raw_spin_unlock_irqrestore
idle_balance
__schedule
schedule
schedule_timeout
? lock_timer_base
schedule_timeout_uninterruptible
msleep
lock_device_hotplug_sysfs
online_store
dev_attr_store
sysfs_write_file
vfs_write
SyS_write
system_call_fastpath
Last level cache shared mask is built during CPU up and the
build_sched_domain() routine takes advantage of it to setup
the sched domain CPU topology.
However, llc_shared_mask is not released during CPU disable,
which leads to an invalid sched domainCPU topology.
This patch fix it by releasing the llc_shared_mask correctly
during CPU disable.
Yasuaki also reported that this can happen on real hardware:
https://lkml.org/lkml/2014/7/22/1018
His case is here:
==
Here is an example on my system.
My system has 4 sockets and each socket has 15 cores and HT is
enabled. In this case, each core of sockes is numbered as
follows:
| CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89
Socket#2 | 30-44, 90-104
Socket#3 | 45-59, 105-119
Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000.
It means that last level cache of Socket#2 is shared with
CPU#30-44 and 90-104.
When hot-removing socket#2 and #3, each core of sockets is
numbered as follows:
| CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89
But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30
remains having 0x3fff80000001fffc0000000.
After that, when hot-adding socket#2 and #3, each core of
sockets is numbered as follows:
| CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89
Socket#2 | 30-59
Socket#3 | 90-119
Then llc_shared_mask of CPU#30 becomes
0x3fff8000fffffffc0000000. It means that last level cache of
Socket#2 is shared with CPU#30-59 and 90-104. So the mask has
the wrong value.
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Tested-by: Linn Crosetto <linn@hp.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: <stable@vger.kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1411547885-48165-1-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
A number of people are reporting seeing the "setup_efi_pci() failed!"
error message in what used to be a quiet boot,
https://bugzilla.kernel.org/show_bug.cgi?id=81891
The message isn't all that helpful because setup_efi_pci() can return a
non-success error code for a variety of reasons, not all of them fatal.
Let's drop the return code from setup_efi_pci*() altogether, since
there's no way to process it in any meaningful way outside of the inner
__setup_efi_pci*() functions.
Reported-by: Darren Hart <dvhart@linux.intel.com>
Reported-by: Josh Boyer <jwboyer@fedoraproject.org>
Cc: Ulf Winkelvos <ulf@winkelvos.de>
Cc: Andre Müller <andre.muller@web.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
There is possibility with misconfigured pins that interrupt occurs instantly
after setting irq_set_chained_handler() in gpiochip_set_chained_irqchip().
Now if handler gets called before irq_set_handler_data() the handler gets
NULL handler data.
Fix this by moving irq_set_handler_data() call before
irq_set_chained_handler() in gpiochip_set_chained_irqchip().
Cc: Stable <stable@vger.kernel.org> # 3.15+
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
GPIO direction flags are not getting set because
an 'if' statement is the wrong way around.
Cc: Stable <stable@vger.kernel.org> # 3.15+
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
If the ccp is built as a built-in module, then ccp-crypto (whether
built as a module or a built-in module) will be able to load and
it will register its crypto algorithms. If the system does not have
a CCP this will result in -ENODEV being returned whenever a command
is attempted to be queued by the registered crypto algorithms.
Add an API, ccp_present(), that checks for the presence of a CCP
on the system. The ccp-crypto module can use this to determine if it
should register it's crypto alogorithms.
Cc: stable@vger.kernel.org
Reported-by: Scot Doyle <lkml14@scotdoyle.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Scot Doyle <lkml14@scotdoyle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
since commit 31964ffebb ("tty: serial: msm: Remove direct access to GSBI")'
serial hangs if earlyprintk are enabled.
This hang is noticed only when the GSBI driver is probed and all the
earlyprintks before gsbi probe are seen on the console.
The reason why it hangs is because GSBI driver disables hclk in its
probe function without realizing that the serial IP might be in use by
a bootconsole. As gsbi driver disables the clock in probe the
bootconsole locks up.
Turning off hclk's could be dangerous if there are system components
like earlyprintk using the hclk.
This patch fixes the issue by delegating the clock management to
probe and remove functions in gsbi rather than disabling the clock in probe.
More detailed problem description can be found here:
http://www.spinics.net/lists/linux-arm-msm/msg10589.html
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
- Fixes for the new memory region re-registration support
- iSER initiator error path fixes
- Grab bag of small fixes for the qib and ocrdma hardware drivers
- Larger set of fixes for mlx4, especially in RoCE mode
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJUIexdAAoJEENa44ZhAt0hP10QAJztxlS2a8U3JCJzthwSYxlI
ohT9487iLk1uEcj4Z3i7w2ERRUzXaHbRTktNHFjwfRb8x2qMUgT2PfD6/30sQ250
nJAk3FRFNipxKkJSfmcc3+O4r91i4F+CaN8DGypaBDHcupeD2drKocl/Iu5MIvkG
e5CzLlS7i/xrWKmgYP4bIqqFZsqQ+2rJrYBDybuLZSaZNd0PTDE3yCDihfOcsxjn
TeOCVbm5895fPRtxzeCGHy8bXbYYN9vItuhtHC+sntYtbhNJhjpmP+1yD6M2SoZR
34sGd7AA1j1H6ATmanzeW2aALkFYPIuGihDbbnRQlDG1v09lEPfP2GtfLxoQ9Ibo
nfe2rsthzV6Qh2xcXjn6KicgV7bb6aSUXEK24zKx7O3MkOvHkOC/JIIrd9dFe+uj
R7pUd3XlAk8SBhTQ4gLub06Dl7ynzSRArwcdMTHp30LvtnjJZoQR67WGGrsdwlIW
MV43105i7iLCcdaSd0ihKnR6OFlSh13Z0wpu+B386bwxkHxjFJXkVHxOJir/iAk9
cW4RXbA/ic7nwIjes4GbMNDOvdJO2tDcg9KGSgiDY3kC5GksPqfxXYVDlMB2rFoE
PhfQ8TOcbZYTmlcKLMpMIFXP484VPhWQJeYWPOf9KGS6aW5QRNPsPCmAvaoSXWLs
GVSlvjbE6O7MgonqG1Jh
=Kpm1
-----END PGP SIGNATURE-----
Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull infiniband/rdma fixes from Roland Dreier:
"Last late set of InfiniBand/RDMA fixes for 3.17:
- fixes for the new memory region re-registration support
- iSER initiator error path fixes
- grab bag of small fixes for the qib and ocrdma hardware drivers
- larger set of fixes for mlx4, especially in RoCE mode"
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (26 commits)
IB/mlx4: Fix VF mac handling in RoCE
IB/mlx4: Do not allow APM under RoCE
IB/mlx4: Don't update QP1 in native mode
IB/mlx4: Avoid accessing netdevice when building RoCE qp1 header
mlx4: Fix mlx4 reg/unreg mac to work properly with 0-mac addresses
IB/core: When marshaling uverbs path, clear unused fields
IB/mlx4: Avoid executing gid task when device is being removed
IB/mlx4: Fix lockdep splat for the iboe lock
IB/mlx4: Get upper dev addresses as RoCE GIDs when port comes up
IB/mlx4: Reorder steps in RoCE GID table initialization
IB/mlx4: Don't duplicate the default RoCE GID
IB/mlx4: Avoid null pointer dereference in mlx4_ib_scan_netdevs()
IB/iser: Bump version to 1.4.1
IB/iser: Allow bind only when connection state is UP
IB/iser: Fix RX/TX CQ resource leak on error flow
RDMA/ocrdma: Use right macro in query AH
RDMA/ocrdma: Resolve L2 address when creating user AH
mlx4: Correct error flows in rereg_mr
IB/qib: Correct reference counting in debugfs qp_stats
IPoIB: Remove unnecessary port query
...
One fix is about a buggy compuation in PCM API function Clemens
spotted out, but the impact must be really small as no one
really uses it in user-space side. The rest are a trivial fix
for a HD-audio model and a USB-audio device-specific regression
fix, so all look fairly safe to apply.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJUIeYoAAoJEGwxgFQ9KSmkxQAP/jnhL0sGaVlqPP8UkxlQAEFo
KuY+luKFttLDU/IixGkjpraVyvAG8pKwRkrMLLxy5N6WaX7vAbn45m714xlgRwaS
EnREMes7hMnyH8KLyjxsSfnrKdGdXYET+3y2JcKAchcWK0UMwWuwptXZRaUFZl+X
9J0dDtuzEi8Lt/UeBkzMQXNQxHdJHFucNFKPe6oWHovV0f3AjgCsWHzpDyyGvAKQ
dEAymw5AJF+oiAwdv+GEsW40jXAmRfstE7PadcyaxNlpHBzrXL7oWZfnOf1e0A2X
H+g/Imj+XSk3L5HRL638amkvb/FmGdydD+vqeO4LCxt6kZKn53rPRNF/gSEU6NEF
Ms0BHAji+PRcXcex7tW+RX7SxARJji7fTEFD4tv6P/Q9hwN2zqKM9qbKDZmdt4pz
Cl3ldUv6Xi4PV7onLZVRW2Fv7XbgDcnzGzBYIWjkYp9TotqWxcEHOKIY/4Gdfjt4
SNZbJ8ESfPRcjRT8m+e9XayOpgFi3iwYMs+26vAULopPeFrQqMf7wN22qZF+A1En
iHRdLKpnBnuGrmFVVFo2KjTpWf5bTQQFoDEjTm/FZQzkwn1u+RWqjH/c+KN6wYAq
ewPGOYr8IBtFx7EkzbyIwUAsAdastA6kzsKlm0LjaN22QPZa/dpuBg2uNKiBBcuG
mAwoS/dG/lQhwkh/cGYo
=GQ9M
-----END PGP SIGNATURE-----
Merge tag 'sound-3.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"One fix is about a buggy computation in PCM API function Clemens
spotted out, but the impact must be really small as no one really uses
it in user-space side.
The rest are a trivial fix for a HD-audio model and a USB-audio
device-specific regression fix, so all look fairly safe to apply"
* tag 'sound-3.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: snd-usb-caiaq: Fix LED commands for Kore controller
ALSA: pcm: fix fifo_size frame calculation
ALSA: hda - Add fixup model name lookup for Lemote A1205
Pull final block fixes from Jens Axboe:
"This week and last we've been fixing some corner cases related to
blk-mq, mostly. I ended up pulling most of that out of for-linus
yesterday, which is why the branch looks fresh. The rest were
postponed for 3.18.
This pull request contains:
- Fix from Christoph, avoiding a stack overflow when FUA insertion
would recursive infinitely.
- Fix from David Hildenbrand on races between the timeout handler and
uninitialized requests. Fixes a real issue that virtio_blk has run
into.
- A few fixes from me:
- Ensure that request deadline/timeout is ordered before the
request is marked as started.
- A potential oops on out-of-memory, when we scale the queue
depth of the device and retry.
- A hang fix on requeue from SCSI, where the hardware queue
would be stopped when we attempt to re-run it (and hence
nothing would happen, stalling progress).
- A fix for commit 2da78092, where the cleanup path was moved
to RCU, but a debug might_sleep() was inadvertently left in
the code. This causes warnings for people"
* 'for-linus' of git://git.kernel.dk/linux-block:
genhd: fix leftover might_sleep() in blk_free_devt()
blk-mq: use blk_mq_start_hw_queues() when running requeue work
blk-mq: fix potential oops on out-of-memory in __blk_mq_alloc_rq_maps()
blk-mq: avoid infinite recursion with the FUA flag
blk-mq: Avoid race condition with uninitialized requests
blk-mq: request deadline must be visible before marking rq as started
Pull parisc fixes from Helge Deller:
"We avoid using -mfast-indirect-calls for 64bit kernel builds to
prevent building an unbootable kernel due to latest gcc changes.
In the pdc_stable/firmware-access driver we fix a few possible stack
overflows and we now call secure_computing_strict() instead of
secure_computing() which fixes upcoming SECCOMP patches in the
for-next trees"
* 'parisc-3.17-7' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Only use -mfast-indirect-calls option for 32-bit kernel builds
parisc: pdc_stable.c: Avoid potential stack overflows
parisc: pdc_stable.c: Cleaning up unnecessary use of memset in conjunction with strncpy
parisc: ptrace: use secure_computing_strict()
This reverts commit f23cf8bd5c ("efi/x86: efistub: Move shared
dependencies to <asm/efi.h>") as well as the x86 parts of commit
f4f75ad574 ("efi: efistub: Convert into static library").
The road leading to these two reverts is long and winding.
The above two commits were merged during the v3.17 merge window and
turned the common EFI boot stub code into a static library. This
necessitated making some symbols global in the x86 boot stub which
introduced new entries into the early boot GOT.
The problem was that we weren't fixing up the newly created GOT entries
before invoking the EFI boot stub, which sometimes resulted in hangs or
resets. This failure was reported by Maarten on his Macbook pro.
The proposed fix was commit 9cb0e39423 ("x86/efi: Fixup GOT in all
boot code paths"). However, that caused issues for Linus when booting
his Sony Vaio Pro 11. It was subsequently reverted in commit
f3670394c2.
So that leaves us back with Maarten's Macbook pro not booting.
At this stage in the release cycle the least risky option is to revert
the x86 EFI boot stub to the pre-merge window code structure where we
explicitly #include efi-stub-helper.c instead of linking with the static
library. The arm64 code remains unaffected.
We can take another swing at the x86 parts for v3.18.
Conflicts:
arch/x86/include/asm/efi.h
Tested-by: Josh Boyer <jwboyer@fedoraproject.org>
Tested-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Tested-by: Leif Lindholm <leif.lindholm@linaro.org> [arm64]
Tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>,
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
In spite of what the GCC manual says, the -mfast-indirect-calls has
never been supported in the 64-bit parisc compiler. Indirect calls have
always been done using function descriptors irrespective of the
-mfast-indirect-calls option.
Recently, it was noticed that a function descriptor was always requested
when the -mfast-indirect-calls option was specified. This caused
problems when the option was used in application code and doesn't make
any sense because the whole point of the option is to avoid using a
function descriptor for indirect calls.
Fixing this broke 64-bit kernel builds.
I will fix GCC but for now we need the attached change. This results in
the same kernel code as before.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # v3.0+
Signed-off-by: Helge Deller <deller@gmx.de>
Commit acbf74a763 ("vxlan: Refactor vxlan driver to make use of the common UDP tunnel functions." introduced a bug in vxlan_xmit_one()
function, causing it to transmit Vxlan packets without proper
Vxlan header inserted. The change was not needed in the first
place. Revert it.
Reported-by: Tom Herbert <therbert@google.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
"select NET" in drivers/scsi/Kconfig
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUIb7QAAoJEKurIx+X31iBdCgP/A6gtnOlkybSlxLGjATlObI6
bqYGixJd5+Po0bHJE/6Q2gepTBWRADCizexfzA8tDsuJbrQdpwpJ/4BCxIQMbQHe
qWlQS/d07U5TbUh1HDQrBdseMUgJsqqUTEaUw/zuvfHHCDE4W08PoGn+KlJJv4RU
ytV3Liph+NLaJbkEFftVcW+uIJ8Qsibd6k37oPA5AWsOyn4NiOBv2twaz20Kcowa
MvzutUfogLAqbCl5Bh/7xEREuKdnmzlz2b+/Zc5/TgWLMjqJuJHJqdEJKm4Nag4D
jbqw2Ji7hkX6eUL33MlmySVZqKlvLvqvaFdSQMB3wwTV5zAR19wz9ZXCgvanB+i4
aRn1zFpQWB51kQSUuy9jyaQ+8n+DPGeF1WLJrWKEVSV+SfKeOntnn224DqZirGIe
UZea73MPMEjMGbQh477EwR5gaOTwaqQgNo403qIU8Iee8i5KVJp3tdAIf1zOUDed
c/eziXvKGygwr0nyBwJzwDFE15TLUzkkv4IKnJje1RRT1ElTrsOMuGpl6EORy/Lm
XabBCmiyU3l4Aj9+oqmxkDR22SOJ0fFBdeWKeufktfeIyRbW3TOsgSwKiMswAF1E
jA2y3d7WhT76ceFZhQZ9kdT/6nz4CMpqOragUbMjw9bqCXjk5nAFMoZYXAHxSw4l
ToG7whl3TYXs7OK4PIhe
=GdXa
-----END PGP SIGNATURE-----
Merge tag 'please-pull-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux
Pull ia64 defconfig update from Tony Luck:
"Need to rebuild defconfig files to cope with removal of "select NET"
in drivers/scsi/Kconfig"
* tag 'please-pull-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
[IA64] refresh arch/ia64/configs/* using "make savedefconfig"
Add support for two more processors to fam15h_power driver.
Also fix a bug in the same driver to only report the power level
on chips which actually support reporting it.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUIZ6JAAoJEMsfJm/On5mBD6UQAIZsZr17e5tqBRBQVpT67Uhc
8bQOL4bdxmkCF9YDCDFZ+R2tbd49T7zzitOdlFFbPv0/t9gQctZzW7YQ+plVh/+w
eTm40imyh+Le6MDXxqgZdCZR1Y5/J15mTYGdCUQ5ld3p9yuJFbSoI7rxUfIeH5ER
V7D/TZbf3U3m3kFhz5OalnBFNWMKhImGxOgfd/ulABZgvwbnwZCtersvjlYDHOsI
5SpBFCLvH0QD6khVGUgAT85XuHZZi+em3mhLAV4gpYO0NQDLotL/JONO5QslFMVQ
bdNzTX54gRN7xZKp4kQUJllsc25htavVw531Yw9HUo0K9pPcjGruuGpcawm4zruY
4VNWdULm3MWdkS6LYCCRlKd2zW89ibcP1Ehwf+biD4Ucnx4J7QYeZ5pLl5oSG5Sk
3w2KSIM627PfL724wvZz2vvOV0CB62wB6TbX5XSHeN7jP+7dH2cTFAJycVfBBZLv
nCXmvshLqQcjboW/XLrLAfu5r2nRWcZH30uL+D/VSI5o9kGEHj1d9xPu1BiWgpov
xBe+mDWpHWFC6Qx74BqRwS2sCcLO3IMXXdcIOyV5jLeCHCIwJApypwfqsCeKfGPn
hKjJqv8bfRhprMEEG2iFqYswTF3BM5QpDHt1LiTE059OWiBhI4z7Qn2VRUV0aDrh
GAXPYFOhQb0atjYskMJ3
=i4EB
-----END PGP SIGNATURE-----
Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- Fix a resource leak in tmp103 driver
- Add support for two more processors to fam15h_power driver
- Also fix a bug in the same driver to only report the power level on
chips which actually support reporting it
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (tmp103) Fix resource leak bug in tmp103 temperature sensor driver
hwmon: (fam15h_power) Add support for two more processors
hwmon: (fam15h_power) Make actual power reporting conditional
Prompted by a change to drivers/scsi/Kconfig which used to do a
"select NET" but now does a "depends on NET". This meant that some
configurations ended up without CONFIG_NET=y
Signed-off-by Tony Luck <tony.luck@intel.com>
In order to make TCP more resilient in presence of reorders, we need
to allow coalescing to happen when skbs from out of order queue are
transferred into receive queue. LRO/GRO can be completely canceled
in some pathological cases, like per packet load balancing on aggregated
links.
I had to move tcp_try_coalesce() up in the file above tcp_ofo_queue()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current ICMP rate limiting uses inetpeer cache, which is an RBL tree
protected by a lock, meaning that hosts can be stuck hard if all cpus
want to check ICMP limits.
When say a DNS or NTP server process is restarted, inetpeer tree grows
quick and machine comes to its knees.
iptables can not help because the bottleneck happens before ICMP
messages are even cooked and sent.
This patch adds a new global limitation, using a token bucket filter,
controlled by two new sysctl :
icmp_msgs_per_sec - INTEGER
Limit maximal number of ICMP packets sent per second from this host.
Only messages whose type matches icmp_ratemask are
controlled by this limit.
Default: 1000
icmp_msgs_burst - INTEGER
icmp_msgs_per_sec controls number of ICMP packets sent per second,
while icmp_msgs_burst controls the burst size of these packets.
Default: 50
Note that if we really want to send millions of ICMP messages per
second, we might extend idea and infra added in commit 04ca6973f7
("ip: make IP identifiers less predictable") :
add a token bucket in the ip_idents hash and no longer rely on inetpeer.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Will Deacon pointed out, that the currently used opcode for filling holes,
that is 0xe7ffffff, seems not robust enough ...
$ echo 0xffffffe7 | xxd -r > test.bin
$ arm-linux-gnueabihf-objdump -m arm -D -b binary test.bin
...
0: e7ffffff udf #65535 ; 0xffff
... while for Thumb, it ends up as ...
0: ffff e7ff vqshl.u64 q15, <illegal reg q15.5>, #63
... which is a bit fragile. The ARM specification defines some *permanently*
guaranteed undefined instruction (UDF) space, for example for ARM in ARMv7-AR,
section A5.4 and for Thumb in ARMv7-M, section A5.2.6.
Similarly, ptrace, kprobes, kgdb, bug and uprobes make use of such instruction
as well to trap. Given mentioned section from the specification, we can find
such a universe as (where 'x' denotes 'don't care'):
ARM: xxxx 0111 1111 xxxx xxxx xxxx 1111 xxxx
Thumb: 1101 1110 xxxx xxxx
We therefore should use a more robust opcode that fits both. Russell King
suggested that we can even reuse a single 32-bit word, that is, 0xe7fddef1
which will fault if executed in ARM *or* Thumb mode as done in f928d4f2a8
("ARM: poison the vectors page"). That will still hold our requirements:
$ echo 0xf1defde7 | xxd -r > test.bin
$ arm-unknown-linux-gnueabi-objdump -m arm -D -b binary test.bin
...
0: e7fddef1 udf #56801 ; 0xdde1
$ echo 0xf1defde7f1defde7f1defde7 | xxd -r > test.bin
$ arm-unknown-linux-gnueabi-objdump -marm -Mforce-thumb -D -b binary test.bin
...
0: def1 udf #241 ; 0xf1
2: e7fd b.n 0x0
4: def1 udf #241 ; 0xf1
6: e7fd b.n 0x4
8: def1 udf #241 ; 0xf1
a: e7fd b.n 0x8
So on ARM 0xe7fddef1 conforms to the above UDF pattern, and the low 16 bit
likewise correspond to UDF in Thumb case. The 0xe7fd part is an unconditional
branch back to the UDF instruction.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mircea Gherzan <mgherzan@gmail.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
yesterday's pull request. Normally I would have waited for
some other patches to pile up, but since 3.17 might be short
here it is.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJUIXPRAAoJEBvWZb6bTYbyySEP/AkPjfNGYqwBbM8GUJ4tt4gR
C+xbiO+xPr2qCwfi36DQtL0UPwJHWSq7SXaDMvSqMo22FjnFcVaGuQcGAPno/8ZA
tLBe1km9HIPlEIV3vpoe8PPpj9cuZ86+YOCuPIqK5fC7l6Ops0dhCOjf88tmPVQ4
yhodpJ1Lt/sPBUWb6pNfk0iWD+qSbfUWtwzv89oudEvLcLiAcPSBdbvnxVS3bmGm
qbL8pvCOozK5GJbl0+cYWCoEPBP5ekqGvwvGdEBTx+4qv2S2htzUX30UA2VYy5IR
jMXVrJbvSW9FXQdBgr0Q4ql6evOVjL+5LpwgRUC6tuC6r1rMP+nXyHKS35HC1i8W
FYr62B/LZTm4vyDHsmsiEl43VLAcF7kmXufQT62vJg+ifeA1MAMIJra7ZDx8WbsD
HDqM+CeaZrF3p4okRrktbecQdeFFyg4wOasHRTO7SETkbP7i1cS0Mp8rRbX3CnJq
0UM8STe+hViXR7uYZEbjlbGKkszDS49fstJIaNm9JPJm+S/V5/MZFelNWgPp/+kF
xpQUxtoSaFnqgBXpRZ7t2Y2zGeZkMWn/P84S23/7K1TfRPCsUpgFbiY26rGW9l4v
r8gz7v+V1gCzWYVRuEzolFrea6A1ik2sspzeDuZOrf+QYwMyyUdEQ/NfCm032wba
CYL0V2M/dJNmZnZRPP9/
=ZkSE
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull another kvm fix from Paolo Bonzini:
"Another fix for 3.17 arrived at just the wrong time, after I had sent
yesterday's pull request. Normally I would have waited for some other
patches to pile up, but since 3.17 might be short here it is"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
arm/arm64: KVM: Fix unaligned access bug on gicv2 access
Conflicts:
arch/mips/net/bpf_jit.c
drivers/net/can/flexcan.c
Both the flexcan and MIPS bpf_jit conflicts were cases of simple
overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull cgroup fix from Tejun Heo:
"One late fix for cgroup.
I was waiting for another set of fixes for a long-standing obscure
cpuset bug but am not sure whether they'll be ready before v3.17
release. This one is a simple fix for a mutex unlock balance bug in
an allocation failure path in pidlist_array_load().
The bug was introduced in v3.14 and the fix is tagged for -stable"
* 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: fix unbalanced locking
In the brcmf_count_20mhz_channels function we are looping through a list
of channels received from firmware. Since the index of the first channel
is 0 the condition leads to an off by one bug. This is causing us to hit
the WARN_ON_ONCE(1) calls in the brcmu_d11n_decchspec function, which is
how I discovered the bug.
Introduced by:
commit b48d891676
("brcmfmac: rework wiphy structure setup")
Acked-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Emil Goode <emilgoode@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Otherwise we may fail to init the second compute ring.
Noticed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Check the correct bit for audio. Seems like a copy-paste error from the
start:
commit 9ed109a7b4
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Apr 24 23:54:52 2014 +0200
drm/i915: Track has_audio in the pipe config
Reported-by: Martin Andersen <martin.x.andersen@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82756
Cc: stable@vger.kernel.org # 3.16+
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
When the audio stream is paused or suspended we stop the sDMA and when it
is unpaused/resumed we start the channel without reconfiguring it.
The omap_dma_stop() clears the link configuration when we pause the dma, but
it is not setting it back on start. This will result only one audio buffer
to be played back and the DMA will stop, since the linking is disabled.
We need to restore the CLINK_CTRL register in case of resume.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Add mb() call to resume path to ensure the necessary barrier.
Resume can happen after waking up from suspend for example.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Ring init and cleanup are not balanced because we re-init the rings on
resume without having cleaned them up on suspend. This leads to the
driver leaking the parser's hash tables with a kmemleak signature such
as this:
unreferenced object 0xffff880405960980 (size 32):
comm "systemd-udevd", pid 516, jiffies 4294896961 (age 10202.044s)
hex dump (first 32 bytes):
d0 85 46 c0 ff ff ff ff 00 00 00 00 00 00 00 00 ..F.............
98 60 28 04 04 88 ff ff 00 00 00 00 00 00 00 00 .`(.............
backtrace:
[<ffffffff81816f9e>] kmemleak_alloc+0x4e/0xb0
[<ffffffff811fa678>] kmem_cache_alloc_trace+0x168/0x2f0
[<ffffffffc03e20a5>] i915_cmd_parser_init_ring+0x2a5/0x3e0 [i915]
[<ffffffffc04088a2>] intel_init_ring_buffer+0x202/0x470 [i915]
[<ffffffffc040c998>] intel_init_vebox_ring_buffer+0x1e8/0x2b0 [i915]
[<ffffffffc03eff59>] i915_gem_init_hw+0x2f9/0x3a0 [i915]
[<ffffffffc03f0057>] i915_gem_init+0x57/0x1d0 [i915]
[<ffffffffc045e26a>] i915_driver_load+0xc0a/0x10e0 [i915]
[<ffffffffc02e0d5d>] drm_dev_register+0xad/0x100 [drm]
[<ffffffffc02e3b9f>] drm_get_pci_dev+0x8f/0x200 [drm]
[<ffffffffc03c934b>] i915_pci_probe+0x3b/0x60 [i915]
[<ffffffff81436725>] local_pci_probe+0x45/0xa0
[<ffffffff81437a69>] pci_device_probe+0xd9/0x130
[<ffffffff81524f4d>] driver_probe_device+0x12d/0x3e0
[<ffffffff815252d3>] __driver_attach+0x93/0xa0
[<ffffffff81522e1b>] bus_for_each_dev+0x6b/0xb0
This patch extends the current convention of checking whether a
resource is already allocated before allocating it during ring init.
Longer term it might make sense to only init the rings once.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83794
Tested-by: Kari Suvanto <kari.tj.suvanto@gmail.com>
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
This change adds support for the Linux PTP Hardware clock and timestamping
functionality provided by the hardware. There are actually two cases that
this timestamping is meant to support.
The first case would be an ordinary clock scenario. In this configuration
the host interface does not have access to BAR 4. However all of the host
interfaces should be locked into the same boundary clock region and as such
they are all on the same clock anyway. With this being the case they can
synchronize among themselves and only need to adjust the offset since they
are all on the same clock with the same frequency.
The second case is a boundary clock scenario. This is a special case and
would require both BAR 4 access, and a means of presenting a netdev per
boundary region. The current plan is to use DSA at some point in the
future to provide these interfaces, but the DSA portion is still under
development.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change adds the messaging support needed to support PTP. In the case
of Tx timestamps it is necessary for the Switch Management entity to return
the frames via the mailbox as the host interface cannot know which port the
timestamp will be delivered to. In addition there is only one clock on the
entire switch, as such the entity that has BAR 4 access is the only one who
can actually update the frequency as it is the only one with access.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds limited debugfs support for the driver. Most of the
functionality needed for dumping registers is already provided via ethtool.
The only thing we saw that we really neeed was the ability to dump the
descriptor rings so as such this patch will add a fm10k directory containing a
listing of directories each one with a unique PCI Bus, Device, and Function
number. Each of those BDF directories will have a list of q_vectors, and
the q_vectors will contain a file for each of the Rx/Tx rings that are a part
of the vector. For example:
# ls -RD /sys/kernel/debug/fm10k/
/sys/kernel/debug/fm10k/:
0000:01:00.0
/sys/kernel/debug/fm10k/0000:01:00.0:
q_vector.000 q_vector.001 q_vector.002 q_vector.003
/sys/kernel/debug/fm10k/0000:01:00.0/q_vector.000:
rx_ring.000 tx_ring.000
/sys/kernel/debug/fm10k/0000:01:00.0/q_vector.001:
rx_ring.001 tx_ring.001
/sys/kernel/debug/fm10k/0000:01:00.0/q_vector.002:
rx_ring.002 tx_ring.002
/sys/kernel/debug/fm10k/0000:01:00.0/q_vector.003:
rx_ring.003 tx_ring.003
# cat /sys/kernel/debug/fm10k/0000:01:00.0/q_vector.000/rx_ring.000
DES DATA RSS STATERR LENGTH VLAN DGLORT SGLORT TIMESTAMP
---------------------------------------------------------------------------
000 0x00000000 0x00000000 0x00000003 0x002a 0x0000 0x0000 0x0000 0x13951807dc4fedf0
001 0x00000000 0x00000000 0x00000003 0x002a 0x0000 0x0000 0x0000 0x1395180906c9f2c8
002 0x3731c000 0x00000000 0x00000000 0x0000 0x0000 0x0000 0x0000 0x0000000000000000
003 0x3731d000 0x00000000 0x00000000 0x0000 0x0000 0x0000 0x0000 0x0000000000000000
004 0xaab3a000 0x00000000 0x00000000 0x0000 0x0000 0x0000 0x0000 0x0000000000000000
...
# cat /sys/kernel/debug/fm10k/0000:01:00.0/q_vector.000/tx_ring.000
DES BUFFER_ADDRESS LENGTH VLAN MSS HDRLEN FLAGS
---------------------------------------------------------
000 0x00000000aa8a1002 0x005a 0x0000 0x0000 0x0000 0xc0
001 0x00000000aa8a2002 0x005a 0x0000 0x0000 0x0000 0xc0
002 0x000000006bc13202 0x004e 0x0000 0x0000 0x0000 0xc0
003 0x000000006bc13c02 0x002a 0x0000 0x0000 0x0000 0xe1
004 0x000000006bc13602 0x0062 0x0000 0x0000 0x0000 0xc0
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for management of the limited QOS features of the
FM10000 interface. Specifically we can support up to 8 traffic classes,
however the part only provides 1 Rx and 1 Tx FIFO in the host interface and
as a result this can lead to head-of-line blocking on Rx. This can be
avoided by setting PFC only for priorities that cannot afford to drop
frames.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch combines the recently added VF messaging and configuration
functionality with the interfaces provided by the kernel to allow for
configuration and management of SR-IOV.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change adds a set of functions to fm10k_pf.c which allows for
configuring the VF via a set of standardized TLV messages.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch provides the functions necessary to configure the VF making use
of the same API pointers as the PF.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for the PF <-> VF mailbox. It functions similar to
the PF <-> SM mailbox however there are several modifications made to
improve the reliability of the mailbox itself. In addition the PF/VF
mailbox is much smaller an only supports a total size of 16 DWORDs vs the
1024 DWORDS provided for the PF/SM mailbox.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for L2 MACVLAN by making use of the fact that the
RRC provides a unique tag per filter called a Global Resource Tag, or GLORT.
In the case of this offload what I have done is assigned a linear block of
these so that each GLORT represents one of the MACVLAN netdevs. By doing
this I can share the Rx queues and Tx queues for all of the MACVLAN netdevs
while allowing them to be demuxed in the Rx cleanup path.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for basic offloads including TSO, Tx checksum, Rx
checksum, Rx hash, and the same features applied to VXLAN/NVGRE tunnels.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch takes the driver from supporting a single queue to supporting
multiple queues. The upper queue limit for the PF is 128 queues and the
upper limit for the VF is (128 / num_vfs) rounded down to nearest power of 2.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add PCI power management and error handling to allow the device to support
suspend/resume and recovery of any PCIe errors. The fm10k devices do not
support wake on LAN, and there is no plan to add this as a feature.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds basic ethtool support to the device to allow for configuration.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change adds the transmit and receive fastpath and interrupt handlers.
With this code in place the network device is now able to send and receive
frames over the network interface using a single queue.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Rick Jones <rick.jones2@hp.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for allocating, configuring, and freeing Tx/Rx ring
resources. With these changes in place the descriptor queues are in a
state where they are ready to transmit or receive if provided buffers.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for the service task. The service task takes care
of all processes that cannot be done in interrupt context such as resets,
stats updates, TC prio updates, and checking for hung or detached devices.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change adds the defines and structures necessary to support both Tx
and Rx descriptor rings.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch set adds interrupt support for the fm10k interfaces. The
interfaces themselves only support MSI-X, so neither MSI or legacy
interrupts are used.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add support for brining the interface up/down. This is still primitive yet
as we have not yet added support for the descriptor queues.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for L2 filtering.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>