The pch_gbe driver includes a 'copybreak' parameter which appears to
have been copied from the e1000e driver but is entirely unused. Remove
the dead code.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct net are effectively allocated from order-1 pages on x86,
with one object per slab, meaning that the 13 low order bits
of their addresses are zero.
Once shifted by L1_CACHE_SHIFT, this leaves 7 zero-bits,
meaning that net_hash_mix() does not help spreading
objects on various hash tables.
For example, TCP listen table has 32 buckets, meaning that
all netns use the same bucket for port 80 or port 443.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In update_vf():
cftree_remove(cl);
update_cfmin(cl->cl_parent);
the cl_cfmin of cl->cl_parent is intentionally updated to 0
when that parent only has one child. And if this parent is
root qdisc, we could end up, in hfsc_schedule_watchdog(),
that we can't decide the next schedule time for qdisc watchdog.
But it seems safe that we can just skip it, as this watchdog is
not always scheduled anyway.
Thanks to Marco for testing all the cases, nothing is broken.
Reported-by: Marco Berizzi <pupilla@libero.it>
Tested-by: Marco Berizzi <pupilla@libero.it>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
random_ether_addr is a #define for eth_random_addr which is
generally preferred in kernel code by ~3:1
Convert the uses of random_ether_addr to enable removing the #define
Miscellanea:
o Convert &vfmac[0] to equivalent vfmac and avoid unnecessary line wrap
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet says:
====================
net: dccp: fixes around rx_tstamp_last_feedback
This patch series fix some issues with rx_tstamp_last_feedback.
- Switch to monotonic clock.
- Avoid potential overflows on fast hosts/networks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
To compute delays, better not use time of the day which can
be changed by admins or malicious programs.
Also change ccid3_first_li() to use s64 type for delta variable
to avoid potential overflows.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Cc: dccp@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove dependencies on HAS_DMA where a Kconfig symbol depends on another
symbol that implies HAS_DMA, and, optionally, on "|| COMPILE_TEST".
In most cases this other symbol is an architecture or platform specific
symbol, or PCI.
Generic symbols and drivers without platform dependencies keep their
dependencies on HAS_DMA, to prevent compiling subsystems or drivers that
cannot work anyway.
This simplifies the dependencies, and allows to improve compile-testing.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Submitters of device tree binding documentation may forget to CC
the subsystem maintainer if this is missing.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a sparse warning about using an incorrect type in
argument 2 of ocelot_write_rix(), as an u32 was expected but a __be32
was given. The conversion to u32 is forced, which is safe as the value
will be written as-is in the hardware without any modification.
Fixes: 08d02364b1 ("net: mscc: fix the injection header")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using s/w buffer management, buffers are allocated and DMA mapped.
When doing so on an arm64 platform, an offset correction is applied on
the DMA address, before storing it in an Rx descriptor. The issue is
this DMA address is then used later in the Rx path without removing the
offset correction. Thus the DMA address is wrong, which can led to
various issues.
This patch fixes this by removing the offset correction from the DMA
address retrieved from the Rx descriptor before using it in the Rx path.
Fixes: 8d5047cf9c ("net: mvneta: Convert to be 64 bits compatible")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent patch updated e1000 docs to rst format. Docs build (`make
htmldocs`) is currently failing due to this file with error:
(SEVERE/4) Unexpected section title.
This is because a section of the file is indented 2 spaces. Build error
can be cleared by aligning the text with column 0. While we are changing
these lines we can make sure line length does not exceed 72, that
newlines following headings are uniform, and that full stops are
followed by two spaces.
Align text with column 0, limit line length to 72, ensure two spaces
follow all full stops, ensure uniform use of newlines after heading.
Fixes commit (228046e761 Documentation: e1000: Update kernel documentation)
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent patch updated e100 docs to rst format. Docs build (`make
htmldocs`) is currently failing due to this file with error:
(SEVERE/4) Unexpected section title.
This is because a section of the file is indented 2 spaces. Build error
can be cleared by aligning the text with column 0. While we are changing
these lines we can make sure line length does not exceed 72, that
newlines following headings are uniform, and that full stops are
followed by two spaces.
Align text with column 0, limit line length to 72, ensure two spaces
follow all full stops, ensure uniform use of newlines after heading.
Fixes commit (85d63445f4 Documentation: e100: Update the Intel 10/100 driver doc)
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recently documentation file was converted to rst. The document title
has the incorrect heading adornment. From kernel docs:
* Please stick to this order of heading adornments:
1. ``=`` with overline for document title::
==============
Document title
==============
Add overline heading adornment to document title.
Fixes commit (228046e761 Documentation: e1000: Update kernel documentation)
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recently documentation file was converted to rst. The document title
has the incorrect heading adornment. From kernel docs:
* Please stick to this order of heading adornments:
1. ``=`` with overline for document title::
==============
Document title
==============
Add overline heading adornment to document title.
Fixes commit (85d63445f4 Documentation: e100: Update the Intel 10/100 driver doc)
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The GPIO subsystem provides dummy GPIO consumer functions if GPIOLIB is
not enabled. Hence drivers that depend on GPIOLIB, but use GPIO consumer
functionality only, can still be compiled if GPIOLIB is not enabled.
Relax the dependency on GPIOLIB if COMPILE_TEST is enabled, where
appropriate.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
After recieving MLD querys, we update idev->mc_maxdelay with max_delay
from query header. This make the later unsolicited reports have the same
interval with mc_maxdelay, which means we may send unsolicited reports with
long interval time instead of default configured interval time.
Also as we will not call ipv6_mc_reset() after device up. This issue will
be there even after leave the group and join other groups.
Fixes: fc4eba58b4 ("ipv6: make unsolicited report intervals configurable for mld")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sock will be NULL if we pass -1 to vhost_net_set_backend(), but when
we meet errors during ubuf allocation, the code does not check for
NULL before calling sockfd_put(), this will lead NULL
dereferencing. Fixing by checking sock pointer before.
Fixes: bab632d69e ("vhost: vhost TX zero-copy support")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current logic incorrectly calculates the LLC ID from the APIC ID.
Unless specified otherwise, the LLC ID should be calculated by removing
the Core and Thread ID bits from the least significant end of the APIC
ID. For more info, see "ApicId Enumeration Requirements" in any Fam17h
PPR document.
[ bp: Improve commit message. ]
Fixes: 68091ee7ac ("Calculate last level cache ID from number of sharing threads")
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1528915390-30533-1-git-send-email-suravee.suthikulpanit@amd.com
syzbot is reporting NULL pointer dereference at wb_workfn() [1] due to
wb->bdi->dev being NULL. And Dmitry confirmed that wb->state was
WB_shutting_down after wb->bdi->dev became NULL. This indicates that
unregister_bdi() failed to call wb_shutdown() on one of wb objects.
The problem is in cgwb_bdi_unregister() which does cgwb_kill() and thus
drops bdi's reference to wb structures before going through the list of
wbs again and calling wb_shutdown() on each of them. This way the loop
iterating through all wbs can easily miss a wb if that wb has already
passed through cgwb_remove_from_bdi_list() called from wb_shutdown()
from cgwb_release_workfn() and as a result fully shutdown bdi although
wb_workfn() for this wb structure is still running. In fact there are
also other ways cgwb_bdi_unregister() can race with
cgwb_release_workfn() leading e.g. to use-after-free issues:
CPU1 CPU2
cgwb_bdi_unregister()
cgwb_kill(*slot);
cgwb_release()
queue_work(cgwb_release_wq, &wb->release_work);
cgwb_release_workfn()
wb = list_first_entry(&bdi->wb_list, ...)
spin_unlock_irq(&cgwb_lock);
wb_shutdown(wb);
...
kfree_rcu(wb, rcu);
wb_shutdown(wb); -> oops use-after-free
We solve these issues by synchronizing writeback structure shutdown from
cgwb_bdi_unregister() with cgwb_release_workfn() using a new mutex. That
way we also no longer need synchronization using WB_shutting_down as the
mutex provides it for CONFIG_CGROUP_WRITEBACK case and without
CONFIG_CGROUP_WRITEBACK wb_shutdown() can be called only once from
bdi_unregister().
Reported-by: syzbot <syzbot+4a7438e774b21ddd8eca@syzkaller.appspotmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Remove dependencies on HAS_DMA where a Kconfig symbol depends on another
symbol that implies HAS_DMA, and, optionally, on "|| COMPILE_TEST".
In most cases this other symbol is an architecture or platform specific
symbol, or PCI.
Generic symbols and drivers without platform dependencies keep their
dependencies on HAS_DMA, to prevent compiling subsystems or drivers that
cannot work anyway.
This simplifies the dependencies, and allows to improve compile-testing.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When delivering a signal to a task that is using rseq, we call into
__rseq_handle_notify_resume() so that the registers pushed in the
sigframe are updated to reflect the state of the restartable sequence
(for example, ensuring that the signal returns to the abort handler if
necessary).
However, if the rseq management fails due to an unrecoverable fault when
accessing userspace or certain combinations of RSEQ_CS_* flags, then we
will attempt to deliver a SIGSEGV. This has the potential for infinite
recursion if the rseq code continuously fails on signal delivery.
Avoid this problem by using force_sigsegv() instead of force_sig(), which
is explicitly designed to reset the SEGV handler to SIG_DFL in the case
of a recursive fault. In doing so, remove rseq_signal_deliver() from the
internal rseq API and have an optional struct ksignal * parameter to
rseq_handle_notify_resume() instead.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: peterz@infradead.org
Cc: paulmck@linux.vnet.ibm.com
Cc: boqun.feng@gmail.com
Link: https://lkml.kernel.org/r/1529664307-983-1-git-send-email-will.deacon@arm.com
When rewriting swapper using nG mappings, we must performance cache
maintenance around each page table access in order to avoid coherency
problems with the host's cacheable alias under KVM. To ensure correct
ordering of the maintenance with respect to Device memory accesses made
with the Stage-1 MMU disabled, DMBs need to be added between the
maintenance and the corresponding memory access.
This patch adds a missing DMB between writing a new page table entry and
performing a clean+invalidate on the same line.
Fixes: f992b4dfd5 ("arm64: kpti: Add ->enable callback to remap swapper using nG mappings")
Cc: <stable@vger.kernel.org> # 4.16.x-
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We inspect __kpti_forced early on as part of the cpufeature enable
callback which remaps the swapper page table using non-global entries.
Ensure that __kpti_forced has been updated to reflect the kpti=
command-line option before we start using it.
Fixes: ea1e3de85e ("arm64: entry: Add fake CPU feature for unmapping the kernel at EL0")
Cc: <stable@vger.kernel.org> # 4.16.x-
Reported-by: Wei Xu <xuwei5@hisilicon.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Wei Xu <xuwei5@hisilicon.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Submitters of device tree binding documentation may forget to CC
the subsystem maintainer if this is missing.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree@vger.kernel.org
Link: https://lkml.kernel.org/r/20180622100820.29616-1-geert@linux-m68k.org
For the common cases where 1000 is a multiple of HZ, or HZ is a multiple of
1000, jiffies_to_msecs() never returns zero when passed a non-zero time
period.
However, if HZ > 1000 and not an integer multiple of 1000 (e.g. 1024 or
1200, as used on alpha and DECstation), jiffies_to_msecs() may return zero
for small non-zero time periods. This may break code that relies on
receiving back a non-zero value.
jiffies_to_usecs() does not need such a fix: one jiffy can only be less
than one µs if HZ > 1000000, and such large values of HZ are already
rejected at build time, twice:
- include/linux/jiffies.h does #error if HZ >= 12288,
- kernel/time/time.c has BUILD_BUG_ON(HZ > USEC_PER_SEC).
Broken since forever.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: linux-alpha@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180622143357.7495-1-geert@linux-m68k.org
KVM_CAP_HYPERV_TLBFLUSH collided with KVM_CAP_S390_PSW-BPB, its paragraph
number should now be 8.18.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
This patch extends the checks done prior to a nested VM entry.
Specifically, it extends the check_vmentry_prereqs function with checks
for fields relevant to the VM-entry event injection information, as
described in the Intel SDM, volume 3.
This patch is motivated by a syzkaller bug, where a bad VM-entry
interruption information field is generated in the VMCS02, which causes
the nested VM launch to fail. Then, KVM fails to resume L1.
While KVM should be improved to correctly resume L1 execution after a
failed nested launch, this change is justified because the existing code
to resume L1 is flaky/ad-hoc and the test coverage for resuming L1 is
sparse.
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Marc Orr <marcorr@google.com>
[Removed comment whose parts were describing previous revisions and the
rest was obvious from function/variable naming. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Pull NVMe fixes from Christoph:
"Various relatively small fixes, mostly to fix error handling of various
sorts."
* 'nvme-4.18' of git://git.infradead.org/nvme:
nvme-pci: limit max IO size and segments to avoid high order allocations
nvme-pci: move nvme_kill_queues to nvme_remove_dead_ctrl
nvme-fc: release io queues to allow fast fail
nvmet: reset keep alive timer in controller enable
nvme-rdma: don't override opts->queue_size
nvme-rdma: Fix command completion race at error recovery
nvme-rdma: fix possible free of a non-allocated async event buffer
nvme-rdma: fix possible double free condition when failing to create a controller
- Lazy FPSIMD switching fixes
- Really disable compat ioctls on architectures that don't want it
- Disable compat on arm64 (it was never implemented...)
- Rely on architectural requirements for GICV on GICv3
- Detect bad alignments in unmap_stage2_range
-----BEGIN PGP SIGNATURE-----
iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlss7fgVHG1hcmMuenlu
Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDjosQAKG9+NIpcKx15k6Rzw+WcpEwk3IY
yx47MR9ndgfypQGrJ1bczbaP0xzMW/snErtN2iKGv/q8QdaKI0fj63XFkDMcfp+i
RdsIrWyXkhATbFvChKvp2jUWPoDKAO+eUYpwKk60QqT/NEzo6JoLyRXcGwIruHoZ
pzJEZXnmNUTwgAWnJlLGVCJ17lDVd39Gp9/z97MzOyjFCVpfJ61HsEWAtjmiCQr5
opHcdJ5KBW7XOU4uRqM4QEIruO3Hq56L9VJJWfRf2gdnFQowyMhKNErvuuSBpeBF
jHA+1nEWbxph97aEEYcnXDQt7wh/o7hZ7kGdGqmW03gJWZdMhXeivlsavXihkxIq
s3rUGRFhkIJ2+RwsmhSchlkkR9RFTItAAoMsRPwxAGcLA/2CBIPjdUFqbIse8O/h
mWvXVrYa6uVQQzxJgtbtiOFFVdBOSxr9uE4eq+8vKNkS+3UTs0vUgueB6QtERNtM
L1uyLGfyZVGuRvvIo5YTGO5vmRQdsBuPi4YtkJpdtks3TDZBw2bsf38ybyvPc963
0orsS065JiFbh/aCtcMUnD8nPkdfpWVEmSa470bsSNK0f7Wa311fawW+/wGGpGdh
2Jz/CIm5a29bIE+aR9k9B4QJIvXrX6uCiobfEHjQGJMC1DOJPVFvQhMn2JMnFc/F
LlulbKhiYbdhxAnS
=nzFZ
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-fixes-for-4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm
KVM/arm fixes for 4.18, take #1
- Lazy FPSIMD switching fixes
- Really disable compat ioctls on architectures that don't want it
- Disable compat on arm64 (it was never implemented...)
- Rely on architectural requirements for GICV on GICv3
- Detect bad alignments in unmap_stage2_range
Some injection testing resulted in the following console log:
mce: [Hardware Error]: CPU 22: Machine Check Exception: f Bank 1: bd80000000100134
mce: [Hardware Error]: RIP 10:<ffffffffc05292dd> {pmem_do_bvec+0x11d/0x330 [nd_pmem]}
mce: [Hardware Error]: TSC c51a63035d52 ADDR 3234bc4000 MISC 88
mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1526502199 SOCKET 0 APIC 38 microcode 2000043
mce: [Hardware Error]: Run the above through 'mcelog --ascii'
Kernel panic - not syncing: Machine check from unknown source
This confused everybody because the first line quite clearly shows
that we found a logged error in "Bank 1", while the last line says
"unknown source".
The problem is that the Linux code doesn't do the right thing
for a local machine check that results in a fatal error.
It turns out that we know very early in the handler whether the
machine check is fatal. The call to mce_no_way_out() has checked
all the banks for the CPU that took the local machine check. If
it says we must crash, we can do so right away with the right
messages.
We do scan all the banks again. This means that we might initially
not see a problem, but during the second scan find something fatal.
If this happens we print a slightly different message (so I can
see if it actually every happens).
[ bp: Remove unneeded severity assignment. ]
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: stable@vger.kernel.org # 4.2
Link: http://lkml.kernel.org/r/52e049a497e86fd0b71c529651def8871c804df0.1527283897.git.tony.luck@intel.com
mce_no_way_out() does a quick check during #MC to see whether some of
the MCEs logged would require the kernel to panic immediately. And it
passes a struct mce where MCi_STATUS gets written.
However, after having saved a valid status value, the next iteration
of the loop which goes over the MCA banks on the CPU, overwrites the
valid status value because we're using struct mce as storage instead of
a temporary variable.
Which leads to MCE records with an empty status value:
mce: [Hardware Error]: CPU 0: Machine Check Exception: 6 Bank 0: 0000000000000000
mce: [Hardware Error]: RIP 10:<ffffffffbd42fbd7> {trigger_mce+0x7/0x10}
In order to prevent the loss of the status register value, return
immediately when severity is a panic one so that we can panic
immediately with the first fatal MCE logged. This is also the intention
of this function and not to noodle over the banks while a fatal MCE is
already logged.
Tony: read the rest of the MCA bank to populate the struct mce fully.
Suggested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20180622095428.626-8-bp@alien8.de
Function irq_desc_get_msi_desc() is not referenced in the kernel (and does
not seem to have been referenced since e39758e0ea, 3 years ago), so
delete it.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <marc.zyngier@arm.com>
Cc: <will.deacon@arm.com>
Cc: <kstewart@linuxfoundation.org>
Cc: <julien.thierry@arm.com>
Cc: <andrew@lunn.ch>
Cc: <trivial@kernel.org>
Link: https://lkml.kernel.org/r/1529667333-92959-1-git-send-email-john.garry@huawei.com
Enabling LPIs was made a lot stricter recently, by checking that they are
disabled before enabling them. By doing so, the CPU hotplug case was missed
altogether, which leaves LPIs enabled on hotplug off (expecting the CPU to
eventually come back), and won't write a different value anyway on hotplug
on.
So skip that check if that particular case is detected
Fixes: 6eb486b66a ("irqchip/gic-v3: Ensure GICR_CTLR.EnableLPI=0 is observed before enabling")
Reported-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Sumit Garg <sumit.garg@linaro.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lkml.kernel.org/r/20180622095254.5906-8-marc.zyngier@arm.com
Similarily to the SYNC operation, it must be verified that the VPE
targetted by a VLPI is backed by a valid collection in the GIC driver data
structures.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20180622095254.5906-7-marc.zyngier@arm.com
It is possible, under obscure circumstances, to convince the ITS driver to
emit a SYNC operation that targets a collection that is not bound to any
redistributor (and the target_address field is zero) because the
corresponding CPU has not been seen yet (the system has been booted with
max_cpus="something small").
If the ITS is using the linear CPU number as the target, this is not a big
deal, as we just end-up issuing a SYNC to CPU0. But if the ITS requires the
physical address of the redistributor (with GITS_TYPER.PTA==1), we end-up
asking the ITS to write to the physical address zero, which is not exactly
a good idea (there has been report of the ITS locking up). This should of
course never happen, but hey, this is SW...
In order to avoid the above disaster, let's track which collections have
been actually initialized, and let's not generate a SYNC if the collection
hasn't been properly bound to a redistributor. Take this opportunity to
spit our a warning, in the hope that someone may report the issue if it
arrises again.
Reported-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20180622095254.5906-6-marc.zyngier@arm.com
On a NUMA system, if an ITS is local to an offline node, the ITS driver may
pick an offline CPU to bind the LPI. In this case, pick an online CPU (and
the first one will do).
But on some systems, binding an LPI to non-local node CPU may cause
deadlock (see Cavium erratum 23144). In this case, just fail the activate
and return an error code.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180622095254.5906-5-marc.zyngier@arm.com
On failing to allocate the required SPIs, the actual number of interrupts
should be freed and not its log2 value.
Fixes: de337ee301 ("irqchip/gic-v2m: Add PCI Multi-MSI support")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20180622095254.5906-4-marc.zyngier@arm.com
The ls-scfs-msi driver is not dealing with the effective affinity
as it should. Let's fix that, and make it clear that the effective
affinity is restricted to a single CPU. Also prevent the driver from
messing with the internals of the affinity setting infrastructure.
Reported-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20180622095254.5906-3-marc.zyngier@arm.com
Debug is missing the IRQCHIP_SUPPORTS_LEVEL_MSI debug entry, making debugfs
slightly less useful.
Take this opportunity to also add a missing comment in the definition of
IRQCHIP_SUPPORTS_LEVEL_MSI.
Fixes: 6988e0e0d2 ("genirq/msi: Limit level-triggered MSI to platform devices")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20180622095254.5906-2-marc.zyngier@arm.com
When building perf with W=1 the following warning triggers:
CC kernel/events/ring_buffer.o
kernel/events/ring_buffer.c:105:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
static bool __always_inline
^~~~~~
...
Move the inline keyword to the beginning of the function declaration.
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: trival@kernel.org
Link: http://lkml.kernel.org/r/20180308202856.9378-1-malat@debian.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAlssqQcACgkQnJ2qBz9k
QNl2XwgAhtb2LvBhWKh3OkBVczICgc9eQho03KiHJkEcR1OCkgjOJhVBDrQmJbBa
tAuMmUcnTpLJG8CI5TyR0G4hNbttE2LuTu+6RV67hXOjhiYAQ5P9wYLyUqfZf/01
myrPEewr9qqx3h2htqufiLQIyO1M4FeM37VqdH7vZhQOb+B+FUw7JB9a/HbCpNh/
7NUDf2GbLtLjK+2Xh0ttXvfWjgbLjC4wMmaPaa5+Nabn+URtvX9aHgQ/dYrOjQ16
7oD5K5x3k3sOT6Ix7xRGLHgE6Xl6MlTtxpyt5ldb96RwWO/GAuMhSCBd14nS2hmx
fEx5N2vbs3v8Ux+ZdMauzAKZI6Fz2g==
=oFo+
-----END PGP SIGNATURE-----
Merge tag 'for_v4.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull udf, quota, ext2 fixes from Jan Kara:
"UDF:
- fix an oops due to corrupted disk image
- two small cleanups
quota:
- a fixfor lru handling
- cleanup
ext2:
- a warning about a deprecated mount option"
* tag 'for_v4.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Drop unused arguments of udf_delete_aext()
udf: Provide function for calculating dir entry length
udf: Detect incorrect directory size
ext2: add warning when specifying nocheck option
quota: Cleanup list iteration in dqcache_shrink_scan()
quota: reclaim least recently used dquots
Commit:
79832f0b5f ("efi/libstub/tpm: Initialize pointer variables to zero for mixed mode")
fixes a problem with the tpm code on mixed mode (64-bit kernel on 32-bit UEFI),
where 64-bit pointer variables are not fully initialized by the 32-bit EFI code.
A similar problem applies to the efi_physical_addr_t variables which
are written by the ->get_event_log() EFI call. Even though efi_physical_addr_t
is 64-bit everywhere, it seems that some 32-bit UEFI implementations only
fill in the lower 32 bits when passed a pointer to an efi_physical_addr_t
to fill.
This commit initializes these to 0 to, to ensure the upper 32 bits are
0 in mixed mode. This fixes recent kernels sometimes hanging during
early boot on mixed mode UEFI systems.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: <stable@vger.kernel.org> # v4.16+
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20180622064222.11633-2-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 910f8befdf ("xen/pirq: fix error path cleanup when binding
MSIs") fixed a couple of errors in error cleanup path of
xen_bind_pirq_msi_to_irq(). This cleanup allowed a call to
__unbind_from_irq() with an unbound irq, which would result in
triggering the BUG_ON there.
Since there is really no reason for the BUG_ON (xen_free_irq() can
operate on unbound irqs) we can remove it.
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: stable@vger.kernel.org
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
For passing arbitrary data from user land to the Xen hypervisor the
Xen tools today are using mlock()ed buffers. Unfortunately the kernel
might change access rights of such buffers for brief periods of time
e.g. for page migration or compaction, leading to access faults in the
hypervisor, as the hypervisor can't use the locks of the kernel.
In order to solve this problem add a new device node to the Xen privcmd
driver to easily allocate hypercall buffers via mmap(). The memory is
allocated in the kernel and just mapped into user space. Marked as
VM_IO the user mapping will not be subject to page migration et al.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
On Intel platforms (Skylake and newer), ASPM support in r8169 is the
last missing puzzle to let CPU's Package C-State reaches PC8. Without
ASPM support, the CPU cannot reach beyond PC3. PC8 can save additional
~3W in comparison with PC3 on a Coffee Lake platform, Dell G3 3779.
This is based on the work from Chunhao Lin <hau@realtek.com>.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable or disable ASPM should be done in PCI core instead of in the
device driver.
Commit ba04c7c93b ("r8169: disable ASPM") uses
pci_disable_link_state() to disable ASPM, but it's not the best way to
do it. If the device really wants to disable ASPM, we can use a quirk in
PCI core to prevent the PCI core from setting ASPM before probe.
Let's remove pci_disable_link_state() for now. Use PCI core quirks if
any regression happens.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit makes BBR use only the MSS (without any headers) to
calculate pacing rates when internal TCP-layer pacing is used.
This is necessary to achieve the correct pacing behavior in this case,
since tcp_internal_pacing() uses only the payload length to calculate
pacing delays.
Signed-off-by: Kevin Yang <yyd@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>