Commit Graph

766959 Commits

Author SHA1 Message Date
Paul Burton 41fd60fa74 net: pch_gbe: Remove PCH_GBE_MAC_IFOP_RGMII define
The pch_gbe driver currently presumes that the PHY is connected using
RGMII, and would need further work to support other buses. It includes a
define which is always set that conditionalises some of the
RGMII-specific code regardless. Remove it. If we do ever support
different MII buses then preprocessor defines won't be the best way to
select between them anyway.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 20:52:08 +09:00
Paul Burton c63ebdf01a net: pch_gbe: Remove pch_gbe_hal_setup_init_funcs
The pch_gbe driver calls a pch_gbe_hal_setup_init_funcs function which
ultimately sets the value of one field in struct pch_gbe_phy_info in a
convoluted way.

This patch removes pch_gbe_hal_setup_init_funcs in favor of inlining it,
and in turn its callee pch_gbe_plat_init_function_pointers, into the
single caller pch_gbe_sw_init.

With this pch_gbe_api.c & pch_gbe_api.h are essentially empty, so they
are removed & inclusions of the latter replaced with pch_gbe_phy.h which
was previously being included via pch_gbe_api.h.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 20:52:08 +09:00
Paul Burton b02c38a23a net: pch_gbe: Remove get_bus_info HAL abstraction
For some reason the pch_gbe driver contains a struct pch_gbe_functions
with pointers used by a HAL abstraction layer, even though there is only
one implementation of each function.

This patch removes the get_bus_info abstraction. Its single
implementation (pch_gbe_plat_get_bus_info) only sets values within a
struct pch_gbe_bus_info which is never used, so we simply remove the
call to it in pch_gbe_probe & remove struct pch_gbe_bus_info entirely.

Now that struct pch_gbe_functions is empty we remove it entirely too.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 20:52:08 +09:00
Paul Burton 3ef594b0e4 net: pch_gbe: Remove init_hw HAL abstraction
For some reason the pch_gbe driver contains a struct pch_gbe_functions
with pointers used by a HAL abstraction layer, even though there is only
one implementation of each function.

This patch removes the init_hw abstraction in favor of inlining its
single implementation (pch_gbe_plat_init_hw) into its single caller
(pch_gbe_reset).

Signed-off-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 20:52:08 +09:00
Paul Burton c96a0f7431 net: pch_gbe: Remove {read,write}_phy_reg HAL abstraction
For some reason the pch_gbe driver contains a struct pch_gbe_functions
with pointers used by a HAL abstraction layer, even though there is only
one implementation of each function.

This patch removes the read_phy_reg & write_phy_reg abstractions in
favor of calling pch_gbe_phy_read_reg_miic & pch_gbe_phy_write_reg_miic
directly.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 20:52:08 +09:00
Paul Burton 7dbe38aed0 net: pch_gbe: Remove reset_phy HAL abstraction
For some reason the pch_gbe driver contains a struct pch_gbe_functions
with pointers used by a HAL abstraction layer, even though there is only
one implementation of each function.

This patch removes the reset_phy abstraction in favor of calling
pch_gbe_phy_hw_reset directly.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 20:52:08 +09:00
Paul Burton 66dde2b0aa net: pch_gbe: Remove sw_reset_phy HAL abstraction
For some reason the pch_gbe driver contains a struct pch_gbe_functions
with pointers used by a HAL abstraction layer, even though there is only
one implementation of each function.

This patch removes the sw_reset_phy abstraction, which it turns out is
never even used. Its one implementation, which is already called
directly within the same translation unit, can therefore be made static
and removed from the pch_gbe_phy.h header.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 20:52:08 +09:00
Paul Burton 9c020d7b05 net: pch_gbe: Remove read_mac_addr HAL abstraction
For some reason the pch_gbe driver contains a struct pch_gbe_functions
with pointers used by a HAL abstraction layer, even though there is only
one implementation of each function.

This patch removes the read_mac_addr abstraction in favor of calling
pch_gbe_mac_read_mac_addr directly. Since this is defined in the same
translation unit as all of its callers, we can make it static & remove
it from the pch_gbe.h header.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 20:52:08 +09:00
Paul Burton ac6c0e0aa4 net: pch_gbe: Remove power_{up,down}_phy HAL abstraction
For some reason the pch_gbe driver contains a struct pch_gbe_functions
with pointers used by a HAL abstraction layer, even though there is only
one implementation of each function.

This patch removes the power_up_phy & power_down_phy abstractions in
favor of calling pch_phy_power_up & pch_phy_power_down directly.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 20:52:08 +09:00
Paul Burton 33bfdeaa76 net: pch_gbe: Remove unused copybreak parameter
The pch_gbe driver includes a 'copybreak' parameter which appears to
have been copied from the e1000e driver but is entirely unused. Remove
the dead code.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 20:52:08 +09:00
Eric Dumazet 5424ea2739 netns: get more entropy from net_hash_mix()
struct net are effectively allocated from order-1 pages on x86,
with one object per slab, meaning that the 13 low order bits
of their addresses are zero.

Once shifted by L1_CACHE_SHIFT, this leaves 7 zero-bits,
meaning that net_hash_mix() does not help spreading
objects on various hash tables.

For example, TCP listen table has 32 buckets, meaning that
all netns use the same bucket for port 80 or port 443.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:59:56 +09:00
Cong Wang 35b42da69e net_sched: remove a bogus warning in hfsc
In update_vf():

  cftree_remove(cl);
  update_cfmin(cl->cl_parent);

the cl_cfmin of cl->cl_parent is intentionally updated to 0
when that parent only has one child. And if this parent is
root qdisc, we could end up, in hfsc_schedule_watchdog(),
that we can't decide the next schedule time for qdisc watchdog.
But it seems safe that we can just skip it, as this watchdog is
not always scheduled anyway.

Thanks to Marco for testing all the cases, nothing is broken.

Reported-by: Marco Berizzi <pupilla@libero.it>
Tested-by: Marco Berizzi <pupilla@libero.it>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:58:46 +09:00
Joe Perches 6c1f0a1ffb net: drivers/net: Convert random_ether_addr to eth_random_addr
random_ether_addr is a #define for eth_random_addr which is
generally preferred in kernel code by ~3:1

Convert the uses of random_ether_addr to enable removing the #define

Miscellanea:

o Convert &vfmac[0] to equivalent vfmac and avoid unnecessary line wrap

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:49:14 +09:00
David S. Miller 2ca4eb8574 Merge branch 'dccp-fixes-around-rx_tstamp_last_feedback'
Eric Dumazet says:

====================
net: dccp: fixes around rx_tstamp_last_feedback

This patch series fix some issues with rx_tstamp_last_feedback.

- Switch to monotonic clock.
- Avoid potential overflows on fast hosts/networks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:46:44 +09:00
Eric Dumazet 0ce4e70ff0 net: dccp: switch rx_tstamp_last_feedback to monotonic clock
To compute delays, better not use time of the day which can
be changed by admins or malicious programs.

Also change ccid3_first_li() to use s64 type for delta variable
to avoid potential overflows.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Cc: dccp@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:46:44 +09:00
Eric Dumazet 74174fe563 net: dccp: avoid crash in ccid3_hc_rx_send_feedback()
On fast hosts or malicious bots, we trigger a DCCP_BUG() which
seems excessive.

syzbot reported :

BUG: delta (-6195) <= 0 at net/dccp/ccids/ccid3.c:628/ccid3_hc_rx_send_feedback()
CPU: 1 PID: 18 Comm: ksoftirqd/1 Not tainted 4.18.0-rc1+ #112
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
 ccid3_hc_rx_send_feedback net/dccp/ccids/ccid3.c:628 [inline]
 ccid3_hc_rx_packet_recv.cold.16+0x38/0x71 net/dccp/ccids/ccid3.c:793
 ccid_hc_rx_packet_recv net/dccp/ccid.h:185 [inline]
 dccp_deliver_input_to_ccids+0xf0/0x280 net/dccp/input.c:180
 dccp_rcv_established+0x87/0xb0 net/dccp/input.c:378
 dccp_v4_do_rcv+0x153/0x180 net/dccp/ipv4.c:654
 sk_backlog_rcv include/net/sock.h:914 [inline]
 __sk_receive_skb+0x3ba/0xd80 net/core/sock.c:517
 dccp_v4_rcv+0x10f9/0x1f58 net/dccp/ipv4.c:875
 ip_local_deliver_finish+0x2eb/0xda0 net/ipv4/ip_input.c:215
 NF_HOOK include/linux/netfilter.h:287 [inline]
 ip_local_deliver+0x1e9/0x750 net/ipv4/ip_input.c:256
 dst_input include/net/dst.h:450 [inline]
 ip_rcv_finish+0x823/0x2220 net/ipv4/ip_input.c:396
 NF_HOOK include/linux/netfilter.h:287 [inline]
 ip_rcv+0xa18/0x1284 net/ipv4/ip_input.c:492
 __netif_receive_skb_core+0x2488/0x3680 net/core/dev.c:4628
 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4693
 process_backlog+0x219/0x760 net/core/dev.c:5373
 napi_poll net/core/dev.c:5771 [inline]
 net_rx_action+0x7da/0x1980 net/core/dev.c:5837
 __do_softirq+0x2e8/0xb17 kernel/softirq.c:284
 run_ksoftirqd+0x86/0x100 kernel/softirq.c:645
 smpboot_thread_fn+0x417/0x870 kernel/smpboot.c:164
 kthread+0x345/0x410 kernel/kthread.c:240
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Cc: dccp@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:46:43 +09:00
Geert Uytterhoeven e020797b7d net: Remove depends on HAS_DMA in case of platform dependency
Remove dependencies on HAS_DMA where a Kconfig symbol depends on another
symbol that implies HAS_DMA, and, optionally, on "|| COMPILE_TEST".
In most cases this other symbol is an architecture or platform specific
symbol, or PCI.

Generic symbols and drivers without platform dependencies keep their
dependencies on HAS_DMA, to prevent compiling subsystems or drivers that
cannot work anyway.

This simplifies the dependencies, and allows to improve compile-testing.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:44:30 +09:00
Geert Uytterhoeven 935c5e3eaf MAINTAINERS: Add file patterns for dsa device tree bindings
Submitters of device tree binding documentation may forget to CC
the subsystem maintainer if this is missing.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:43:00 +09:00
Antoine Tenart c2cd650b5b net: mscc: make sparse happy
This patch fixes a sparse warning about using an incorrect type in
argument 2 of ocelot_write_rix(), as an u32 was expected but a __be32
was given. The conversion to u32 is forced, which is safe as the value
will be written as-is in the hardware without any modification.

Fixes: 08d02364b1 ("net: mscc: fix the injection header")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:42:02 +09:00
Antoine Tenart 271f7ff5aa net: mvneta: fix the Rx desc DMA address in the Rx path
When using s/w buffer management, buffers are allocated and DMA mapped.
When doing so on an arm64 platform, an offset correction is applied on
the DMA address, before storing it in an Rx descriptor. The issue is
this DMA address is then used later in the Rx path without removing the
offset correction. Thus the DMA address is wrong, which can led to
various issues.

This patch fixes this by removing the offset correction from the DMA
address retrieved from the Rx descriptor before using it in the Rx path.

Fixes: 8d5047cf9c ("net: mvneta: Convert to be 64 bits compatible")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:40:15 +09:00
Tobin C. Harding 805f16a5f1 Documentation: e1000: Fix docs build error
Recent patch updated e1000 docs to rst format.  Docs build (`make
htmldocs`) is currently failing due to this file with error:

        (SEVERE/4) Unexpected section title.

This is because a section of the file is indented 2 spaces.  Build error
can be cleared by aligning the text with column 0.  While we are changing
these lines we can make sure line length does not exceed 72, that
newlines following headings are uniform, and that full stops are
followed by two spaces.

Align text with column 0, limit line length to 72, ensure two spaces
follow all full stops, ensure uniform use of newlines after heading.

Fixes commit (228046e761 Documentation: e1000: Update kernel documentation)

CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:37:37 +09:00
Tobin C. Harding 3b0c3ebe2a Documentation: e100: Fix docs build error
Recent patch updated e100 docs to rst format.  Docs build (`make
htmldocs`) is currently failing due to this file with error:

	(SEVERE/4) Unexpected section title.

This is because a section of the file is indented 2 spaces.  Build error
can be cleared by aligning the text with column 0.  While we are changing
these lines we can make sure line length does not exceed 72, that
newlines following headings are uniform, and that full stops are
followed by two spaces.

Align text with column 0, limit line length to 72, ensure two spaces
follow all full stops, ensure uniform use of newlines after heading.

Fixes commit (85d63445f4 Documentation: e100: Update the Intel 10/100 driver doc)

CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:37:37 +09:00
Tobin C. Harding 3be40e5476 Documentation: e1000: Use correct heading adornment
Recently documentation file was converted to rst.  The document title
has the incorrect heading adornment.  From kernel docs:

	* Please stick to this order of heading adornments:

	  1. ``=`` with overline for document title::

	       ==============
	       Document title
	       ==============

Add  overline heading adornment to document title.

Fixes commit (228046e761 Documentation: e1000: Update kernel documentation)

CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:37:37 +09:00
Tobin C. Harding 32e6996ca6 Documentation: e100: Use correct heading adornment
Recently documentation file was converted to rst.  The document title
has the incorrect heading adornment.  From kernel docs:

	* Please stick to this order of heading adornments:

	  1. ``=`` with overline for document title::

	       ==============
	       Document title
	       ==============

Add  overline heading adornment to document title.

Fixes commit (85d63445f4 Documentation: e100: Update the Intel 10/100 driver doc)

CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:37:37 +09:00
Geert Uytterhoeven d55207e37a net: phy: Allow compile test of GPIO consumers if !GPIOLIB
The GPIO subsystem provides dummy GPIO consumer functions if GPIOLIB is
not enabled. Hence drivers that depend on GPIOLIB, but use GPIO consumer
functionality only, can still be compiled if GPIOLIB is not enabled.

Relax the dependency on GPIOLIB if COMPILE_TEST is enabled, where
appropriate.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:33:56 +09:00
Hangbin Liu 6c6da92808 ipv6: mcast: fix unsolicited report interval after receiving querys
After recieving MLD querys, we update idev->mc_maxdelay with max_delay
from query header. This make the later unsolicited reports have the same
interval with mc_maxdelay, which means we may send unsolicited reports with
long interval time instead of default configured interval time.

Also as we will not call ipv6_mc_reset() after device up. This issue will
be there even after leave the group and join other groups.

Fixes: fc4eba58b4 ("ipv6: make unsolicited report intervals configurable for mld")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:27:45 +09:00
Jason Wang b8f1f65882 vhost_net: validate sock before trying to put its fd
Sock will be NULL if we pass -1 to vhost_net_set_backend(), but when
we meet errors during ubuf allocation, the code does not check for
NULL before calling sockfd_put(), this will lead NULL
dereferencing. Fixing by checking sock pointer before.

Fixes: bab632d69e ("vhost: vhost TX zero-copy support")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-23 10:23:49 +09:00
Suravee Suthikulpanit 964d978433 x86/CPU/AMD: Fix LLC ID bit-shift calculation
The current logic incorrectly calculates the LLC ID from the APIC ID.

Unless specified otherwise, the LLC ID should be calculated by removing
the Core and Thread ID bits from the least significant end of the APIC
ID. For more info, see "ApicId Enumeration Requirements" in any Fam17h
PPR document.

[ bp: Improve commit message. ]

Fixes: 68091ee7ac ("Calculate last level cache ID from number of sharing threads")
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1528915390-30533-1-git-send-email-suravee.suthikulpanit@amd.com
2018-06-22 21:21:49 +02:00
Thomas Gleixner 7731b8bc94 Merge branch 'linus' into x86/urgent
Required to queue a dependent fix.
2018-06-22 21:20:35 +02:00
Jan Kara 3ee7e8697d bdi: Fix another oops in wb_workfn()
syzbot is reporting NULL pointer dereference at wb_workfn() [1] due to
wb->bdi->dev being NULL. And Dmitry confirmed that wb->state was
WB_shutting_down after wb->bdi->dev became NULL. This indicates that
unregister_bdi() failed to call wb_shutdown() on one of wb objects.

The problem is in cgwb_bdi_unregister() which does cgwb_kill() and thus
drops bdi's reference to wb structures before going through the list of
wbs again and calling wb_shutdown() on each of them. This way the loop
iterating through all wbs can easily miss a wb if that wb has already
passed through cgwb_remove_from_bdi_list() called from wb_shutdown()
from cgwb_release_workfn() and as a result fully shutdown bdi although
wb_workfn() for this wb structure is still running. In fact there are
also other ways cgwb_bdi_unregister() can race with
cgwb_release_workfn() leading e.g. to use-after-free issues:

CPU1                            CPU2
                                cgwb_bdi_unregister()
                                  cgwb_kill(*slot);

cgwb_release()
  queue_work(cgwb_release_wq, &wb->release_work);
cgwb_release_workfn()
                                  wb = list_first_entry(&bdi->wb_list, ...)
                                  spin_unlock_irq(&cgwb_lock);
  wb_shutdown(wb);
  ...
  kfree_rcu(wb, rcu);
                                  wb_shutdown(wb); -> oops use-after-free

We solve these issues by synchronizing writeback structure shutdown from
cgwb_bdi_unregister() with cgwb_release_workfn() using a new mutex. That
way we also no longer need synchronization using WB_shutting_down as the
mutex provides it for CONFIG_CGROUP_WRITEBACK case and without
CONFIG_CGROUP_WRITEBACK wb_shutdown() can be called only once from
bdi_unregister().

Reported-by: syzbot <syzbot+4a7438e774b21ddd8eca@syzkaller.appspotmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-22 12:08:07 -06:00
Geert Uytterhoeven 0ae52ddf5b lightnvm: Remove depends on HAS_DMA in case of platform dependency
Remove dependencies on HAS_DMA where a Kconfig symbol depends on another
symbol that implies HAS_DMA, and, optionally, on "|| COMPILE_TEST".
In most cases this other symbol is an architecture or platform specific
symbol, or PCI.

Generic symbols and drivers without platform dependencies keep their
dependencies on HAS_DMA, to prevent compiling subsystems or drivers that
cannot work anyway.

This simplifies the dependencies, and allows to improve compile-testing.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-22 12:07:11 -06:00
Will Deacon 784e0300fe rseq: Avoid infinite recursion when delivering SIGSEGV
When delivering a signal to a task that is using rseq, we call into
__rseq_handle_notify_resume() so that the registers pushed in the
sigframe are updated to reflect the state of the restartable sequence
(for example, ensuring that the signal returns to the abort handler if
necessary).

However, if the rseq management fails due to an unrecoverable fault when
accessing userspace or certain combinations of RSEQ_CS_* flags, then we
will attempt to deliver a SIGSEGV. This has the potential for infinite
recursion if the rseq code continuously fails on signal delivery.

Avoid this problem by using force_sigsegv() instead of force_sig(), which
is explicitly designed to reset the SEGV handler to SIG_DFL in the case
of a recursive fault. In doing so, remove rseq_signal_deliver() from the
internal rseq API and have an optional struct ksignal * parameter to
rseq_handle_notify_resume() instead.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: peterz@infradead.org
Cc: paulmck@linux.vnet.ibm.com
Cc: boqun.feng@gmail.com
Link: https://lkml.kernel.org/r/1529664307-983-1-git-send-email-will.deacon@arm.com
2018-06-22 19:04:22 +02:00
Will Deacon 71c8fc0c96 arm64: mm: Ensure writes to swapper are ordered wrt subsequent cache maintenance
When rewriting swapper using nG mappings, we must performance cache
maintenance around each page table access in order to avoid coherency
problems with the host's cacheable alias under KVM. To ensure correct
ordering of the maintenance with respect to Device memory accesses made
with the Stage-1 MMU disabled, DMBs need to be added between the
maintenance and the corresponding memory access.

This patch adds a missing DMB between writing a new page table entry and
performing a clean+invalidate on the same line.

Fixes: f992b4dfd5 ("arm64: kpti: Add ->enable callback to remap swapper using nG mappings")
Cc: <stable@vger.kernel.org> # 4.16.x-
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-06-22 17:23:40 +01:00
Will Deacon b5b7dd647f arm64: kpti: Use early_param for kpti= command-line option
We inspect __kpti_forced early on as part of the cpufeature enable
callback which remaps the swapper page table using non-global entries.

Ensure that __kpti_forced has been updated to reflect the kpti=
command-line option before we start using it.

Fixes: ea1e3de85e ("arm64: entry: Add fake CPU feature for unmapping the kernel at EL0")
Cc: <stable@vger.kernel.org> # 4.16.x-
Reported-by: Wei Xu <xuwei5@hisilicon.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Wei Xu <xuwei5@hisilicon.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-06-22 17:23:26 +01:00
Geert Uytterhoeven 48e315618d MAINTAINERS: Add file patterns for x86 device tree bindings
Submitters of device tree binding documentation may forget to CC
the subsystem maintainer if this is missing.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: devicetree@vger.kernel.org
Link: https://lkml.kernel.org/r/20180622100820.29616-1-geert@linux-m68k.org
2018-06-22 17:58:32 +02:00
Geert Uytterhoeven abcbcb80cd time: Make sure jiffies_to_msecs() preserves non-zero time periods
For the common cases where 1000 is a multiple of HZ, or HZ is a multiple of
1000, jiffies_to_msecs() never returns zero when passed a non-zero time
period.

However, if HZ > 1000 and not an integer multiple of 1000 (e.g. 1024 or
1200, as used on alpha and DECstation), jiffies_to_msecs() may return zero
for small non-zero time periods.  This may break code that relies on
receiving back a non-zero value.

jiffies_to_usecs() does not need such a fix: one jiffy can only be less
than one µs if HZ > 1000000, and such large values of HZ are already
rejected at build time, twice:

  - include/linux/jiffies.h does #error if HZ >= 12288,
  - kernel/time/time.c has BUILD_BUG_ON(HZ > USEC_PER_SEC).

Broken since forever.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: linux-alpha@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180622143357.7495-1-geert@linux-m68k.org
2018-06-22 17:48:36 +02:00
Vitaly Kuznetsov 2ddc649810 KVM: fix KVM_CAP_HYPERV_TLBFLUSH paragraph number
KVM_CAP_HYPERV_TLBFLUSH collided with KVM_CAP_S390_PSW-BPB, its paragraph
number should now be 8.18.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-06-22 17:30:20 +02:00
Marc Orr 0447378a4a kvm: vmx: Nested VM-entry prereqs for event inj.
This patch extends the checks done prior to a nested VM entry.
Specifically, it extends the check_vmentry_prereqs function with checks
for fields relevant to the VM-entry event injection information, as
described in the Intel SDM, volume 3.

This patch is motivated by a syzkaller bug, where a bad VM-entry
interruption information field is generated in the VMCS02, which causes
the nested VM launch to fail. Then, KVM fails to resume L1.

While KVM should be improved to correctly resume L1 execution after a
failed nested launch, this change is justified because the existing code
to resume L1 is flaky/ad-hoc and the test coverage for resuming L1 is
sparse.

Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Marc Orr <marcorr@google.com>
[Removed comment whose parts were describing previous revisions and the
 rest was obvious from function/variable naming. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-06-22 16:46:26 +02:00
Jens Axboe f9da9d0786 Merge branch 'nvme-4.18' of git://git.infradead.org/nvme into for-linus
Pull NVMe fixes from Christoph:

"Various relatively small fixes, mostly to fix error handling of various
 sorts."

* 'nvme-4.18' of git://git.infradead.org/nvme:
  nvme-pci: limit max IO size and segments to avoid high order allocations
  nvme-pci: move nvme_kill_queues to nvme_remove_dead_ctrl
  nvme-fc: release io queues to allow fast fail
  nvmet: reset keep alive timer in controller enable
  nvme-rdma: don't override opts->queue_size
  nvme-rdma: Fix command completion race at error recovery
  nvme-rdma: fix possible free of a non-allocated async event buffer
  nvme-rdma: fix possible double free condition when failing to create a controller
2018-06-22 08:45:29 -06:00
Radim Krčmář 5f9077cbae KVM/arm fixes for 4.18, take #1
- Lazy FPSIMD switching fixes
 - Really disable compat ioctls on architectures that don't want it
 - Disable compat on arm64 (it was never implemented...)
 - Rely on architectural requirements for GICV on GICv3
 - Detect bad alignments in unmap_stage2_range
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlss7fgVHG1hcmMuenlu
 Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDjosQAKG9+NIpcKx15k6Rzw+WcpEwk3IY
 yx47MR9ndgfypQGrJ1bczbaP0xzMW/snErtN2iKGv/q8QdaKI0fj63XFkDMcfp+i
 RdsIrWyXkhATbFvChKvp2jUWPoDKAO+eUYpwKk60QqT/NEzo6JoLyRXcGwIruHoZ
 pzJEZXnmNUTwgAWnJlLGVCJ17lDVd39Gp9/z97MzOyjFCVpfJ61HsEWAtjmiCQr5
 opHcdJ5KBW7XOU4uRqM4QEIruO3Hq56L9VJJWfRf2gdnFQowyMhKNErvuuSBpeBF
 jHA+1nEWbxph97aEEYcnXDQt7wh/o7hZ7kGdGqmW03gJWZdMhXeivlsavXihkxIq
 s3rUGRFhkIJ2+RwsmhSchlkkR9RFTItAAoMsRPwxAGcLA/2CBIPjdUFqbIse8O/h
 mWvXVrYa6uVQQzxJgtbtiOFFVdBOSxr9uE4eq+8vKNkS+3UTs0vUgueB6QtERNtM
 L1uyLGfyZVGuRvvIo5YTGO5vmRQdsBuPi4YtkJpdtks3TDZBw2bsf38ybyvPc963
 0orsS065JiFbh/aCtcMUnD8nPkdfpWVEmSa470bsSNK0f7Wa311fawW+/wGGpGdh
 2Jz/CIm5a29bIE+aR9k9B4QJIvXrX6uCiobfEHjQGJMC1DOJPVFvQhMn2JMnFc/F
 LlulbKhiYbdhxAnS
 =nzFZ
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-for-4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm

KVM/arm fixes for 4.18, take #1

- Lazy FPSIMD switching fixes
- Really disable compat ioctls on architectures that don't want it
- Disable compat on arm64 (it was never implemented...)
- Rely on architectural requirements for GICV on GICv3
- Detect bad alignments in unmap_stage2_range
2018-06-22 14:56:19 +02:00
Zhenzhong Duan 0218c76626 x86/microcode/intel: Fix memleak in save_microcode_patch()
Free useless ucode_patch entry when it's replaced.

[ bp: Drop the memfree_patch() two-liner. ]

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Srinivas REDDY Eeda <srinivas.eeda@oracle.com>
Link: http://lkml.kernel.org/r/888102f0-fd22-459d-b090-a1bd8a00cb2b@default
2018-06-22 14:42:59 +02:00
Tony Luck 40c36e2741 x86/mce: Fix incorrect "Machine check from unknown source" message
Some injection testing resulted in the following console log:

  mce: [Hardware Error]: CPU 22: Machine Check Exception: f Bank 1: bd80000000100134
  mce: [Hardware Error]: RIP 10:<ffffffffc05292dd> {pmem_do_bvec+0x11d/0x330 [nd_pmem]}
  mce: [Hardware Error]: TSC c51a63035d52 ADDR 3234bc4000 MISC 88
  mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1526502199 SOCKET 0 APIC 38 microcode 2000043
  mce: [Hardware Error]: Run the above through 'mcelog --ascii'
  Kernel panic - not syncing: Machine check from unknown source

This confused everybody because the first line quite clearly shows
that we found a logged error in "Bank 1", while the last line says
"unknown source".

The problem is that the Linux code doesn't do the right thing
for a local machine check that results in a fatal error.

It turns out that we know very early in the handler whether the
machine check is fatal. The call to mce_no_way_out() has checked
all the banks for the CPU that took the local machine check. If
it says we must crash, we can do so right away with the right
messages.

We do scan all the banks again. This means that we might initially
not see a problem, but during the second scan find something fatal.
If this happens we print a slightly different message (so I can
see if it actually every happens).

[ bp: Remove unneeded severity assignment. ]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: stable@vger.kernel.org # 4.2
Link: http://lkml.kernel.org/r/52e049a497e86fd0b71c529651def8871c804df0.1527283897.git.tony.luck@intel.com
2018-06-22 14:35:50 +02:00
Borislav Petkov 1f74c8a647 x86/mce: Do not overwrite MCi_STATUS in mce_no_way_out()
mce_no_way_out() does a quick check during #MC to see whether some of
the MCEs logged would require the kernel to panic immediately. And it
passes a struct mce where MCi_STATUS gets written.

However, after having saved a valid status value, the next iteration
of the loop which goes over the MCA banks on the CPU, overwrites the
valid status value because we're using struct mce as storage instead of
a temporary variable.

Which leads to MCE records with an empty status value:

  mce: [Hardware Error]: CPU 0: Machine Check Exception: 6 Bank 0: 0000000000000000
  mce: [Hardware Error]: RIP 10:<ffffffffbd42fbd7> {trigger_mce+0x7/0x10}

In order to prevent the loss of the status register value, return
immediately when severity is a panic one so that we can panic
immediately with the first fatal MCE logged. This is also the intention
of this function and not to noodle over the banks while a fatal MCE is
already logged.

Tony: read the rest of the MCA bank to populate the struct mce fully.

Suggested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20180622095428.626-8-bp@alien8.de
2018-06-22 14:35:50 +02:00
John Garry bed9df97b3 irqdesc: Delete irq_desc_get_msi_desc()
Function irq_desc_get_msi_desc() is not referenced in the kernel (and does
not seem to have been referenced since e39758e0ea, 3 years ago), so
delete it.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <marc.zyngier@arm.com>
Cc: <will.deacon@arm.com>
Cc: <kstewart@linuxfoundation.org>
Cc: <julien.thierry@arm.com>
Cc: <andrew@lunn.ch>
Cc: <trivial@kernel.org>
Link: https://lkml.kernel.org/r/1529667333-92959-1-git-send-email-john.garry@huawei.com
2018-06-22 14:22:02 +02:00
Marc Zyngier 82f499c881 irqchip/gic-v3-its: Fix reprogramming of redistributors on CPU hotplug
Enabling LPIs was made a lot stricter recently, by checking that they are
disabled before enabling them. By doing so, the CPU hotplug case was missed
altogether, which leaves LPIs enabled on hotplug off (expecting the CPU to
eventually come back), and won't write a different value anyway on hotplug
on.

So skip that check if that particular case is detected

Fixes: 6eb486b66a ("irqchip/gic-v3: Ensure GICR_CTLR.EnableLPI=0 is observed before enabling")
Reported-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Sumit Garg <sumit.garg@linaro.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lkml.kernel.org/r/20180622095254.5906-8-marc.zyngier@arm.com
2018-06-22 14:22:02 +02:00
Marc Zyngier 205e065d91 irqchip/gic-v3-its: Only emit VSYNC if targetting a valid collection
Similarily to the SYNC operation, it must be verified that the VPE
targetted by a VLPI is backed by a valid collection in the GIC driver data
structures.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20180622095254.5906-7-marc.zyngier@arm.com
2018-06-22 14:22:01 +02:00
Marc Zyngier 83559b47cd irqchip/gic-v3-its: Only emit SYNC if targetting a valid collection
It is possible, under obscure circumstances, to convince the ITS driver to
emit a SYNC operation that targets a collection that is not bound to any
redistributor (and the target_address field is zero) because the
corresponding CPU has not been seen yet (the system has been booted with
max_cpus="something small").

If the ITS is using the linear CPU number as the target, this is not a big
deal, as we just end-up issuing a SYNC to CPU0. But if the ITS requires the
physical address of the redistributor (with GITS_TYPER.PTA==1), we end-up
asking the ITS to write to the physical address zero, which is not exactly
a good idea (there has been report of the ITS locking up). This should of
course never happen, but hey, this is SW...

In order to avoid the above disaster, let's track which collections have
been actually initialized, and let's not generate a SYNC if the collection
hasn't been properly bound to a redistributor.  Take this opportunity to
spit our a warning, in the hope that someone may report the issue if it
arrises again.

Reported-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20180622095254.5906-6-marc.zyngier@arm.com
2018-06-22 14:22:01 +02:00
Yang Yingliang c1797b11a0 irqchip/gic-v3-its: Don't bind LPI to unavailable NUMA node
On a NUMA system, if an ITS is local to an offline node, the ITS driver may
pick an offline CPU to bind the LPI.  In this case, pick an online CPU (and
the first one will do).

But on some systems, binding an LPI to non-local node CPU may cause
deadlock (see Cavium erratum 23144).  In this case, just fail the activate
and return an error code.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180622095254.5906-5-marc.zyngier@arm.com
2018-06-22 14:22:01 +02:00
Marc Zyngier cbaf45a6be irqchip/gic-v2m: Fix SPI release on error path
On failing to allocate the required SPIs, the actual number of interrupts
should be freed and not its log2 value.

Fixes: de337ee301 ("irqchip/gic-v2m: Add PCI Multi-MSI support")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20180622095254.5906-4-marc.zyngier@arm.com
2018-06-22 14:22:00 +02:00
Marc Zyngier 893fbfff97 irqchip/ls-scfg-msi: Fix MSI affinity handling
The ls-scfs-msi driver is not dealing with the effective affinity
as it should. Let's fix that, and make it clear that the effective
affinity is restricted to a single CPU. Also prevent the driver from
messing with the internals of the affinity setting infrastructure.

Reported-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20180622095254.5906-3-marc.zyngier@arm.com
2018-06-22 14:22:00 +02:00