While hacking on kTLS, I ran into the following panic from an
unprivileged netserver / netperf TCP session:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 800000037f378067 P4D 800000037f378067 PUD 3c0e61067 PMD 0
Oops: 0010 [#1] SMP KASAN PTI
CPU: 1 PID: 2289 Comm: netserver Not tainted 4.17.0+ #139
Hardware name: LENOVO 20FBCTO1WW/20FBCTO1WW, BIOS N1FET47W (1.21 ) 11/28/2016
RIP: 0010: (null)
Code: Bad RIP value.
RSP: 0018:ffff88036abcf740 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: ffff88036f5f6800 RCX: 1ffff1006debed26
RDX: ffff88036abcf920 RSI: ffff8803cb1a4f00 RDI: ffff8803c258c280
RBP: ffff8803c258c280 R08: ffff8803c258c280 R09: ffffed006f559d48
R10: ffff88037aacea43 R11: ffffed006f559d49 R12: ffff8803c258c280
R13: ffff8803cb1a4f20 R14: 00000000000000db R15: ffffffffc168a350
FS: 00007f7e631f4700(0000) GS:ffff8803d1c80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 00000003ccf64005 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
? tls_sw_poll+0xa4/0x160 [tls]
? sock_poll+0x20a/0x680
? do_select+0x77b/0x11a0
? poll_schedule_timeout.constprop.12+0x130/0x130
? pick_link+0xb00/0xb00
? read_word_at_a_time+0x13/0x20
? vfs_poll+0x270/0x270
? deref_stack_reg+0xad/0xe0
? __read_once_size_nocheck.constprop.6+0x10/0x10
[...]
Debugging further, it turns out that calling into ctx->sk_poll() is
invalid since sk_poll itself is NULL which was saved from the original
TCP socket in order for tls_sw_poll() to invoke it.
Looks like the recent conversion from poll to poll_mask callback started
in 1525242310 ("net: add support for ->poll_mask in proto_ops") missed
to eventually convert kTLS, too: TCP's ->poll was converted over to the
->poll_mask in commit 2c7d3daceb ("net/tcp: convert to ->poll_mask")
and therefore kTLS wrongly saved the ->poll old one which is now NULL.
Convert kTLS over to use ->poll_mask instead. Also instead of POLLIN |
POLLRDNORM use the proper EPOLLIN | EPOLLRDNORM bits as the case in
tcp_poll_mask() as well that is mangled here.
Fixes: 2c7d3daceb ("net/tcp: convert to ->poll_mask")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Watson <davejwatson@fb.com>
Tested-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes for your net tree:
1) Reject non-null terminated helper names from xt_CT, from Gao Feng.
2) Fix KASAN splat due to out-of-bound access from commit phase, from
Alexey Kodanev.
3) Missing conntrack hook registration on IPVS FTP helper, from Julian
Anastasov.
4) Incorrect skbuff allocation size in bridge nft_reject, from Taehee Yoo.
5) Fix inverted check on packet xmit to non-local addresses, also from
Julian.
6) Fix ebtables alignment compat problems, from Alin Nastac.
7) Hook mask checks are not correct in xt_set, from Serhey Popovych.
8) Fix timeout listing of element in ipsets, from Jozsef.
9) Cap maximum timeout value in ipset, also from Jozsef.
10) Don't allow family option for hash:mac sets, from Florent Fourcot.
11) Restrict ebtables to work with NFPROTO_BRIDGE targets only, this
Florian.
12) Another bug reported by KASAN in the rbtree set backend, from
Taehee Yoo.
13) Missing __IPS_MAX_BIT update doesn't include IPS_OFFLOAD_BIT.
From Gao Feng.
14) Missing initialization of match/target in ebtables, from Florian
Westphal.
15) Remove useless nft_dup.h file in include path, from C. Labbe.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When pskb_trim_rcsum fails, the lack of error-handling code may
cause unexpected results.
This patch adds error-handling code after calling pskb_trim_rcsum.
Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPVS setups with local client and remote tunnel server need
to create exception for the local virtual IP. What we do is to
change PMTU from 64KB (on "lo") to 1460 in the common case.
Suggested-by: Martin KaFai Lau <kafai@fb.com>
Fixes: 45e4fd2668 ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
Fixes: 7343ff31eb ("ipv6: Don't create clones of host routes.")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix several bpfilter/UMH bugs, in particular make the UMH build not
depend upon X86 specific Kconfig symbols. From Alexei Starovoitov.
2) Fix handling of modified context pointer in bpf verifier, from
Daniel Borkmann.
3) Kill regression in ifdown/ifup sequences for hv_netvsc driver, from
Dexuan Cui.
4) When the bonding primary member name changes, we have to re-evaluate
the bond->force_primary setting, from Xiangning Yu.
5) Eliminate possible padding beyone end of SKB in cdc_ncm driver, from
Bjørn Mork.
6) RX queue length reported for UDP sockets in procfs and socket diag
are inaccurate, from Paolo Abeni.
7) Fix br_fdb_find_port() locking, from Petr Machata.
8) Limit sk_rcvlowat values properly in TCP, from Soheil Hassas
Yeganeh.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (23 commits)
tcp: limit sk_rcvlowat by the maximum receive buffer
net: phy: dp83822: use BMCR_ANENABLE instead of BMSR_ANEGCAPABLE for DP83620
socket: close race condition between sock_close() and sockfs_setattr()
net: bridge: Fix locking in br_fdb_find_port()
udp: fix rx queue len reported by diag and proc interface
cdc_ncm: avoid padding beyond end of skb
net/sched: act_simple: fix parsing of TCA_DEF_DATA
net: fddi: fix a possible null-ptr-deref
net: aquantia: fix unsigned numvecs comparison with less than zero
net: stmmac: fix build failure due to missing COMMON_CLK dependency
bpfilter: fix race in pipe access
bpf, xdp: fix crash in xdp_umem_unaccount_pages
xsk: Fix umem fill/completion queue mmap on 32-bit
tools/bpf: fix selftest get_cgroup_id_user
bpfilter: fix OUTPUT_FORMAT
umh: fix race condition
net: mscc: ocelot: Fix uninitialized error in ocelot_netdevice_event()
bonding: re-evaluate force_primary when the primary slave name changes
ip_tunnel: Fix name string concatenate in __ip_tunnel_create()
hv_netvsc: Fix a network regression after ifdown/ifup
...
Subsystem:
- rework of the rtc-test driver which allows to test the core more thoroughly
- rtc_set_alarm() now fails early when alarms are not supported
Drivers:
- mktime is now replaced by mktime64
- RTC range added for 88pm80x, ab-b5ze-s3, at91rm9200, brcmstb-waketimer,
ds1685, ftrtc010, ls1x, mxc_v2, rx8581, sprd, st-lpc, tps6586x, tps65910 and
vr41xx
- Fixed a possible race condition in probe functions
- pxa: fix the probe function that is broken since v4.3
- stm32: now supports stm32mp1
-----BEGIN PGP SIGNATURE-----
iQIyBAABCgAdFiEEXx9Viay1+e7J/aM4AyWl4gNJNJIFAlsdkVAACgkQAyWl4gNJ
NJJMfA/3YzFFxsZZdcf84e3LWMwgA12c/YNM24nlQ3S+Fo23bAerGZyKEroBAaiq
HVL7j6OwYkVrGJHbqvq7J0UhI0J9Fjbtp8suj7Cj5wBKOG3wUeTkpzBiHZN42WBB
PpPC97z9HRTVjxAOmWC0wbbf622ZBOZyEti3kMVh5DwER+8iNoPJWUS6nmZdOVqR
PjT/c79WCT3q7n2j9t+ZjQfVOqPlqTTty3WuCpYDu3ce3W7uUO/cISc3M4HA4A5d
dw6gDcd9WQcf4qZESjlci84pn4Ktha317fX5QlkaKM2ul3x33652pbH8Yv6ynZsq
ZlmIyE9vSWThBWUj7R4lo9/y2IsVL5FtMRIN5bvG6ms/tPuZGeX/qAZEBgMggN+r
PgFY5U+k/1WkOeSMd4OpuE9g308wzR3xGIhtuiJOa006hvHNvyMunIeMURDWjceW
fh1uu1eUQqf4yKt8ceB9s38pYcPrvtEOh9006VcHMp/JJpoOjIn93jdsaxCmlUZc
poDAYgH+RudVaaMZ4VvZjzlrD/diSwjh51MpBf0ImdQut4ehfZdGna4WOzddenAT
1nsVKRp/qxR0b9kolQYCpSVsJKHME4pZPEKY0f5UyCZgEy/l3SkMPVOkjXlAzZAd
ZX0l857UGeVbWP5sRDTc9J1sw2QAVO2oBsSOEeK0z9kvbuz/uQ==
=I5F0
-----END PGP SIGNATURE-----
Merge tag 'rtc-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:
"Setting the supported range from drivers for RTCs failing soon has
started. A few fixes are developed along the way. Some drivers have
been switched to SPDX by their maintainers.
Subsystem:
- rework of the rtc-test driver which allows to test the core more
thoroughly
- rtc_set_alarm() now fails early when alarms are not supported
Drivers:
- mktime() is now replaced by mktime64()
- RTC range added for 88pm80x, ab-b5ze-s3, at91rm9200,
brcmstb-waketimer, ds1685, ftrtc010, ls1x, mxc_v2, rx8581, sprd,
st-lpc, tps6586x, tps65910 and vr41xx
- fixed a possible race condition in probe functions
- pxa: fix the probe function that is broken since v4.3
- stm32: now supports stm32mp1"
* tag 'rtc-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (78 commits)
rtc: pxa: fix probe function
rtc: cros-ec: Switch to SPDX identifier.
rtc: cros-ec: Make license text and module license match.
rtc: ensure rtc_set_alarm fails when alarms are not supported
rtc: test: remove alarm support from the first device
rtc: test: convert to devm_rtc_allocate_device
rtc: ftrtc010: let the core handle range
rtc: ftrtc010: handle dates after 2106
rtc: ftrtc010: switch to devm_rtc_allocate_device
rtc: mrst: switch to devm functions
rtc: sunxi: fix possible race condition
rtc: test: remove irq sysfs file
rtc: test: emulate alarms using timers
rtc: test: store time as an offset to system time
rtc: test: allow registering many devices
rtc: test: remove useless proc info
rtc: ds1685: Add range
rtc: ds1685: fix possible race condition
rtc: sprd: Add new RTC power down check method
rtc: sun6i: Fix bit_idx value for clk_register_gate
...
- The UBI on-disk format header file is now dual licensed
- New way to detect Fastmap problems during runtime
- Bugfix for Fastmap
- Minor updates for UBIFS (spelling, comments, vm_fault_t, ...)
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAlsdi6AWHHJpY2hhcmRA
c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wVimEADam9dNKs7hVAlSR5G8fgoFMiPI
PgP/cQ719PAYh8Zub0w8fIIvtSCaVZmPX/K40w4KAgP/jdQ/v4QD3ZYdHO3vn3/M
6WI/cEB/Cn5lJp78tt14cGlqG3+xnlwyc9EEeiMCPDWmoSf66J7fwuL3xRq4dDet
CBISYrF8ZMe6BJohT9kAYeqYoYLzALUmZoOx1DGhX7b2SEaLphCyW0A0rkxoHbYG
sxAfUiR4/bdaFfqLAmEMgoovIHGYmOdnMJ3Hk+uUVrPEfBzxDQuGfpevDJ1u99hS
D4qdvY3+VAcTpTK0XJUBgPaVpkYYMS5nfHtpObbyOT/kBmgCloCAmtvVcznxUwJ7
R+8UBZH/SrxUw+R9JCP1hFgb/PlEAE0+Vsc2jgvqWOQ8xnOqojG5FdqAueGanX+l
/on2/HNoZp5C8x15+cB9SiQwfdoZrH7vr9uzi1lqaWSV+ZMDpvG/QFMX1QX1gLl9
TFeQJBojE41y+hjTnnUQZdy1oPaisffU44ymKlf1FR2SdzW6d7wIRBd0jnB+7Pjz
KocSAA2kqDUcCjUtLFSSVfE7kUL9wwJQUfUxeQFViTGDvAnLcmy1a6mqQ1K5YWq9
e5mpMQRWzoq0XrPyaVI5EEhvaqWalJwIa6GA7ZksOvS4W1pqFLLyFOKkHejUPou8
lxcw0dDsyZdmnDRuMA==
=b0lt
-----END PGP SIGNATURE-----
Merge tag 'upstream-4.18-rc1' of git://git.infradead.org/linux-ubifs
Pull UBI and UBIFS updates from Richard Weinberger:
- the UBI on-disk format header file is now dual licensed
- new way to detect Fastmap problems during runtime
- bugfix for Fastmap
- minor updates for UBIFS (spelling, comments, vm_fault_t, ...)
* tag 'upstream-4.18-rc1' of git://git.infradead.org/linux-ubifs:
mtd: ubi: Update ubi-media.h to dual license
ubi: fastmap: Detect EBA mismatches on-the-fly
ubi: fastmap: Check each mapping only once
ubi: fastmap: Correctly handle interrupted erasures in EBA
ubi: fastmap: Cancel work upon detach
ubifs: lpt: Fix wrong pnode number range in comment
ubifs: gc: Fix typo
ubifs: log: Some spelling fixes
ubifs: Spelling fix someting -> something
ubifs: journal: Remove wrong comment
ubifs: remove set but never used variable
ubifs, xattr: remove misguided quota flags
fs: ubifs: Adding new return type vm_fault_t
The user-provided value to setsockopt(SO_RCVLOWAT) can be
larger than the maximum possible receive buffer. Such values
mute POLLIN signals on the socket which can stall progress
on the socket.
Limit the user-provided value to half of the maximum receive
buffer, i.e., half of sk_rcvbuf when the receive buffer size
is set by the user, or otherwise half of sysctl_tcp_rmem[2].
Fixes: d1361840f8 ("tcp: fix SO_RCVLOWAT and RCVBUF autotuning")
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is mostly updates to the usual drivers: ufs, qedf, mpt3sas, lpfc,
xfcp, hisi_sas, cxlflash, qla2xxx. In the absence of Nic, we're also
taking target updates which are mostly minor except for the tcmu
refactor. The only real core change to worry about is the removal of
high page bouncing (in sas, storvsc and iscsi). This has been well
tested and no problems have shown up so far.
Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCWx1pbCYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishUucAP42pccS
ziKyiOizuxv9fZ4Q+nXd1A9zhI5tqqpkHjcQegEA40qiZSi3EKGKR8W0UpX7Ntmo
tqrZJGojx9lnrAM2RbQ=
=NMXg
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is mostly updates to the usual drivers: ufs, qedf, mpt3sas, lpfc,
xfcp, hisi_sas, cxlflash, qla2xxx.
In the absence of Nic, we're also taking target updates which are
mostly minor except for the tcmu refactor.
The only real core change to worry about is the removal of high page
bouncing (in sas, storvsc and iscsi). This has been well tested and no
problems have shown up so far"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (268 commits)
scsi: lpfc: update driver version to 12.0.0.4
scsi: lpfc: Fix port initialization failure.
scsi: lpfc: Fix 16gb hbas failing cq create.
scsi: lpfc: Fix crash in blk_mq layer when executing modprobe -r lpfc
scsi: lpfc: correct oversubscription of nvme io requests for an adapter
scsi: lpfc: Fix MDS diagnostics failure (Rx < Tx)
scsi: hisi_sas: Mark PHY as in reset for nexus reset
scsi: hisi_sas: Fix return value when get_free_slot() failed
scsi: hisi_sas: Terminate STP reject quickly for v2 hw
scsi: hisi_sas: Add v2 hw force PHY function for internal ATA command
scsi: hisi_sas: Include TMF elements in struct hisi_sas_slot
scsi: hisi_sas: Try wait commands before before controller reset
scsi: hisi_sas: Init disks after controller reset
scsi: hisi_sas: Create a scsi_host_template per HW module
scsi: hisi_sas: Reset disks when discovered
scsi: hisi_sas: Add LED feature for v3 hw
scsi: hisi_sas: Change common allocation mode of device id
scsi: hisi_sas: change slot index allocation mode
scsi: hisi_sas: Introduce hisi_sas_phy_set_linkrate()
scsi: hisi_sas: fix a typo in hisi_sas_task_prep()
...
DP83620 register set is compatible with the DP83848, but it also supports
100base-FX. When the hardware is configured such as that fiber mode is
enabled, autonegotiation is not possible.
The chip, however, doesn't expose this information via BMSR_ANEGCAPABLE.
Instead, this bit is always set high, even if the particular hardware
configuration makes it so that auto negotiation is not possible [1]. Under
these circumstances, the phy subsystem keeps trying for autonegotiation to
happen, without success.
Hereby, we inspect BMCR_ANENABLE bit after genphy_config_init, which on
reset is set to 0 when auto negotiation is disabled, and so we use this
value instead of BMSR_ANEGCAPABLE.
[1] https://e2e.ti.com/support/interface/ethernet/f/903/p/697165/2571170
Signed-off-by: Alvaro Gamez Machado <alvaro.gamez@hazent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
fchownat() doesn't even hold refcnt of fd until it figures out
fd is really needed (otherwise is ignored) and releases it after
it resolves the path. This means sock_close() could race with
sockfs_setattr(), which leads to a NULL pointer dereference
since typically we set sock->sk to NULL in ->release().
As pointed out by Al, this is unique to sockfs. So we can fix this
in socket layer by acquiring inode_lock in sock_close() and
checking against NULL in sockfs_setattr().
sock_release() is called in many places, only the sock_close()
path matters here. And fortunately, this should not affect normal
sock_close() as it is only called when the last fd refcnt is gone.
It only affects sock_close() with a parallel sockfs_setattr() in
progress, which is not common.
Fixes: 86741ec254 ("net: core: Add a UID field to struct sock.")
Reported-by: shankarapailoor <shankarapailoor@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQHHBAABCgAxFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAlsdWXsTHHNtZnJlbmNo
QGdtYWlsLmNvbQAKCRCKLL1wB3JPUXiyC/wJdnwM+R2JGtj1ygZqPfS6/xWbCI0F
uXfn9a09nrMVbM2CQCDChiqoU2yvg9cKqc2d/UhssuwJYdfNNYSVe5MDkYYCADRn
WSvCQ4+ZDQEl65pcjVKcOlhwl73cIVQmb6yaxMA/scZx+ikfe67EgJEEJZHJPhFO
WJnMA5IUMAQFw5KxBdKtuVL0ixHBYvtrtQ9HAyPq0GTmCsTyb0GcbeMYaqRwpCMS
R4NXyTr+s68UUi+iM9r+I6nYqml5L2EBFjrSwRbu7e6PIRF5fEMnKzj0WSmbgsSG
KY2b/YWSskN5p5dkU+cLETXfjOjg8ugROJzz4LYkS/dUn+AJpg7IOwUNfr/a0yO5
dHMwpa9SltPOslcHZgnBUsfqLHZSP0y/QODVhhc80uapxmllA4CHOc+lgG5i0VlB
mTS4SQNRPw2NINoGCr+C65ghcJJyf6ivP6J3PoCjobo76yte8TiyauxjuDIJgY8T
h2Wc/fWgN9lc9IUcpvedUjxoacG+rMSopQc=
=yibA
-----END PGP SIGNATURE-----
Merge tag '4.18-fixes-smb3' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
- one smb3 (ACL related) fix for stable
- one SMB3 security enhancement (when mounting -t smb3 forbid less
secure dialects)
- some RDMA and compounding fixes
* tag '4.18-fixes-smb3' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix a buffer leak in smb2_query_symlink
smb3: do not allow insecure cifs mounts when using smb3
CIFS: Fix NULL ptr deref
CIFS: fix encryption in SMB3.1.1
CIFS: Pass page offset for encrypting
CIFS: Pass page offset for calculating signature
CIFS: SMBD: Support page offset in memory registration
CIFS: SMBD: Support page offset in RDMA recv
CIFS: SMBD: Support page offset in RDMA send
CIFS: When sending data on socket, pass the correct page offset
CIFS: Introduce helper function to get page offset and length in smb_rqst
CIFS: Calculate the correct request length based on page offset and tail size
cifs: For SMB2 security informaion query, check for minimum sized security descriptor instead of sizeof FileAllInformation class
CIFS: Fix signing for SMB2/3
Pull restartable sequence support from Thomas Gleixner:
"The restartable sequences syscall (finally):
After a lot of back and forth discussion and massive delays caused by
the speculative distraction of maintainers, the core set of
restartable sequences has finally reached a consensus.
It comes with the basic non disputed core implementation along with
support for arm, powerpc and x86 and a full set of selftests
It was exposed to linux-next earlier this week, so it does not fully
comply with the merge window requirements, but there is really no
point to drag it out for yet another cycle"
* 'core-rseq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rseq/selftests: Provide Makefile, scripts, gitignore
rseq/selftests: Provide parametrized tests
rseq/selftests: Provide basic percpu ops test
rseq/selftests: Provide basic test
rseq/selftests: Provide rseq library
selftests/lib.mk: Introduce OVERRIDE_TARGETS
powerpc: Wire up restartable sequences system call
powerpc: Add syscall detection for restartable sequences
powerpc: Add support for restartable sequences
x86: Wire up restartable sequence system call
x86: Add support for restartable sequences
arm: Wire up restartable sequences system call
arm: Add syscall detection for restartable sequences
arm: Add restartable sequences support
rseq: Introduce restartable sequences system call
uapi/headers: Provide types_32_64.h
Pull x86 updates and fixes from Thomas Gleixner:
- Fix the (late) fallout from the vector management rework causing
hlist corruption and irq descriptor reference leaks caused by a
missing sanity check.
The straight forward fix triggered another long standing issue to
surface. The pre rework code hid the issue due to being way slower,
but now the chance that user space sees an EBUSY error return when
updating irq affinities is way higher, though quite a bunch of
userspace tools do not handle it properly despite the fact that EBUSY
could be returned for at least 10 years.
It turned out that the EBUSY return can be avoided completely by
utilizing the existing delayed affinity update mechanism for irq
remapped scenarios as well. That's a bit more error handling in the
kernel, but avoids fruitless fingerpointing discussions with tool
developers.
- Decouple PHYSICAL_MASK from AMD SME as its going to be required for
the upcoming Intel memory encryption support as well.
- Handle legacy device ACPI detection properly for newer platforms
- Fix the wrong argument ordering in the vector allocation tracepoint
- Simplify the IDT setup code for the APIC=n case
- Use the proper string helpers in the MTRR code
- Remove a stale unused VDSO source file
- Convert the microcode update lock to a raw spinlock as its used in
atomic context.
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/intel_rdt: Enable CMT and MBM on new Skylake stepping
x86/apic/vector: Print APIC control bits in debugfs
genirq/affinity: Defer affinity setting if irq chip is busy
x86/platform/uv: Use apic_ack_irq()
x86/ioapic: Use apic_ack_irq()
irq_remapping: Use apic_ack_irq()
x86/apic: Provide apic_ack_irq()
genirq/migration: Avoid out of line call if pending is not set
genirq/generic_pending: Do not lose pending affinity update
x86/apic/vector: Prevent hlist corruption and leaks
x86/vector: Fix the args of vector_alloc tracepoint
x86/idt: Simplify the idt_setup_apic_and_irq_gates()
x86/platform/uv: Remove extra parentheses
x86/mm: Decouple dynamic __PHYSICAL_MASK from AMD SME
x86: Mark native_set_p4d() as __always_inline
x86/microcode: Make the late update update_lock a raw lock for RT
x86/mtrr: Convert to use strncpy_from_user() helper
x86/mtrr: Convert to use match_string() helper
x86/vdso: Remove unused file
x86/i8237: Register device based on FADT legacy boot flag
Pull x86 pti updates from Thomas Gleixner:
"Three small commits updating the SSB mitigation to take the updated
AMD mitigation variants into account"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/bugs: Switch the selection of mitigation from CPU vendor to CPU features
x86/bugs: Add AMD's SPEC_CTRL MSR usage
x86/bugs: Add AMD's variant of SSB_NO
Pull more perf tooling updates from Thomas Gleixner:
"Perf tool updates and fixes:
perf stat:
- Display user and system time for workload targets (Jiri Olsa)
perf record:
- Enable arbitrary event names thru name= modifier (Alexey Budankov)
PowerPC:
- Add a python script for hypervisor call statistics (Ravi Bangoria)
Intel PT: (Adrian Hunter)
- Fix sync_switch INTEL_PT_SS_NOT_TRACING
- Fix decoding to accept CBR between FUP and corresponding TIP
- Fix MTC timing after overflow
- Fix "Unexpected indirect branch" error
perf test:
- record+probe_libc_inet_pton:
- To get the symbol table for dynamic shared objects on ubuntu we
need to pass the -D/--dynamic command line option, unlike with
the fedora distros (Arnaldo Carvalho de Melo)
- code-reading:
- Fix perf_env setup for PTI entry trampolines (Adrian Hunter)
- kmod-path:
- Add tests for vdso32 and vdsox32 (Adrian Hunter)
- Use header file util/debug.h (Thomas Richter)
perf annotate:
- Make the various UI backends (stdio, TUI, gtk) use more
consistently structs with annotation options as specified by the
user (Arnaldo Carvalho de Melo)
- Move annotation specific knobs from the symbol_conf global kitchen
sink to the annotation option structs (Arnaldo Carvalho de Melo)
perf script:
- Add more PMU fields to python scripts event handler dict (Jin Yao)
Core:
- Fix misleading error for some unparsable events mentioning PMUs
when those are not involved in the problem (Jiri Olsa)
- Consider BSS symbols when processing /proc/kallsyms ('B' and 'b')
(Arnaldo Carvalho de Melo)
- Be more robust when trying to use per-symbol histograms, checking
for unlikely but possible cases where the space for the histograms
wasn't allocated, print a debug message for such cases (Arnaldo
Carvalho de Melo)
- Fix symbol and object code resolution for vdso32 and vdsox32
(Adrian Hunter)
- No need to check for null when passing pointers to foo__get() style
refcount grabbing helpers, just like in the kernel and with free(),
its safe to pass a NULL pointer to avoid having to check it before
each and every foo__get() call (Arnaldo Carvalho de Melo)
- Remove some dead code (quote.[ch]) (Arnaldo Carvalho de Melo)
- Remove some needless globals, making them local (Arnaldo Carvalho
de Melo)
- Reduce usage of symbol_conf.use_callchain, using other means of
finding out if callchains are in use or available for specific
events, as we evolved this codebase to allow requesting callchains
for just a subset of the monitored events. In time it will help
polish recording and showing mixed sets accross the various tools:
perf record -e cycles/call-graph=fp/,cache-misses/call-graph=dwarf/,instructions'
(Arnaldo Carvalho de Melo)
- Consider PTI entry trampolines in map__rip_2objdump() (Adrian
Hunter)"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
perf script python: Add dict fields introduction to Documentation
perf script python: Add more PMU fields to event handler dict
perf script python: Move dsoname code to a new function
perf symbols: Add BSS symbols when reading from /proc/kallsyms
perf annnotate: Make __symbol__inc_addr_samples handle src->histograms == NULL
perf intel-pt: Fix "Unexpected indirect branch" error
perf intel-pt: Fix MTC timing after overflow
perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIP
perf intel-pt: Fix sync_switch INTEL_PT_SS_NOT_TRACING
perf script powerpc: Python script for hypervisor call statistics
perf test record+probe_libc_inet_pton: Ask 'nm' for dynamic symbols
perf map: Consider PTI entry trampolines in rip_2objdump()
perf test code-reading: Fix perf_env setup for PTI entry trampolines
perf tools: Fix pmu events parsing rule
perf stat: Display user and system time
perf record: Enable arbitrary event names thru name= modifier
perf tools: Fix symbol and object code resolution for vdso32 and vdsox32
perf tests kmod-path: Add tests for vdso32 and vdsox32
perf hists: Check if a hist_entry has callchains before using them
perf hists: Introduce hist_entry__has_callchain() method
...
Pull irq fixes from Thomas Gleixner:
"Two small fixlets:
- Add the missing iomu mapping call in the Freescale/NXP/Qualcomm/
whoever owns it now/ SCFG MSI irqchip driver. Otherwise IRQs wont
work at all.
- Fix a SMP=n build warning in the STM32 irq chip driver"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/ls-scfg-msi: Map MSIs in the iommu
irqchip/stm32: Fix non-SMP build warning
Pull core fixes from Thomas Gleixner:
"A small set of core updates:
- Make objtool cope with GCC8 oddities some more
- Remove a stale local_irq_save/restore sequence in the signal code
along with the stale comment in the RCU code. The underlying issue
which led to this has been solved long time ago, but nobody cared
to cleanup the hackarounds"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
signal: Remove no longer required irqsave/restore
rcu: Update documentation of rcu_read_unlock()
objtool: Fix GCC 8 cold subfunction detection for aliased functions
Commit a841796f11 ("signal: align __lock_task_sighand() irq disabling and
RCU") introduced a rcu read side critical section with interrupts
disabled. The changelog suggested that a better long-term fix would be "to
make rt_mutex_unlock() disable irqs when acquiring the rt_mutex structure's
->wait_lock".
This long-term fix has been made in commit b4abf91047 ("rtmutex: Make
wait_lock irq safe") for a different reason.
Therefore revert commit a841796f11 ("signal: align >
__lock_task_sighand() irq disabling and RCU") as the interrupt disable
dance is not longer required.
The change was tested on the base of b4abf91047 ("rtmutex: Make wait_lock
irq safe") with a four hour run of rcutorture scenario TREE03 with lockdep
enabled as suggested by Paul McKenney.
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: bigeasy@linutronix.de
Link: https://lkml.kernel.org/r/20180525090507.22248-3-anna-maria@linutronix.de
Since commit b4abf91047 ("rtmutex: Make wait_lock irq safe") the
explanation in rcu_read_unlock() documentation about irq unsafe rtmutex
wait_lock is no longer valid.
Remove it to prevent kernel developers reading the documentation to rely on
it.
Suggested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: bigeasy@linutronix.de
Link: https://lkml.kernel.org/r/20180525090507.22248-2-anna-maria@linutronix.de
Merge proc_cmdline simplifications.
This re-writes the get_mm_cmdline() logic to be rather simpler than it
used to be, and makes the semantics for "cmdline goes past the end of
the original area" more natural.
You _can_ use prctl(PR_SET_MM) to just point your command line somewhere
else entirely, but the traditional model is to just edit things in place
and that still needs to continue to work. At least this way the code
makes some sense.
* proc-cmdline:
fs/proc: simplify and clarify get_mm_cmdline() function
fs/proc: re-factor proc_pid_cmdline_read() a bit
Use the error code EUCLEAN for filesystem errors because other
filesystems use this code too.
[ And remove unused EMEMERROR - Linus ]
Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* bq27xxx: Add BQ27426 support
* ab8500: Drop AB8540/9540 support
* Introduced new usb_type property
* Properly document the power-supply ABI
* misc. cleanups and fixes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE72YNB0Y/i3JqeVQT2O7X88g7+poFAlscFLIACgkQ2O7X88g7
+pplkA/+OU3V/sFfbHka6RdoIVeEJCqtQnd1d3VimbKmSpCZQ9W5SnJoc/Dzy3+q
Di0mrFb9nB77H4ShMVVc/rAy4flv7copnI2SJDZ1psx4pX4LS7AwToAlQY1Cr/VR
QBStyZ4aJrvHtdPYggZADAwiU3/vBtI14n28/8TdGlsHFszVbcr2IhiUBYiVgb++
jvQCQyKbxgMfou08wV2Sg/moKXGFh+1HtobRZDyr4Su3A1Nyr6Epyg8gBADi2ZgS
nWaGfYfpI9gaA7g00y/OvqTZroJD+WKAToecl5frgtZ+zHZAQsVGJth9f+LrYXlz
7bERZf94L8MwRNz1Zl8nJ+zHfOmMjEFerHCI81+6wO+klcAgv5AQTcP11a+oZWiQ
5Isq6yt6meg2B4XfBX2EcnXtztnvgb0+lb0KaRUhiQ/5BzupsyIBw8vlJmilGP61
DmL63WrXSb2XChnAkfLbLiKXethY4/y7WjHxPv7esJqy1X6tpFURHwoopv/HWX6N
ctsqOp1D3pl+pBQvOZ4g5oWMTwmlu1uUsKXaZpOYx7+a8CIfJxPVXB+S/D3C5YsU
U1KodhtmhXp8FmbyH1OufARXk9G9rievgOLAPTfsjet06rwpLdy6iLeztvjZIo+T
H5pvkD+1o4P6TiAgsPTnOK9YXGL/Pr4rb2voiu7sKLRAfBr65ww=
=UB16
-----END PGP SIGNATURE-----
Merge tag 'for-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
Pull power supply and reset updates from Sebastian Reichel:
- bq27xxx: Add BQ27426 support
- ab8500: Drop AB8540/9540 support
- Introduced new usb_type property
- Properly document the power-supply ABI
- misc. cleanups and fixes
* tag 'for-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
MAINTAINERS: add entry for LEGO MINDSTORMS EV3
power: supply: ab8500_charger: fix spelling mistake: "faile" -> "failed"
power: supply: axp288_fuel_gauge: Remove polling from the driver
power: supply: axp288_fuelguage: Do not bind when the fg function is not used
power: supply: axp288_charger: Do not bind when the charge function is not used
power: supply: axp288_charger: Support 3500 and 4000 mA input current limit
power: supply: s3c-adc-battery: fix driver data initialization
power: supply: charger-manager: Verify polling interval only when polling requested
power: supply: sysfs: Use enum to specify property
power: supply: ab8500: Drop AB8540/9540 support
power: supply: ab8500_fg: fix spelling mistake: "Disharge" -> "Discharge"
power: supply: simplify getting .drvdata
power: supply: bq27xxx: Add support for BQ27426
gpio-poweroff: Use gpiod_set_value_cansleep
* use new vm_fault_t return type
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE72YNB0Y/i3JqeVQT2O7X88g7+poFAlscE5EACgkQ2O7X88g7
+poOIg/8D3KjWeF8+T/3rhBtTccxGPJveK9oWTlUDm7omjj1s7Hm/fuCBO9oFdyb
IJ+QFIzMhZEsYDPw9Pk7lPDM9MygZ6jdmEDQBj60HMPambDfehm6OdM41F7RcwKv
d1YDy0O3ocCtn7zKi8utYSmjxZV8YHVilaf1HuTgFwQW8w2Aeno9znfVs6ikRdC/
hgCBui88+LcsJaewAVL3dtYOZaxdDvQc1p8Dew1zEVdqAELlx6R2H0LJm4n52lPu
vAvIzCqhEbr3XRy20giFcgLB8Zc7qka054KUdSNhReYl6R7sQ5lSKPYJKW1sonN2
NS4gM8lCbxXVcdd86ysn9KIOryHpZTMl63fHpEr8kjUV568HOwNET9ILjyupU6Sh
DNmHmk/g11h2wqEAk2CvAi7991j5PRT5EuubTdW4g2cDjQpq2EYo/Yar89O3H0k+
H1lYX/K8vPsKTf1K28NfUKce7urEdhl+HZlApy5JU/6AnR35r1XvYcVQVTZ1tsbf
gQaQzkjm/mB7Q97G8bwhQAnu4WdPLwpk4FVkGN9V2MyCUfQOeS4WmB5UEiaIjH8N
U/liszuXfkuFm8VjbYz+fuVDroAephfd869g3WEB9P28JihO7xaVQYOOQOWhwYUT
KGAWbzCD1h7HndiGlJBwY0YKQ/umxa1Gc0YN2P2kd72iSvkqtn8=
=ElNN
-----END PGP SIGNATURE-----
Merge tag 'hsi-for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi
Pull HSI update from Sebastian Reichel:
"Just one patch for the HSI subsystem this time: use the new vm_fault_t
return type"
* tag 'hsi-for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi:
hsi: clients: Change return type to vm_fault_t
general cleanups, but nothing too major. The majority of the diff goes to
two SoCs, Actions Semi and Qualcomm. A brand new driver is introduced for
Actions Semi so it takes up some lines to add all the different types, and
the Qualcomm diff is there because we add support for two SoCs and it's quite
a bit of data.
Otherwise the big driver updates are on TI Davinci and Amlogic platforms. And
then the long tail of driver updates for various fixes and stuff follows
after that.
Core:
- debugfs cleanups removing error checking and an unused provider API
- Removal of a clk init typedef that isn't used
- Usage of match_string() to simplify parent string name matching
- OF clk helpers moved to their own file (linux/of_clk.h)
- Make clk warnings more readable across kernel versions
New Drivers:
- Qualcomm SDM845 GCC and Video clk controllers
- Qualcomm MSM8998 GCC
- Actions Semi S900 SoC support
- Nuvoton npcm750 microcontroller clks
- Amlogic axg AO clock controller
Removed Drivers:
- Deprecated Rockchip clk-gate driver
Updates:
- debugfs functions stopped checking return values
- Support for the MSIOF module clocks on Rensas R-Car M3-N
- Support for the new Rensas RZ/G1C and R-Car E3 SoCs
- Qualcomm GDSC, RCG, and PLL updates for clk changes in new SoCs
- Berlin and Amlogic SPDX tagging
- Usage of of_clk_get_parent_count() in more places
- Proper implementation of the CDEV1/2 clocks on Tegra20
- Allwinner H6 PRCM clock support and R40 EMAC support
- Add critical flag to meson8b's fdiv2 as temporary fixup for ethernet
- Round closest support for meson's mpll driver
- Support for meson8b nand clocks and gxbb video decoder clocks
- Mediatek mali clks
- STM32MP1 fixes
- Uniphier LD11/LD20 stream demux system clock
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAlsWxugACgkQrQKIl8bk
lSVs2A/9HOMsWeiYx1MESrXw6N2UknWeqeT/b1v8L/VOiptJg+OTExPbzmSylngv
AXJAfIkCpguSMh9b310pA3DAzk5docmbQ4zL977yY+KXmOcDooCd34aG5a+tB3ie
ugC8T2bQLrJdMp3hsqaKZsYzqe7LoW2NJgoliXDMA/QUBLpvHq+fcu2zOawingTA
GNc3LGqP5Op7p09aPK30gtQNqLK5qGpHASa/AY7Y0PXlUeTZ8rmF06fcEAg5shkC
CT57Zy2rSFB2RorEJarYXDPLRHMw/jxXtpMVXEy7zuz/3ajvvRiZDHv75+NaBru9
hDt1rzslzexEN4fYzj4AtGYRKyBrHbDaxG1qdIWPWVyoE0CEb+dZ1gH7/Ski5r+s
z5D28NogC0T0sey6yWssyG3RLvkPJ5nxUhL++siHm1lbyo16LmhB1+nFvxrlzmBB
0V1xqEa7feYpD+JD66lJFb5ornHLwGtVYBpeiY+hrDR3ddWEe1IxaYGR2p9nHwSS
Us/ZQdHIYBVEqoo3+BWnTn+HSQzmd/sqHqWnLlVWUHoomm5nXx18PeS87vFbcPv9
dMr+FFJ3Elubzcy5UZJPfNw+pb+teE7tYGQkQ3nbLRxT1YZOoIJZJDqNKxM1cgne
6c/VXJMEyBBn/w7Iru/3eWCZVQJGlmYS47DFDzduFvd3LMfmKIM=
=KK/v
-----END PGP SIGNATURE-----
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
"This time we have a good set of changes to the core framework that do
some general cleanups, but nothing too major. The majority of the diff
goes to two SoCs, Actions Semi and Qualcomm. A brand new driver is
introduced for Actions Semi so it takes up some lines to add all the
different types, and the Qualcomm diff is there because we add support
for two SoCs and it's quite a bit of data.
Otherwise the big driver updates are on TI Davinci and Amlogic
platforms. And then the long tail of driver updates for various fixes
and stuff follows after that.
Core:
- debugfs cleanups removing error checking and an unused provider API
- Removal of a clk init typedef that isn't used
- Usage of match_string() to simplify parent string name matching
- OF clk helpers moved to their own file (linux/of_clk.h)
- Make clk warnings more readable across kernel versions
New Drivers:
- Qualcomm SDM845 GCC and Video clk controllers
- Qualcomm MSM8998 GCC
- Actions Semi S900 SoC support
- Nuvoton npcm750 microcontroller clks
- Amlogic axg AO clock controller
Removed Drivers:
- Deprecated Rockchip clk-gate driver
Updates:
- debugfs functions stopped checking return values
- Support for the MSIOF module clocks on Rensas R-Car M3-N
- Support for the new Rensas RZ/G1C and R-Car E3 SoCs
- Qualcomm GDSC, RCG, and PLL updates for clk changes in new SoCs
- Berlin and Amlogic SPDX tagging
- Usage of of_clk_get_parent_count() in more places
- Proper implementation of the CDEV1/2 clocks on Tegra20
- Allwinner H6 PRCM clock support and R40 EMAC support
- Add critical flag to meson8b's fdiv2 as temporary fixup for ethernet
- Round closest support for meson's mpll driver
- Support for meson8b nand clocks and gxbb video decoder clocks
- Mediatek mali clks
- STM32MP1 fixes
- Uniphier LD11/LD20 stream demux system clock"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (134 commits)
clk: qcom: Export clk_fabia_pll_configure()
clk: bcm: Update and add Stingray clock entries
dt-bindings: clk: Update Stingray binding doc
clk-si544: Properly round requested frequency to nearest match
clk: ingenic: jz4770: Add 150us delay after enabling VPU clock
clk: ingenic: jz4770: Enable power of AHB1 bus after ungating VPU clock
clk: ingenic: jz4770: Modify C1CLK clock to disable CPU clock stop on idle
clk: ingenic: jz4770: Change OTG from custom to standard gated clock
clk: ingenic: Support specifying "wait for clock stable" delay
clk: ingenic: Add support for clocks whose gate bit is inverted
clk: use match_string() helper
clk: bcm2835: use match_string() helper
clk: Return void from debug_init op
clk: remove clk_debugfs_add_file()
clk: tegra: no need to check return value of debugfs_create functions
clk: davinci: no need to check return value of debugfs_create functions
clk: bcm2835: no need to check return value of debugfs_create functions
clk: no need to check return value of debugfs_create functions
clk: imx6: add EPIT clock support
clk: mvebu: use correct bit for 98DX3236 NAND
...
Pull MD updates from Shaohua Li:
"A few fixes of MD for this merge window. Mostly bug fixes:
- raid5 stripe batch fix from Amy
- Read error handling for raid1 FailFast device from Gioh
- raid10 recovery NULL pointer dereference fix from Guoqing
- Support write hint for raid5 stripe cache from Mariusz
- Fixes for device hot add/remove from Neil and Yufen
- Improve flush bio scalability from Xiao"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
MD: fix lock contention for flush bios
md/raid5: Assigning NULL to sh->batch_head before testing bit R5_Overlap of a stripe
md/raid1: add error handling of read error from FailFast device
md: fix NULL dereference of mddev->pers in remove_and_add_spares()
raid5: copy write hint from origin bio to stripe
md: fix two problems with setting the "re-add" device state.
raid10: check bio in r10buf_pool_free to void NULL pointer dereference
md: fix an error code format and remove unsed bio_sector
Pull sparc updates from David Miller:
- a FPE signal fix that was also merged upstream
- privileged ADI driver from Tom Hromatka
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: fix compat siginfo ABI regression
selftests: sparc64: char: Selftest for privileged ADI driver
char: sparc64: Add privileged ADI driver
Pull IDE updates from David Miller:
"Primarily IRQ disabling avoidance changes from Sebastian Andrzej
Siewior"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
ide: don't enable/disable interrupts in force threaded-IRQ mode
ide: don't disable interrupts during kmap_atomic()
ide: Handle irq disabling consistently
alim15x3: move irq-restore before pci_dev_put()
Here is the big staging and IIO driver update for 4.18-rc1.
It was delayed as I wanted to make sure the final driver deletions did
not cause any major merge issues, and all now looks good.
There are a lot of patches here, just over 1000. The diffstat summary
shows the major changes here:
1007 files changed, 16828 insertions(+), 227770 deletions(-)
Because of this, we might be close to shrinking the overall kernel
source code size for two releases in a row.
There was loads of work in this release cycle, primarily:
- tons of ks7010 driver cleanups
- lots of mt7621 driver fixes and cleanups
- most driver cleanups
- wilc1000 fixes and cleanups
- lots and lots of IIO driver cleanups and new additions
- debugfs cleanups for all staging drivers
- lots of other staging driver cleanups and fixes, the shortlog
has the full details.
but the big user-visable things here are the removal of 3 chunks of
code:
- ncpfs and ipx were removed on schedule, no one has cared about
this code since it moved to staging last year, and if it needs
to come back, it can be reverted.
- lustre file system is removed. I've ranted at the lustre
developers about once a year for the past 5 years, with no
real forward progress at all to clean things up and get the
code into the "real" part of the kernel. Given that the
lustre developers continue to work on an external tree and try
to port those changes to the in-kernel tree every once in a
while, this whole thing really really is not working out at
all. So I'm deleting it so that the developers can spend the
time working in their out-of-tree location and get things
cleaned up properly to get merged into the tree correctly at a
later date.
Because of these file removals, you will have merge issues on some of
these files (2 in the ipx code, 1 in the ncpfs code, and 1 in the
atomisp driver). Just delete those files, it's a simple merge :)
All of this has been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWxvjGQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymoEwCbBYnyUl3cwCszIJ3L3/zvUWpmqIgAn1DDsAim
dM4lmKg6HX/JBSV4GAN0
=zdta
-----END PGP SIGNATURE-----
Merge tag 'staging-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging/IIO updates from Greg KH:
"Here is the big staging and IIO driver update for 4.18-rc1.
It was delayed as I wanted to make sure the final driver deletions did
not cause any major merge issues, and all now looks good.
There are a lot of patches here, just over 1000. The diffstat summary
shows the major changes here:
1007 files changed, 16828 insertions(+), 227770 deletions(-)
Because of this, we might be close to shrinking the overall kernel
source code size for two releases in a row.
There was loads of work in this release cycle, primarily:
- tons of ks7010 driver cleanups
- lots of mt7621 driver fixes and cleanups
- most driver cleanups
- wilc1000 fixes and cleanups
- lots and lots of IIO driver cleanups and new additions
- debugfs cleanups for all staging drivers
- lots of other staging driver cleanups and fixes, the shortlog has
the full details.
but the big user-visable things here are the removal of 3 chunks of
code:
- ncpfs and ipx were removed on schedule, no one has cared about this
code since it moved to staging last year, and if it needs to come
back, it can be reverted.
- lustre file system is removed.
I've ranted at the lustre developers about once a year for the past
5 years, with no real forward progress at all to clean things up
and get the code into the "real" part of the kernel.
Given that the lustre developers continue to work on an external
tree and try to port those changes to the in-kernel tree every once
in a while, this whole thing really really is not working out at
all. So I'm deleting it so that the developers can spend the time
working in their out-of-tree location and get things cleaned up
properly to get merged into the tree correctly at a later date.
Because of these file removals, you will have merge issues on some of
these files (2 in the ipx code, 1 in the ncpfs code, and 1 in the
atomisp driver). Just delete those files, it's a simple merge :)
All of this has been in linux-next for a while with no reported
problems"
* tag 'staging-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (1011 commits)
staging: ipx: delete it from the tree
ncpfs: remove uapi .h files
ncpfs: remove Documentation
ncpfs: remove compat functionality
staging: ncpfs: delete it
staging: lustre: delete the filesystem from the tree.
staging: vc04_services: no need to save the log debufs dentries
staging: vc04_services: vchiq_debugfs_log_entry can be a void *
staging: vc04_services: remove struct vchiq_debugfs_info
staging: vc04_services: move client dbg directory into static variable
staging: vc04_services: remove odd vchiq_debugfs_top() wrapper
staging: vc04_services: no need to check debugfs return values
staging: mt7621-gpio: reorder includes alphabetically
staging: mt7621-gpio: change gc_map to don't use pointers
staging: mt7621-gpio: use GPIOF_DIR_OUT and GPIOF_DIR_IN macros instead of custom values
staging: mt7621-gpio: change 'to_mediatek_gpio' to make just a one line return
staging: mt7621-gpio: dt-bindings: update documentation for #interrupt-cells property
staging: mt7621-gpio: update #interrupt-cells for the gpio node
staging: mt7621-gpio: dt-bindings: complete documentation for the gpio
staging: mt7621-dts: add missing properties to gpio node
...
New stepping of Skylake has fixes for cache occupancy and memory
bandwidth monitoring.
Update the code to enable these by default on newer steppings.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: stable@vger.kernel.org # v4.14
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Link: https://lkml.kernel.org/r/20180608160732.9842-1-tony.luck@intel.com
A recent commit reused the original request flags for the flush
queue handling. However, for some of the kick flush cases, the
original request was already completed. This caused a use after
free, if blk-mq wasn't used.
Fixes: 84fca1b0c4 ("block: pass failfast and driver-specific flags to flush requests")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
* DAX broke a fundamental assumption of truncate of file mapped pages.
The truncate path assumed that it is safe to disconnect a pinned page
from a file and let the filesystem reclaim the physical block. With DAX
the page is equivalent to the filesystem block. Introduce
dax_layout_busy_page() to enable filesystems to wait for pinned DAX
pages to be released. Without this wait a filesystem could allocate
blocks under active device-DMA to a new file.
* DAX arranges for the block layer to be bypassed and uses
dax_direct_access() + copy_to_iter() to satisfy read(2) calls.
However, the memcpy_mcsafe() facility is available through the pmem
block driver. In order to safely handle media errors, via the DAX
block-layer bypass, introduce copy_to_iter_mcsafe().
* Fix cache management policy relative to the ACPI NFIT Platform
Capabilities Structure to properly elide cache flushes when they are not
necessary. The table indicates whether CPU caches are power-fail
protected. Clarify that a deep flush is always performed on
REQ_{FUA,PREFLUSH} requests.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbGxI7AAoJEB7SkWpmfYgCDjsP/2Lcibu9Kf4tKIzuInsle6iE
6qP29qlkpHVTpDKbhvIxTYTYL9sMU0DNUrpPCJR/EYdeyztLWDFC5EAT1wF240vf
maV37s/uP331jSC/2VJnKWzBs2ztQxmKLEIQCxh6aT0qs9cbaOvJgB/WlVu+qtsl
aGJFLmb6vdQacp31noU5plKrMgMA1pADyF5qx9I9K2HwowHE7T368ZEFS/3S//c3
LXmpx/Nfq52sGu/qbRbu6B1CTJhIGhmarObyQnvBYoKntK1Ov4e8DS95wD3EhNDe
FuRkOCUKhjl6cFy7QVWh1ct1bFm84ny+b4/AtbpOmv9l/+0mveJ7e+5mu8HQTifT
wYiEe2xzXJ+OG/xntv8SvlZKMpjP3BqI0jYsTutsjT4oHrciiXdXM186cyS+BiGp
KtFmWyncQJgfiTq6+Hj5XpP9BapNS+OYdYgUagw9ZwzdzptuGFYUMSVOBrYrn6c/
fwqtxjubykJoW0P3pkIoT91arFSea7nxOKnGwft06imQ7TwR4ARsI308feQ9itJq
2P2e7/20nYMsw2aRaUDDA70Yu+Lagn1m8WL87IybUGeUDLb1BAkjphAlWa6COJ+u
PhvAD2tvyM9m0c7O5Mytvz7iWKG6SVgatoAyOPkaeplQK8khZ+wEpuK58sO6C1w8
4GBvt9ri9i/Ww/A+ppWs
=4bfw
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
"This adds a user for the new 'bytes-remaining' updates to
memcpy_mcsafe() that you already received through Ingo via the
x86-dax- for-linus pull.
Not included here, but still targeting this cycle, is support for
handling memory media errors (poison) consumed via userspace dax
mappings.
Summary:
- DAX broke a fundamental assumption of truncate of file mapped
pages. The truncate path assumed that it is safe to disconnect a
pinned page from a file and let the filesystem reclaim the physical
block. With DAX the page is equivalent to the filesystem block.
Introduce dax_layout_busy_page() to enable filesystems to wait for
pinned DAX pages to be released. Without this wait a filesystem
could allocate blocks under active device-DMA to a new file.
- DAX arranges for the block layer to be bypassed and uses
dax_direct_access() + copy_to_iter() to satisfy read(2) calls.
However, the memcpy_mcsafe() facility is available through the pmem
block driver. In order to safely handle media errors, via the DAX
block-layer bypass, introduce copy_to_iter_mcsafe().
- Fix cache management policy relative to the ACPI NFIT Platform
Capabilities Structure to properly elide cache flushes when they
are not necessary. The table indicates whether CPU caches are
power-fail protected. Clarify that a deep flush is always performed
on REQ_{FUA,PREFLUSH} requests"
* tag 'libnvdimm-for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (21 commits)
dax: Use dax_write_cache* helpers
libnvdimm, pmem: Do not flush power-fail protected CPU caches
libnvdimm, pmem: Unconditionally deep flush on *sync
libnvdimm, pmem: Complete REQ_FLUSH => REQ_PREFLUSH
acpi, nfit: Remove ecc_unit_size
dax: dax_insert_mapping_entry always succeeds
libnvdimm, e820: Register all pmem resources
libnvdimm: Debug probe times
linvdimm, pmem: Preserve read-only setting for pmem devices
x86, nfit_test: Add unit test for memcpy_mcsafe()
pmem: Switch to copy_to_iter_mcsafe()
dax: Report bytes remaining in dax_iomap_actor()
dax: Introduce a ->copy_to_iter dax operation
uio, lib: Fix CONFIG_ARCH_HAS_UACCESS_MCSAFE compilation
xfs, dax: introduce xfs_break_dax_layouts()
xfs: prepare xfs_break_layouts() for another layout type
xfs: prepare xfs_break_layouts() to be called with XFS_MMAPLOCK_EXCL
mm, fs, dax: handle layout changes to pinned dax mappings
mm: fix __gup_device_huge vs unmap
mm: introduce MEMORY_DEVICE_FS_DAX and CONFIG_DEV_PAGEMAP_OPS
...
Callers of br_fdb_find() need to hold the hash lock, which
br_fdb_find_port() doesn't do. However, since br_fdb_find_port() is not
doing any actual FDB manipulation, the hash lock is not really needed at
all. So convert to br_fdb_find_rcu(), surrounded by rcu_read_lock() /
_unlock() pair.
The device pointer copied from inside the FDB entry is then kept alive
by the RTNL lock, which br_fdb_find_port() asserts.
Fixes: 4d4fd36126 ("net: bridge: Publish bridge accessor functions")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 6b229cf77d ("udp: add batching to udp_rmem_release()")
the sk_rmem_alloc field does not measure exactly anymore the
receive queue length, because we batch the rmem release. The issue
is really apparent only after commit 0d4a6608f6 ("udp: do rmem bulk
free even if the rx sk queue is empty"): the user space can easily
check for an empty socket with not-0 queue length reported by the 'ss'
tool or the procfs interface.
We need to use a custom UDP helper to report the correct queue length,
taking into account the forward allocation deficit.
Reported-by: trevor.francis@46labs.com
Fixes: 6b229cf77d ("UDP: add batching to udp_rmem_release()")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
use nla_strlcpy() to avoid copying data beyond the length of TCA_DEF_DATA
netlink attribute, in case it is less than SIMP_MAX_DATA and it does not
end with '\0' character.
v2: fix errors in the commit message, thanks Hangbin Liu
Fixes: fa1b1cff3d ("net_cls_act: Make act_simple use of netlink policy.")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bp->SharedMemAddr is set to NULL while bp->SharedMemSize lesser-or-equal 0,
then memset will trigger null-ptr-deref.
fix it by replacing pci_alloc_consistent with dma_zalloc_coherent.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Colin Ian King <colin.king@canonical.com>
This was originally mistakenly submitted to net-next. Resubmitting to net.
The comparison of numvecs < 0 is always false because numvecs is a u32
and hence the error return from a failed call to pci_alloc_irq_vectores
is never detected. Fix this by using the signed int ret to handle the
error return and assign numvecs to err.
Detected by CoverityScan, CID#1468650 ("Unsigned compared against 0")
Fixes: a09bd81b54 ("net: aquantia: Limit number of vectors to actually allocated irqs")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlsa4sQQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpqPNEADbZby01Q6i+dTZYIosz5+/gq8gSkCcpQ/T
krK/f2MlwD7Rdog1BnGNNP5XOqK8pKIGdARL1FQKpViii6xGIoOc2F4VK+vO44yR
LI+BeeOM6rWNOAoBO4CqeZz/Fv5IYi7KURWogYhZMqrxBqT2OeD9MMowm5NulBix
YZ2ttFWiTJScJttJCDPE6cu9EjHDeK63Nr7+UU80k3atU4eUpUp1mRFGmtaYWulq
l3KaENCwm00WCVqM4i/gVWr2AkgTZqAAyeCx7IrPsrQrCMEhxEpMnU52e2kXSxhM
Qx6FLNEOjzARuBDurtfJE74usQcW2xDLzT8fh2UStnPpt6S/JX6f9GMBVk0G7I8B
8COF4DF+bzdbhhz2SiZaTFOmDML5H1iQ8t6lTTms0Bnq29mE3E4QFom8lO+2BxN3
g6PFhvYaOkhTVtV5BPXpXs9xZBLHrv5G/JopXsZh0RF1kpiova+nfA1K2uJPFpJ0
NcHuMZKmIG3uBqY3fj5Ul+zuVhZ/1v8B69zWoSWafLrk+VRdcEAniuY2E6SsQFP5
gV4GNja85S53DnlIVwEUXPYMiY6opiwP53yMNMvkB/FdzaQB5Ehdif2fhZu64QmE
TtqbHtAuV0VZ3z4GrJ3XNbV6Np4wMOhYls4lTkZsnqNNO2sw/eoTYcmwxLDEYOQw
uQ9rhZh4IQ==
=N3BP
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20180608' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A few fixes for this merge window, where some of them should go in
sooner rather than later, hence a new pull this week. This pull
request contains:
- Set of NVMe fixes, mostly follow up cleanups/fixes to the queue
changes, but also teardown/removal and misc changes (Christop/Dan/
Johannes/Sagi/Steve).
- Two lightnvm fixes for issues that showed up in this window
(Colin/Wei).
- Failfast/driver flags inheritance for flush requests (Hannes).
- The md device put sanitization and fix (Kent).
- dm bio_set inheritance fix (me).
- nbd discard granularity fix (Josef).
- nbd consistency in command printing (Kevin).
- Loop recursion validation fix (Ted).
- Partition overlap check (Wang)"
[ .. and now my build is warning-free again thanks to the md fix - Linus ]
* tag 'for-linus-20180608' of git://git.kernel.dk/linux-block: (22 commits)
nvme: cleanup double shift issue
nvme-pci: make CMB SQ mod-param read-only
nvme-pci: unquiesce dead controller queues
nvme-pci: remove HMB teardown on reset
nvme-pci: queue creation fixes
nvme-pci: remove unnecessary completion doorbell check
nvme-pci: remove unnecessary nested locking
nvmet: filter newlines from user input
nvme-rdma: correctly check for target keyed sgl support
nvme: don't hold nvmf_transports_rwsem for more than transport lookups
nvmet: return all zeroed buffer when we can't find an active namespace
md: Unify mddev destruction paths
dm: use bioset_init_from_src() to copy bio_set
block: add bioset_init_from_src() helper
block: always set partition number to '0' in blk_partition_remap()
block: pass failfast and driver-specific flags to flush requests
nbd: set discard_alignment to the granularity
nbd: Consistently use request pointer in debug messages.
block: add verifier for cmdline partition
lightnvm: pblk: fix resource leak of invalid_bitmap
...
Quite a lot of core work this time around, though not 100% successful.
We gained support for runtime mode changes thanks to David Collins and
improved support for write only regulators (ones where we can't read
back the configuration) from Douglas Anderson.
There's been quite a bit of work from Linus Walleij on converting from
specfying GPIOs by numbers to descriptors. Sadly the testing turned out
to be less good than we had hoped and so a lot of this had to be
reverted.
We also have the start of updates to use coupled regulators from Maciej
Purski, unfortunately there are further problems there so the last
couple of patches have been reverted.
We also have new drivers for BD71837 and SY8106A devices, SAW regulators
on Qualcomm SPMI and dropped support for some preproduction chips
that never made it to market from the AB8500 driver.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAlsarcETHGJyb29uaWVA
a2VybmVsLm9yZwAKCRAk1otyXVSH0B4HB/9MFV/MK7Hw2hsVCX3qWTiH4tJ/X0MG
tGz1PfmbH0CJ9ly5g+rvCoT+1/s7BydKzi1cd1RHnimBv2U1XagwSny3LNEcJs2Q
pmhpGViakkQI/Y2h+u/j0Jk1nE+jTiKk+1ozUB7YnPekrGyQlf7TMhvKOLTvLKyX
56jdNxcW0MgSnXV2N6y4NpWhgvrQwvKjacTxV5iX7WP2rnK2WNFeG7Q859buhtI0
znRi+tO5hZsw5T44ickdPfotZn+i5o7MYLCkkaA2h1EwtpbYwVINfUjp3KtuyFhH
3B9GmCsjjN2Z2eInnkpzWfVXK2S1Vlp6+ka2FSs+U/4rVdd3Bw2KkblS
=Ftua
-----END PGP SIGNATURE-----
Merge tag 'regulator-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator updates from Mark Brown:
"Quite a lot of core work this time around, though not 100% successful.
We gained support for runtime mode changes thanks to David Collins and
improved support for write only regulators (ones where we can't read
back the configuration) from Douglas Anderson.
There's been quite a bit of work from Linus Walleij on converting from
specfying GPIOs by numbers to descriptors. Sadly the testing turned
out to be less good than we had hoped and so a lot of this had to be
reverted.
We also have the start of updates to use coupled regulators from
Maciej Purski, unfortunately there are further problems there so the
last couple of patches have been reverted.
We also have new drivers for BD71837 and SY8106A devices, SAW
regulators on Qualcomm SPMI and dropped support for some preproduction
chips that never made it to market from the AB8500 driver"
* tag 'regulator-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (57 commits)
regulator: gpio: Revert
ARM: pxa, regulator: fix building ezx e680
regulator: Revert coupled regulator support again
regulator: wm8994: Fix shared GPIOs
regulator: max77686: Fix shared GPIOs
regulator: bd71837: BD71837 PMIC regulator driver
regulator: bd71837: Devicetree bindings for BD71837 regulators
regulator: gpio: Get enable GPIO using GPIO descriptor
regulator: fixed: Convert to use GPIO descriptor only
regulator: s2mps11: Fix boot on Odroid XU3
dt-bindings: qcom_spmi: Document SAW support
regulator: qcom_spmi: Add support for SAW
regulator: tps65090: Pass descriptor instead of GPIO number
regulator: s5m8767: Pass descriptor instead of GPIO number
regulator: pfuze100: Delete reference to ena_gpio
regulator: max8952: Pass descriptor instead of GPIO number
regulator: lp8788-ldo: Pass descriptor instead of GPIO number
regulator: lm363x: Pass descriptor instead of GPIO number
regulator: max8973: Pass descriptor instead of GPIO number
regulator: mc13xxx-core: Switch to SPDX identifier
...
The problem here is that set_bit() and test_bit() take a bit number so
we should be passing 0 but instead we're passing (1 << 0) which leads to
a double shift. It doesn't cause a runtime bug in the current code
because it's done consistently and we only set that one bit.
I decided to just re-use NVME_AER_NOTICE_NS_CHANGED instead of
introducing a new define for this.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
A controller reset after a run time change of the CMB module parameter
breaks the driver. An 'on -> off' will have the driver use NULL for the
host memory queue, and 'off -> on' will use mismatched queue depth between
the device and the host.
We could fix both, but there isn't really a good reason to change this
at run time anyway, compared to at module load time, so this patch makes
parameter read-only after after modprobe.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch ensures the nvme namsepace request queues are not quiesced
on a surprise removal. It's possible the queues were previously killed
in a failed reset, so the queues need to be unquiesced to ensure all
requests are flushed to completion.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The controller is required to disable its host memory buffer use on
controller reset. We don't need to submit an admin command to delete it,
so this patch skips sending that command so we don't need to worry about
handling a timeout.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We've been ignoring NVMe error status on queue creations. Fortunately they
are uncommon, but we should handle these anyway. This patch adds checks
for the a positive error return value that indicates an NVMe status.
If we do see a negative return, the controller isn't usable, so this
patch returns immediately in since we can't unwind that failure.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The nvme pci driver never unmaps the doorbell registers while the requests
are active, so we can always safely update the completion queue head.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The nvme pci driver no longer handles completions under the cq lock,
so the nested locking is not necessary.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>