With support for dual mode in the driver, this mode becomes
dead code. Remove reverse_ini_mode from code.
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Add switch to allow both Initiator Mode & Target
mode to operate at the same time.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Currently code performs a full scan of the fabric for
every RSCN. Its an expensive process in a noisy large SAN.
This patch optimizes expensive fabric discovery process by
scanning switch for the affected port when RSCN is received.
Currently Initiator Mode code makes login/logout decision without
knowledge of target mode. This causes driver and firmware to go
out-of-sync. This framework synchronizes both initiator mode
personality and target mode personality in making login/logout
decision.
This patch adds following capabilities in the driver
- Send Notification Acknowledgement asynchronously.
- Update session/fcport state asynchronously.
- Create a session or fcport struct asynchronously.
- Send GNL asynchronously. The command will ask FW to
provide a list of FC Port entries FW knows about.
- Send GPDB asynchronously. The command will ask FW to
provide detail data of an FC Port FW knows about or
perform ADISC to verify the state of the session.
- Send GPNID asynchronously. The command will ask switch
to provide WWPN for provided NPort ID.
- Send GPSC asynchronously. The command will ask switch
to provide registered port speed for provided WWPN.
- Send GIDPN asynchronously. The command will ask the
switch to provide Nport ID for provided WWPN.
- In driver unload path, schedule all session for deletion
and wait for deletion to complete before allowing driver
unload to proceed.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
[ bvanassche: fixed spelling in patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Current code merges qla_tgt_sess and fc_port structure
into single fc_port structure representing same I-T nexus.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
[ bvanassche: fixed spelling of patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Updated code with d_id from s_id for better readability and
clarity.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: fixed spelling of patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Callback for sp->done expects scsi_qla_host is passed in as argument,
Instead qla_hw_data is passed in.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
During initial implementation, tape support was included but not
enabled by default on target. So far, we don't see any target
customer requesting this support. Since this code is not being
used actively, we want to remove it and we will add back if there
are any request in future for SRR support.
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Trace flags are useful during debugging crash dumps
using crash utility. These trace flags makes it easier
to understand various states a command has successfully
completed.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds internal LIO sgl limit since the driver already
sets a max transfer limit on transport layer of 1MB to the client.
Cc: stable@vger.kernel.org
Tested-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch addresses a long standing bug where the commit phase
of COMPARE_AND_WRITE would result in a se_cmd->cmd_kref reference
leak if se_cmd->scsi_status returned non SAM_STAT_GOOD.
This would manifest first as a lost SCSI response, and eventual
hung task during fabric driver logout or re-login, as existing
shutdown logic waited for the COMPARE_AND_WRITE se_cmd->cmd_kref
to reach zero.
To address this bug, compare_and_write_post() has been changed
to drop the incorrect !cmd->scsi_status conditional that was
preventing *post_ret = 1 for being set during non SAM_STAT_GOOD
status.
This patch has been tested with SAM_STAT_CHECK_CONDITION status
from normal target_complete_cmd() callback path, as well as the
incoming __target_execute_cmd() submission failure path when
se_cmd->execute_cmd() returns non zero status.
Reported-by: Donald White <dew@datera.io>
Cc: Donald White <dew@datera.io>
Tested-by: Gary Guo <ghg@datera.io>
Cc: Gary Guo <ghg@datera.io>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch addresses a long-standing bug with multi-session
(eg: iscsi-target + iser-target) se_node_acl dynamic free
withini transport_deregister_session().
This bug is caused when a storage endpoint is configured with
demo-mode (generate_node_acls = 1 + cache_dynamic_acls = 1)
initiators, and initiator login creates a new dynamic node acl
and attaches two sessions to it.
After that, demo-mode for the storage instance is disabled via
configfs (generate_node_acls = 0 + cache_dynamic_acls = 0) and
the existing dynamic acl is never converted to an explicit ACL.
The end result is dynamic acl resources are released twice when
the sessions are shutdown in transport_deregister_session().
If the storage instance is not changed to disable demo-mode,
or the dynamic acl is converted to an explict ACL, or there
is only a single session associated with the dynamic ACL,
the bug is not triggered.
To address this big, move the release of dynamic se_node_acl
memory into target_complete_nacl() so it's only freed once
when se_node_acl->acl_kref reaches zero.
(Drop unnecessary list_del_init usage - HCH)
Reported-by: Rob Millner <rlm@daterainc.com>
Tested-by: Rob Millner <rlm@daterainc.com>
Cc: Rob Millner <rlm@daterainc.com>
Cc: stable@vger.kernel.org # 4.1+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a bug where incoming task management requests
can be explicitly aborted during an active LUN_RESET, but who's
struct work_struct are canceled in-flight before execution.
This occurs when core_tmr_drain_tmr_list() invokes cancel_work_sync()
for the incoming se_tmr_req->task_cmd->work, resulting in cmd->work
for target_tmr_work() never getting invoked and the aborted TMR
waiting indefinately within transport_wait_for_tasks().
To address this case, perform a CMD_T_ABORTED check early in
transport_generic_handle_tmr(), and invoke the normal path via
transport_cmd_check_stop_to_fabric() to complete any TMR kthreads
blocked waiting for CMD_T_STOP in transport_wait_for_tasks().
Also, move the TRANSPORT_ISTATE_PROCESSING assignment earlier
into transport_generic_handle_tmr() so the existing check in
core_tmr_drain_tmr_list() avoids attempting abort the incoming
se_tmr_req->task_cmd->work if it has already been queued into
se_device->tmr_wq.
Reported-by: Rob Millner <rlm@daterainc.com>
Tested-by: Rob Millner <rlm@daterainc.com>
Cc: Rob Millner <rlm@daterainc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds the missing target_complete_cmd() SCSI status
parameter change in target_xcopy_do_work(), that was originally
missing in commit 926317de33.
It correctly propigates up the correct SCSI status during
EXTENDED_COPY exception cases, instead of always using the
hardcoded SAM_STAT_CHECK_CONDITION from original code.
This is required for ESX host environments that expect to
hit SAM_STAT_RESERVATION_CONFLICT for certain scenarios,
and SAM_STAT_CHECK_CONDITION results in non-retriable
status for these cases.
Reported-by: Nixon Vincent <nixon.vincent@calsoftinc.com>
Tested-by: Nixon Vincent <nixon.vincent@calsoftinc.com>
Cc: Nixon Vincent <nixon.vincent@calsoftinc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
After the v4.2+ RCU conversion to se_node_acl->lun_entry_hlist,
a BUG_ON() was added in core_enable_device_list_for_node() to
detect when the located orig->se_lun_acl contains an existing
se_lun_acl pointer reference.
However, this scenario can happen when a dynamically generated
NodeACL is being converted to an explicit NodeACL, when the
explicit NodeACL contains a different LUN mapping than the
default provided by the WWN endpoint.
So instead of triggering BUG_ON(), go ahead and fail instead
following the original pre RCU conversion logic.
Reported-by: Benjamin ESTRABAUD <ben.estrabaud@mpstor.com>
Cc: Benjamin ESTRABAUD <ben.estrabaud@mpstor.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org # 4.2+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Pull irq fixes from Thomas Gleixner:
- Prevent double activation of interrupt lines, which causes problems
on certain interrupt controllers
- Handle the fallout of the above because x86 (ab)uses the activation
function to reconfigure interrupts under the hood.
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/irq: Make irq activate operations symmetric
irqdomain: Avoid activating interrupts more than once
Fix a regression that prevented migration between hosts with different
XSAVE features even if the missing features were not used by the guest
(for stable).
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJYlf83AAoJEED/6hsPKofoQI8H/2Y9v5FkIMUeLVPf5nskcomw
pV/IqqMJEQ0sEp0+fkGhk15nykrVpXfOdqgGD8FI9Xk8rlkTEcUSGMGvfXrIk0ir
fzX27ASWrHvyjso+6XZzarSUhMFiBljU+NDcqWgjAeYEA1H+fxtxcomx+KiC1D1H
Q3kYMWTDQ0q/QU0q/4ohVM0gfVIunmVjoJaMK3tlrPP+w4MgMu2WALi0BlZKyugZ
fcVxzgGxPKoxAfXoFHohS7jKhLX9rF8MJoSH2NxInguajpMtf76Jw+YOr10yWtR2
ESY/5JXb4KLE94cwM3XiDghYg2ak/zphTFxBbPHmSxY3nim7QahRyuiMQFr3VN8=
=0UcD
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fix from Radim Krčmář:
"Fix a regression that prevented migration between hosts with different
XSAVE features even if the missing features were not used by the guest
(for stable)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: do not save guest-unsupported XSAVE state
Here are two bugfixes that resolve some reported issues. One in the
firmware loader, that should fix the much-reported problem of crashes
with it. The other is a hyperv fix for a reported regression.
Both have been in linux-next for a week or so with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWJWsGA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylWmwCgjvg9SImQDY2FKYNAOhQnBh9gtXUAn0Gux/KD
yzqEsG5BOmjD3YcYGsx6
=VzHo
-----END PGP SIGNATURE-----
Merge tag 'char-misc-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are two bugfixes that resolve some reported issues. One in the
firmware loader, that should fix the much-reported problem of crashes
with it. The other is a hyperv fix for a reported regression.
Both have been in linux-next for a week or so with no reported issues"
* tag 'char-misc-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Drivers: hv: vmbus: finally fix hv_need_to_signal_on_read()
firmware: fix NULL pointer dereference in __fw_load_abort()
Here are a few small IIO and one staging driver fix for 4.10-rc7. They
fix some reported issues with the drivers.
All of them have been in linux-next for a week or so with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWJW6xQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynvjACgsAba59pU0bEDsUmtzgF4WoPYX3sAoLgB5I16
MHXQKHRl//uQtYboSufC
=CM0c
-----END PGP SIGNATURE-----
Merge tag 'staging-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging/IIO fixes from Greg KH:
"Here are a few small IIO and one staging driver fix for 4.10-rc7. They
fix some reported issues with the drivers.
All of them have been in linux-next for a week or so with no reported
issues"
* tag 'staging-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: greybus: timesync: validate platform state callback
iio: dht11: Use usleep_range instead of msleep for start signal
iio: adc: palmas_gpadc: retrieve a valid iio_dev in suspend/resume
iio: health: max30100: fixed parenthesis around FIFO count check
iio: health: afe4404: retrieve a valid iio_dev in suspend/resume
iio: health: afe4403: retrieve a valid iio_dev in suspend/resume
Here are some small USB fixes for some reported issues, and the usual
number of new device ids for 4.10-rc7.
All of these, except the last new device id, have been in linux-next for
a while with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWJW8Iw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynTtQCfTZyPCHsDudlzuJeqrigE2AsfRfYAnR7OQiZK
6GgUHc8ulHGyF/Vuib3A
=dZOf
-----END PGP SIGNATURE-----
Merge tag 'usb-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes for some reported issues, and the usual
number of new device ids for 4.10-rc7.
All of these, except the last new device id, have been in linux-next
for a while with no reported issues"
* tag 'usb-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: serial: pl2303: add ATEN device ID
usb: gadget: f_fs: Assorted buffer overflow checks.
USB: Add quirk for WORLDE easykey.25 MIDI keyboard
usb: musb: Fix external abort on non-linefetch for musb_irq_work()
usb: musb: Fix host mode error -71 regression
USB: serial: option: add device ID for HP lt2523 (Novatel E371)
USB: serial: qcserial: add Dell DW5570 QDL
A single fix this time: a fix for a virtqueue removal bug which only
appears to affect S390, but which results in the queue hanging forever
thus causing the machine to fail shutdown.
Signed-off-by: James E. J. Bottomley <jejb@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIbBAABAgAGBQJYlPNHAAoJEAVr7HOZEZN4OsgP9Rlu6udoNn5If6lPly77hjsQ
ukiXZNtPNOuxeiCrUmTMBm69588/XNyuVrrpP7pccujQhX+5sv2qd2Ph4uoaXLeK
zq695/Y/ejAVRhORCJNibA+EQ6Dr4+DGEm+Iitifa1ILO/npaf5hCzNfdY7Ln3pb
cUu8FhXQkFkKOwhNovtOzkB6lXDobh3pZKBxYOsK4Ea5f1CSB+Sjdr/Xl4l141Ei
3eN+flX9VLX8pV6mJ7xQEoWCYrqjgh7l0PYSgX011S2Qniw8sgwI91XsNABZP3oJ
Ceu+COJPt3fRYcJugBYvAJB0pFUyxPh8rC0NL6nJLBcWVm5hJoaHX96/I5hgx+r2
9ZH4lLOiIyyEZQxz31qe73YzGkBe6lBNxJMjRcP3o5MXw+GDsUhZfuwqnX/Zc6EH
o7R4cW1o08HTgZcE3pKAwhTzzZ5IxMe4pkUiVBxb2TgUMvKfeX9dRBW4YStgRLKC
EHBQ89g1DSWbP15a4OX45sNYCSYPvq+HyNQCFzXXIhELVsEd7VyCyMK2i8E/ccAu
UwusYLDpX1QH56IpYNMgwoTJeCjI9HeOTGf7EWtJSMUTa/rrYSFZwcEA6xHxVPco
o3GqJMID84sg9fOCvToW8tKbl38Smkse9r24FhqBdiZRRJXsCogPCgt2Fa9ZRcPx
oNy87IN7k4K+bL6BAJw=
=1j2t
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
"A single fix this time: a fix for a virtqueue removal bug which only
appears to affect S390, but which results in the queue hanging forever
thus causing the machine to fail shutdown"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: virtio_scsi: Reject commands when virtqueue is broken
ARM DMA fix revert
vhost endian-ness fix
MAINTAINERS: email address change for Amit
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJYlPj5AAoJECgfDbjSjVRp2L0IALFxjzTaXDy+39y3zTkMu97r
r2Mm2CiduZJ3XrCRFKWnZleA3yKpE1zNkZlpdVV252tG0YC7oHdtdE3Ctu7x8gCv
25rH7nEbQTF5NcRh/Ur2h1oR1PGXT/CuIkEQCH8FxUWa1anbJC0Y6dpd+VSd4wWV
eQMqh/1775IdH7XeYbWvgOi3FK0ox9RclcxzRzUqEcVxL3MkZaKzPh7Qh2dGokLA
vF/ao5fchepXtUbyDwdIjvkc9bQlEjcXhch7Zz+aep+iwfEfZqB7Ku4yDmXrGTuw
URFlRen83zFMfu2Xd10hVL1JukR8TWxuxcQx8yzYEOqe9uF8LAq8hsXAgV72VmM=
=xnLA
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio/vhost fixes from Michael S. Tsirkin:
"Last minute fixes:
- ARM DMA fix revert
- vhost endian-ness fix
- MAINTAINERS: email address change for Amit"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
MAINTAINERS: update email address for Amit Shah
vhost: fix initialization for vq->is_le
Revert "vring: Force use of DMA API for ARM-based systems with legacy devices"
Merge fixes from Andrew Morton:
"8 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm, fs: check for fatal signals in do_generic_file_read()
fs: break out of iomap_file_buffered_write on fatal signals
base/memory, hotplug: fix a kernel oops in show_valid_zones()
mm/memory_hotplug.c: check start_pfn in test_pages_in_a_zone()
jump label: pass kbuild_cflags when checking for asm goto support
shmem: fix sleeping from atomic context
kasan: respect /proc/sys/kernel/traceoff_on_warning
zswap: disable changing params if init fails
do_generic_file_read() can be told to perform a large request from
userspace. If the system is under OOM and the reading task is the OOM
victim then it has an access to memory reserves and finishing the full
request can lead to the full memory depletion which is dangerous. Make
sure we rather go with a short read and allow the killed task to
terminate.
Link: http://lkml.kernel.org/r/20170201092706.9966-3-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tetsuo has noticed that an OOM stress test which performs large write
requests can cause the full memory reserves depletion. He has tracked
this down to the following path
__alloc_pages_nodemask+0x436/0x4d0
alloc_pages_current+0x97/0x1b0
__page_cache_alloc+0x15d/0x1a0 mm/filemap.c:728
pagecache_get_page+0x5a/0x2b0 mm/filemap.c:1331
grab_cache_page_write_begin+0x23/0x40 mm/filemap.c:2773
iomap_write_begin+0x50/0xd0 fs/iomap.c:118
iomap_write_actor+0xb5/0x1a0 fs/iomap.c:190
? iomap_write_end+0x80/0x80 fs/iomap.c:150
iomap_apply+0xb3/0x130 fs/iomap.c:79
iomap_file_buffered_write+0x68/0xa0 fs/iomap.c:243
? iomap_write_end+0x80/0x80
xfs_file_buffered_aio_write+0x132/0x390 [xfs]
? remove_wait_queue+0x59/0x60
xfs_file_write_iter+0x90/0x130 [xfs]
__vfs_write+0xe5/0x140
vfs_write+0xc7/0x1f0
? syscall_trace_enter+0x1d0/0x380
SyS_write+0x58/0xc0
do_syscall_64+0x6c/0x200
entry_SYSCALL64_slow_path+0x25/0x25
the oom victim has access to all memory reserves to make a forward
progress to exit easier. But iomap_file_buffered_write and other
callers of iomap_apply loop to complete the full request. We need to
check for fatal signals and back off with a short write instead.
As the iomap_apply delegates all the work down to the actor we have to
hook into those. All callers that work with the page cache are calling
iomap_write_begin so we will check for signals there. dax_iomap_actor
has to handle the situation explicitly because it copies data to the
userspace directly. Other callers like iomap_page_mkwrite work on a
single page or iomap_fiemap_actor do not allocate memory based on the
given len.
Fixes: 68a9f5e700 ("xfs: implement iomap based buffered write path")
Link: http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org> [4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reading a sysfs "memoryN/valid_zones" file leads to the following oops
when the first page of a range is not backed by struct page.
show_valid_zones() assumes that 'start_pfn' is always valid for
page_zone().
BUG: unable to handle kernel paging request at ffffea017a000000
IP: show_valid_zones+0x6f/0x160
This issue may happen on x86-64 systems with 64GiB or more memory since
their memory block size is bumped up to 2GiB. [1] An example of such
systems is desribed below. 0x3240000000 is only aligned by 1GiB and
this memory block starts from 0x3200000000, which is not backed by
struct page.
BIOS-e820: [mem 0x0000003240000000-0x000000603fffffff] usable
Since test_pages_in_a_zone() already checks holes, fix this issue by
extending this function to return 'valid_start' and 'valid_end' for a
given range. show_valid_zones() then proceeds with the valid range.
[1] 'Commit bdee237c03 ("x86: mm: Use 2GB memory block size on
large-memory x86-64 systems")'
Link: http://lkml.kernel.org/r/20170127222149.30893-3-toshi.kani@hpe.com
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Zhang Zhen <zhenzhang.zhang@huawei.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org> [4.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "fix a kernel oops when reading sysfs valid_zones", v2.
A sysfs memory file is created for each 2GiB memory block on x86-64 when
the system has 64GiB or more memory. [1] When the start address of a
memory block is not backed by struct page, i.e. a memory range is not
aligned by 2GiB, reading its 'valid_zones' attribute file leads to a
kernel oops. This issue was observed on multiple x86-64 systems with
more than 64GiB of memory. This patch-set fixes this issue.
Patch 1 first fixes an issue in test_pages_in_a_zone(), which does not
test the start section.
Patch 2 then fixes the kernel oops by extending test_pages_in_a_zone()
to return valid [start, end).
Note for stable kernels: The memory block size change was made by commit
bdee237c03 ("x86: mm: Use 2GB memory block size on large-memory x86-64
systems"), which was accepted to 3.9. However, this patch-set depends
on (and fixes) the change to test_pages_in_a_zone() made by commit
5f0f2887f4 ("mm/memory_hotplug.c: check for missing sections in
test_pages_in_a_zone()"), which was accepted to 4.4.
So, I recommend that we backport it up to 4.4.
[1] 'Commit bdee237c03 ("x86: mm: Use 2GB memory block size on
large-memory x86-64 systems")'
This patch (of 2):
test_pages_in_a_zone() does not check 'start_pfn' when it is aligned by
section since 'sec_end_pfn' is set equal to 'pfn'. Since this function
is called for testing the range of a sysfs memory file, 'start_pfn' is
always aligned by section.
Fix it by properly setting 'sec_end_pfn' to the next section pfn.
Also make sure that this function returns 1 only when the range belongs
to a zone.
Link: http://lkml.kernel.org/r/20170127222149.30893-2-toshi.kani@hpe.com
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: Greg KH <greg@kroah.com>
Cc: <stable@vger.kernel.org> [4.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some versions of ARM GCC compiler such as Android toolchain throws in a
'-fpic' flag by default. This causes the gcc-goto check script to fail
although some config would have '-fno-pic' flag in the KBUILD_CFLAGS.
This patch passes the KBUILD_CFLAGS to the check script so that the
script does not rely on the default config from different compilers.
Link: http://lkml.kernel.org/r/20170120234329.78868-1-dtwlin@google.com
Signed-off-by: David Lin <dtwlin@google.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Michal Marek <mmarek@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After much waiting I finally reproduced a KASAN issue, only to find my
trace-buffer empty of useful information because it got spooled out :/
Make kasan_report honour the /proc/sys/kernel/traceoff_on_warning
interface.
Link: http://lkml.kernel.org/r/20170125164106.3514-1-aryabinin@virtuozzo.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add zswap_init_failed bool that prevents changing any of the module
params, if init_zswap() fails, and set zswap_enabled to false. Change
'enabled' param to a callback, and check zswap_init_failed before
allowing any change to 'enabled', 'zpool', or 'compressor' params.
Any driver that is built-in to the kernel will not be unloaded if its
init function returns error, and its module params remain accessible for
users to change via sysfs. Since zswap uses param callbacks, which
assume that zswap has been initialized, changing the zswap params after
a failed initialization will result in WARNING due to the param
callbacks expecting a pool to already exist. This prevents that by
immediately exiting any of the param callbacks if initialization failed.
This was reported here:
https://marc.info/?l=linux-mm&m=147004228125528&w=4
And fixes this WARNING:
[ 429.723476] WARNING: CPU: 0 PID: 5140 at mm/zswap.c:503 __zswap_pool_current+0x56/0x60
The warning is just noise, and not serious. However, when init fails,
zswap frees all its percpu dstmem pages and its kmem cache. The kmem
cache might be serious, if kmem_cache_alloc(NULL, gfp) has problems; but
the percpu dstmem pages are definitely a problem, as they're used as
temporary buffer for compressed pages before copying into place in the
zpool.
If the user does get zswap enabled after an init failure, then zswap
will likely Oops on the first page it tries to compress (or worse, start
corrupting memory).
Fixes: 90b0fc26d5 ("zswap: change zpool/compressor at runtime")
Link: http://lkml.kernel.org/r/20170124200259.16191-2-ddstreet@ieee.org
Signed-off-by: Dan Streetman <dan.streetman@canonical.com>
Reported-by: Marcin Miroslaw <marcin@mejor.pl>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Three changes here, two run of the mill driver specific fixes and a
change from Mark Rutland which reverts some new device specific ACPI
binding code which was added during the merge window as there are
concerns about this sending the wrong signal about usage of regulators
in ACPI systems.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAliUbfoTHGJyb29uaWVA
a2VybmVsLm9yZwAKCRAk1otyXVSH0A9tB/4zf8o0ueo5kT2+15FZBozyY9iKMZl6
daIGxXdJlHjUoCawoq00az3SxELPx0ydq+Cl2A1/lpJAwy0RZ/K1NnIC/bddI9xD
m9DsgictpVqrl/XF6+9WIutXq4FTGQVWD7VbkG0pP/MF80tEzskTTNwe9uGjgeeu
tJAF0ksYC0wA8pG1ukTyAU5zthv6Vr4VSTq8ETpVkpwMiE7nfLtDlf468xg8L8ng
4JAgZA0AsEOWnDRQvc7gCFEmn41rl0WfQNnf/CdnjnrefVpFoW7+paU6a8mgGRqD
+8hiNaqvgjgGfICQV6eFpGoP//9jRvisEOxl255ZATXEKZ5fjdBOKd3T
=7XMg
-----END PGP SIGNATURE-----
Merge tag 'regulator-fix-v4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"Three changes here: two run of the mill driver specific fixes and a
change from Mark Rutland which reverts some new device specific ACPI
binding code which was added during the merge window as there are
concerns about this sending the wrong signal about usage of regulators
in ACPI systems"
* tag 'regulator-fix-v4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: fixed: Revert support for ACPI interface
regulator: axp20x: AXP806: Fix dcdcb being set instead of dcdce
regulator: twl6030: fix range comparison, allowing vsel = 59
I'm leaving my job at Red Hat, this email address will stop working next week.
Update it to one that I will have access to later.
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, under certain circumstances vhost_init_is_le does just a part
of the initialization job, and depends on vhost_reset_is_le being called
too. For this reason vhost_vq_init_access used to call vhost_reset_is_le
when vq->private_data is NULL. This is not only counter intuitive, but
also real a problem because it breaks vhost_net. The bug was introduced to
vhost_net with commit 2751c9882b ("vhost: cross-endian support for
legacy devices"). The symptom is corruption of the vq's used.idx field
(virtio) after VHOST_NET_SET_BACKEND was issued as a part of the vhost
shutdown on a vq with pending descriptors.
Let us make sure the outcome of vhost_init_is_le never depend on the state
it is actually supposed to initialize, and fix virtio_net by removing the
reset from vhost_vq_init_access.
With the above, there is no reason for vhost_reset_is_le to do just half
of the job. Let us make vhost_reset_is_le reinitialize is_le.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reported-by: Michael A. Tebolt <miket@us.ibm.com>
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixes: commit 2751c9882b ("vhost: cross-endian support for legacy devices")
Cc: <stable@vger.kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Michael A. Tebolt <miket@us.ibm.com>
This reverts commit c7070619f3.
This has been shown to regress on some ARM systems:
by forcing on DMA API usage for ARM systems, we have inadvertently
kicked open a hornets' nest in terms of cache-coherency. Namely that
unless the virtio device is explicitly described as capable of coherent
DMA by firmware, the DMA APIs on ARM and other DT-based platforms will
assume it is non-coherent. This turns out to cause a big problem for the
likes of QEMU and kvmtool, which generate virtio-mmio devices in their
guest DTs but neglect to add the often-overlooked "dma-coherent"
property; as a result, we end up with the guest making non-cacheable
accesses to the vring, the host doing so cacheably, both talking past
each other and things going horribly wrong.
We are working on a safer work-around.
Fixes: c7070619f3 ("vring: Force use of DMA API for ARM-based systems with legacy devices")
Reported-by: Robin Murphy <robin.murphy@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
One more device ID for pl2303.
Signed-off-by: Johan Hovold <johan@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIuBAABCAAYBQJYlKr/ERxqb2hhbkBrZXJuZWwub3JnAAoJEEEN5E/e4bSVMPMP
/2aaW+WBJxFVHtIZUESNWCuhe3CvUUrLmpKfOwuRCruvQ9C7vwRCcAW6tG8CjLVL
/Byq1K2RMiGq2XFIB5bVVz0XkXEBSX8tkXgp+I7M9Ajmixo4F6kpmE7RHQ3gAbwO
q3EqLz1hvHZ8nX53iHLgouMSPtHolS5o8ofd0HmKfAG60MnW0HKDGl0PBfEycZjf
K30cU76fQhixMosyc5bA3DZxhIprnktauWSGY61RmudGzxAiFUMlJrHT8RQZpRmK
x9VF5MJIGdkGQM73RW2uPpnXImbhIDZPTqKrWWfpp/+0dT/Qy/OpEzTS+K2HNSAS
79I8j2uPJ1c+BXCHvrpUHGc8zIxVNJ64pVrteCZdQHaWrTFZpLAAFRG31ctRR5do
hHzGBepXFr1QPGoebH65bDl3BeORoiNQkfPhAZi7kOucw9HwP9kFcN8DlvaWQ9J1
7cNOyankXwcgwkgcsUZm2SwrusXCJXchxur9MrDciBU2NdDP33A2E5pSe6vEqyhR
Vs2R3NWdV5tJErMYVLJY1MgZ7oRHgSsj3ldAE7ce3HIyzJqrxxyHI8/0/ncDAKon
lukp3S+XVKnqp+ZIJpcWTHXKfsXZLrpH5twD6qY4IrsQGF+E3dVzeLFbXh2pK6KD
nGpxS5dNh1lSMyrQF2wRp+jAKEaJMvx+9QrFFGjr6cSz
=YjOy
-----END PGP SIGNATURE-----
Merge tag 'usb-serial-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus
Johan writes:
USB-serial fixes for v4.10-rc7
One more device ID for pl2303.
Signed-off-by: Johan Hovold <johan@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYk70aAAoJEAx081l5xIa+yNMQAKj/mRRLbF5LTHtHDSwkiFkL
0PQYJUI9GHIc5zGFlQc9YJ+Nk2cQSEybCD6fSd88JofNffE+PJW2cDU17ggAeeVW
AqKacwadLduaK286TNWhX+c1tkWPtbvNWGzLSNTZdIAqTGFH4FvJFtRevLLEVMjY
jXE/yfkqRH+IbogsHMRuwzPZLxgpT41AtMrC39ndIOn0znfdaQz6gdYzspH2X0lo
w698AxD6cOFSTovQDG6H7cbhbo8sSqDVF6a38cAH4DwtczraVfLYPbBkAApHBocA
trPZXro5GO5uJbV9IMxsBjmBdap3T7R7HAWYH5blJy9x+M3U0UrAsZC5AGM38xRG
pXspq8ewDVW3jQ6Ob7Gg7ksmaymVLqUjQMgohYH3DrdRokehHYheO5DyOZqfJJOs
w1kybnNp16a8Qwu16IbbdOUhNfEedzmCeJKWg/g2KDqJlsdmF/4VUaOQBqgeeNta
2SZQ12v/Xv5aj8XocFj2exICuwq04DnnXHBU2HpEOZ7bHEiVy7mBNxap17+NbklY
cyd0cwzXJ6WvrEmmjC5IAd569wKoC1RDOXOMKrKiPKgRbGaofhui6klBVLqT+6Nt
yKM1uevZlxiumuEup6r6yDVyxjjhO6Ej07r9HsBRrnkTSkm8qkuvvPVICRfOF8gL
8VrH5uUFFqbZ6+IO15xq
=hRhh
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-for-v4.10-rc7' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Another fixes pull for v4.10, it's a bit big due to the backport of
the VMA fixes for i915 that should fix the oops on shutdown problems
that you've worked around.
There are also two drm core connector registration fixes, a bunch of
nouveau regression fixes and two AMD fixes"
* tag 'drm-fixes-for-v4.10-rc7' of git://people.freedesktop.org/~airlied/linux:
drm/radeon: Fix vram_size/visible values in DRM_RADEON_GEM_INFO ioctl
drm/amdgpu/si: fix crash on headless asics
drm/i915: Track pinned vma in intel_plane_state
drm/atomic: Unconditionally call prepare_fb.
drm/atomic: Fix double free in drm_atomic_state_default_clear
drm/nouveau/kms/nv50: request vblank events for commits that send completion events
drm/nouveau/nv1a,nv1f/disp: fix memory clock rate retrieval
drm/nouveau/disp/gt215: Fix HDA ELD handling (thus, HDMI audio) on gt215
drm/nouveau/nouveau/led: prevent compiling the led-code if nouveau=y and leds=m
drm/nouveau/disp/mcp7x: disable dptmds workaround
drm/nouveau: prevent userspace from deleting client object
drm/nouveau/fence/g84-: protect against concurrent access to semaphore buffers
drm: Don't race connector registration
drm: prevent double-(un)registration for connectors
The main change is we're reverting the initial stack protector support we
merged this cycle. It turns out to not work on toolchains built with libc
support, and fixing it will be need to wait for another release.
And the rest are all fairly minor:
- Some pasemi machines were not booting due to a missing error check in
prom_find_boot_cpu().
- In EEH we were checking a pointer rather than the bool it pointed to.
- The clang build was broken by a BUILD_BUG_ON() we added.
- The radix (Power9 only) version of map_kernel_page() was broken if our
memory size was a multiple of 2MB, which it generally isn't.
Thanks to:
Darren Stevens, Gavin Shan, Reza Arbab.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYlGHSAAoJEFHr6jzI4aWAVy4QALSpVLv0900hMtn+5MBhhgad
GxHI3ViJUV+K3gDpagl+Ya++WiPd9ffiK9FqS4Xe/r1r2pgSoNU8SC736Jomb1wL
cCx34E73SLMTfnhm/lVq+a4e+nRnk633C8gdsU4Py+FFOfz7FkeQX/gBGbv+ffng
oVO6Susqo/dKirVo6+BCgVjSLdYezzxw31SVNGYUYz4MFxWUHK+bi/l0FIDQGhjD
mTzV2cPhfjYTAHQtY43in6plQ91Pd2CR5HxL3Bw9DnFhS7zFMBnWPYYWh+kdT/vf
wNS1cFJoGHjXCaXt/eXsk3GPC696bDWGSwpenQewTQdb8yJ5MWP7Jv9KeBN8603x
LSE6o9/RATv7+pJJ9UPP+vLwSXtx2bi1FJlOU4Nbxi5YhfXZncCltMANCNuVix1l
2x5Qhs8GJNt7nqxhv5GwvAmEaKZet1Ucgo+p/9D08bfKk8loN+evRx6fjT5CymE7
s1INIbkzxppebTBjawDA7w6kWXjmgrskWvqjrPUfu91Ro8KmeBVS4t0pXY9D/i8N
hCEMbXNL5qrWTRScOxP3k5cQsOPnQEz6F7AkecYjWfwGG9ZvdcztuhTj1qYRjVaF
9Xk7OWu52/EamFdZ5MGuiVIEp8Vbqy8/0IJ/bdTOEMgjsaziD6LyRNZ6Qoyxg+bf
kFEqmzuKgmz3D4aTklnh
=XUAi
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"The main change is we're reverting the initial stack protector support
we merged this cycle. It turns out to not work on toolchains built
with libc support, and fixing it will be need to wait for another
release.
And the rest are all fairly minor:
- Some pasemi machines were not booting due to a missing error check
in prom_find_boot_cpu()
- In EEH we were checking a pointer rather than the bool it pointed
to
- The clang build was broken by a BUILD_BUG_ON() we added.
- The radix (Power9 only) version of map_kernel_page() was broken if
our memory size was a multiple of 2MB, which it generally isn't
Thanks to: Darren Stevens, Gavin Shan, Reza Arbab"
* tag 'powerpc-4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm: Use the correct pointer when setting a 2MB pte
powerpc: Fix build failure with clang due to BUILD_BUG_ON()
powerpc: Revert the initial stack protector support
powerpc/eeh: Fix wrong flag passed to eeh_unfreeze_pe()
powerpc: Add missing error check to prom_find_boot_cpu()
Merge kcrctab entry fixes from Ard Biesheuvel:
"This is a followup to [0] 'modversions: redefine kcrctab entries as
relative CRC pointers', but since relative CRC pointers do not work in
modules, and are actually only needed by powerpc with
CONFIG_RELOCATABLE=y, I have made it a Kconfig selectable feature
instead.
First it introduces the MODULE_REL_CRCS Kconfig symbol, and adds the
kbuild handling of it, i.e., modpost, genksyms and kallsyms.
Then it switches all architectures to 32-bit CRC entries in kcrctab,
where all architectures except powerpc with CONFIG_RELOCATABLE=y use
absolute ELF symbol references as before"
[0] http://marc.info/?l=linux-arch&m=148493613415294&w=2
* emailed patches from Ard Biesheuvel:
module: unify absolute krctab definitions for 32-bit and 64-bit
modversions: treat symbol CRCs as 32 bit quantities
kbuild: modversions: add infrastructure for emitting relative CRCs
The function order_base_2() is defined (according to the comment block)
as returning zero on input zero, but subsequently passes the input into
roundup_pow_of_two(), which is explicitly undefined for input zero.
This has gone unnoticed until now, but optimization passes in GCC 7 may
produce constant folded function instances where a constant value of
zero is passed into order_base_2(), resulting in link errors against the
deliberately undefined '____ilog2_NaN'.
So update order_base_2() to adhere to its own documented interface.
[ See
http://marc.info/?l=linux-kernel&m=147672952517795&w=2
and follow-up discussion for more background. The gcc "optimization
pass" is really just broken, but now the GCC trunk problem seems to
have escaped out of just specially built daily images, so we need to
work around it in mainline. - Linus ]
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Saving unsupported state prevents migration when the new host does not
support a XSAVE feature of the original host, even if the feature is not
exposed to the guest.
We've masked host features with guest-visible features before, with
4344ee981e ("KVM: x86: only copy XSAVE state for the supported
features") and dropped it when implementing XSAVES. Do it again.
Fixes: df1daba7d1 ("KVM: x86: support XSAVES usage in the host")
Cc: stable@vger.kernel.org
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
The previous patch introduced a separate inline asm version of the
krcrctab declaration template for use with 64-bit architectures, which
cannot refer to ELF symbols using 32-bit quantities.
This declaration should be equivalent to the C one for 32-bit
architectures, but just in case - unify them in a separate patch, which
can simply be dropped if it turns out to break anything.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
- Given that the CRCs are treated as memory addresses, we waste 4 bytes
for each CRC on 64 bit architectures,
- On architectures that support runtime relocation, a R_<arch>_RELATIVE
relocation entry is emitted for each CRC value, which identifies it
as a quantity that requires fixing up based on the actual runtime
load offset of the kernel. This results in corrupted CRCs unless we
explicitly undo the fixup (and this is currently being handled in the
core module code)
- Such runtime relocation entries take up 24 bytes of __init space
each, resulting in a x8 overhead in [uncompressed] kernel size for
CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset. Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant. Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff). To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit d4703aefdb ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This add the kbuild infrastructure that will allow architectures to emit
vmlinux symbol CRCs as 32-bit offsets to another location in the kernel
where the actual value is stored. This works around problems with CRCs
being mistaken for relocatable symbols on kernels that self relocate at
runtime (i.e., powerpc with CONFIG_RELOCATABLE=y)
For the kbuild side of things, this comes down to the following:
- introducing a Kconfig symbol MODULE_REL_CRCS
- adding a -R switch to genksyms to instruct it to emit the CRC symbols
as references into the .rodata section
- making modpost distinguish such references from absolute CRC symbols
by the section index (SHN_ABS)
- making kallsyms disregard non-absolute symbols with a __crc_ prefix
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>