linux

Commit Graph

Author	SHA1	Message	Date
Qiujun Huang	57aa9f294b	USB: serial: io_edgeport: fix slab-out-of-bounds read in edge_interrupt_callback Fix slab-out-of-bounds read in the interrupt-URB completion handler. The boundary condition should be (length - 1) as we access data[position + 1]. Reported-and-tested-by: syzbot+37ba33391ad5f3935bbd@syzkaller.appspotmail.com Signed-off-by: Qiujun Huang <hqjagain@gmail.com> Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org>	2020-03-26 10:22:15 +01:00
Simran Singhal	ecc11b42c7	staging: rtl8723bs: hal: Compress return logic Simplify function returns by merging assignment and return into one command line. Found with Coccinelle @@ local idexpression ret; expression e; @@ -ret = +return e; -return ret; Signed-off-by: Simran Singhal <singhalsimran0@gmail.com> Link: https://lore.kernel.org/r/20200325214312.GA1936@simran-Inspiron-5558 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-03-26 08:37:38 +01:00
Simran Singhal	1b590af9fa	staging: rtl8723bs: rtw_cmd: Compress lines for immediate return Compress two lines into a single line if immediate return statement is found. It also removes variable cmd_obj as it is no longer needed. It is done using script Coccinelle. And coccinelle uses following semantic patch for this compression function: @@ expression ret; identifier f; @@ -ret = +return f(...); -return ret; Signed-off-by: Simran Singhal <singhalsimran0@gmail.com> Link: https://lore.kernel.org/r/20200325212253.GA8175@simran-Inspiron-5558 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-03-26 08:37:38 +01:00
Simran Singhal	f465b0a4e5	staging: rtl8723bs: rtw_efuse: Compress lines for immediate return Compress two lines into a single line if immediate return statement is found. It is done using script Coccinelle. And coccinelle uses following semantic patch for this compression function: @@ expression ret; identifier f; @@ -ret = +return f(...); -return ret; Signed-off-by: Simran Singhal <singhalsimran0@gmail.com> Link: https://lore.kernel.org/r/20200325205418.GA29149@simran-Inspiron-5558 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-03-26 08:37:38 +01:00
Ajay Singh	bd864252cf	staging: wilc1000: remove label from examples in DT binding documentation Remove labels and not relevant property from DT binding documentation examples as suggested in [1]. 1. https://patchwork.ozlabs.org/patch/1252837 Signed-off-by: Ajay Singh <ajay.kathat@microchip.com> Link: https://lore.kernel.org/r/20200325164234.14146-1-ajay.kathat@microchip.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-03-26 08:37:38 +01:00
Leonard Crestez	8400ab8896	clk: imx: Align imx sc clock parent msg structs to 4 The imx SC api strongly assumes that messages are composed out of 4-bytes words but some of our message structs have odd sizeofs. This produces many oopses with CONFIG_KASAN=y. Fix by marking with __aligned(4). Fixes: `666aed2d13` ("clk: imx: scu: add set parent support") Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Link: https://lkml.kernel.org/r/aad021e432b3062c142973d09b766656eec18fde.1582216144.git.leonard.crestez@nxp.com Signed-off-by: Stephen Boyd <sboyd@kernel.org>	2020-03-25 18:46:05 -07:00
Leonard Crestez	a0ae04a256	clk: imx: Align imx sc clock msg structs to 4 The imx SC api strongly assumes that messages are composed out of 4-bytes words but some of our message structs have odd sizeofs. This produces many oopses with CONFIG_KASAN=y. Fix by marking with __aligned(4). Fixes: `fe37b48204` ("clk: imx: add scu clock common part") Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Link: https://lkml.kernel.org/r/10e97a04980d933b2cfecb6b124bf9046b6e4f16.1582216144.git.leonard.crestez@nxp.com Signed-off-by: Stephen Boyd <sboyd@kernel.org>	2020-03-25 18:45:59 -07:00
Stephen Boyd	4e93430120	clk: Pass correct arguments to __clk_hw_register_gate() I copy/pasted these macros and forgot to update the argument names and where they're passed to. Fix it so that these macros make sense. Reported-by: Maxime Ripard <maxime@cerno.tech> Fixes: `194efb6e26` ("clk: gate: Add support for specifying parents via DT/pointers") Signed-off-by: Stephen Boyd <sboyd@kernel.org> Link: https://lkml.kernel.org/r/20200325022257.148244-1-sboyd@kernel.org Tested-by: Maxime Ripard <mripard@kernel.org>	2020-03-25 17:38:23 -07:00
Jens Axboe	01bb12fce7	Merge branch 'nvme-5.7-rc1' of git://git.infradead.org/nvme into for-5.7/drivers Pull 5.7 NVMe updates from Keith. * 'nvme-5.7-rc1' of git://git.infradead.org/nvme: (42 commits) nvme: cleanup namespace identifier reporting in nvme_init_ns_head nvme: rename __nvme_find_ns_head to nvme_find_ns_head nvme: refactor nvme_identify_ns_descs error handling nvme-tcp: Add warning on state change failure at nvme_tcp_setup_ctrl nvme-rdma: Add warning on state change failure at nvme_rdma_setup_ctrl nvme: Fix controller creation races with teardown flow nvme: Make nvme_uninit_ctrl symmetric to nvme_init_ctrl nvme: Fix ctrl use-after-free during sysfs deletion nvme-pci: Re-order nvme_pci_free_ctrl nvme: Remove unused return code from nvme_delete_ctrl_sync nvme: Use nvme_state_terminal helper nvme: release ida resources nvme: Add compat_ioctl handler for NVME_IOCTL_SUBMIT_IO nvmet-tcp: optimize tcp stack TX when data digest is used nvme-fabrics: Use scnprintf() for avoiding potential buffer overflow nvme-multipath: do not reset on unknown status nvmet-rdma: allocate RW ctxs according to mdts nvmet-rdma: Implement get_mdts controller op nvmet: Add get_mdts op for controllers nvme-pci: properly print controller address ...	2020-03-25 16:20:51 -06:00
Linus Torvalds	1b649e0bca	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from David Miller: 1) Fix deadlock in bpf_send_signal() from Yonghong Song. 2) Fix off by one in kTLS offload of mlx5, from Tariq Toukan. 3) Add missing locking in iwlwifi mvm code, from Avraham Stern. 4) Fix MSG_WAITALL handling in rxrpc, from David Howells. 5) Need to hold RTNL mutex in tcindex_partial_destroy_work(), from Cong Wang. 6) Fix producer race condition in AF_PACKET, from Willem de Bruijn. 7) cls_route removes the wrong filter during change operations, from Cong Wang. 8) Reject unrecognized request flags in ethtool netlink code, from Michal Kubecek. 9) Need to keep MAC in reset until PHY is up in bcmgenet driver, from Doug Berger. 10) Don't leak ct zone template in act_ct during replace, from Paul Blakey. 11) Fix flushing of offloaded netfilter flowtable flows, also from Paul Blakey. 12) Fix throughput drop during tx backpressure in cxgb4, from Rahul Lakkireddy. 13) Don't let a non-NULL skb->dev leave the TCP stack, from Eric Dumazet. 14) TCP_QUEUE_SEQ socket option has to update tp->copied_seq as well, also from Eric Dumazet. 15) Restrict macsec to ethernet devices, from Willem de Bruijn. 16) Fix reference leak in some ethtool _SET handlers, from Michal Kubecek. 17) Fix accidental disabling of MSI for some r8169 chips, from Heiner Kallweit. git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (138 commits) net: Fix CONFIG_NET_CLS_ACT=n and CONFIG_NFT_FWD_NETDEV={y, m} build net: ena: Add PCI shutdown handler to allow safe kexec selftests/net/forwarding: define libs as TEST_PROGS_EXTENDED selftests/net: add missing tests to Makefile r8169: re-enable MSI on RTL8168c net: phy: mdio-bcm-unimac: Fix clock handling cxgb4/ptp: pass the sign of offset delta in FW CMD net: dsa: tag_8021q: replace dsa_8021q_remove_header with __skb_vlan_pop net: cbs: Fix software cbs to consider packet sending time net/mlx5e: Do not recover from a non-fatal syndrome net/mlx5e: Fix ICOSQ recovery flow with Striding RQ net/mlx5e: Fix missing reset of SW metadata in Striding RQ reset net/mlx5e: Enhance ICOSQ WQE info fields net/mlx5_core: Set IB capability mask1 to fix ib_srpt connection failure selftests: netfilter: add nfqueue test case netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress netfilter: nft_fwd_netdev: validate family and chain type netfilter: nft_set_rbtree: Detect partial overlaps on insertion netfilter: nft_set_rbtree: Introduce and use nft_rbtree_interval_start() netfilter: nft_set_pipapo: Separate partial and complete overlap cases on insertion ...	2020-03-25 13:58:05 -07:00
Linus Torvalds	1dfb642b10	GPIO fixes for the v5.6 series: - One core quirk by myself to fix the .irq_disable() semantics when the gpiolib core takes over this callback. - The rest is an elaborate series of 4 patches fixing Intel laptop ACPI wakeup quirks. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAl57qSIACgkQQRCzN7AZ XXMmqxAAiggbsjiMkufxHxDqKl0tLRkbQwlsnz5VQlH0XCPBYSRidbIfQdiDFTbD yoBENIYeW/9aVlb0Q7a2T5H4zy1Ye3MiqObZt97ECr7DyxwNrAbPS/IgOFmsjSmu C4HZgX6/fQpBlJnKVgMBwzdgKShX4Fb4l1gFnY7uRRjBGVkp5ZArqJALJ747zFR0 yj/kpA+hGgO24AJspF+QaLFfKkjMXKv2zgaa3HEtsYqUBH/3yBZigLKJs3koXHYB 3c7EbWxnpYLiPi/T51aN7ZbbNX7Ag6wW6rLm/6460MCYDRSOB+7fCJCZc2KZJomr 4ntq1HHxSrjOiqce+MSvc+WLI6pqjN73tyWJwhFgBfwr32Ara5FR6CO0KJT9M5d+ 0i8Na0Dyp2LlZM530ygOOn48p+tiU6NVoR/BXUnXBBuYtFsOmfTtrQGh++Uw+rcJ wraFAsSzzr7JmXzh+tUEeGGdI4EYy1T6+DppEfSbyty9W6YnWvqHP9xcyuicUyi2 g9Btl0hRhSqQZozDYjROPQox1qjUKHLnI66n/W9pVUN61GlFlSH/38yRAEpD0DZW GlzSEkfL5SN5DQjaAOC2z0z+kfFSGdtCYMOKkyl+cKGbLCRc43PrzHB3oH+wF1Mu 0J+Hilrk9y4sGTX67jIQPV/t1qhoorRFzihuUkHBdGt//XCqtI4= =+dEG -----END PGP SIGNATURE----- Merge tag 'gpio-v5.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: - One core quirk by myself to fix the .irq_disable() semantics when the gpiolib core takes over this callback. - The rest is an elaborate series of four patches fixing Intel laptop ACPI wakeup quirks. * tag 'gpio-v5.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpiolib: acpi: Add quirk to ignore EC wakeups on HP x2 10 CHT + AXP288 model gpiolib: acpi: Add quirk to ignore EC wakeups on HP x2 10 BYT + AXP288 model gpiolib: acpi: Rework honor_wakeup option into an ignore_wake option gpiolib: acpi: Correct comment for HP x2 10 honor_wakeup quirk gpiolib: Fix irq_disable() semantics	2020-03-25 13:52:36 -07:00
David S. Miller	2910594fd3	wireless-drivers fixes for v5.6 Fourth, and last, set of fixes for v5.6. Just two important fixes to iwlwifi regressions. iwlwifi * fix GEO_TX_POWER_LIMIT command on certain devices which caused firmware to crash during initialisation * add back device ids for three devices which were accidentally removed -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJee7S6AAoJEG4XJFUm622bcH4IAICV95Y79p9loKZVwXBrguiK 5D9rBNZcSc0Yo9AwZkhatRAJYiLu19hw2qra3YsQnaWpjESmnZtV6/ZASDcpOGSP NaI4TMLt57dhnpqhpglvGeM9PlUbaoqX7Sl5LbhnSQoKkk7rppnk90KqbXl7SDba 4eG7+i4iMc69uw4BttVh/lfgAH0YyYpgStWi9ccoQ2ip9SrljFZwmCF7q+3/25OR Hyty44U90BBrsi+LrRuhutR2LuR+TfXEWmtXWRzOpnwmicY8sCpeEVCMeeVy59TR y66sjlhM5w+N1mZbzBrqUZJLdMGInljx4G4DIvvi8jbD6rVLZc8zAeZMLjO24ZM= =CFbE -----END PGP SIGNATURE----- Merge tag 'wireless-drivers-2020-03-25' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for v5.6 Fourth, and last, set of fixes for v5.6. Just two important fixes to iwlwifi regressions. iwlwifi * fix GEO_TX_POWER_LIMIT command on certain devices which caused firmware to crash during initialisation * add back device ids for three devices which were accidentally removed ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-25 13:12:26 -07:00
Christoph Hellwig	43fcd9e1ea	nvme: cleanup namespace identifier reporting in nvme_init_ns_head Lift the common namespace identifier reporting between the shared namespace and new nshead cases into common code. This also means one less lock is held while doing I/O. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:51:56 +09:00
Christoph Hellwig	026d2ef752	nvme: rename __nvme_find_ns_head to nvme_find_ns_head There is no non __-prefixed version, so make the name a little more readable. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:51:56 +09:00
Christoph Hellwig	fb314eb0cb	nvme: refactor nvme_identify_ns_descs error handling Move the handling of an error into the function from the caller, and only do it for an actual error on the admin command itself, not the command parsing, as that should be enough to deal with devices claiming a bogus version compliance. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:51:56 +09:00
Israel Rukshin	bea54ef53f	nvme-tcp: Add warning on state change failure at nvme_tcp_setup_ctrl The transition to LIVE state should not fail in case of a new controller. Moving to DELETING state before nvme_tcp_create_ctrl() allocates all the resources may leads to NULL dereference at teardown flow (e.g., IO tagset, admin_q, connect_q). Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:51:56 +09:00
Israel Rukshin	96135862df	nvme-rdma: Add warning on state change failure at nvme_rdma_setup_ctrl The transition to LIVE state should not fail in case of a new controller. Moving to DELETING state before nvme_tcp_create_ctrl() allocates all the resources may leads to NULL dereference at teardown flow (e.g., IO tagset, admin_q, connect_q). Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:51:56 +09:00
Israel Rukshin	ce1518139e	nvme: Fix controller creation races with teardown flow Calling nvme_sysfs_delete() when the controller is in the middle of creation may cause several bugs. If the controller is in NEW state we remove delete_controller file and don't delete the controller. The user will not be able to use nvme disconnect command on that controller again, although the controller may be active. Other bugs may happen if the controller is in the middle of create_ctrl callback and nvme_do_delete_ctrl() starts. For example, freeing I/O tagset at nvme_do_delete_ctrl() before it was allocated at create_ctrl callback. To fix all those races don't allow the user to delete the controller before it was fully created. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:51:56 +09:00
Israel Rukshin	726612b6b8	nvme: Make nvme_uninit_ctrl symmetric to nvme_init_ctrl Put the ctrl reference count at nvme_uninit_ctrl as opposed to nvme_init_ctrl which takes it. This decrease the reference count at the core layer instead of decreasing it on each transport separately. Also move the call of nvme_uninit_ctrl at PCI driver after calling to nvme_release_prp_pools and nvme_dev_unmap, in order to put the reference count after using the dev. This is safe because those functions use nvme_dev which is freed only later at nvme_pci_free_ctrl. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:51:56 +09:00
Israel Rukshin	b780d7415a	nvme: Fix ctrl use-after-free during sysfs deletion In case nvme_sysfs_delete() is called by the user before taking the ctrl reference count, the ctrl may be freed during the creation and cause the bug. Take the reference as soon as the controller is externally visible, which is done by cdev_device_add() in nvme_init_ctrl(). Also take the reference count at the core layer instead of taking it on each transport separately. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:51:56 +09:00
Israel Rukshin	253fd4ac80	nvme-pci: Re-order nvme_pci_free_ctrl Destroy the resources in the same order like in nvme_probe error flow to improve code readability. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:51:55 +09:00
Israel Rukshin	6721c18a06	nvme: Remove unused return code from nvme_delete_ctrl_sync The return code of nvme_delete_ctrl_sync is never used, so change it to void. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:51:55 +09:00
Israel Rukshin	e7c43feae2	nvme: Use nvme_state_terminal helper Improve code readability. Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:51:55 +09:00
Max Gurtovoy	f41cfd5d0a	nvme: release ida resources ida instances allocate some internal memory in addition to the base 'struct ida'. Use ida_destroy() to release that memory at module_exit(). Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:51:55 +09:00
masahiro31.yamada@kioxia.com	c225b61031	nvme: Add compat_ioctl handler for NVME_IOCTL_SUBMIT_IO Currently 32 bit application gets ENOTTY when it calls compat_ioctl with NVME_IOCTL_SUBMIT_IO in 64 bit kernel. The cause is that the results of sizeof(struct nvme_user_io), which is used to define NVME_IOCTL_SUBMIT_IO, are not same between 32 bit compiler and 64 bit compiler. * 32 bit: the result of sizeof nvme_user_io is 44. * 64 bit: the result of sizeof nvme_user_io is 48. 64 bit compiler seems to add 32 bit padding for multiple of 8 bytes. This patch adds a compat_ioctl handler. The handler replaces NVME_IOCTL_SUBMIT_IO32 with NVME_IOCTL_SUBMIT_IO in case 32 bit application calls compat_ioctl for submit in 64 bit kernel. Then, it calls nvme_ioctl as usual. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Masahiro Yamada (KIOXIA) <masahiro31.yamada@kioxia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:51:55 +09:00
Sagi Grimberg	e90d172b11	nvmet-tcp: optimize tcp stack TX when data digest is used If we have a 4-byte data digest to send to the wire, but we have more data to send, set MSG_MORE to tell the stack that more is coming. Reviewed-by: Mark Wunderlich <mark.wunderlich@intel.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:51:55 +09:00
Takashi Iwai	8d8a50e20d	nvme-fabrics: Use scnprintf() for avoiding potential buffer overflow Since snprintf() returns the would-be-output size instead of the actual output size, the succeeding calls may go beyond the given buffer limit. Fix it by replacing with scnprintf(). Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:51:55 +09:00
John Meneghini	764e933209	nvme-multipath: do not reset on unknown status The nvme multipath error handling defaults to controller reset if the error is unknown. There are, however, no existing nvme status codes that indicate a reset should be used, and resetting causes unnecessary disruption to the rest of IO. Change nvme's error handling to first check if failover should happen. If not, let the normal error handling take over rather than reset the controller. Based-on-a-patch-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: John Meneghini <johnm@netapp.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:51:55 +09:00
Max Gurtovoy	c363f249e7	nvmet-rdma: allocate RW ctxs according to mdts Current nvmet-rdma code allocates MR pool budget based on queue size, assuming both host and target use the same "max_pages_per_mr" count. After limiting the mdts value for RDMA controllers, we know the factor of maximum MR's per IO operation. Thus, make sure MR pool will be sufficient for the required IO depth and IO size. That is, say host's SQ size is 100, then the MR pool budget allocated currently at target will also be 100 MRs. But 100 IO WRITE Requests with 256 sg_count(IO size above 1MB) require 200 MRs when target's "max_pages_per_mr" is 128. Reported-by: Krishnamraju Eraparaju <krishna2@chelsio.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Max Gurtovoy <maxg@mellanox.com>	2020-03-26 04:51:55 +09:00
Max Gurtovoy	ec6d20e16c	nvmet-rdma: Implement get_mdts controller op Set the maximal data transfer size to be 1MB (currently mdts is unlimited). This will allow calculating the amount of MR's that one ctrl should allocate to fulfill it's capabilities. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Max Gurtovoy <maxg@mellanox.com>	2020-03-26 04:51:55 +09:00
Max Gurtovoy	02cb00e233	nvmet: Add get_mdts op for controllers Some transports, such as RDMA, would like to set the Maximum Data Transfer Size (MDTS) according to device/port/ctrl characteristics. This will enable the transport to set the optimal MDTS according to controller needs and device capabilities. Add a new nvmet transport op that is called during ctrl identification. This will not effect transports that don't implement this option. The return value of the new op is according to the NVMe spec definition for MDTS. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com>	2020-03-26 04:51:54 +09:00
Max Gurtovoy	2db24e4a22	nvme-pci: properly print controller address Align PCI address print with fabrics address that is printed with newline character. Before: [root@server40 linux]# cat /sys/class/nvme/nvme2/address 0000:0b:00.0[root@server40 linux]# After: [root@server40 linux]# cat /sys/class/nvme/nvme2/address 0000:0b:00.0 [root@server40 linux]# Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Max Gurtovoy <maxg@mellanox.com>	2020-03-26 04:51:54 +09:00
Sagi Grimberg	761ad26c45	nvme-tcp: break from io_work loop if recv failed If we failed to receive data from the socket, don't try to further process it, we will for sure be handling a queue error at this point. While no issue was seen with the current behavior thus far, its safer to cease socket processing if we detected an error. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:51:45 +09:00
Sagi Grimberg	5ff4e11264	nvme-tcp: move send failure to nvme_tcp_try_send Consolidate the request failure handling code to where it is being fetched (nvme_tcp_try_send). Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:50:21 +09:00
Sagi Grimberg	9cda34e374	nvmet-tcp: fix maxh2cdata icresp parameter MAXH2CDATA is not zero based. Also no reason to limit ourselves to 1M transfers as we can do more easily. Make this an arbitrary limit of 16M. Reported-by: Wenhua Liu <liuw@vmware.com> Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:49:35 +09:00
Sagi Grimberg	40510a639e	nvme-tcp: optimize queue io_cpu assignment for multiple queue maps Currently, queue io_cpu assignment is done sequentially for default, read and poll queues based on queue id. This causes miss-alignment between context of CPU initiating I/O and the I/O worker thread processing queued requests or completions. Change to modify queue io_cpu assignment to take into account queue maps offset. Each queue io_cpu will start at zero for each queue map. This essentially aligns read/poll queues to start over the same range as default queues. Testing performed by Mark with: - ram device (nvmet) - single CPU core (pinned) - 100% 4k reads - engine io_uring (not using sq_thread option) - hipri flag set Micro-benchmark results show a net gain of: - increase of 18%-29% in IOPs - reduction of 16%-22% in average latency - reduction of 7%-23% in 99.99% latency Baseline: ======== QDepth/Batch \| IOPs [k] \| Avg. Lat [us] \| 99.99% Lat [us] ----------------------------------------------------------------- 1/1 \| 32.4 \| 30.11 \| 50.94 32/8 \| 179 \| 168.20 \| 371 CPU alignment: ============= QDepth/Batch \| IOPs [k] \| Avg. Lat [us] \| 99.99% Lat [us] ----------------------------------------------------------------- 1/1 \| 38.5 \| 25.18 \| 39.16 32/8 \| 231 \| 130.75 \| 343 Reported-by: Mark Wunderlich <mark.wunderlich@intel.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:48:06 +09:00
Keith Busch	fa059b856a	nvme-pci: Simplify nvme_poll_irqdisable The timeout handler can use the existing nvme_poll() if it needs to check a polled queue, allowing nvme_poll_irqdisable() to handle only irq driven queues for the remaining callers. Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:48:06 +09:00
Keith Busch	324b494c28	nvme-pci: Remove two-pass completions Completion handling had been done in two steps: find all new completions under a lock, then handle those completions outside the lock. This was done to make the locked section as short as possible so that other threads using the same lock wait less time. The driver no longer shares locks during completion, and is in fact lockless for interrupt driven queues, so the optimization no longer serves its original purpose. Replace the two-pass completion queue handler with a single pass that completes entries immediately. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:48:06 +09:00
Keith Busch	bf392a5dc0	nvme-pci: Remove tag from process cq The only user for tagged completion was for timeout handling. That user, though, really only cares if the timed out command is completed, which we can safely check within the timeout handler. Remove the tag check to simplify completion handling. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:48:06 +09:00
Alexey Dobriyan	e2a366a4b0	nvme-pci: slimmer CQ head update Update CQ head with pre-increment operator. This saves subtraction of 1 and a few registers. Also update phase with "^= 1". This generates only one RMW instruction. ffffffff815ba150 <nvme_update_cq_head>: ffffffff815ba150: 0f b7 47 70 movzx eax,WORD PTR [rdi+0x70] ffffffff815ba154: 83 c0 01 add eax,0x1 ffffffff815ba157: 66 89 47 70 mov WORD PTR [rdi+0x70],ax ffffffff815ba15b: 66 3b 47 68 cmp ax,WORD PTR [rdi+0x68] ffffffff815ba15f: 74 01 je ffffffff815ba162 <nvme_update_cq_head+0x12> ffffffff815ba161: c3 ret ffffffff815ba162: 31 c0 xor eax,eax ffffffff815ba164: 80 77 74 01 ===> xor BYTE PTR [rdi+0x74],0x1 ffffffff815ba168: 66 89 47 70 mov WORD PTR [rdi+0x70],ax ffffffff815ba16c: c3 ret add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-119 (-119) Function old new delta nvme_poll 690 678 -12 nvme_dev_disable 1230 1177 -53 nvme_irq 613 559 -54 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2020-03-26 04:48:06 +09:00
Amit Engel	6d525f9755	nvmet: check ncqr & nsqr for set-features cmd For set feature command when setting up NVME_FEAT_NUM_QUEUES, check Number of I/O Completion Queues Requested (NCQR) and Number of I/O Submission Queues Requested (NSQR) before we proceed, for invalid values (i.e. 65535) return an appropriate NVMe invalid field status. Signed-off-by: Amit Engel <Amit.Engel@dell.com> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:47:03 +09:00
Josh Triplett	3e98c2443f	nvme: Check for readiness more quickly, to speed up boot time After initialization, nvme_wait_ready checks for readiness every 100ms, even though the drive may be ready far sooner than that. This delays system boot by hundreds of milliseconds. Reduce the delay, checking for readiness every millisecond instead. Boot-time tests on an AWS c5.12xlarge: Before: [ 0.546936] initcall nvme_init+0x0/0x5b returned 0 after 37 usecs ... [ 0.764178] nvme nvme0: 2/0/0 default/read/poll queues [ 0.768424] nvme0n1: p1 [ 0.774132] EXT4-fs (nvme0n1p1): mounted filesystem with ordered data mode. Opts: (null) [ 0.774146] VFS: Mounted root (ext4 filesystem) on device 259:1. ... [ 0.788141] Run /sbin/init as init process After: [ 0.537088] initcall nvme_init+0x0/0x5b returned 0 after 37 usecs ... [ 0.543457] nvme nvme0: 2/0/0 default/read/poll queues [ 0.548473] nvme0n1: p1 [ 0.554339] EXT4-fs (nvme0n1p1): mounted filesystem with ordered data mode. Opts: (null) [ 0.554344] VFS: Mounted root (ext4 filesystem) on device 259:1. ... [ 0.567931] Run /sbin/init as init process Signed-off-by: Josh Triplett <josh@joshtriplett.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:47:03 +09:00
Rupesh Girase	94d2e705b6	nvme: log additional message for controller status Log the controller status to know more about issue if it lies within kernel nvme subsytem or controller is unhealthy. Signed-off-by: Rupesh Girase <rgirase@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulakrni@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:47:03 +09:00
Chaitanya Kulkarni	ad95a613ea	nvme: code cleanup nvme_identify_ns_desc() The function nvme_identify_ns_desc() has 3 levels of nesting which make error message to exceeded > 80 char per line which is not aligned with the kernel code standards and rest of the NVMe subsystem code. Add a helper function to move the processing of the log when the command is successful by reducing the nesting and keeping the code < 80 char per line. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:45:25 +09:00
Jean Delvare	228914504c	nvme: Don't deter users from enabling hwmon support I see no good reason for the "If unsure, say N" advice in the description of the NVME_HWMON configuration option. It is not dangerous, it does not select any other option, and has a fairly low overhead. As the option is already not enabled by default, further suggesting hesitant users to not enable it is not useful anyway. Unlike some other options where the description alone may not be sufficient for users to make a decision, NVME_HWMON is pretty simple to grasp in my opinion, so just let the user do what they want. Signed-off-by: Jean Delvare <jdelvare@suse.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:45:25 +09:00
Sagi Grimberg	45fb19f766	nvme: expose hostid via sysfs for fabrics controllers We allow userspace to connect with a custom hostid which is useful for certain use-cases. However there is is no way to tell what is the hostid used to connect to a given controller. Expose this so userspace can correlate controllers based on hostid. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:45:12 +09:00
Sagi Grimberg	76171c6cdf	nvme: expose hostnqn via sysfs for fabrics controllers We allow userspace to connect with a custom hostnqn which is useful for certain use-cases. However there is no way to tell what is the hostnqn used to connect to a given controller. Expose this so userspace can correlate controllers based on hostnqn. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>	2020-03-26 04:44:35 +09:00
Pablo Neira Ayuso	2c64605b59	net: Fix CONFIG_NET_CLS_ACT=n and CONFIG_NFT_FWD_NETDEV={y, m} build net/netfilter/nft_fwd_netdev.c: In function ‘nft_fwd_netdev_eval’: net/netfilter/nft_fwd_netdev.c:32:10: error: ‘struct sk_buff’ has no member named ‘tc_redirected’ pkt->skb->tc_redirected = 1; ^~ net/netfilter/nft_fwd_netdev.c:33:10: error: ‘struct sk_buff’ has no member named ‘tc_from_ingress’ pkt->skb->tc_from_ingress = 1; ^~ To avoid a direct dependency with tc actions from netfilter, wrap the redirect bits around CONFIG_NET_REDIRECT and move helpers to include/linux/skbuff.h. Turn on this toggle from the ifb driver, the only existing client of these bits in the tree. This patch adds skb_set_redirected() that sets on the redirected bit on the skbuff, it specifies if the packet was redirect from ingress and resets the timestamp (timestamp reset was originally missing in the netfilter bugfix). Fixes: `bcfabee1af` ("netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress") Reported-by: noreply@ellerman.id.au Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-25 12:24:33 -07:00
Guilherme G. Piccoli	428c491332	net: ena: Add PCI shutdown handler to allow safe kexec Currently ENA only provides the PCI remove() handler, used during rmmod for example. This is not called on shutdown/kexec path; we are potentially creating a failure scenario on kexec: (a) Kexec is triggered, no shutdown() / remove() handler is called for ENA; instead pci_device_shutdown() clears the master bit of the PCI device, stopping all DMA transactions; (b) Kexec reboot happens and the device gets enabled again, likely having its FW with that DMA transaction buffered; then it may trigger the (now invalid) memory operation in the new kernel, corrupting kernel memory area. This patch aims to prevent this, by implementing a shutdown() handler quite similar to the remove() one - the difference being the handling of the netdev, which is unregistered on remove(), but following the convention observed in other drivers, it's only detached on shutdown(). This prevents an odd issue in AWS Nitro instances, in which after the 2nd kexec the next one will fail with an initrd corruption, caused by a wild DMA write to invalid kernel memory. The lspci output for the adapter present in my instance is: 00:05.0 Ethernet controller [0200]: Amazon.com, Inc. Elastic Network Adapter (ENA) [1d0f:ec20] Suggested-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com> Acked-by: Sameeh Jubran <sameehj@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-25 12:03:29 -07:00
Hangbin Liu	c085dbfb1c	selftests/net/forwarding: define libs as TEST_PROGS_EXTENDED The lib files should not be defined as TEST_PROGS, or we will run them in run_kselftest.sh. Also remove ethtool_lib.sh exec permission. Fixes: `81573b18f2` ("selftests/net/forwarding: add Makefile to install tests") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-25 12:01:18 -07:00

... 3 4 5 6 7 ...

904926 Commits All Branches Search

904926 Commits

All Branches