linux

Commit Graph

Author	SHA1	Message	Date
Adrian Hunter	1b6599a9d8	perf intel-pt: Fix sample timestamp wrt non-taken branches The sample timestamp is updated to ensure that the timestamp represents the time of the sample and not a branch that the decoder is still walking towards. The sample timestamp is updated when the decoder returns, but the decoder does not return for non-taken branches. Update the sample timestamp then also. Note that commit `3f04d98e97` ("perf intel-pt: Improve sample timestamp") was also a stable fix and appears, for example, in v4.4 stable tree as commit a4ebb58fd124 ("perf intel-pt: Improve sample timestamp"). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v4.4+ Fixes: `3f04d98e97` ("perf intel-pt: Improve sample timestamp") Link: http://lkml.kernel.org/r/20190510124143.27054-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-16 14:17:24 -03:00
Adrian Hunter	61b6e08dc8	perf intel-pt: Fix improved sample timestamp The decoder uses its current timestamp in samples. Usually that is a timestamp that has already passed, but in some cases it is a timestamp for a branch that the decoder is walking towards, and consequently hasn't reached. The intel_pt_sample_time() function decides which is which, but was not handling TNT packets exactly correctly. In the case of TNT, the timestamp applies to the first branch, so the decoder must first walk to that branch. That means intel_pt_sample_time() should return true for TNT, and this patch makes that change. However, if the first branch is a non-taken branch (i.e. a 'N'), then intel_pt_sample_time() needs to return false for subsequent taken branches in the same TNT packet. To handle that, introduce a new state INTEL_PT_STATE_TNT_CONT to distinguish the cases. Note that commit `3f04d98e97` ("perf intel-pt: Improve sample timestamp") was also a stable fix and appears, for example, in v4.4 stable tree as commit a4ebb58fd124 ("perf intel-pt: Improve sample timestamp"). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v4.4+ Fixes: `3f04d98e97` ("perf intel-pt: Improve sample timestamp") Link: http://lkml.kernel.org/r/20190510124143.27054-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-16 14:17:23 -03:00
Adrian Hunter	7ba8fa20e2	perf intel-pt: Fix instructions sampling rate The timestamp used to determine if an instruction sample is made, is an estimate based on the number of instructions since the last known timestamp. A consequence is that it might go backwards, which results in extra samples. Change it so that a sample is only made when the timestamp goes forwards. Note this does not affect a sampling period of 0 or sampling periods specified as a count of instructions. Example: Before: $ perf script --itrace=i10us ls 13812 [003] 2167315.222583: 3270 instructions:u: 7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 30902 instructions:u: 7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 10 instructions:u: 7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 8 instructions:u: 7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 14 instructions:u: 7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 6 instructions:u: 7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 14 instructions:u: 7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 4 instructions:u: 7fac71e2dab2 _dl_cache_libcmp+0xd2 (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222728: 16423 instructions:u: 7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222734: 12731 instructions:u: 7fac71e27938 _dl_name_match_p+0x68 (/lib/x86_64-linux-gnu/ld-2.28.so) ... After: $ perf script --itrace=i10us ls 13812 [003] 2167315.222583: 3270 instructions:u: 7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 30902 instructions:u: 7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222728: 16479 instructions:u: 7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so) ... Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Fixes: `f4aa081949` ("perf tools: Add Intel PT decoder") Link: http://lkml.kernel.org/r/20190510124143.27054-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-16 14:17:23 -03:00
Kan Liang	6466ec14aa	perf regs x86: Add X86 specific arch__intr_reg_mask() XMM registers can be collected on Icelake and later platforms. Add specific arch__intr_reg_mask(), which creating an event to check if the kernel and hardware can collect XMM registers. Test on Skylake which doesn't support XMM registers collection. There is nothing changed. #perf record -I? available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use '-I?' to list register names #perf record -I [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.905 MB perf.data (2520 samples) ] #perf evlist -v cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP\|TID\|TIME\|CPU\|PERIOD\|REGS_INTR, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, sample_regs_intr: 0xff0fff Test on Icelake which support XMM registers collection. #perf record -I? available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9 XMM10 XMM11 XMM12 XMM13 XMM14 XMM15 Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use '-I?' to list register names #perf record -I [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.800 MB perf.data (318 samples) ] #perf evlist -v cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP\|TID\|TIME\|CPU\|PERIOD\|REGS_INTR, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, sample_regs_intr: 0xffffffff00ff0fff Committer notes: Don't set attr.sample_period as a named struct init, as it is part of an unnamed union in 'struct perf_event_attr', and doing so breaks the build on older gcc versions, such as: gcc version 4.1.2 20080704 (Red Hat 4.1.2-55) gcc version 4.4.7 20120313 (Red Hat 4.4.7-23) (GCC) arch/x86/util/perf_regs.c: In function 'arch__intr_reg_mask': arch/x86/util/perf_regs.c:279: error: unknown field 'sample_period' specified in initializer cc1: warnings being treated as errors arch/x86/util/perf_regs.c:279: warning: missing braces around initializer arch/x86/util/perf_regs.c:279: warning: (near initialization for 'attr.<anonymous>') Signed-off-by: Kan Liang <kan.liang@linux.intel.com> [ Only on a lenovo t480s, a skylake machine, where the XMM registers didn't show up in -I?/--user-regs=? as expected ] Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1557865174-56264-3-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-16 14:17:23 -03:00
Kan Liang	af785e75bf	perf parse-regs: Add generic support for arch__intr/user_reg_mask() There may be different register mask for use with intr or user on some platforms, e.g. Icelake. Add weak functions arch__intr_reg_mask() and arch__user_reg_mask() to return intr and user register mask respectively. Check mask before printing or comparing the register name. Generic code always return PERF_REGS_MASK. No functional change. Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1557865174-56264-2-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-05-16 14:17:12 -03:00
David S. Miller	2407a88a13	Merge branch 'rhashtable-Fix-sparse-warnings' Herbert Xu says: ==================== rhashtable: Fix sparse warnings This patch series fixes all the sparse warnings. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-16 09:45:20 -07:00
Herbert Xu	e9458a4e33	rhashtable: Fix cmpxchg RCU warnings As cmpxchg is a non-RCU mechanism it will cause sparse warnings when we use it for RCU. This patch adds explicit casts to silence those warnings. This should probably be moved to RCU itself in future. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-16 09:45:20 -07:00
Herbert Xu	ba6306e3f6	rhashtable: Remove RCU marking from rhash_lock_head The opaque type rhash_lock_head should not be marked with __rcu because it can never be dereferenced. We should apply the RCU marking when we turn it into a pointer which can be dereferenced. This patch does exactly that. This fixes a number of sparse warnings as well as getting rid of some unnecessary RCU checking. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-16 09:45:20 -07:00
Linus Torvalds	8c05f3b965	ARM development updates: - more unified assembly conversions for clang - drop obsolete -mauto-it assembler option - remove arm_memory_present in preference to the generic version - remove unused asm/limits.h header - vdso linker update We tried to make the assembler warn if unified syntax was not used, but unfortunately older versions of GCC warn, so the commit had to be reverted. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIVAwUAXNyRJfTnkBvkraxkAQLy9w/+IZ17YCke7L1Lrk/cbV72RbMPEK4LOJD/ H2JP7GNJPa7JXsAohOELtYav6xsOxERtDkHmplcsRkl519kEfKU7wgjFNGoVUHpk s+urcDIkijdNMXHvyBY4pSy54b8OmUAdeGnpgkMfQ+2gmKSTd9iXXOFfAvi8qDGI F13xXCL3XbbC/9tPmSFS04jmseaDIE5qW9GhrQ3n4qmwx5tx3e6IuWZ1G2UPqAly vjIuRKR2PrGW2ORoPklJ1VPjPjV5eaxP9+YPhCMMIH5MtCOM8K+me0ldbZvrSqu1 b1ovjuWrLdFXBVRUI3yRM3aAM83ZuV3jLWF7+2QtDZCAMMND9FnCOTq4cvHW0Hno HVkYmUFJA5nV4Z62K9buR65gA3gFTmbDBO3NupoXw1KdIUONv/vHt9dA1E6tA4iE oaRK60Uoi2a2FzFp+jsDDNGKJP9eMJS3oDmgVa2UOI0L+oyPsZXM95g7+CJ2Yr5G +oLUfDhhWyrYVTjXd6ZZSCqbQgrbrSslxN0nSjcFb/tO7K5zcAWGWRW/KDxjVlvO ElJXgxNa6RvTT6Lt+zDLEFRxWiBfADQYSoQYXrBLSuEG85/7HJg/9gTbbHFEGfBw IqHvBut/wwArNBkkz2OFjyn0z3vrIr5PwdXs2K49ywlfHHvDxGSEovvYv7kFTuTR hVBSmogtt7U= =719N -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm Pull ARM updates from Russell King: "ARM development updates: - more unified assembly conversions for clang - drop obsolete -mauto-it assembler option - remove arm_memory_present in preference to the generic version - remove unused asm/limits.h header - vdso linker update We tried to make the assembler warn if unified syntax was not used, but unfortunately older versions of GCC warn, so the commit had to be reverted" * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: Revert "ARM: 8846/1: warn if divided syntax assembler is used" ARM: 8858/1: vdso: use $(LD) instead of $(CC) to link VDSO ARM: 8855/1: remove unused <asm/limits.h> ARM: 8850/1: use memblocks_present ARM: 8854/1: drop -mauto-it ARM: 8846/1: warn if divided syntax assembler is used ARM: 8853/1: drop WASM to work around LLVM issue ARM: 8852/1: uaccess: use unified assembler language syntax ARM: 8851/1: add TUSERCOND() macro for conditional postfix	2019-05-16 09:41:54 -07:00
Linus Torvalds	ab02888e39	ARM: SoC defconfig updates - Mostly the usual churn due to options being reordered or not added in the right locations. - Some various enabling of new drivers, etc. ... i.e. the usual updates, nothing particularly sticks out. -----BEGIN PGP SIGNATURE----- iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAlzc/AEPHG9sb2ZAbGl4 b20ubmV0AAoJEIwa5zzehBx3+AIQAIuxZAxekCCrUqxmxNMLLSOEP396YH3eRMp4 GXnHrQWwwAbjxwvuYlX62RBfYvEfXqwDlq3TnUMuEcIqnEzendF8NxXJLi8tw7J2 gg/O6Hi6PMdl0MkmHJscmXGR+grvofANpOHhoxNg5opC47lIVzTfMs4Lb/M4g01h tlDrkkQq8/ybUeMsSa/WT2o1Jt1j3fR6kzoTh7SduG+OaYIojSlgnkdcFGRfUfJr CVb1pXu5lZG32sQRt61WfbtjiztZGrEX/TxOOZqyVo1zgtDxdIRqr/Y5i81icQv/ vSwPJiGVzZ2+gMRqREJJY1Ckn4vcGpz7ZN++gtGCFMRaVob5AW33zUDhlkPt7ynv uKVuasNfrN/xPdFdqhMo21vXcBV1cOSDLWachAYWaLzzOvMyREYKiRaq377yQMQ5 TQFIRxFHY1vfOnssNEOCu2Og5XEJ3fH9tN0o5Pi3qrpxWibBlp0biGrS7Q8AyjYa wSHsaQFC3Z9gNGdcO7smm361HGXBR8KW1ZE0iC8birezhI3Wb16hLzAN0jvXs4Sg lF6ruJYrLqeaclKeY+vGw/c9x5knw1wpdN43E7+b0Olh7qSDvZBjfD+hfREpQHJP DL2PUtI3MjWCzjWVUeO+kc6KdYCzspBQ/lnn9yJKF0otbWxz1rLyTNEJ3YmZbbS6 OunyMN/N =IKfA -----END PGP SIGNATURE----- Merge tag 'armsoc-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC defconfig updates from Olof Johansson: "Mostly the usual churn due to options being reordered or not added in the right locations. Some various enabling of new drivers, etc. ... i.e. the usual updates, nothing particularly sticks out" * tag 'armsoc-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (23 commits) arm64: defconfig: Update UFSHCD for Hi3660 soc ARM: multi_v7_defconfig: Enable support for STPMIC1 ARM: multi_v7_defconfig: Enable missing drivers for supported Chromebooks arm64: defconfig: enable mv-xor driver ARM: Enable Trusted Foundations for multiplatform ARM v7 ARM: tegra: Enable Trusted Foundations by default ARM: tegra: Update default configuration for v5.1-rc1 ARM: multi_v7_defconfig: Update for moved options ARM: multi_v7_defconfig: Update for dropped options ARM: shmobile: Enable USB [EO]HCI HCD PLATFORM support in shmobile_defconfig ARM: shmobile: Enable PHY_RCAR_GEN3_USB2 in shmobile_defconfig ARM: qcom_defconfig: add options for LG Nexus 5 phone arm64: defconfig: include the Agilex platform to the arm64 defconfig arm64: defconfig: Add PWM Fan support arm64: defconfig: Enable Tegra HDA support ARM: multi_v7_defconfig: Enable support for CFI NOR FLASH ARM: shmobile: defconfig: Enable support for CFI NOR FLASH ARM: shmobile: defconfig: Refresh for v5.1-rc1 ARM: multi_v7_defconfig: enable the Amlogic Meson ADC and eFuse drivers arm64: defconfig: enable fpga and service layer ...	2019-05-16 09:35:26 -07:00
David Howells	d8076bdb56	uapi: Wire up the mount API syscalls on non-x86 arches [ver #2 ] Wire up the mount API syscalls on non-x86 arches. Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2019-05-16 12:23:45 -04:00
David Howells	9c8ad7a2ff	uapi, x86: Fix the syscall numbering of the mount API syscalls [ver #2 ] Fix the syscall numbering of the mount API syscalls so that the numbers match between i386 and x86_64 and that they're in the common numbering scheme space. Fixes: `a07b200047` ("vfs: syscall: Add open_tree(2) to reference or clone a mount") Fixes: `2db154b3ea` ("vfs: syscall: Add move_mount(2) to move mounts around") Fixes: `24dcb3d90a` ("vfs: syscall: Add fsopen() to prepare for superblock creation") Fixes: `ecdab150fd` ("vfs: syscall: Add fsconfig() for configuring and managing a context") Fixes: `93766fbd26` ("vfs: syscall: Add fsmount() to create a mount for a superblock") Fixes: `cf3cba4a42` ("vfs: syscall: Add fspick() to select a superblock for reconfiguration") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2019-05-16 12:23:45 -04:00
Christian Brauner	1cdc415f10	uapi, fsopen: use square brackets around "fscontext" [ver #2 ] Make the name of the anon inode fd "[fscontext]" instead of "fscontext". This is minor but most core-kernel anon inode fds already carry square brackets around their name: [eventfd] [eventpoll] [fanotify] [io_uring] [pidfd] [signalfd] [timerfd] [userfaultfd] For the sake of consistency lets do the same for the fscontext anon inode fd that comes with the new mount api. Signed-off-by: Christian Brauner <christian@brauner.io> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2019-05-16 12:23:45 -04:00
Linus Torvalds	dc413a90ed	ARM: SoC-related driver updates Various driver updates for platforms and a couple of the small driver subsystems we merge through our tree: Among the larger pieces: - Power management improvements for TI am335x and am437x (RTC suspend/wake) - Misc new additions for Amlogic (socinfo updates) - ZynqMP FPGA manager - Nvidia improvements for reset/powergate handling - PMIC wrapper for Mediatek MT8516 - Misc fixes/improvements for ARM SCMI, TEE, NXP i.MX SCU drivers -----BEGIN PGP SIGNATURE----- iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAlzc+9QPHG9sb2ZAbGl4 b20ubmV0AAoJEIwa5zzehBx3o3sQAIJ2SZnITy/ycvkbhKe+V/806P+aoqMpbZDw 7ldBQFoIMQqVIoeSSeml+9B86ZGyK4CGTgnvsfAI/Zt2fZSHczjqLP5InbEnvB5M 4naf0nSjSlkb5F4p24wXQ7WTI8IO45SwqG4hCi/WW6MakxN21cwdMWHBn+TRZWQu +AlJdwyDFJoMRXcq8xvLHOBNVAqD3LyvlECbLKqn3+UPwwYw0Ti1dsLwaMLOYDbc o/1dC2O8111kg2DgO0OM4Tl7jdbpmGA5MeixbVnmu3t4b2s26trG33eXqK2yWqaV XigD85R74GAq/wmgnzjdiNaIgZjlPPitVYaTE4L6Od39zMgXemnsqMlh/byPeO2y JvRRLEIciNay9q9uq+8H2zRWwa2wLqAewjssTTMM0RJNQWUtonVCkD8DAx4GLDof 6Ej42XGbtxnqpf0g854mBJ4zaPfZLN4xK//1Llx9HkM8mhLZLJ7BQvgvW1JzniSa XKnmjqK7SySiJ4bbjn+aFk5EkX7Oh5aXno18tVNKXdxc8nWoEw4PHMUmCCHOFPye /1oxc95Ux8P/lV+B0ZjiI0yTAX/IpDkEszAYmgdy6pWh1hXnYUr/Rpm7cGUG8kzk SbtyB8JOI/DFQ7QMDfPp6e6bcB8zTbUuF9H2MXwPN5TqGzP/mya88DC5Iv1jY4jc 0oWv/uhj =YSfu -----END PGP SIGNATURE----- Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC-related driver updates from Olof Johansson: "Various driver updates for platforms and a couple of the small driver subsystems we merge through our tree: Among the larger pieces: - Power management improvements for TI am335x and am437x (RTC suspend/wake) - Misc new additions for Amlogic (socinfo updates) - ZynqMP FPGA manager - Nvidia improvements for reset/powergate handling - PMIC wrapper for Mediatek MT8516 - Misc fixes/improvements for ARM SCMI, TEE, NXP i.MX SCU drivers" * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (57 commits) soc: aspeed: fix Kconfig soc: add aspeed folder and misc drivers spi: zynqmp: Fix build break soc: imx: Add generic i.MX8 SoC driver MAINTAINERS: Update email for Qualcomm SoC maintainer memory: tegra: Fix a typos for "fdcdwr2" mc client Revert "ARM: tegra: Restore memory arbitration on resume from LP1 on Tegra30+" memory: tegra: Replace readl-writel with mc_readl-mc_writel memory: tegra: Fix integer overflow on tick value calculation memory: tegra: Fix missed registers values latching ARM: tegra: cpuidle: Handle tick broadcasting within cpuidle core on Tegra20/30 optee: allow to work without static shared memory soc/tegra: pmc: Move powergate initialisation to probe soc/tegra: pmc: Remove reset sysfs entries on error soc/tegra: pmc: Fix reset sources and levels soc: amlogic: meson-gx-pwrc-vpu: Add support for G12A soc: amlogic: meson-gx-pwrc-vpu: Fix power on/off register bitmask fpga manager: Adding FPGA Manager support for Xilinx zynqmp dt-bindings: fpga: Add bindings for ZynqMP fpga driver firmware: xilinx: Add fpga API's ...	2019-05-16 09:19:14 -07:00
Linus Torvalds	e8a1d70117	ARM: Device-tree updates Besides new bindings and additional descriptions of hardware blocks for various SoCs and boards, the main new contents here is: SoCs: - Intel Agilex (SoCFPGA) - NXP i.MX8MM (Quad Cortex-A53 with media/graphics focus) New boards: - Allwinner: + RerVision H3-DVK (H3) + Oceanic 5205 5inMFD (H6) + Beelink GS2 (H6) + Orange Pi 3 (H6) - Rockchip: + Orange Pi RK3399 + Nanopi NEO4 + Veyron-Mighty Chromebook variant - Amlogic: + SEI Robotics SEI510 - ST Micro: + stm32mp157a discovery1 + stm32mp157c discovery2 - NXP: + Eckelmann ci4x10 (i.MX6DL) + i.MX8MM EVK (i.MX8MM) + ZII i.MX7 RPU2 (i.MX7) + ZII SPB4 (VF610) + Zii Ultra (i.MX8M) + TQ TQMa7S (i.MX7Solo) + TQ TQMa7D (i.MX7Dual) + Kobo Aura (i.MX50) + Menlosystems M53 (i.MX53)j - Nvidia: + Jetson Nano (Tegra T210) -----BEGIN PGP SIGNATURE----- iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAlzc+0QPHG9sb2ZAbGl4 b20ubmV0AAoJEIwa5zzehBx32MkP/RBivO4AJpznRbqULmStzZL5y24bKzlt/vO8 6QXr95fTuqJ+0e+oNTVBN4pYMT0yrnMh4PGesEhcu5SEL0fc1kS8UPhkC45FbcLu KG+51oLQyiedQrFAG7aT9JdZgtqbfkeGeieJl4LOKHiXy0uNQY0i4VsxrnSeRfuA 9Geq4sO0hwDUE8OwjZDddeURJmBulshgZtYGZRceKhO3NYRTwOYFcVsijAY2tfCu VE4v231bs+gCaDzD90y3HBRCmK1UdUXWQzrud44EV9seJ3yskXFU6YOuKhecXtEk jHjLaIZ5zss7cHjlRdkGb8B6TavBuvaQi8hTB7qScvRSWKTiUmAo3vCuyHNJZroV rG8g1CbYgyG8/B1KjjU/kvdYdl82z3+K27UZHoAM5lKfEvIyAlWd4gmAri/0qR1A LoMDYmvtsIXg7ZMnmfuLJc5luU7zUPjlXMyA/E6wZ6Q5AzDphkpfqir7/9eb8A0p bCiyitfy6N0jB9lm51wAKIl/0poMDDEzsH/VpVz6iziDwpoUXoL5ujTwIijQL6Li 0dLJssBSU0ElX2GOICu5OgpVwK9aZnlMC7eG0Uq49pgvQIz8czQcTE2tv9jtGxmz 1T0JB2ilvJnDSunnYek3xiAB1gU8I7cdwjtkMvyPho1Gqd6fFKAChvWFbSIkVdjz CGqrSXjF =lMVy -----END PGP SIGNATURE----- Merge tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM Device-tree updates from Olof Johansson: "Besides new bindings and additional descriptions of hardware blocks for various SoCs and boards, the main new contents here is: SoCs: - Intel Agilex (SoCFPGA) - NXP i.MX8MM (Quad Cortex-A53 with media/graphics focus) New boards: - Allwinner: + RerVision H3-DVK (H3) + Oceanic 5205 5inMFD (H6) + Beelink GS2 (H6) + Orange Pi 3 (H6) - Rockchip: + Orange Pi RK3399 + Nanopi NEO4 + Veyron-Mighty Chromebook variant - Amlogic: + SEI Robotics SEI510 - ST Micro: + stm32mp157a discovery1 + stm32mp157c discovery2 - NXP: + Eckelmann ci4x10 (i.MX6DL) + i.MX8MM EVK (i.MX8MM) + ZII i.MX7 RPU2 (i.MX7) + ZII SPB4 (VF610) + Zii Ultra (i.MX8M) + TQ TQMa7S (i.MX7Solo) + TQ TQMa7D (i.MX7Dual) + Kobo Aura (i.MX50) + Menlosystems M53 (i.MX53)j - Nvidia: + Jetson Nano (Tegra T210)" * tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (593 commits) arm64: dts: bitmain: Add UART pinctrl support for Sophon Edge arm64: dts: bitmain: Add pinctrl support for BM1880 SoC arm64: dts: bitmain: Add GPIO Line names for Sophon Edge board arm64: dts: bitmain: Add GPIO support for BM1880 SoC ARM: dts: gemini: Indent DIR-685 partition table dt-bindings: hwmon (pwm-fan) Remove dead "cooling-*-state" properties ARM: dts: qcom-apq8064: Set 'cxo_board' as ref clock of the DSI PHY arm64: dts: msm8998: thermal: Restrict thermal zone name length to under 20 arm64: dts: msm8998: thermal: Fix number of supported sensors arm64: dts: msm8998-mtp: thermal: Remove skin and battery thermal zones arm64: dts: exynos: Move fixed-clocks out of soc arm64: dts: exynos: Move pmu and timer nodes out of soc ARM: dts: s5pv210: Fix camera clock provider on Goni board ARM: dts: exynos: Properly override node to use MDMA0 on Universal C210 ARM: dts: exynos: Move fixed-clocks out of soc on Exynos3250 ARM: dts: exynos: Remove unneeded address/size cells from fixed-clock on Exynos3250 ARM: dts: exynos: Move pmu and timer nodes out of soc arm64: dts: rockchip: fix IO domain voltage setting of APIO5 on rockpro64 arm64: dts: db820c: Add sound card support arm64: dts: apq8096-db820c: Add HDMI display support ...	2019-05-16 08:38:17 -07:00
Linus Torvalds	22c58fd70c	ARM: SoC platform updates SoC updates, mostly refactorings and cleanups of old legacy platforms. Major themes this release: - Conversion of ixp4xx to a modern platform (drivers, DT, bindings) - Moving some of the ep93xx headers around to get it closer to multiplatform enabled. - Cleanups of Davinci This tag also contains a few patches that were queued up as fixes before 5.1 but I didn't get sent in before release. -----BEGIN PGP SIGNATURE----- iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAlzc+sMPHG9sb2ZAbGl4 b20ubmV0AAoJEIwa5zzehBx3ygQP/3mxLFGJxgHk6m/41V4Tepv9F2ZZ3BW4Lcp7 vZtr6xiyhZXzIHOGzqQ4VGllfWhMWnjzZZe3iruSBY1gpJU7D4x054T3xVsIDs9F EIcbBm5fE0O0bdijfk7V8vBu7LOIP/KYdaD1n9WDhW0Hy4wTXN8NNLSKEU5Lq15p oz/A3QP5GcwhGAqaHyxx445La9yEKKWAsc2cOCRCdvfw6+n1GpoE6TI1YGjDvqbw xd73mIwXb0l0f7jhCV7OPyZ3t/aQgTD3ddr4gHUGNa8sSWmD5nupSVxj23FkbGby ejqJMxOfHpJJGIL/sxmR3+cFBYxyE+JNmrEq/kDW5ncWs/LY91juJxR1dkQKs6Mj 4Y9CWruftDz34DlFs/J33hF/rdZ73O91ldk7zqND41Fi5aLrIKvZBJlTuqyZ0tGV YNRxsjWF953h8TXimDV0KvBgO4+E8d5ype/kIYtEGYO9DVmXQGMxFx2Gt2I/NfoH 5tCtVFwDPpMxJShpXHLMzUT8sQL3mytg5L/MIPTGx+zAtDwx/qTLEEAElffG29oI vdzgJR0lrG/zzqQh25/M80UZYMdOrwtjAB42C+jAvlfQ0C4DtvSH+8OdcROOgj0b GbAJbTdHYTD6OpoxhSuRii7zzNxw+i7pQj+uLSt8s8ZReGkUk5a2wpRpoVoV2WxK RJHkMK95 =pUeO -----END PGP SIGNATURE----- Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC platform updates from Olof Johansson: "SoC updates, mostly refactorings and cleanups of old legacy platforms. Major themes this release: - Conversion of ixp4xx to a modern platform (drivers, DT, bindings) - Moving some of the ep93xx headers around to get it closer to multiplatform enabled. - Cleanups of Davinci This also contains a few patches that were queued up as fixes before 5.1 but I didn't get sent in before release" * tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (123 commits) ARM: debug-ll: add default address for digicolor ARM: u300: regulator: add MODULE_LICENSE() ARM: ep93xx: move private headers out of mach/* ARM: ep93xx: move pinctrl interfaces into include/linux/soc ARM: ep93xx: keypad: stop using mach/platform.h ARM: ep93xx: move network platform data to separate header ARM: stm32: add AMBA support for stm32 family MAINTAINERS: update arch/arm/mach-davinci ARM: rockchip: add missing of_node_put in rockchip_smp_prepare_pmu ARM: dts: Add queue manager and NPE to the IXP4xx DTSI soc: ixp4xx: qmgr: Add DT probe code soc: ixp4xx: qmgr: Add DT bindings for IXP4xx qmgr soc: ixp4xx: npe: Add DT probe code soc: ixp4xx: Add DT bindings for IXP4xx NPE soc: ixp4xx: qmgr: Pass resources soc: ixp4xx: Remove unused functions soc: ixp4xx: Uninline several functions soc: ixp4xx: npe: Pass addresses as resources ARM: ixp4xx: Turn the QMGR into a platform device ARM: ixp4xx: Turn the NPE into a platform device ...	2019-05-16 08:31:32 -07:00
David Howells	fd711586bb	afs: Fix double inc of vnode->cb_break When __afs_break_callback() clears the CB_PROMISED flag, it increments vnode->cb_break to trigger a future refetch of the status and callback - however it also calls afs_clear_permits(), which also increments vnode->cb_break. Fix this by removing the increment from afs_clear_permits(). Whilst we're at it, fix the conditional call to afs_put_permits() as the function checks to see if the argument is NULL, so the check is redundant. Fixes: `be080a6f43` ("afs: Overhaul permit caching"); Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	a58823ac45	afs: Fix application of status and callback to be under same lock When applying the status and callback in the response of an operation, apply them in the same critical section so that there's no race between checking the callback state and checking status-dependent state (such as the data version). Fix this by: (1) Allocating a joint {status,callback} record (afs_status_cb) before calling the RPC function for each vnode for which the RPC reply contains a status or a status plus a callback. A flag is set in the record to indicate if a callback was actually received. (2) These records are passed into the RPC functions to be filled in. The afs_decode_status() and yfs_decode_status() functions are removed and the cb_lock is no longer taken. (3) xdr_decode_AFSFetchStatus() and xdr_decode_YFSFetchStatus() no longer update the vnode. (4) xdr_decode_AFSCallBack() and xdr_decode_YFSCallBack() no longer update the vnode. (5) vnodes, expected data-version numbers and callback break counters (cb_break) no longer need to be passed to the reply delivery functions. Note that, for the moment, the file locking functions still need access to both the call and the vnode at the same time. (6) afs_vnode_commit_status() is now given the cb_break value and the expected data_version and the task of applying the status and the callback to the vnode are now done here. This is done under a single taking of vnode->cb_lock. (7) afs_pages_written_back() is now called by afs_store_data() rather than by the reply delivery function. afs_pages_written_back() has been moved to before the call point and is now given the first and last page numbers rather than a pointer to the call. (8) The indicator from YFS.RemoveFile2 as to whether the target file actually got removed (status.abort_code == VNOVNODE) rather than merely dropping a link is now checked in afs_unlink rather than in xdr_decode_YFSFetchStatus(). Supplementary fixes: () afs_cache_permit() now gets the caller_access mask from the afs_status_cb object rather than picking it out of the vnode's status record. afs_fetch_status() returns caller_access through its argument list for this purpose also. () afs_inode_init_from_status() now uses a write lock on cb_lock rather than a read lock and now sets the callback inside the same critical section. Fixes: `c435ee3455` ("afs: Overhaul the callback handling") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	c7226e407b	afs: Fix lock-wait/callback-break double locking __afs_break_callback() holds vnode->lock around its call of afs_lock_may_be_available() - which also takes that lock. Fix this by not taking the lock in __afs_break_callback(). Also, there's no point checking the granted_locks and pending_locks queues; it's sufficient to check lock_state, so move that check out of afs_lock_may_be_available() into __afs_break_callback() to replace the queue checks. Fixes: `e8d6c55412` ("AFS: implement file locking") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	4571577f16	afs: Always get the reply time Always ask for the reply time from AF_RXRPC as it's used to calculate the callback expiry time and lock expiry times, so it's needed by most FS operations. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	d9052dda8a	afs: Don't invalidate callback if AFS_VNODE_DIR_VALID not set Don't invalidate the callback promise on a directory if the AFS_VNODE_DIR_VALID flag is not set (which indicates that the directory contents are invalid, due to edit failure, callback break, page reclaim). The directory will be reloaded next time the directory is accessed, so clearing the callback flag at this point may race with a reload of the directory and cancel it's recorded callback promise. Fixes: `f3ddee8dc4` ("afs: Fix directory handling") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	87182759cd	afs: Fix order-1 allocation in afs_do_lookup() afs_do_lookup() will do an order-1 allocation to allocate status records if there are more than 39 vnodes to stat. Fix this by allocating an array of {status,callback} records for each vnode we want to examine using vmalloc() if larger than a page. This not only gets rid of the order-1 allocation, but makes it easier to grow beyond 50 records for YFS servers. It also allows us to move to {status,callback} tuples for other calls too and makes it easier to lock across the application of the status and the callback to the vnode. Fixes: `5cf9dd55a0` ("afs: Prospectively look up extra files when doing a single lookup") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	781070551c	afs: Fix calculation of callback expiry time Fix the calculation of the expiry time of a callback promise, as obtained from operations like FS.FetchStatus and FS.FetchData. The time should be based on the timestamp of the first DATA packet in the reply and the calculation needs to turn the ktime_t timestamp into a time64_t. Fixes: `c435ee3455` ("afs: Overhaul the callback handling") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	ffba718e93	afs: Get rid of afs_call::reply[] Replace the afs_call::reply[] array with a bunch of typed members so that the compiler can use type-checking on them. It's also easier for the eye to see what's going on. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	3b05e528cb	afs: Make dynamic root population wait uninterruptibly for proc_cells_lock Make dynamic root population wait uninterruptibly for proc_cells_lock. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	fefb2483dc	afs: Don't pass the vnode pointer through into the inline bulk status op Don't pass the vnode pointer through into the inline bulk status op. We want to process the status records outside of it anyway. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:21 +01:00
David Howells	20b8391fff	afs: Make some RPC operations non-interruptible Make certain RPC operations non-interruptible, including: () Set attributes () Store data We don't want to get interrupted during a flush on close, flush on unlock, writeback or an inode update, leaving us in a state where we still need to do the writeback or update. () Extend lock () Release lock We don't want to get lock extension interrupted as the file locks on the server are time-limited. Interruption during lock release is less of an issue since the lock is time-limited, but it's better to complete the release to avoid a several-minute wait to recover it. Setting the lock isn't a problem if it's interrupted since we can just return to the user and tell them they were interrupted - at which point they can elect to retry. (*) Silly unlink We want to remove silly unlink files if we can, rather than leaving them for the salvager to clear up. Note that whilst these calls are no longer interruptible, they do have timeouts on them, so if the server stops responding the call will fail with something like ETIME or ECONNRESET. Without this, the following: kAFS: Unexpected error from FS.StoreData -512 appears in dmesg when a pending store data gets interrupted and some processes may just hang. Additionally, make the code that checks/updates the server record ignore failure due to interruption if the main call is uninterruptible and if the server has an address list. The next op will check it again since the expiration time on the old list has past. Fixes: `d2ddc776a4` ("afs: Overhaul volume and server record caching and fileserver rotation") Reported-by: Jonathan Billings <jsbillings@jsbillings.org> Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:20 +01:00
David Howells	b960a34b73	rxrpc: Allow the kernel to mark a call as being non-interruptible Allow kernel services using AF_RXRPC to indicate that a call should be non-interruptible. This allows kafs to make things like lock-extension and writeback data storage calls non-interruptible. If this is set, signals will be ignored for operations on that call where possible - such as waiting to get a call channel on an rxrpc connection. It doesn't prevent UDP sendmsg from being interrupted, but that will be handled by packet retransmission. rxrpc_kernel_recv_data() isn't affected by this since that never waits, preferring instead to return -EAGAIN and leave the waiting to the caller. Userspace initiated calls can't be set to be uninterruptible at this time. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:20 +01:00
David Howells	0ab4c95948	afs: Fix error propagation from server record check/update afs_check/update_server_record() should be setting fc->error rather than fc->ac.error as they're called from within the cursor iteration function. afs_fs_cursor::error is where the error code of the attempt to call the operation on multiple servers is integrated and is the final result, whereas afs_addr_cursor::error is used to hold the error from individual iterations of the call loop. (Note there's also an afs_vl_cursor which also wraps afs_addr_cursor for accessing VL servers rather than file servers). Fix this by setting fc->error in the afs_check/update_server_record() so that any error incurred whilst talking to the VL server correctly propagates to the final result. This results in: kAFS: Unexpected error from FS.StoreData -512 being seen, even though the store-data op is non-interruptible. The error is actually coming from the server record update getting interrupted. Fixes: `d2ddc776a4` ("afs: Overhaul volume and server record caching and fileserver rotation") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:20 +01:00
David Howells	94f699c9cd	afs: Fix the maximum lifespan of VL and probe calls If an older AFS server doesn't support an operation, it may accept the call and then sit on it forever, happily responding to pings that make kafs think that the call is still alive. Fix this by setting the maximum lifespan of Volume Location service calls in particular and probe calls in general so that they don't run on endlessly if they're not supported. Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 16:25:20 +01:00
Linus Torvalds	a455eda33f	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal Pull thermal soc updates from Eduardo Valentin: - thermal core has a new devm_* API for registering cooling devices. I took the entire series, that is why you see changes on drivers/hwmon in this pull (Guenter Roeck) - rockchip thermal driver gains support to PX30 SoC (Elaine Zhang) - the generic-adc thermal driver now considers the lookup table DT property as optional (Jean-Francois Dagenais) - Refactoring of tsens thermal driver (Amit Kucheria) - Cleanups on cpu cooling driver (Daniel Lezcano) - broadcom thermal driver dropped support to ACPI (Srinath Mannam) - tegra thermal driver gains support to OC hw throttle and GPU throtle (Wei Ni) - Fixes in several thermal drivers. * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: (59 commits) hwmon: (pwm-fan) Use devm_thermal_of_cooling_device_register hwmon: (npcm750-pwm-fan) Use devm_thermal_of_cooling_device_register hwmon: (mlxreg-fan) Use devm_thermal_of_cooling_device_register hwmon: (gpio-fan) Use devm_thermal_of_cooling_device_register hwmon: (aspeed-pwm-tacho) Use devm_thermal_of_cooling_device_register thermal: rcar_gen3_thermal: Fix to show correct trip points number thermal: rcar_thermal: update calculation formula for R-Car Gen3 SoCs thermal: cpu_cooling: Actually trace CPU load in thermal_power_cpu_get_power thermal: rockchip: Support the PX30 SoC in thermal driver dt-bindings: rockchip-thermal: Support the PX30 SoC compatible thermal: rockchip: fix up the tsadc pinctrl setting error thermal: broadcom: Remove ACPI support thermal: Fix build error of missing devm_ioremap_resource on UM thermal/drivers/cpu_cooling: Remove pointless field thermal/drivers/cpu_cooling: Add Software Package Data Exchange (SPDX) thermal/drivers/cpu_cooling: Fixup the header and copyright thermal/drivers/cpu_cooling: Remove pointless test in power2state() thermal: rcar_gen3_thermal: disable interrupt in .remove thermal: rcar_gen3_thermal: fix interrupt type thermal: Introduce devm_thermal_of_cooling_device_register ...	2019-05-16 07:56:57 -07:00
Jackie Liu	7a102d9044	block/bio-integrity: use struct_size() in kmalloc() Use the new struct_size() helper to keep code simple. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-05-16 08:48:48 -06:00
David Howells	bbd172e316	rxrpc: Provide kernel interface to set max lifespan on a call Provide an interface to set max lifespan on a call from inside of the kernel without having to call kernel_sendmsg(). Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 15:48:30 +01:00
David Howells	51eba99970	afs: Fix "kAFS: AFS vnode with undefined type 0" Under some circumstances afs_select_fileserver() can return without setting an error in fc->error. The problem is in the no_more_servers segment where the accumulated errors from attempts to contact various servers are integrated into an afs_error-type variable 'e'. The resultant error code is, however, then abandoned. Fix this by getting the error out of e.error and putting it in 'error' so that the next part will store it into fc->error. Not doing this causes a report like the following: kAFS: AFS vnode with undefined type 0 kAFS: A=0 m=0 s=0 v=0 kAFS: vnode 20000025:1:1 because the code following the server selection loop then sees what it thinks is a successful invocation because fc.error is 0. However, it can't apply the status record because it's all zeros. The report is followed on the first instance with a trace looking something like: dump_stack+0x67/0x8e afs_inode_init_from_status.isra.2+0x21b/0x487 afs_fetch_status+0x119/0x1df afs_iget+0x130/0x295 afs_get_tree+0x31d/0x595 vfs_get_tree+0x1f/0xe8 fc_mount+0xe/0x36 afs_d_automount+0x328/0x3c3 follow_managed+0x109/0x20a lookup_fast+0x3bf/0x3f8 do_last+0xc3/0x6a4 path_openat+0x1af/0x236 do_filp_open+0x51/0xae ? _raw_spin_unlock+0x24/0x2d ? __alloc_fd+0x1a5/0x1b7 do_sys_open+0x13b/0x1e8 do_syscall_64+0x7d/0x1b3 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: `4584ae96ae` ("afs: Fix missing net error handling") Signed-off-by: David Howells <dhowells@redhat.com>	2019-05-16 15:48:20 +01:00
Aneesh Kumar K.V	6457f42eb3	powerpc/mm: Drop VM_BUG_ON in get_region_id() We call get_region_id() without validating the ea value. That means with a wrong ea value we hit the BUG as below. kernel BUG at arch/powerpc/include/asm/book3s/64/hash.h:129! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries CPU: 0 PID: 3937 Comm: access_tests Not tainted 5.1.0 .... NIP [c00000000007ba20] do_slb_fault+0x70/0x320 LR [c00000000000896c] data_access_slb_common+0x15c/0x1a0 Fix this by removing the VM_BUG_ON. All callers make sure the returned region id is valid and error out otherwise. Fixes: `0034d395f8` ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range") Reported-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-05-17 00:40:01 +10:00
Linus Torvalds	cc7ce90153	drm i915, amdgpu, nouveau, msm, panfrost, bridge, pl111 fixes -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJc3MkNAAoJEAx081l5xIa+DpkP/2VhwTcnyumBi/cxG9/D/vRN W0tZDaiR3jvdz5J0lbg03fJnzgbBIFREjS9/W3PmVbsAvzMfUTok9trlw1rht82a NLbfK/Ge8e+OqmSRJ5kqMblixoWyqmM03/3/ms/nSfd86qywYpLy3ldZHK+WKdJq BV3topSeuP6dUj35v4L7y69DZ0Q6ignOe8udjkJgUyq9HCU7PxXxiV1K/9wUeFq1 EL736LTnvm7djzseS0kOSIB2oguRnP+6czYkVvMwP2bxuDayklOjAL6s489pvX6e 36FkFXFksj7vvi3s9JfKQ6gmlE4GbWvFevI0cXzB+G9mWr53nrV+iinRvx9neA3C YG263vD14Brm6Nfr4PCMAI7PgNLas9y0cpzc6LwjpJdWTmPiY/UhHmYbk81ZGEC5 s6Du8vt2PHvkbsyWdNglpPztZzGTKihCIoAtoDzH0TQTLVvoUsr8CKEk/w7wUtuQ NZSoDnlxlRVVRx07H8/aSYzwDjY9R0J3zCPRCrkDhoPy2tGFJd4xv/d2c1uVV19v VVqUZLHNhXdSJiu0RNRmyJktgQgnMsONRqaY6qsJq9O8sQrMVU5V+BRk9iBm/m21 xSw1nXJZsPYIqvxgF6+mmr98bIktu47htguV+Zp4rCUzAfO0YxBO5++vqBrLRPax T97d8gQopoX5nnBI4pzN =EfK4 -----END PGP SIGNATURE----- Merge tag 'drm-next-2019-05-16' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "A bunch of fixes for the merge window closure, doesn't seem to be anything too major or serious in there. It does add TU117 turing modesetting to nouveau but it's just an enable for preexisting code. amdgpu: - gpu reset at load crash fix - ATPX hotplug fix for when dGPU is off - SR-IOV fixes radeon: - r5xx pll fixes i915: - GVT (MCHBAR, buffer alignment, misc warnings fixes) - Fixes for newly enabled semaphore code - Geminilake disable framebuffer compression - HSW edp fast modeset fix - IRQ vs RCU race fix nouveau: - Turing modesetting fixes - TU117 support msm: - SDM845 bringup fixes panfrost: - static checker fixes pl111: - spinlock init fix. bridge: - refresh rate register fix for adv7511" * tag 'drm-next-2019-05-16' of git://anongit.freedesktop.org/drm/drm: (36 commits) drm/msm: Upgrade gxpd checks to IS_ERR_OR_NULL drm/msm/dpu: Remove duplicate header drm/pl111: Initialize clock spinlock early drm/msm: correct attempted NULL pointer dereference in debugfs drm/msm: remove resv fields from msm_gem_object struct drm/nouveau: fix duplication of nv50_head_atom struct drm/nouveau/disp/dp: respect sink limits when selecting failsafe link configuration drm/nouveau/core: initial support for boards with TU117 chipset drm/nouveau/core: allow detected chipset to be overridden drm/nouveau/kms/gf119-gp10x: push HeadSetControlOutputResource() mthd when encoders change drm/nouveau/kms/nv50-: fix bug preventing non-vsync'd page flips drm/nouveau/kms/gv100-: fix spurious window immediate interlocks drm/bridge: adv7511: Fix low refresh rate selection drm/panfrost: Add missing _fini() calls in panfrost_device_fini() drm/panfrost: Only put sync_out if non-NULL drm/i915: Seal races between async GPU cancellation, retirement and signaling drm/i915: Fix fastset vs. pfit on/off on HSW EDP transcoder drm/i915/fbc: disable framebuffer compression on GeminiLake drm/amdgpu/psp: move psp version specific function pointers to early_init drm/radeon: prefer lower reference dividers ...	2019-05-16 07:22:42 -07:00
Jackie Liu	fdb288a679	io_uring: use wait_event_interruptible for cq_wait conditional wait The previous patch has ensured that io_cqring_events contain smp_rmb memory barriers, Now we can use wait_event_interruptible to keep the code simple. Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-05-16 08:10:27 -06:00
Jackie Liu	dc6ce4bc2b	io_uring: adjust smp_rmb inside io_cqring_events Whenever smp_rmb is required to use io_cqring_events, keep smp_rmb inside the function io_cqring_events. Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-05-16 08:10:25 -06:00
Roman Penyaev	2bbcd6d3b3	io_uring: fix infinite wait in khread_park() on io_finish_async() This fixes couple of races which lead to infinite wait of park completion with the following backtraces: [20801.303319] Call Trace: [20801.303321] ? __schedule+0x284/0x650 [20801.303323] schedule+0x33/0xc0 [20801.303324] schedule_timeout+0x1bc/0x210 [20801.303326] ? schedule+0x3d/0xc0 [20801.303327] ? schedule_timeout+0x1bc/0x210 [20801.303329] ? preempt_count_add+0x79/0xb0 [20801.303330] wait_for_completion+0xa5/0x120 [20801.303331] ? wake_up_q+0x70/0x70 [20801.303333] kthread_park+0x48/0x80 [20801.303335] io_finish_async+0x2c/0x70 [20801.303336] io_ring_ctx_wait_and_kill+0x95/0x180 [20801.303338] io_uring_release+0x1c/0x20 [20801.303339] __fput+0xad/0x210 [20801.303341] task_work_run+0x8f/0xb0 [20801.303342] exit_to_usermode_loop+0xa0/0xb0 [20801.303343] do_syscall_64+0xe0/0x100 [20801.303349] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [20801.303380] Call Trace: [20801.303383] ? __schedule+0x284/0x650 [20801.303384] schedule+0x33/0xc0 [20801.303386] io_sq_thread+0x38a/0x410 [20801.303388] ? __switch_to_asm+0x40/0x70 [20801.303390] ? wait_woken+0x80/0x80 [20801.303392] ? _raw_spin_lock_irqsave+0x17/0x40 [20801.303394] ? io_submit_sqes+0x120/0x120 [20801.303395] kthread+0x112/0x130 [20801.303396] ? kthread_create_on_node+0x60/0x60 [20801.303398] ret_from_fork+0x35/0x40 o kthread_park() waits for park completion, so io_sq_thread() loop should check kthread_should_park() along with khread_should_stop(), otherwise if kthread_park() is called before prepare_to_wait() the following schedule() never returns: CPU#0 CPU#1 io_sq_thread_stop(): io_sq_thread(): while(!kthread_should_stop() && !ctx->sqo_stop) { ctx->sqo_stop = 1; kthread_park() prepare_to_wait(); if (kthread_should_stop() { } schedule(); <<< nobody checks park flag, <<< so schedule and never return o if the flag ctx->sqo_stop is observed by the io_sq_thread() loop it is quite possible, that kthread_should_park() check and the following kthread_parkme() is never called, because kthread_park() has not been yet called, but few moments later is is called and waits there for park completion, which never happens, because kthread has already exited: CPU#0 CPU#1 io_sq_thread_stop(): io_sq_thread(): ctx->sqo_stop = 1; while(!kthread_should_stop() && !ctx->sqo_stop) { <<< observe sqo_stop and exit the loop } if (kthread_should_park()) kthread_parkme(); <<< never called, since was <<< never parked kthread_park() <<< waits forever for park completion In the current patch we quit the loop by only kthread_should_park() check (kthread_park() is synchronous, so kthread_should_stop() is never observed), and we abandon ->sqo_stop flag, since it is racy. At the end of the io_sq_thread() we unconditionally call parmke(), since we've exited the loop by the park flag. Signed-off-by: Roman Penyaev <rpenyaev@suse.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-block@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-05-16 08:10:24 -06:00
Sheetal Singala	8454fca4f5	dm: fix a couple brace coding style issues Signed-off-by: Sheetal Singala <2396sheetal@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2019-05-16 10:09:21 -04:00
Milan Broz	f710126cfc	dm crypt: print device name in integrity error message This message should better identify the DM device with the integrity failure. Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2019-05-16 10:09:20 -04:00
Milan Broz	7a1cd7238f	dm crypt: move detailed message into debug level The information about tag size should not be printed without debug info set. Also print device major:minor in the error message to identify the device instance. Also use rate limiting and debug level for info about used crypto API implementaton. This is important because during online reencryption the existing message saturates syslog (because we are moving hotzone across the whole device). Cc: stable@vger.kernel.org Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2019-05-16 10:09:20 -04:00
Jens Axboe	47ca23c117	Merge branch 'nvme-5.2' of git://git.infradead.org/nvme into for-5.2/block-post Pull NVMe fixes from Christoph * 'nvme-5.2' of git://git.infradead.org/nvme: nvme: validate cntlid during controller initialisation nvme: change locking for the per-subsystem controller list nvme: trace all async notice events nvme: fix typos in nvme status code values nvme-fabrics: remove unused argument nvme-multipath: avoid crash on invalid subsystem cntlid enumeration nvme-fc: use separate work queue to avoid warning nvme-rdma: remove redundant reference between ib_device and tagset nvme-pci: mark expected switch fall-through nvme-pci: add known admin effects to augument admin effects log page nvme-pci: init shadow doorbell after each reset	2019-05-16 08:07:57 -06:00
Helen Koike	0f41fcf788	dm ioctl: fix hang in early create error condition The dm_early_create() function (which deals with "dm-mod.create=" kernel command line option) calls dm_hash_insert() who gets an extra reference to the md object. In case of failure, this reference wasn't being released, causing dm_destroy() to hang, thus hanging the whole boot process. Fix this by calling __hash_remove() in the error path. Fixes: `6bbc923dfc` ("dm: add support to directly boot to a mapped device") Cc: stable@vger.kernel.org Signed-off-by: Helen Koike <helen.koike@collabora.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2019-05-16 09:52:06 -04:00
Filipe Manana	4e9845eff5	Btrfs: tree-checker: detect file extent items with overlapping ranges Having file extent items with ranges that overlap each other is a serious issue that leads to all sorts of corruptions and crashes (like a BUG_ON() during the course of __btrfs_drop_extents() when it traims file extent items). Therefore teach the tree checker to detect such cases. This is motivated by a recently fixed bug (race between ranged full fsync and writeback or adjacent ranges). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-05-16 14:33:51 +02:00
Filipe Manana	0c713cbab6	Btrfs: fix race between ranged fsync and writeback of adjacent ranges When we do a full fsync (the bit BTRFS_INODE_NEEDS_FULL_SYNC is set in the inode) that happens to be ranged, which happens during a msync() or writes for files opened with O_SYNC for example, we can end up with a corrupt log, due to different file extent items representing ranges that overlap with each other, or hit some assertion failures. When doing a ranged fsync we only flush delalloc and wait for ordered exents within that range. If while we are logging items from our inode ordered extents for adjacent ranges complete, we end up in a race that can make us insert the file extent items that overlap with others we logged previously and the assertion failures. For example, if tree-log.c:copy_items() receives a leaf that has the following file extents items, all with a length of 4K and therefore there is an implicit hole in the range 68K to 72K - 1: (257 EXTENT_ITEM 64K), (257 EXTENT_ITEM 72K), (257 EXTENT_ITEM 76K), ... It copies them to the log tree. However due to the need to detect implicit holes, it may release the path, in order to look at the previous leaf to detect an implicit hole, and then later it will search again in the tree for the first file extent item key, with the goal of locking again the leaf (which might have changed due to concurrent changes to other inodes). However when it locks again the leaf containing the first key, the key corresponding to the extent at offset 72K may not be there anymore since there is an ordered extent for that range that is finishing (that is, somewhere in the middle of btrfs_finish_ordered_io()), and it just removed the file extent item but has not yet replaced it with a new file extent item, so the part of copy_items() that does hole detection will decide that there is a hole in the range starting from 68K to 76K - 1, and therefore insert a file extent item to represent that hole, having a key offset of 68K. After that we now have a log tree with 2 different extent items that have overlapping ranges: 1) The file extent item copied before copy_items() released the path, which has a key offset of 72K and a length of 4K, representing the file range 72K to 76K - 1. 2) And a file extent item representing a hole that has a key offset of 68K and a length of 8K, representing the range 68K to 76K - 1. This item was inserted after releasing the path, and overlaps with the extent item inserted before. The overlapping extent items can cause all sorts of unpredictable and incorrect behaviour, either when replayed or if a fast (non full) fsync happens later, which can trigger a BUG_ON() when calling btrfs_set_item_key_safe() through __btrfs_drop_extents(), producing a trace like the following: [61666.783269] ------------[ cut here ]------------ [61666.783943] kernel BUG at fs/btrfs/ctree.c:3182! [61666.784644] invalid opcode: 0000 [#1] PREEMPT SMP (...) [61666.786253] task: ffff880117b88c40 task.stack: ffffc90008168000 [61666.786253] RIP: 0010:btrfs_set_item_key_safe+0x7c/0xd2 [btrfs] [61666.786253] RSP: 0018:ffffc9000816b958 EFLAGS: 00010246 [61666.786253] RAX: 0000000000000000 RBX: 000000000000000f RCX: 0000000000030000 [61666.786253] RDX: 0000000000000000 RSI: ffffc9000816ba4f RDI: ffffc9000816b937 [61666.786253] RBP: ffffc9000816b998 R08: ffff88011dae2428 R09: 0000000000001000 [61666.786253] R10: 0000160000000000 R11: 6db6db6db6db6db7 R12: ffff88011dae2418 [61666.786253] R13: ffffc9000816ba4f R14: ffff8801e10c4118 R15: ffff8801e715c000 [61666.786253] FS: 00007f6060a18700(0000) GS:ffff88023f5c0000(0000) knlGS:0000000000000000 [61666.786253] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [61666.786253] CR2: 00007f6060a28000 CR3: 0000000213e69000 CR4: 00000000000006e0 [61666.786253] Call Trace: [61666.786253] __btrfs_drop_extents+0x5e3/0xaad [btrfs] [61666.786253] ? time_hardirqs_on+0x9/0x14 [61666.786253] btrfs_log_changed_extents+0x294/0x4e0 [btrfs] [61666.786253] ? release_extent_buffer+0x38/0xb4 [btrfs] [61666.786253] btrfs_log_inode+0xb6e/0xcdc [btrfs] [61666.786253] ? lock_acquire+0x131/0x1c5 [61666.786253] ? btrfs_log_inode_parent+0xee/0x659 [btrfs] [61666.786253] ? arch_local_irq_save+0x9/0xc [61666.786253] ? btrfs_log_inode_parent+0x1f5/0x659 [btrfs] [61666.786253] btrfs_log_inode_parent+0x223/0x659 [btrfs] [61666.786253] ? arch_local_irq_save+0x9/0xc [61666.786253] ? lockref_get_not_zero+0x2c/0x34 [61666.786253] ? rcu_read_unlock+0x3e/0x5d [61666.786253] btrfs_log_dentry_safe+0x60/0x7b [btrfs] [61666.786253] btrfs_sync_file+0x317/0x42c [btrfs] [61666.786253] vfs_fsync_range+0x8c/0x9e [61666.786253] SyS_msync+0x13c/0x1c9 [61666.786253] entry_SYSCALL_64_fastpath+0x18/0xad A sample of a corrupt log tree leaf with overlapping extents I got from running btrfs/072: item 14 key (295 108 200704) itemoff 2599 itemsize 53 extent data disk bytenr 0 nr 0 extent data offset 0 nr 458752 ram 458752 item 15 key (295 108 659456) itemoff 2546 itemsize 53 extent data disk bytenr 4343541760 nr 770048 extent data offset 606208 nr 163840 ram 770048 item 16 key (295 108 663552) itemoff 2493 itemsize 53 extent data disk bytenr 4343541760 nr 770048 extent data offset 610304 nr 155648 ram 770048 item 17 key (295 108 819200) itemoff 2440 itemsize 53 extent data disk bytenr 4334788608 nr 4096 extent data offset 0 nr 4096 ram 4096 The file extent item at offset 659456 (item 15) ends at offset 823296 (659456 + 163840) while the next file extent item (item 16) starts at offset 663552. Another different problem that the race can trigger is a failure in the assertions at tree-log.c:copy_items(), which expect that the first file extent item key we found before releasing the path exists after we have released path and that the last key we found before releasing the path also exists after releasing the path: $ cat -n fs/btrfs/tree-log.c 4080 if (need_find_last_extent) { 4081 /* btrfs_prev_leaf could return 1 without releasing the path */ 4082 btrfs_release_path(src_path); 4083 ret = btrfs_search_slot(NULL, inode->root, &first_key, 4084 src_path, 0, 0); 4085 if (ret < 0) 4086 return ret; 4087 ASSERT(ret == 0); (...) 4103 if (i >= btrfs_header_nritems(src_path->nodes[0])) { 4104 ret = btrfs_next_leaf(inode->root, src_path); 4105 if (ret < 0) 4106 return ret; 4107 ASSERT(ret == 0); 4108 src = src_path->nodes[0]; 4109 i = 0; 4110 need_find_last_extent = true; 4111 } (...) The second assertion implicitly expects that the last key before the path release still exists, because the surrounding while loop only stops after we have found that key. When this assertion fails it produces a stack like this: [139590.037075] assertion failed: ret == 0, file: fs/btrfs/tree-log.c, line: 4107 [139590.037406] ------------[ cut here ]------------ [139590.037707] kernel BUG at fs/btrfs/ctree.h:3546! [139590.038034] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC PTI [139590.038340] CPU: 1 PID: 31841 Comm: fsstress Tainted: G W 5.0.0-btrfs-next-46 #1 (...) [139590.039354] RIP: 0010:assfail.constprop.24+0x18/0x1a [btrfs] (...) [139590.040397] RSP: 0018:ffffa27f48f2b9b0 EFLAGS: 00010282 [139590.040730] RAX: 0000000000000041 RBX: ffff897c635d92c8 RCX: 0000000000000000 [139590.041105] RDX: 0000000000000000 RSI: ffff897d36a96868 RDI: ffff897d36a96868 [139590.041470] RBP: ffff897d1b9a0708 R08: 0000000000000000 R09: 0000000000000000 [139590.041815] R10: 0000000000000008 R11: 0000000000000000 R12: 0000000000000013 [139590.042159] R13: 0000000000000227 R14: ffff897cffcbba88 R15: 0000000000000001 [139590.042501] FS: 00007f2efc8dee80(0000) GS:ffff897d36a80000(0000) knlGS:0000000000000000 [139590.042847] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [139590.043199] CR2: 00007f8c064935e0 CR3: 0000000232252002 CR4: 00000000003606e0 [139590.043547] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [139590.043899] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [139590.044250] Call Trace: [139590.044631] copy_items+0xa3f/0x1000 [btrfs] [139590.045009] ? generic_bin_search.constprop.32+0x61/0x200 [btrfs] [139590.045396] btrfs_log_inode+0x7b3/0xd70 [btrfs] [139590.045773] btrfs_log_inode_parent+0x2b3/0xce0 [btrfs] [139590.046143] ? do_raw_spin_unlock+0x49/0xc0 [139590.046510] btrfs_log_dentry_safe+0x4a/0x70 [btrfs] [139590.046872] btrfs_sync_file+0x3b6/0x440 [btrfs] [139590.047243] btrfs_file_write_iter+0x45b/0x5c0 [btrfs] [139590.047592] __vfs_write+0x129/0x1c0 [139590.047932] vfs_write+0xc2/0x1b0 [139590.048270] ksys_write+0x55/0xc0 [139590.048608] do_syscall_64+0x60/0x1b0 [139590.048946] entry_SYSCALL_64_after_hwframe+0x49/0xbe [139590.049287] RIP: 0033:0x7f2efc4be190 (...) [139590.050342] RSP: 002b:00007ffe743243a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [139590.050701] RAX: ffffffffffffffda RBX: 0000000000008d58 RCX: 00007f2efc4be190 [139590.051067] RDX: 0000000000008d58 RSI: 00005567eca0f370 RDI: 0000000000000003 [139590.051459] RBP: 0000000000000024 R08: 0000000000000003 R09: 0000000000008d60 [139590.051863] R10: 0000000000000078 R11: 0000000000000246 R12: 0000000000000003 [139590.052252] R13: 00000000003d3507 R14: 00005567eca0f370 R15: 0000000000000000 (...) [139590.055128] ---[ end trace 193f35d0215cdeeb ]--- So fix this race between a full ranged fsync and writeback of adjacent ranges by flushing all delalloc and waiting for all ordered extents to complete before logging the inode. This is the simplest way to solve the problem because currently the full fsync path does not deal with ranges at all (it assumes a full range from 0 to LLONG_MAX) and it always needs to look at adjacent ranges for hole detection. For use cases of ranged fsyncs this can make a few fsyncs slower but on the other hand it can make some following fsyncs to other ranges do less work or no need to do anything at all. A full fsync is rare anyway and happens only once after loading/creating an inode and once after less common operations such as a shrinking truncate. This is an issue that exists for a long time, and was often triggered by generic/127, because it does mmap'ed writes and msync (which triggers a ranged fsync). Adding support for the tree checker to detect overlapping extents (next patch in the series) and trigger a WARN() when such cases are found, and then calling btrfs_check_leaf_full() at the end of btrfs_insert_file_extent() made the issue much easier to detect. Running btrfs/072 with that change to the tree checker and making fsstress open files always with O_SYNC made it much easier to trigger the issue (as triggering it with generic/127 is very rare). CC: stable@vger.kernel.org # 3.16+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-05-16 14:31:13 +02:00
Filipe Manana	ebb929060a	Btrfs: avoid fallback to transaction commit during fsync of files with holes When we are doing a full fsync (bit BTRFS_INODE_NEEDS_FULL_SYNC set) of a file that has holes and has file extent items spanning two or more leafs, we can end up falling to back to a full transaction commit due to a logic bug that leads to failure to insert a duplicate file extent item that is meant to represent a hole between the last file extent item of a leaf and the first file extent item in the next leaf. The failure (EEXIST error) leads to a transaction commit (as most errors when logging an inode do). For example, we have the two following leafs: Leaf N: ----------------------------------------------- \| ..., ..., ..., (257, FILE_EXTENT_ITEM, 64K) \| ----------------------------------------------- The file extent item at the end of leaf N has a length of 4Kb, representing the file range from 64K to 68K - 1. Leaf N + 1: ----------------------------------------------- \| (257, FILE_EXTENT_ITEM, 72K), ..., ..., ... \| ----------------------------------------------- The file extent item at the first slot of leaf N + 1 has a length of 4Kb too, representing the file range from 72K to 76K - 1. During the full fsync path, when we are at tree-log.c:copy_items() with leaf N as a parameter, after processing the last file extent item, that represents the extent at offset 64K, we take a look at the first file extent item at the next leaf (leaf N + 1), and notice there's a 4K hole between the two extents, and therefore we insert a file extent item representing that hole, starting at file offset 68K and ending at offset 72K - 1. However we don't update the value of last_extent, which is used to represent the end offset (plus 1, non-inclusive end) of the last file extent item inserted in the log, so it stays with a value of 68K and not with a value of 72K. Then, when copy_items() is called for leaf N + 1, because the value of last_extent is smaller then the offset of the first extent item in the leaf (68K < 72K), we look at the last file extent item in the previous leaf (leaf N) and see it there's a 4K gap between it and our first file extent item (again, 68K < 72K), so we decide to insert a file extent item representing the hole, starting at file offset 68K and ending at offset 72K - 1, this insertion will fail with -EEXIST being returned from btrfs_insert_file_extent() because we already inserted a file extent item representing a hole for this offset (68K) in the previous call to copy_items(), when processing leaf N. The -EEXIST error gets propagated to the fsync callback, btrfs_sync_file(), which falls back to a full transaction commit. Fix this by adjusting *last_extent after inserting a hole when we had to look at the next leaf. Fixes: `4ee3fad34a` ("Btrfs: fix fsync after hole punching when using no-holes feature") Cc: stable@vger.kernel.org # 4.14+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-05-16 14:31:13 +02:00
Qu Wenruo	14ae4ec1ee	btrfs: extent-tree: Fix a bug that btrfs is unable to add pinned bytes Commit `ddf30cf03f` ("btrfs: extent-tree: Use btrfs_ref to refactor add_pinned_bytes()") refactored add_pinned_bytes(), but during that refactor, there are two callers which add the pinned bytes instead of subtracting. That refactor misses those two caller, causing incorrect pinned bytes calculation and resulting unexpected ENOSPC error. Fix it by adding a new parameter @sign to restore the original behavior. Reported-by: kernel test robot <rong.a.chen@intel.com> Fixes: `ddf30cf03f` ("btrfs: extent-tree: Use btrfs_ref to refactor add_pinned_bytes()") Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-05-16 14:31:13 +02:00
Tobin C. Harding	e32773357d	btrfs: sysfs: don't leak memory when failing add fsid A failed call to kobject_init_and_add() must be followed by a call to kobject_put(). Currently in the error path when adding fs_devices we are missing this call. This could be fixed by calling btrfs_sysfs_remove_fsid() if btrfs_sysfs_add_fsid() returns an error or by adding a call to kobject_put() directly in btrfs_sysfs_add_fsid(). Here we choose the second option because it prevents the slightly unusual error path handling requirements of kobject from leaking out into btrfs functions. Add a call to kobject_put() in the error path of kobject_add_and_init(). This causes the release method to be called if kobject_init_and_add() fails. open_tree() is the function that calls btrfs_sysfs_add_fsid() and the error code in this function is already written with the assumption that the release method is called during the error path of open_tree() (as seen by the call to btrfs_sysfs_remove_fsid() under the fail_fsdev_sysfs label). Cc: stable@vger.kernel.org # v4.4+ Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tobin C. Harding <tobin@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-05-16 14:31:12 +02:00
Tobin C. Harding	450ff83488	btrfs: sysfs: Fix error path kobject memory leak If a call to kobject_init_and_add() fails we must call kobject_put() otherwise we leak memory. Calling kobject_put() when kobject_init_and_add() fails drops the refcount back to 0 and calls the ktype release method (which in turn calls the percpu destroy and kfree). Add call to kobject_put() in the error path of call to kobject_init_and_add(). Cc: stable@vger.kernel.org # v4.4+ Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tobin C. Harding <tobin@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-05-16 14:31:01 +02:00

... 4 5 6 7 8 ...

839762 Commits All Branches Search

839762 Commits

All Branches