mirror of https://gitee.com/openkylin/linux.git
30 Commits
Author | SHA1 | Message | Date |
---|---|---|---|
Linus Torvalds | 3d8dfe75ef |
arm64 updates for 5.1:
- Pseudo NMI support for arm64 using GICv3 interrupt priorities - uaccess macros clean-up (unsafe user accessors also merged but reverted, waiting for objtool support on arm64) - ptrace regsets for Pointer Authentication (ARMv8.3) key management - inX() ordering w.r.t. delay() on arm64 and riscv (acks in place by the riscv maintainers) - arm64/perf updates: PMU bindings converted to json-schema, unused variable and misleading comment removed - arm64/debug fixes to ensure checking of the triggering exception level and to avoid the propagation of the UNKNOWN FAR value into the si_code for debug signals - Workaround for Fujitsu A64FX erratum 010001 - lib/raid6 ARM NEON optimisations - NR_CPUS now defaults to 256 on arm64 - Minor clean-ups (documentation/comments, Kconfig warning, unused asm-offsets, clang warnings) - MAINTAINERS update for list information to the ARM64 ACPI entry -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlyCl0cACgkQa9axLQDI XvEyKxAAiogBZLbyhcy8bTUHVzVoJE0FyAkdO2wWnnaff2Ohkhy1Y/npv33IeK2q RknxqDIx2DUUVPJNRZGoI/WwBtTZdKaAnW4rIKG84yC1eAkFcd96WQasaZzcp1qY HmvbJiYXM0bh+0J7i3Wgry/QzOkrltJFJW2kp6Wd5aFE+R1WyWyxT6d+Fp0J3vlA bT70jlpBK6LXEOmmBS+04Ml02+8MvaGxIl8EInBHSfDLRLErj5E8n41rRHKUiSWz maWI+kVoLYwOE68xiZlDftUBEeQpUSWgg2nxeK+640QSl1wJmVcRcY9nm6TZeMG2 AiZTR9a7cP5rrdSN5suUmb7d4AMMVlVMisGDlwb+9oCxeTRDzg0uwACaVgHfPqQr UeBdHbL9nStN7uBH23H8L9mKk+tqpFmk0sgzdrKejOwysAiqWV8aazb/Na3qnVRl J1B5opxMnGOsjXmHvtG/tiZl281Uwz5ZmzfLmIY3gUZgUgdA3511Egp0ry5y1dzJ SkYC4Hmzb2ybQvXGIDDa3OzCwXXiqyqKsO+O8Egg1k4OIwbp3w+NHE7gKeA+dMgD gjN7zEalCUi46Q28xiCPEb+88BpQ18czIWGQLb9mAnmYeZPjqqenXKXuRHr4lgVe jPURJ/vqvFEglZJN1RDuQHKzHEcm5f2XE566sMZYdSoeiUCb0QM= =2U56 -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - Pseudo NMI support for arm64 using GICv3 interrupt priorities - uaccess macros clean-up (unsafe user accessors also merged but reverted, waiting for objtool support on arm64) - ptrace regsets for Pointer Authentication (ARMv8.3) key management - inX() ordering w.r.t. delay() on arm64 and riscv (acks in place by the riscv maintainers) - arm64/perf updates: PMU bindings converted to json-schema, unused variable and misleading comment removed - arm64/debug fixes to ensure checking of the triggering exception level and to avoid the propagation of the UNKNOWN FAR value into the si_code for debug signals - Workaround for Fujitsu A64FX erratum 010001 - lib/raid6 ARM NEON optimisations - NR_CPUS now defaults to 256 on arm64 - Minor clean-ups (documentation/comments, Kconfig warning, unused asm-offsets, clang warnings) - MAINTAINERS update for list information to the ARM64 ACPI entry * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (54 commits) arm64: mmu: drop paging_init comments arm64: debug: Ensure debug handlers check triggering exception level arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals Revert "arm64: uaccess: Implement unsafe accessors" arm64: avoid clang warning about self-assignment arm64: Kconfig.platforms: fix warning unmet direct dependencies lib/raid6: arm: optimize away a mask operation in NEON recovery routine lib/raid6: use vdupq_n_u8 to avoid endianness warnings arm64: io: Hook up __io_par() for inX() ordering riscv: io: Update __io_[p]ar() macros to take an argument asm-generic/io: Pass result of I/O accessor to __io_[p]ar() arm64: Add workaround for Fujitsu A64FX erratum 010001 arm64: Rename get_thread_info() arm64: Remove documentation about TIF_USEDFPU arm64: irqflags: Fix clang build warnings arm64: Enable the support of pseudo-NMIs arm64: Skip irqflags tracing for NMI in IRQs disabled context arm64: Skip preemption when exiting an NMI arm64: Handle serror in NMI context irqchip/gic-v3: Allow interrupts to be set as pseudo-NMI ... |
|
Zhang Lei | 3e32131abc |
arm64: Add workaround for Fujitsu A64FX erratum 010001
On the Fujitsu-A64FX cores ver(1.0, 1.1), memory access may cause an undefined fault (Data abort, DFSC=0b111111). This fault occurs under a specific hardware condition when a load/store instruction performs an address translation. Any load/store instruction, except non-fault access including Armv8 and SVE might cause this undefined fault. The TCR_ELx.NFD1 bit is used by the kernel when CONFIG_RANDOMIZE_BASE is enabled to mitigate timing attacks against KASLR where the kernel address space could be probed using the FFR and suppressed fault on SVE loads. Since this erratum causes spurious exceptions, which may corrupt the exception registers, we clear the TCR_ELx.NFDx=1 bits when booting on an affected CPU. Signed-off-by: Zhang Lei <zhang.lei@jp.fujitsu.com> [Generated MIDR value/mask for __cpu_setup(), removed spurious-fault handler and always disabled the NFDx bits on affected CPUs] Signed-off-by: James Morse <james.morse@arm.com> Tested-by: zhang.lei <zhang.lei@jp.fujitsu.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
|
Samuel Holland | c950ca8c35 |
clocksource/drivers/arch_timer: Workaround for Allwinner A64 timer instability
The Allwinner A64 SoC is known[1] to have an unstable architectural timer, which manifests itself most obviously in the time jumping forward a multiple of 95 years[2][3]. This coincides with 2^56 cycles at a timer frequency of 24 MHz, implying that the time went slightly backward (and this was interpreted by the kernel as it jumping forward and wrapping around past the epoch). Investigation revealed instability in the low bits of CNTVCT at the point a high bit rolls over. This leads to power-of-two cycle forward and backward jumps. (Testing shows that forward jumps are about twice as likely as backward jumps.) Since the counter value returns to normal after an indeterminate read, each "jump" really consists of both a forward and backward jump from the software perspective. Unless the kernel is trapping CNTVCT reads, a userspace program is able to read the register in a loop faster than it changes. A test program running on all 4 CPU cores that reported jumps larger than 100 ms was run for 13.6 hours and reported the following: Count | Event -------+--------------------------- 9940 | jumped backward 699ms 268 | jumped backward 1398ms 1 | jumped backward 2097ms 16020 | jumped forward 175ms 6443 | jumped forward 699ms 2976 | jumped forward 1398ms 9 | jumped forward 356516ms 9 | jumped forward 357215ms 4 | jumped forward 714430ms 1 | jumped forward 3578440ms This works out to a jump larger than 100 ms about every 5.5 seconds on each CPU core. The largest jump (almost an hour!) was the following sequence of reads: 0x0000007fffffffff → 0x00000093feffffff → 0x0000008000000000 Note that the middle bits don't necessarily all read as all zeroes or all ones during the anomalous behavior; however the low 10 bits checked by the function in this patch have never been observed with any other value. Also note that smaller jumps are much more common, with backward jumps of 2048 (2^11) cycles observed over 400 times per second on each core. (Of course, this is partially explained by lower bits rolling over more frequently.) Any one of these could have caused the 95 year time skip. Similar anomalies were observed while reading CNTPCT (after patching the kernel to allow reads from userspace). However, the CNTPCT jumps are much less frequent, and only small jumps were observed. The same program as before (except now reading CNTPCT) observed after 72 hours: Count | Event -------+--------------------------- 17 | jumped backward 699ms 52 | jumped forward 175ms 2831 | jumped forward 699ms 5 | jumped forward 1398ms Further investigation showed that the instability in CNTPCT/CNTVCT also affected the respective timer's TVAL register. The following values were observed immediately after writing CNVT_TVAL to 0x10000000: CNTVCT | CNTV_TVAL | CNTV_CVAL | CNTV_TVAL Error --------------------+------------+--------------------+----------------- 0x000000d4a2d8bfff | 0x10003fff | 0x000000d4b2d8bfff | +0x00004000 0x000000d4a2d94000 | 0x0fffffff | 0x000000d4b2d97fff | -0x00004000 0x000000d4a2d97fff | 0x10003fff | 0x000000d4b2d97fff | +0x00004000 0x000000d4a2d9c000 | 0x0fffffff | 0x000000d4b2d9ffff | -0x00004000 The pattern of errors in CNTV_TVAL seemed to depend on exactly which value was written to it. For example, after writing 0x10101010: CNTVCT | CNTV_TVAL | CNTV_CVAL | CNTV_TVAL Error --------------------+------------+--------------------+----------------- 0x000001ac3effffff | 0x1110100f | 0x000001ac4f10100f | +0x1000000 0x000001ac40000000 | 0x1010100f | 0x000001ac5110100f | -0x1000000 0x000001ac58ffffff | 0x1110100f | 0x000001ac6910100f | +0x1000000 0x000001ac66000000 | 0x1010100f | 0x000001ac7710100f | -0x1000000 0x000001ac6affffff | 0x1110100f | 0x000001ac7b10100f | +0x1000000 0x000001ac6e000000 | 0x1010100f | 0x000001ac7f10100f | -0x1000000 I was also twice able to reproduce the issue covered by Allwinner's workaround[4], that writing to TVAL sometimes fails, and both CVAL and TVAL are left with entirely bogus values. One was the following values: CNTVCT | CNTV_TVAL | CNTV_CVAL --------------------+------------+-------------------------------------- 0x000000d4a2d6014c | 0x8fbd5721 | 0x000000d132935fff (615s in the past) Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> ======================================================================== Because the CPU can read the CNTPCT/CNTVCT registers faster than they change, performing two reads of the register and comparing the high bits (like other workarounds) is not a workable solution. And because the timer can jump both forward and backward, no pair of reads can distinguish a good value from a bad one. The only way to guarantee a good value from consecutive reads would be to read _three_ times, and take the middle value only if the three values are 1) each unique and 2) increasing. This takes at minimum 3 counter cycles (125 ns), or more if an anomaly is detected. However, since there is a distinct pattern to the bad values, we can optimize the common case (1022/1024 of the time) to a single read by simply ignoring values that match the error pattern. This still takes no more than 3 cycles in the worst case, and requires much less code. As an additional safety check, we still limit the loop iteration to the number of max-frequency (1.2 GHz) CPU cycles in three 24 MHz counter periods. For the TVAL registers, the simple solution is to not use them. Instead, read or write the CVAL and calculate the TVAL value in software. Although the manufacturer is aware of at least part of the erratum[4], there is no official name for it. For now, use the kernel-internal name "UNKNOWN1". [1]: https://github.com/armbian/build/commit/a08cd6fe7ae9 [2]: https://forum.armbian.com/topic/3458-a64-datetime-clock-issue/ [3]: https://irclog.whitequark.org/linux-sunxi/2018-01-26 [4]: https://github.com/Allwinner-Homlet/H6-BSP4.9-linux/blob/master/drivers/clocksource/arm_arch_timer.c#L272 Acked-by: Maxime Ripard <maxime.ripard@bootlin.com> Tested-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Samuel Holland <samuel@sholland.org> Cc: stable@vger.kernel.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> |
|
Linus Torvalds | 5694cecdb0 |
arm64 festive updates for 4.21
In the end, we ended up with quite a lot more than I expected: - Support for ARMv8.3 Pointer Authentication in userspace (CRIU and kernel-side support to come later) - Support for per-thread stack canaries, pending an update to GCC that is currently undergoing review - Support for kexec_file_load(), which permits secure boot of a kexec payload but also happens to improve the performance of kexec dramatically because we can avoid the sucky purgatory code from userspace. Kdump will come later (requires updates to libfdt). - Optimisation of our dynamic CPU feature framework, so that all detected features are enabled via a single stop_machine() invocation - KPTI whitelisting of Cortex-A CPUs unaffected by Meltdown, so that they can benefit from global TLB entries when KASLR is not in use - 52-bit virtual addressing for userspace (kernel remains 48-bit) - Patch in LSE atomics for per-cpu atomic operations - Custom preempt.h implementation to avoid unconditional calls to preempt_schedule() from preempt_enable() - Support for the new 'SB' Speculation Barrier instruction - Vectorised implementation of XOR checksumming and CRC32 optimisations - Workaround for Cortex-A76 erratum #1165522 - Improved compatibility with Clang/LLD - Support for TX2 system PMUS for profiling the L3 cache and DMC - Reflect read-only permissions in the linear map by default - Ensure MMIO reads are ordered with subsequent calls to Xdelay() - Initial support for memory hotplug - Tweak the threshold when we invalidate the TLB by-ASID, so that mremap() performance is improved for ranges spanning multiple PMDs. - Minor refactoring and cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABCgAGBQJcE4TmAAoJELescNyEwWM0Nr0H/iaU7/wQSzHyNXtZoImyKTul Blu2ga4/EqUrTU7AVVfmkl/3NBILWlgQVpY6tH6EfXQuvnxqD7CizbHyLdyO+z0S B5PsFUH2GLMNAi48AUNqGqkgb2knFbg+T+9IimijDBkKg1G/KhQnRg6bXX32mLJv Une8oshUPBVJMsHN1AcQknzKariuoE3u0SgJ+eOZ9yA2ZwKxP4yy1SkDt3xQrtI0 lojeRjxcyjTP1oGRNZC+BWUtGOT35p7y6cGTnBd/4TlqBGz5wVAJUcdoxnZ6JYVR O8+ob9zU+4I0+SKt80s7pTLqQiL9rxkKZ5joWK1pr1g9e0s5N5yoETXKFHgJYP8= =sYdt -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 festive updates from Will Deacon: "In the end, we ended up with quite a lot more than I expected: - Support for ARMv8.3 Pointer Authentication in userspace (CRIU and kernel-side support to come later) - Support for per-thread stack canaries, pending an update to GCC that is currently undergoing review - Support for kexec_file_load(), which permits secure boot of a kexec payload but also happens to improve the performance of kexec dramatically because we can avoid the sucky purgatory code from userspace. Kdump will come later (requires updates to libfdt). - Optimisation of our dynamic CPU feature framework, so that all detected features are enabled via a single stop_machine() invocation - KPTI whitelisting of Cortex-A CPUs unaffected by Meltdown, so that they can benefit from global TLB entries when KASLR is not in use - 52-bit virtual addressing for userspace (kernel remains 48-bit) - Patch in LSE atomics for per-cpu atomic operations - Custom preempt.h implementation to avoid unconditional calls to preempt_schedule() from preempt_enable() - Support for the new 'SB' Speculation Barrier instruction - Vectorised implementation of XOR checksumming and CRC32 optimisations - Workaround for Cortex-A76 erratum #1165522 - Improved compatibility with Clang/LLD - Support for TX2 system PMUS for profiling the L3 cache and DMC - Reflect read-only permissions in the linear map by default - Ensure MMIO reads are ordered with subsequent calls to Xdelay() - Initial support for memory hotplug - Tweak the threshold when we invalidate the TLB by-ASID, so that mremap() performance is improved for ranges spanning multiple PMDs. - Minor refactoring and cleanups" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (125 commits) arm64: kaslr: print PHYS_OFFSET in dump_kernel_offset() arm64: sysreg: Use _BITUL() when defining register bits arm64: cpufeature: Rework ptr auth hwcaps using multi_entry_cap_matches arm64: cpufeature: Reduce number of pointer auth CPU caps from 6 to 4 arm64: docs: document pointer authentication arm64: ptr auth: Move per-thread keys from thread_info to thread_struct arm64: enable pointer authentication arm64: add prctl control for resetting ptrauth keys arm64: perf: strip PAC when unwinding userspace arm64: expose user PAC bit positions via ptrace arm64: add basic pointer authentication support arm64/cpufeature: detect pointer authentication arm64: Don't trap host pointer auth use to EL2 arm64/kvm: hide ptrauth from guests arm64/kvm: consistently handle host HCR_EL2 flags arm64: add pointer authentication register bits arm64: add comments about EC exception levels arm64: perf: Treat EXCLUDE_EL* bit definitions as unsigned arm64: kpti: Whitelist Cortex-A CPUs that don't implement the CSV3 field arm64: enable per-task stack canaries ... |
|
Marc Zyngier | a457b0f7f5 |
arm64: Add configuration/documentation for Cortex-A76 erratum 1165522
Now that the infrastructure to handle erratum 1165522 is in place, let's make it a selectable option and add the required documentation. Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> |
|
Catalin Marinas | ce8c80c536 |
arm64: Add workaround for Cortex-A76 erratum 1286807
On the affected Cortex-A76 cores (r0p0 to r3p0), if a virtual address for a cacheable mapping of a location is being accessed by a core while another core is remapping the virtual address to a new physical page using the recommended break-before-make sequence, then under very rare circumstances TLBI+DSB completes before a read using the translation being invalidated has been observed by other observers. The workaround repeats the TLBI+DSB operation and is shared with the Qualcomm Falkor erratum 1009 Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
|
Marc Zyngier | e03a4e5bb7 |
arm64: Add silicon-errata.txt entry for ARM erratum 1188873
Document that we actually work around ARM erratum 1188873
Fixes:
|
|
Suzuki K Poulose | ece1397cbc |
arm64: Add work around for Arm Cortex-A55 Erratum 1024718
Some variants of the Arm Cortex-55 cores (r0p0, r0p1, r1p0) suffer from an erratum 1024718, which causes incorrect updates when DBM/AP bits in a page table entry is modified without a break-before-make sequence. The work around is to skip enabling the hardware DBM feature on the affected cores. The hardware Access Flag management features is not affected. There are some other cores suffering from this errata, which could be added to the midr_list to trigger the work around. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: ckadabi@codeaurora.org Reviewed-by: Dave Martin <dave.martin@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> |
|
Linus Torvalds | 0aebc6a440 |
arm64 updates for 4.16:
- Security mitigations: - variant 2: invalidating the branch predictor with a call to secure firmware - variant 3: implementing KPTI for arm64 - 52-bit physical address support for arm64 (ARMv8.2) - arm64 support for RAS (firmware first only) and SDEI (software delegated exception interface; allows firmware to inject a RAS error into the OS) - Perf support for the ARM DynamIQ Shared Unit PMU - CPUID and HWCAP bits updated for new floating point multiplication instructions in ARMv8.4 - Removing some virtual memory layout printks during boot - Fix initial page table creation to cope with larger than 32M kernel images when 16K pages are enabled -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlpwxDMACgkQa9axLQDI XvF55BAAniMpxPXnYNfv6l7/4O8eKo1lJIaG1wbej4JRZ/rT3K4Z3OBXW1dKHO8d /PTbVmZ90IqIGROkoDrE+6xyjjn9yK3uuW4ytN2zQkBa8VFaHAnHlX+zKQcuwy9f yxwiHk+C7vK5JR7mpXTazjRknsUv1MPtlTt7DQrSdq0KRDJVDNFC+grmbew2rz0X cjQDqZqgzuFyrKxdiQVjDmc3zH9NsNBhDo0hlGHf2jK6bGJsAPtI8M2JcLrK8ITG Ye/dD7BJp1mWD8ff0BPaMxu24qfAMNLH8f2dpTa986/H78irVz7i/t5HG0/1+5Jh EE4OFRTKZ59Qgyo1zWcaJvdp8YjiaX/L4PWJg8CxM5OhP9dIac9ydcFQfWzpKpUs xyZfmK6XliGFReAkVOOf5tEqFUDhMtsqhzPYmbmU1lp61wmSYIZ8CTenpWWCJSRO NOGyG1X2uFBvP69+iPNlfTGz1r7tg1URY5iO8fUEIhY8LrgyORkiqw4OvPEgnMXP Ngy+dXhyvnps2AAWbSX0O4puRlTgEYLT5KaMLzH/+gWsXATT0rzUCD/aOwUQq/Y7 SWXZHkb3jpmOZZnzZsLL2MNzEIPCFBwSUE9fSv4dA9d/N6tUmlmZALJjHkfzCDpj +mPsSmAMTj72kUYzm0b5GCtOu/iQ2kDWOZjOM1m4+v/B+f7JoEE= =iEjP -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "The main theme of this pull request is security covering variants 2 and 3 for arm64. I expect to send additional patches next week covering an improved firmware interface (requires firmware changes) for variant 2 and way for KPTI to be disabled on unaffected CPUs (Cavium's ThunderX doesn't work properly with KPTI enabled because of a hardware erratum). Summary: - Security mitigations: - variant 2: invalidate the branch predictor with a call to secure firmware - variant 3: implement KPTI for arm64 - 52-bit physical address support for arm64 (ARMv8.2) - arm64 support for RAS (firmware first only) and SDEI (software delegated exception interface; allows firmware to inject a RAS error into the OS) - perf support for the ARM DynamIQ Shared Unit PMU - CPUID and HWCAP bits updated for new floating point multiplication instructions in ARMv8.4 - remove some virtual memory layout printks during boot - fix initial page table creation to cope with larger than 32M kernel images when 16K pages are enabled" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (104 commits) arm64: Fix TTBR + PAN + 52-bit PA logic in cpu_do_switch_mm arm64: Turn on KPTI only on CPUs that need it arm64: Branch predictor hardening for Cavium ThunderX2 arm64: Run enable method for errata work arounds on late CPUs arm64: Move BP hardening to check_and_switch_context arm64: mm: ignore memory above supported physical address size arm64: kpti: Fix the interaction between ASID switching and software PAN KVM: arm64: Emulate RAS error registers and set HCR_EL2's TERR & TEA KVM: arm64: Handle RAS SErrors from EL2 on guest exit KVM: arm64: Handle RAS SErrors from EL1 on guest exit KVM: arm64: Save ESR_EL2 on guest SError KVM: arm64: Save/Restore guest DISR_EL1 KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2. KVM: arm/arm64: mask/unmask daif around VHE guests arm64: kernel: Prepare for a DISR user arm64: Unconditionally enable IESB on exception entry/return for firmware-first arm64: kernel: Survive corrected RAS errors notified by SError arm64: cpufeature: Detect CPU RAS Extentions arm64: sysreg: Move to use definitions for all the SCTLR bits arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early ... |
|
Stephen Boyd | bb48711800 |
arm64: cpu_errata: Add Kryo to Falkor 1003 errata
The Kryo CPUs are also affected by the Falkor 1003 errata, so
we need to do the same workaround on Kryo CPUs. The MIDR is
slightly more complicated here, where the PART number is not
always the same when looking at all the bits from 15 to 4. Drop
the lower 8 bits and just look at the top 4 to see if it's '2'
and then consider those as Kryo CPUs. This covers all the
combinations without having to list them all out.
Fixes:
|
|
Shanker Donthineni | 932b50c7c1 |
arm64: Add software workaround for Falkor erratum 1041
The ARM architecture defines the memory locations that are permitted to be accessed as the result of a speculative instruction fetch from an exception level for which all stages of translation are disabled. Specifically, the core is permitted to speculatively fetch from the 4KB region containing the current program counter 4K and next 4K. When translation is changed from enabled to disabled for the running exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the Falkor core may errantly speculatively access memory locations outside of the 4KB region permitted by the architecture. The errant memory access may lead to one of the following unexpected behaviors. 1) A System Error Interrupt (SEI) being raised by the Falkor core due to the errant memory access attempting to access a region of memory that is protected by a slave-side memory protection unit. 2) Unpredictable device behavior due to a speculative read from device memory. This behavior may only occur if the instruction cache is disabled prior to or coincident with translation being changed from enabled to disabled. The conditions leading to this erratum will not occur when either of the following occur: 1) A higher exception level disables translation of a lower exception level (e.g. EL2 changing SCTLR_EL1[M] from a value of 1 to 0). 2) An exception level disabling its stage-1 translation if its stage-2 translation is enabled (e.g. EL1 changing SCTLR_EL1[M] from a value of 1 to 0 when HCR_EL2[VM] has a value of 1). To avoid the errant behavior, software must execute an ISB immediately prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com> |
|
Marc Zyngier | 5c9a882e94 |
irqchip/gic-v3-its: Workaround HiSilicon Hip07 redistributor addressing
The ITSes on the Hip07 (as present in the Huawei D05) are broken when it comes to addressing the redistributors, and need to be explicitely told to address the VLPI page instead of the redistributor base address. So let's add yet another quirk, fixing up the target address in the command stream. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> |
|
Linus Torvalds | fb4e3beeff |
IOMMU Updates for Linux v4.13
This update comes with: * Support for lockless operation in the ARM io-pgtable code. This is an important step to solve the scalability problems in the common dma-iommu code for ARM * Some Errata workarounds for ARM SMMU implemenations * Rewrite of the deferred IO/TLB flush code in the AMD IOMMU driver. The code suffered from very high flush rates, with the new implementation the flush rate is down to ~1% of what it was before * Support for amd_iommu=off when booting with kexec. Problem here was that the IOMMU driver bailed out early without disabling the iommu hardware, if it was enabled in the old kernel * The Rockchip IOMMU driver is now available on ARM64 * Align the return value of the iommu_ops->device_group call-backs to not miss error values * Preempt-disable optimizations in the Intel VT-d and common IOVA code to help Linux-RT * Various other small cleanups and fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJZZgddAAoJECvwRC2XARrjurgQANO338GIBr2ZkA0oectidDpZ Y4yu7W9RH6NyhupJG/Xooya7daBWFjbaA1AVJ3ZZNlMERh69AmehVfRUfVMzF2w+ buma58HQgiJWN1zFD8xdeMzYKms9P77whA88C/9QvrK/klB3LipWP2SC0yvvvyxJ mMCDpgt+D+CGnIDqbRuyLDQoRu3yjAkAvYb6OzL8DPJVP1Y5oLffGwGnHzJbJnOf eWJwYHM5ai0uF/Qqy6RNNekacObjVaOLihjugGvokH6ipXfOrSSNriXW9pZiWR5m S91898YTP3KuWWsJM+N93UAjvc6pL9PqL/OvbB9zdYpzu+5PtUpFXHYcOebKyEEO 4j9CaRzubsWFTFjbYItJnR4WgXQRf4NKOGfTfHMHA+dY8aODYnlXNVdQDAA2aFgn TUBvHq5xb0zZ3nbPwtTDyW06oDMVfBBarLx2yFI1aQSSh+eg/GtIi5KP28gyFZNz 4gWj0q3g/e3y7WEwNbYV7L3TS0d/p8VUYFtUp7PUCddnWoY+4cJzgidub5xIViZD Ql0nZzga9pXXIE/kE5Pf74WqrG7JJzZsvK2ABy4+XGrMq6RclJf+0pXbSqiXDpXL quw8t0oXw0ZEeavQ31Za8mjXBvo5ocM5iintl1wrl2BujHEO3oKqbGsIOaRcLnlN Ukehbl4OEKzZpD3oLPPk =pmBf -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "This update comes with: - Support for lockless operation in the ARM io-pgtable code. This is an important step to solve the scalability problems in the common dma-iommu code for ARM - Some Errata workarounds for ARM SMMU implemenations - Rewrite of the deferred IO/TLB flush code in the AMD IOMMU driver. The code suffered from very high flush rates, with the new implementation the flush rate is down to ~1% of what it was before - Support for amd_iommu=off when booting with kexec. The problem here was that the IOMMU driver bailed out early without disabling the iommu hardware, if it was enabled in the old kernel - The Rockchip IOMMU driver is now available on ARM64 - Align the return value of the iommu_ops->device_group call-backs to not miss error values - Preempt-disable optimizations in the Intel VT-d and common IOVA code to help Linux-RT - Various other small cleanups and fixes" * tag 'iommu-updates-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (60 commits) iommu/vt-d: Constify intel_dma_ops iommu: Warn once when device_group callback returns NULL iommu/omap: Return ERR_PTR in device_group call-back iommu: Return ERR_PTR() values from device_group call-backs iommu/s390: Use iommu_group_get_for_dev() in s390_iommu_add_device() iommu/vt-d: Don't disable preemption while accessing deferred_flush() iommu/iova: Don't disable preempt around this_cpu_ptr() iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126 iommu/arm-smmu-v3: Enable ACPI based HiSilicon CMD_PREFETCH quirk(erratum 161010701) iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #74 ACPI/IORT: Fixup SMMUv3 resource size for Cavium ThunderX2 SMMUv3 model iommu/arm-smmu-v3, acpi: Add temporary Cavium SMMU-V3 IORT model number definitions iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing table iommu/io-pgtable: depend on !GENERIC_ATOMIC64 when using COMPILE_TEST with LPAE iommu/arm-smmu-v3: Remove io-pgtable spinlock iommu/arm-smmu: Remove io-pgtable spinlock iommu/io-pgtable-arm-v7s: Support lockless operation iommu/io-pgtable-arm: Support lockless operation iommu/io-pgtable: Introduce explicit coherency iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap ... |
|
Geetha Sowjanya | f935448acf |
iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126
Cavium ThunderX2 SMMU doesn't support MSI and also doesn't have unique irq lines for gerror, eventq and cmdq-sync. New named irq "combined" is set as a errata workaround, which allows to share the irq line by register single irq handler for all the interrupts. Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Geetha sowjanya <gakula@caviumnetworks.com> [will: reworked irq equality checking and added SPI check] Signed-off-by: Will Deacon <will.deacon@arm.com> |
|
shameer | 99caf177f6 |
iommu/arm-smmu-v3: Enable ACPI based HiSilicon CMD_PREFETCH quirk(erratum 161010701)
HiSilicon SMMUv3 on Hip06/Hip07 platforms doesn't support CMD_PREFETCH command. The dt based support for this quirk is already present in the driver(hisilicon,broken-prefetch-cmd). This adds ACPI support for the quirk using the IORT smmu model number. Signed-off-by: shameer <shameerali.kolothum.thodi@huawei.com> Signed-off-by: hanjun <guohanjun@huawei.com> [will: rewrote patch] Signed-off-by: Will Deacon <will.deacon@arm.com> |
|
Linu Cherian | e5b829de05 |
iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #74
Cavium ThunderX2 SMMU implementation doesn't support page 1 register space and PAGE0_REGS_ONLY option is enabled as an errata workaround. This option when turned on, replaces all page 1 offsets used for EVTQ_PROD/CONS, PRIQ_PROD/CONS register access with page 0 offsets. SMMU resource size checks are now based on SMMU option PAGE0_REGS_ONLY, since resource size can be either 64k/128k. For this, arm_smmu_device_dt_probe/acpi_probe has been moved before platform_get_resource call, so that SMMU options are set beforehand. Signed-off-by: Linu Cherian <linu.cherian@cavium.com> Signed-off-by: Geetha Sowjanya <geethasowjanya.akula@cavium.com> Signed-off-by: Will Deacon <will.deacon@arm.com> |
|
David Daney | 690a341577 |
arm64: Add workaround for Cavium Thunder erratum 30115
Some Cavium Thunder CPUs suffer a problem where a KVM guest may inadvertently cause the host kernel to quit receiving interrupts. Use the Group-0/1 trapping in order to deal with it. [maz]: Adapted patch to the Group-0/1 trapping, reworked commit log Tested-by: Alexander Graf <agraf@suse.de> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> |
|
Marc Zyngier | eeb1efbcb8 |
arm64: cpu_errata: Add capability to advertise Cortex-A73 erratum 858921
In order to work around Cortex-A73 erratum 858921 in a subsequent patch, add the required capability that advertise the erratum. As the configuration option it depends on is not present yet, this has no immediate effect. Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> |
|
Shanker Donthineni | 90922a2d03 |
irqchip/gicv3-its: Add workaround for QDF2400 ITS erratum 0065
On Qualcomm Datacenter Technologies QDF2400 SoCs, the ITS hardware implementation uses 16Bytes for Interrupt Translation Entry (ITE), but reports an incorrect value of 8Bytes in GITS_TYPER.ITTE_size. It might cause kernel memory corruption depending on the number of MSI(x) that are configured and the amount of memory that has been allocated for ITEs in its_create_device(). This patch fixes the potential memory corruption by setting the correct ITE size to 16Bytes. Cc: stable@vger.kernel.org Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> |
|
Christopher Covington | 38fd94b027 |
arm64: Work around Falkor erratum 1003
The Qualcomm Datacenter Technologies Falkor v1 CPU may allocate TLB entries using an incorrect ASID when TTBRx_EL1 is being updated. When the erratum is triggered, page table entries using the new translation table base address (BADDR) will be allocated into the TLB using the old ASID. All circumstances leading to the incorrect ASID being cached in the TLB arise when software writes TTBRx_EL1[ASID] and TTBRx_EL1[BADDR], a memory operation is in the process of performing a translation using the specific TTBRx_EL1 being written, and the memory operation uses a translation table descriptor designated as non-global. EL2 and EL3 code changing the EL1&0 ASID is not subject to this erratum because hardware is prohibited from performing translations from an out-of-context translation regime. Consider the following pseudo code. write new BADDR and ASID values to TTBRx_EL1 Replacing the above sequence with the one below will ensure that no TLB entries with an incorrect ASID are used by software. write reserved value to TTBRx_EL1[ASID] ISB write new value to TTBRx_EL1[BADDR] ISB write new value to TTBRx_EL1[ASID] ISB When the above sequence is used, page table entries using the new BADDR value may still be incorrectly allocated into the TLB using the reserved ASID. Yet this will not reduce functionality, since TLB entries incorrectly tagged with the reserved ASID will never be hit by a later instruction. Based on work by Shanker Donthineni <shankerd@codeaurora.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com> |
|
Ding Tianhong | 6e01398fe4 |
arm64: arch_timer: document Hisilicon erratum 161010101
Now that we have a workaround for Hisilicon erratum 161010101, notes this in the arm64 silicon-errata document. The new config option is too long to fit in the existing kconfig column, so this is widened to accomodate it. At the same time, an existing whitespace error is corrected, and the existing pattern of a line space between vendors is enforced for recent additions. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> [Mark: split patch, reword commit message, rework table] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> |
|
Christopher Covington | d9ff80f83e |
arm64: Work around Falkor erratum 1009
During a TLB invalidate sequence targeting the inner shareable domain, Falkor may prematurely complete the DSB before all loads and stores using the old translation are observed. Instruction fetches are not subject to the conditions of this erratum. If the original code sequence includes multiple TLB invalidate instructions followed by a single DSB, onle one of the TLB instructions needs to be repeated to work around this erratum. While the erratum only applies to cases in which the TLBI specifies the inner-shareable domain (*IS form of TLBI) and the DSB is ISH form or stronger (OSH, SYS), this changes applies the workaround overabundantly-- to local TLBI, DSB NSH sequences as well--for simplicity. Based on work by Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Christopher Covington <cov@codeaurora.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> |
|
Linus Torvalds | 7af8a0f808 |
arm64 updates for 4.9:
- Support for execute-only page permissions - Support for hibernate and DEBUG_PAGEALLOC - Support for heterogeneous systems with mismatches cache line sizes - Errata workarounds (A53 843419 update and QorIQ A-008585 timer bug) - arm64 PMU perf updates, including cpumasks for heterogeneous systems - Set UTS_MACHINE for building rpm packages - Yet another head.S tidy-up - Some cleanups and refactoring, particularly in the NUMA code - Lots of random, non-critical fixes across the board -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABCgAGBQJX7k31AAoJELescNyEwWM0XX0H/iOaWCfKlWOhvBsStGUCsLrK XryTzQT2KjdnLKf3jwP+1ateCuBR5ROurYxoDCX5/7mD63c5KiI338Vbv61a1lE1 AAwjt1stmQVUg/j+kqnuQwB/0DYg+2C8se3D3q5Iyn7zc19cDZJEGcBHNrvLMufc XgHrgHgl/rzBDDlHJXleknDFge/MfhU5/Q1vJMRRb4JYrpAtmIokzCO75CYMRcCT ND2QbmppKtsyuFPGUTVbAFzJlP6dGKb3eruYta7/ct5d0pJQxav3u98D2yWGfjdM YaYq1EmX5Pol7rWumqLtk0+mA9yCFcKLLc+PrJu20Vx0UkvOq8G8Xt70sHNvZU8= =gdPM -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "It's a bit all over the place this time with no "killer feature" to speak of. Support for mismatched cache line sizes should help people seeing whacky JIT failures on some SoCs, and the big.LITTLE perf updates have been a long time coming, but a lot of the changes here are cleanups. We stray outside arch/arm64 in a few areas: the arch/arm/ arch_timer workaround is acked by Russell, the DT/OF bits are acked by Rob, the arch_timer clocksource changes acked by Marc, CPU hotplug by tglx and jump_label by Peter (all CC'd). Summary: - Support for execute-only page permissions - Support for hibernate and DEBUG_PAGEALLOC - Support for heterogeneous systems with mismatches cache line sizes - Errata workarounds (A53 843419 update and QorIQ A-008585 timer bug) - arm64 PMU perf updates, including cpumasks for heterogeneous systems - Set UTS_MACHINE for building rpm packages - Yet another head.S tidy-up - Some cleanups and refactoring, particularly in the NUMA code - Lots of random, non-critical fixes across the board" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (100 commits) arm64: tlbflush.h: add __tlbi() macro arm64: Kconfig: remove SMP dependence for NUMA arm64: Kconfig: select OF/ACPI_NUMA under NUMA config arm64: fix dump_backtrace/unwind_frame with NULL tsk arm/arm64: arch_timer: Use archdata to indicate vdso suitability arm64: arch_timer: Work around QorIQ Erratum A-008585 arm64: arch_timer: Add device tree binding for A-008585 erratum arm64: Correctly bounds check virt_addr_valid arm64: migrate exception table users off module.h and onto extable.h arm64: pmu: Hoist pmu platform device name arm64: pmu: Probe default hw/cache counters arm64: pmu: add fallback probe table MAINTAINERS: Update ARM PMU PROFILING AND DEBUGGING entry arm64: Improve kprobes test for atomic sequence arm64/kvm: use alternative auto-nop arm64: use alternative auto-nop arm64: alternative: add auto-nop infrastructure arm64: lse: convert lse alternatives NOP padding to use __nops arm64: barriers: introduce nops and __nops macros for NOP sequences arm64: sysreg: replace open-coded mrs_s/msr_s with {read,write}_sysreg_s ... |
|
Scott Wood | f6dc1576cd |
arm64: arch_timer: Work around QorIQ Erratum A-008585
Erratum A-008585 says that the ARM generic timer counter "has the potential to contain an erroneous value for a small number of core clock cycles every time the timer value changes". Accesses to TVAL (both read and write) are also affected due to the implicit counter read. Accesses to CVAL are not affected. The workaround is to reread TVAL and count registers until successive reads return the same value. Writes to TVAL are replaced with an equivalent write to CVAL. The workaround is to reread TVAL and count registers until successive reads return the same value, and when writing TVAL to retry until counter reads before and after the write return the same value. The workaround is enabled if the fsl,erratum-a008585 property is found in the timer node in the device tree. This can be overridden with the clocksource.arm_arch_timer.fsl-a008585 boot parameter, which allows KVM users to enable the workaround until a mechanism is implemented to automatically communicate this information. This erratum can be found on LS1043A and LS2080A. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Scott Wood <oss@buserror.net> [will: renamed read macro to reflect that it's not usually unstable] Signed-off-by: Will Deacon <will.deacon@arm.com> |
|
Marc Zyngier | 674e701270 |
arm64: Document workaround for Cortex-A72 erratum #853709
We already have a workaround for Cortex-A57 erratum #852523, but Cortex-A72 r0p0 to r0p2 do suffer from the same issue (known as erratum #853709). Let's document the fact that we already handle this. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> |
|
Ganapatrao Kulkarni | fbf8f40e16 |
irqchip/gicv3-its: numa: Enable workaround for Cavium thunderx erratum 23144
The erratum fixes the hang of ITS SYNC command by avoiding inter node io and collections/cpu mapping on thunderx dual-socket platform. This fix is only applicable for Cavium's ThunderX dual-socket platform. Reviewed-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com> Signed-off-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> |
|
Robin Murphy | f0cfffc48c |
iommu/arm-smmu: Work around MMU-500 prefetch errata
MMU-500 erratum #841119 is tickled by a particular set of circumstances interacting with the next-page prefetcher. Since said prefetcher is quite dumb and actually detrimental to performance in some cases (by causing unwanted TLB evictions for non-sequential access patterns), we lose very little by turning it off, and what we gain is a guarantee that the erratum is never hit. As a bonus, the same workaround will also prevent erratum #826419 once v7 short descriptor support is implemented. CC: Catalin Marinas <catalin.marinas@arm.com> CC: Will Deacon <will.deacon@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> |
|
Tirumalesh Chalamarla | 1bd37a6835 |
iommu/arm-smmu: Workaround for ThunderX erratum #27704
Due to erratum #27704, the CN88xx SMMUv2 implementation supports only shared ASID and VMID numberspaces. This patch ensures that ASID and VMIDs are unique across all SMMU instances on affected Cavium systems. Signed-off-by: Tirumalesh Chalamarla <tchalamarla@caviumnetworks.com> Signed-off-by: Akula Geethasowjanya <Geethasowjanya.Akula@caviumnetworks.com> [will: commit message, comments and formatting] Signed-off-by: Will Deacon <will.deacon@arm.com> |
|
Andrew Pinski | 104a0c02e8 |
arm64: Add workaround for Cavium erratum 27456
On ThunderX T88 pass 1.x through 2.1 parts, broadcast TLBI instructions may cause the icache to become corrupted if it contains data for a non-current ASID. This patch implements the workaround (which invalidates the local icache when switching the mm) by using code patching. Signed-off-by: Andrew Pinski <apinski@cavium.com> Signed-off-by: David Daney <david.daney@cavium.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
|
Will Deacon | 9cb9c9e5ba |
arm64: Documentation: add list of software workarounds for errata
It's not immediately obvious which hardware errata are worked around in the Linux kernel for an arbitrary kernel tree, so add a file to keep track of what we're working around. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> |