This pull requests contains a consolidation of the generic no-IOMMU code,
a well as the glue code for swiotlb. All the code is based on the x86
implementation with hooks to allow all architectures that aren't cache
coherent to use it. The x86 conversion itself has been deferred because
the x86 maintainers were a little busy in the last months.
-----BEGIN PGP SIGNATURE-----
iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlpxcVoLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYN/Lw/+Je9teM4NPQ8lU/ncbJN/bUzCFGJ6dFt2eVX/6xs3
sfl8vBdeHt6CBM02rRNecEr31z3+orjQes5JnlEJFYeG3jumV0zCPw/zbxqjzbJ1
3n6cckLxbxzy8Ca1G/BVjHLAUX5eWp1ujn/Q4d03VKVQZhJvFYlqDbP3TrNVx7xn
k86u37p/o+ngjwX66UdZ3C4iIBF8zqy6n2kkpv4HUQtHHzPwEvliN39eNilovb56
iGOzjDX1UWHAu4xCTVnPHSG4fA4XU41NWzIN3DIVPE25lYSISSl9TFAdR8GeZA0G
0Yj6sW53pRSoUwco1ocoS44/FgrPOB5/vHIL06pABvicXBiomje1QylqcK7zAczk
esjkfPEZrmZuu99GtqFyDNKEvKKdy+aBGaTZ3y+NxsuBs+0xS2Owz1IE4Tk28xaw
xh7zn+CVdk2fJh6ZIdw5Eu9b9VN08UriqDmDzO/ylDlcNGcDi7wcxiSTEkHJ1ON/
g9nletV6f3egL0wljDcOnhCJCHTvmWEeq3z8lE55QzPzSH0hHpnGQ2WD0tKrroxz
kjOZp0TdXa4F5iysOHe2xl2sftOH0zIkBQJ+oBcK12mTaLu21+yeuCggQXJ/CBdk
1Ol7l9g9T0TDuZPfiTHt5+6jmECQs92LElWA8x7uF7Fpix3BpnafWaaSMSsosF3F
D1Y=
=Nrl9
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-4.16' of git://git.infradead.org/users/hch/dma-mapping
Pull dma mapping updates from Christoph Hellwig:
"Except for a runtime warning fix from Christian this is all about
consolidation of the generic no-IOMMU code, a well as the glue code
for swiotlb.
All the code is based on the x86 implementation with hooks to allow
all architectures that aren't cache coherent to use it.
The x86 conversion itself has been deferred because the x86
maintainers were a little busy in the last months"
* tag 'dma-mapping-4.16' of git://git.infradead.org/users/hch/dma-mapping: (57 commits)
MAINTAINERS: add the iommu list for swiotlb and xen-swiotlb
arm64: use swiotlb_alloc and swiotlb_free
arm64: replace ZONE_DMA with ZONE_DMA32
mips: use swiotlb_{alloc,free}
mips/netlogic: remove swiotlb support
tile: use generic swiotlb_ops
tile: replace ZONE_DMA with ZONE_DMA32
unicore32: use generic swiotlb_ops
ia64: remove an ifdef around the content of pci-dma.c
ia64: clean up swiotlb support
ia64: use generic swiotlb_ops
ia64: replace ZONE_DMA with ZONE_DMA32
swiotlb: remove various exports
swiotlb: refactor coherent buffer allocation
swiotlb: refactor coherent buffer freeing
swiotlb: wire up ->dma_supported in swiotlb_dma_ops
swiotlb: add common swiotlb_map_ops
swiotlb: rename swiotlb_free to swiotlb_exit
x86: rename swiotlb_dma_ops
powerpc: rename swiotlb_dma_ops
...
- Security mitigations:
- variant 2: invalidating the branch predictor with a call to secure firmware
- variant 3: implementing KPTI for arm64
- 52-bit physical address support for arm64 (ARMv8.2)
- arm64 support for RAS (firmware first only) and SDEI (software
delegated exception interface; allows firmware to inject a RAS error
into the OS)
- Perf support for the ARM DynamIQ Shared Unit PMU
- CPUID and HWCAP bits updated for new floating point multiplication
instructions in ARMv8.4
- Removing some virtual memory layout printks during boot
- Fix initial page table creation to cope with larger than 32M kernel
images when 16K pages are enabled
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlpwxDMACgkQa9axLQDI
XvF55BAAniMpxPXnYNfv6l7/4O8eKo1lJIaG1wbej4JRZ/rT3K4Z3OBXW1dKHO8d
/PTbVmZ90IqIGROkoDrE+6xyjjn9yK3uuW4ytN2zQkBa8VFaHAnHlX+zKQcuwy9f
yxwiHk+C7vK5JR7mpXTazjRknsUv1MPtlTt7DQrSdq0KRDJVDNFC+grmbew2rz0X
cjQDqZqgzuFyrKxdiQVjDmc3zH9NsNBhDo0hlGHf2jK6bGJsAPtI8M2JcLrK8ITG
Ye/dD7BJp1mWD8ff0BPaMxu24qfAMNLH8f2dpTa986/H78irVz7i/t5HG0/1+5Jh
EE4OFRTKZ59Qgyo1zWcaJvdp8YjiaX/L4PWJg8CxM5OhP9dIac9ydcFQfWzpKpUs
xyZfmK6XliGFReAkVOOf5tEqFUDhMtsqhzPYmbmU1lp61wmSYIZ8CTenpWWCJSRO
NOGyG1X2uFBvP69+iPNlfTGz1r7tg1URY5iO8fUEIhY8LrgyORkiqw4OvPEgnMXP
Ngy+dXhyvnps2AAWbSX0O4puRlTgEYLT5KaMLzH/+gWsXATT0rzUCD/aOwUQq/Y7
SWXZHkb3jpmOZZnzZsLL2MNzEIPCFBwSUE9fSv4dA9d/N6tUmlmZALJjHkfzCDpj
+mPsSmAMTj72kUYzm0b5GCtOu/iQ2kDWOZjOM1m4+v/B+f7JoEE=
=iEjP
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"The main theme of this pull request is security covering variants 2
and 3 for arm64. I expect to send additional patches next week
covering an improved firmware interface (requires firmware changes)
for variant 2 and way for KPTI to be disabled on unaffected CPUs
(Cavium's ThunderX doesn't work properly with KPTI enabled because of
a hardware erratum).
Summary:
- Security mitigations:
- variant 2: invalidate the branch predictor with a call to
secure firmware
- variant 3: implement KPTI for arm64
- 52-bit physical address support for arm64 (ARMv8.2)
- arm64 support for RAS (firmware first only) and SDEI (software
delegated exception interface; allows firmware to inject a RAS
error into the OS)
- perf support for the ARM DynamIQ Shared Unit PMU
- CPUID and HWCAP bits updated for new floating point multiplication
instructions in ARMv8.4
- remove some virtual memory layout printks during boot
- fix initial page table creation to cope with larger than 32M kernel
images when 16K pages are enabled"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (104 commits)
arm64: Fix TTBR + PAN + 52-bit PA logic in cpu_do_switch_mm
arm64: Turn on KPTI only on CPUs that need it
arm64: Branch predictor hardening for Cavium ThunderX2
arm64: Run enable method for errata work arounds on late CPUs
arm64: Move BP hardening to check_and_switch_context
arm64: mm: ignore memory above supported physical address size
arm64: kpti: Fix the interaction between ASID switching and software PAN
KVM: arm64: Emulate RAS error registers and set HCR_EL2's TERR & TEA
KVM: arm64: Handle RAS SErrors from EL2 on guest exit
KVM: arm64: Handle RAS SErrors from EL1 on guest exit
KVM: arm64: Save ESR_EL2 on guest SError
KVM: arm64: Save/Restore guest DISR_EL1
KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2.
KVM: arm/arm64: mask/unmask daif around VHE guests
arm64: kernel: Prepare for a DISR user
arm64: Unconditionally enable IESB on exception entry/return for firmware-first
arm64: kernel: Survive corrected RAS errors notified by SError
arm64: cpufeature: Detect CPU RAS Extentions
arm64: sysreg: Move to use definitions for all the SCTLR bits
arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early
...
When booting a kernel without 52-bit PA support (e.g. a kernel with 4k
pages) on a system with 52-bit memory, the kernel will currently try to
use the 52-bit memory and crash. Fix this by ignoring any memory higher
than what the kernel supports.
Fixes: f77d281713 ("arm64: enable 52-bit physical address support")
Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Printing kernel addresses should be done in limited circumstances, mostly
for debugging purposes. Printing out the virtual memory layout at every
kernel bootup doesn't really fall into this category so delete the prints.
There are other ways to get the same information.
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
arm64 uses ZONE_DMA for allocations below 32-bits. These days we
name the zone for that ZONE_DMA32, which will allow to use the
dma-direct and generic swiotlb code as-is, so rename it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
The high_memory global variable is used by
cma_declare_contiguous(.) before it is defined.
We don't notice this as we compute __pa(high_memory - 1), and it looks
like we're processing a VA from the direct linear map.
This problem becomes apparent when we flip the kernel virtual address
space and the linear map is moved to the bottom of the kernel VA space.
This patch moves the initialisation of high_memory before it used.
Cc: <stable@vger.kernel.org>
Fixes: f7426b983a ("mm: cma: adjust address limit to avoid hitting low/high memory boundary")
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Arch-specific functions are added to allow for implementing a crash dump
file interface, /proc/vmcore, which can be viewed as a ELF file.
A user space tool, like kexec-tools, is responsible for allocating
a separate region for the core's ELF header within crash kdump kernel
memory and filling it in when executing kexec_load().
Then, its location will be advertised to crash dump kernel via a new
device-tree property, "linux,elfcorehdr", and crash dump kernel preserves
the region for later use with reserve_elfcorehdr() at boot time.
On crash dump kernel, /proc/vmcore will access the primary kernel's memory
with copy_oldmem_page(), which feeds the data page-by-page by ioremap'ing
it since it does not reside in linear mapping on crash dump kernel.
Meanwhile, elfcorehdr_read() is simple as the region is always mapped.
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: James Morse <james.morse@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Since arch_kexec_protect_crashkres() removes a mapping for crash dump
kernel image, the loaded data won't be preserved around hibernation.
In this patch, helper functions, crash_prepare_suspend()/
crash_post_resume(), are additionally called before/after hibernation so
that the relevant memory segments will be mapped again and preserved just
as the others are.
In addition, to minimize the size of hibernation image, crash_is_nosave()
is added to pfn_is_nosave() in order to recognize only the pages that hold
loaded crash dump kernel image as saveable. Hibernation excludes any pages
that are marked as Reserved and yet "nosave."
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
"crashkernel=" kernel parameter specifies the size (and optionally
the start address) of the system ram to be used by crash dump kernel.
reserve_crashkernel() will allocate and reserve that memory at boot time
of primary kernel.
The memory range will be exposed to userspace as a resource named
"Crash kernel" in /proc/iomem.
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Pratyush Anand <panand@redhat.com>
Reviewed-by: James Morse <james.morse@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Crash dump kernel uses only a limited range of available memory as System
RAM. On arm64 kdump, This memory range is advertised to crash dump kernel
via a device-tree property under /chosen,
linux,usable-memory-range = <BASE SIZE>
Crash dump kernel reads this property at boot time and calls
memblock_cap_memory_range() to limit usable memory which are listed either
in UEFI memory map table or "memory" nodes of a device tree blob.
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: Geoff Levand <geoff@infradead.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
- Errata workarounds for Qualcomm's Falkor CPU
- Qualcomm L2 Cache PMU driver
- Qualcomm SMCCC firmware quirk
- Support for DEBUG_VIRTUAL
- CPU feature detection for userspace via MRS emulation
- Preliminary work for the Statistical Profiling Extension
- Misc cleanups and non-critical fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJYpIxqAAoJELescNyEwWM0xdwH/AsTYAXPZDMdRnrQUyV0Fd2H
/9pMzww6dHXEmCMKkImf++otUD6S+gTCJTsj7kEAXT5sZzLk27std5lsW7R9oPjc
bGQMalZy+ovLR1gJ6v072seM3In4xph/qAYOpD8Q0AfYCLHjfMMArQfoLa8Esgru
eSsrAgzVAkrK7XHi3sYycUjr9Hac9tvOOuQ3SaZkDz4MfFIbI4b43+c1SCF7wgT9
tQUHLhhxzGmgxjViI2lLYZuBWsIWsE+algvOe1qocvA9JEIXF+W8NeOuCjdL8WwX
3aoqYClC+qD/9+/skShFv5gM5fo0/IweLTUNIHADXpB6OkCYDyg+sxNM+xnEWQU=
=YrPg
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
- Errata workarounds for Qualcomm's Falkor CPU
- Qualcomm L2 Cache PMU driver
- Qualcomm SMCCC firmware quirk
- Support for DEBUG_VIRTUAL
- CPU feature detection for userspace via MRS emulation
- Preliminary work for the Statistical Profiling Extension
- Misc cleanups and non-critical fixes
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (74 commits)
arm64/kprobes: consistently handle MRS/MSR with XZR
arm64: cpufeature: correctly handle MRS to XZR
arm64: traps: correctly handle MRS/MSR with XZR
arm64: ptrace: add XZR-safe regs accessors
arm64: include asm/assembler.h in entry-ftrace.S
arm64: fix warning about swapper_pg_dir overflow
arm64: Work around Falkor erratum 1003
arm64: head.S: Enable EL1 (host) access to SPE when entered at EL2
arm64: arch_timer: document Hisilicon erratum 161010101
arm64: use is_vmalloc_addr
arm64: use linux/sizes.h for constants
arm64: uaccess: consistently check object sizes
perf: add qcom l2 cache perf events driver
arm64: remove wrong CONFIG_PROC_SYSCTL ifdef
ARM: smccc: Update HVC comment to describe new quirk parameter
arm64: do not trace atomic operations
ACPI/IORT: Fix the error return code in iort_add_smmu_platform_device()
ACPI/IORT: Fix iort_node_get_id() mapping entries indexing
arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA
perf: xgene: Include module.h
...
Commit b67a8b29df introduced logic to skip swiotlb allocation when all memory
is DMA accessible anyway.
While this is a great idea, __dma_alloc still calls swiotlb code unconditionally
to allocate memory when there is no CMA memory available. The swiotlb code is
called to ensure that we at least try get_free_pages().
Without initialization, swiotlb allocation code tries to access io_tlb_list
which is NULL. That results in a stack trace like this:
Unable to handle kernel NULL pointer dereference at virtual address 00000000
[...]
[<ffff00000845b908>] swiotlb_tbl_map_single+0xd0/0x2b0
[<ffff00000845be94>] swiotlb_alloc_coherent+0x10c/0x198
[<ffff000008099dc0>] __dma_alloc+0x68/0x1a8
[<ffff000000a1b410>] drm_gem_cma_create+0x98/0x108 [drm]
[<ffff000000abcaac>] drm_fbdev_cma_create_with_funcs+0xbc/0x368 [drm_kms_helper]
[<ffff000000abcd84>] drm_fbdev_cma_create+0x2c/0x40 [drm_kms_helper]
[<ffff000000abc040>] drm_fb_helper_initial_config+0x238/0x410 [drm_kms_helper]
[<ffff000000abce88>] drm_fbdev_cma_init_with_funcs+0x98/0x160 [drm_kms_helper]
[<ffff000000abcf90>] drm_fbdev_cma_init+0x40/0x58 [drm_kms_helper]
[<ffff000000b47980>] vc4_kms_load+0x90/0xf0 [vc4]
[<ffff000000b46a94>] vc4_drm_bind+0xec/0x168 [vc4]
[...]
Thankfully swiotlb code just learned how to not do allocations with the FORCE_NO
option. This patch configures the swiotlb code to use that if we decide not to
initialize the swiotlb framework.
Fixes: b67a8b29df ("arm64: mm: only initialize swiotlb when necessary")
Signed-off-by: Alexander Graf <agraf@suse.de>
CC: Jisheng Zhang <jszhang@marvell.com>
CC: Geert Uytterhoeven <geert+renesas@glider.be>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
__pa_symbol is technically the marcro that should be used for kernel
symbols. Switch to this as a pre-requisite for DEBUG_VIRTUAL which
will do bounds checking.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Convert the flag swiotlb_force from an int to an enum, to prepare for
the advent of more possible values.
Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
All the lines printed by mem_init are independent, with each ending with
a newline. While they logically form a large block, none are actually
continuations of previous lines.
The kernel-side printk code and the userspace demsg tool differ in their
handling of KERN_CONT following a newline, and while this isn't always a
problem kernel-side, it does cause difficulty for userspace. Using
pr_cont causes the userspace tool to not print line prefix (e.g.
timestamps) even when following a newline, mis-aligning the output and
making it harder to read, e.g.
[ 0.000000] Virtual kernel memory layout:
[ 0.000000] modules : 0xffff000000000000 - 0xffff000008000000 ( 128 MB)
vmalloc : 0xffff000008000000 - 0xffff7dffbfff0000 (129022 GB)
.text : 0xffff000008080000 - 0xffff0000088b0000 ( 8384 KB)
.rodata : 0xffff0000088b0000 - 0xffff000008c50000 ( 3712 KB)
.init : 0xffff000008c50000 - 0xffff000008d50000 ( 1024 KB)
.data : 0xffff000008d50000 - 0xffff000008e25200 ( 853 KB)
.bss : 0xffff000008e25200 - 0xffff000008e6bec0 ( 284 KB)
fixed : 0xffff7dfffe7fd000 - 0xffff7dfffec00000 ( 4108 KB)
PCI I/O : 0xffff7dfffee00000 - 0xffff7dffffe00000 ( 16 MB)
vmemmap : 0xffff7e0000000000 - 0xffff800000000000 ( 2048 GB maximum)
0xffff7e0000000000 - 0xffff7e0026000000 ( 608 MB actual)
memory : 0xffff800000000000 - 0xffff800980000000 ( 38912 MB)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=6, Nodes=1
Fix this by using pr_notice consistently for all lines, which both the
kernel and userspace are happy with.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
There is only fixup_init() in mm.h , and it is only called
in free_initmem(), so move the codes from fixup_init() into
free_initmem(), then drop fixup_init() and mm.h.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
These objects are set during initialization, thereafter are read only.
Previously I only want to mark vdso_pages, vdso_spec, vectors_page and
cpu_ops as __read_mostly from performance point of view. Then inspired
by Kees's patch[1] to apply more __ro_after_init for arm, I think it's
better to mark them as __ro_after_init. What's more, I find some more
objects are also read only after init. So apply __ro_after_init to all
of them.
This patch also removes global vdso_pagelist and tries to clean up
vdso_spec[] assignment code.
[1] http://www.spinics.net/lists/arm-kernel/msg523188.html
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When booting an ACPI enabled kernel with 'mem=x', there is the
possibility that ACPI data regions from the firmware will lie above the
memory limit. Ordinarily these will be removed by
memblock_enforce_memory_limit(.).
Unfortunately, this means that these regions will then be mapped by
acpi_os_ioremap(.) as device memory (instead of normal) thus unaligned
accessess will then provoke alignment faults.
In this patch we adopt memblock_mem_limit_remove_map instead, and this
preserves these ACPI data regions (marked NOMAP) thus ensuring that
these regions are not mapped as device memory.
For example, below is an alignment exception observed on ARM platform
when booting the kernel with 'acpi=on mem=8G':
...
Unable to handle kernel paging request at virtual address ffff0000080521e7
pgd = ffff000008aa0000
[ffff0000080521e7] *pgd=000000801fffe003, *pud=000000801fffd003, *pmd=000000801fffc003, *pte=00e80083ff1c1707
Internal error: Oops: 96000021 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc3-next-20160616+ #172
Hardware name: AMD Overdrive/Supercharger/Default string, BIOS ROD1001A 02/09/2016
task: ffff800001ef0000 ti: ffff800001ef8000 task.ti: ffff800001ef8000
PC is at acpi_ns_lookup+0x520/0x734
LR is at acpi_ns_lookup+0x4a4/0x734
pc : [<ffff0000083b8b10>] lr : [<ffff0000083b8a94>] pstate: 60000045
sp : ffff800001efb8b0
x29: ffff800001efb8c0 x28: 000000000000001b
x27: 0000000000000001 x26: 0000000000000000
x25: ffff800001efb9e8 x24: ffff000008a10000
x23: 0000000000000001 x22: 0000000000000001
x21: ffff000008724000 x20: 000000000000001b
x19: ffff0000080521e7 x18: 000000000000000d
x17: 00000000000038ff x16: 0000000000000002
x15: 0000000000000007 x14: 0000000000007fff
x13: ffffff0000000000 x12: 0000000000000018
x11: 000000001fffd200 x10: 00000000ffffff76
x9 : 000000000000005f x8 : ffff000008725fa8
x7 : ffff000008a8df70 x6 : ffff000008a8df70
x5 : ffff000008a8d000 x4 : 0000000000000010
x3 : 0000000000000010 x2 : 000000000000000c
x1 : 0000000000000006 x0 : 0000000000000000
...
acpi_ns_lookup+0x520/0x734
acpi_ds_load1_begin_op+0x174/0x4fc
acpi_ps_build_named_op+0xf8/0x220
acpi_ps_create_op+0x208/0x33c
acpi_ps_parse_loop+0x204/0x838
acpi_ps_parse_aml+0x1bc/0x42c
acpi_ns_one_complete_parse+0x1e8/0x22c
acpi_ns_parse_table+0x8c/0x128
acpi_ns_load_table+0xc0/0x1e8
acpi_tb_load_namespace+0xf8/0x2e8
acpi_load_tables+0x7c/0x110
acpi_init+0x90/0x2c0
do_one_initcall+0x38/0x12c
kernel_init_freeable+0x148/0x1ec
kernel_init+0x10/0xec
ret_from_fork+0x10/0x40
Code: b9009fbc 2a00037b 36380057 3219037b (b9400260)
---[ end trace 03381e5eb0a24de4 ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
With 'efi=debug', we can see those ACPI regions loaded by firmware on
that board as:
efi: 0x0083ff185000-0x0083ff1b4fff [Reserved | | | | | | | | |WB|WT|WC|UC]*
efi: 0x0083ff1b5000-0x0083ff1c2fff [ACPI Reclaim Memory| | | | | | | | |WB|WT|WC|UC]*
efi: 0x0083ff223000-0x0083ff224fff [ACPI Memory NVS | | | | | | | | |WB|WT|WC|UC]*
Link: http://lkml.kernel.org/r/1468475036-5852-3-git-send-email-dennis.chen@arm.com
Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Dennis Chen <dennis.chen@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Kaly Xin <kaly.xin@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As Kees Cook notes in the ARM counterpart of this patch [0]:
The _etext position is defined to be the end of the kernel text code,
and should not include any part of the data segments. This interferes
with things that might check memory ranges and expect executable code
up to _etext.
In particular, Kees is referring to the HARDENED_USERCOPY patch set [1],
which rejects attempts to call copy_to_user() on kernel ranges containing
executable code, but does allow access to the .rodata segment. Regardless
of whether one may or may not agree with the distinction, it makes sense
for _etext to have the same meaning across architectures.
So let's put _etext where it belongs, between .text and .rodata, and fix
up existing references to use __init_begin instead, which unlike _end_rodata
includes the exception and notes sections as well.
The _etext references in kaslr.c are left untouched, since its references
to [_stext, _etext) are meant to capture potential jump instruction targets,
and so disregarding .rodata is actually an improvement here.
[0] http://article.gmane.org/gmane.linux.kernel/2245084
[1] http://thread.gmane.org/gmane.linux.kernel.hardened.devel/2502
Reported-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We currently open-code extracting the NUMA node of a memblock region,
which requires an ifdef to cater for !CONFIG_NUMA builds where the
memblock_region::nid field does not exist.
The generic memblock_get_region_node helper is intended to cater for
this. For CONFIG_HAVE_MEMBLOCK_NODE_MAP, builds this returns reg->nid,
and for for !CONFIG_HAVE_MEMBLOCK_NODE_MAP builds this is a static
inline that returns 0. Note that for arm64,
CONFIG_HAVE_MEMBLOCK_NODE_MAP is selected iff CONFIG_NUMA is.
This patch makes use of memblock_get_region_node to simplify the arm64
code. At the same time, we can move the nid variable definition into the
loop, as this is the only place it is used.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
we only initialize swiotlb when swiotlb_force is true or not all system
memory is DMA-able, this trivial optimization saves us 64MB when
swiotlb is not necessary.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Show the bss segment information as with text and data in Virtual
memory kernel layout.
Acked-by: James Morse <james.morse@arm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Each line with single pr_cont() in Virtual kernel memory layout,
or the dump of the kernel memory layout in dmesg is not aligned
when PRINTK_TIME enabled, due to the missing time stamps.
Tested-by: James Morse <james.morse@arm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Attempt to get the memory and CPU NUMA node via of_numa. If that
fails, default the dummy NUMA node and map all memory and CPUs to node
0.
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This moves the vmemmap region right below PAGE_OFFSET, aka the start
of the linear region, and redefines its size to be a power of two.
Due to the placement of PAGE_OFFSET in the middle of the address space,
whose size is a power of two as well, this guarantees that virt to
page conversions and vice versa can be implemented efficiently, by
masking and shifting rather than ordinary arithmetic.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The implementation of free_initmem_default() expects __init_begin
and __init_end to be covered by the linear mapping, which is no
longer the case. So open code it instead, using addresses that are
explicitly translated from kernel virtual to linear virtual.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Instead of going out of our way to relocate the initrd if it turns out
to occupy memory that is not covered by the linear mapping, just add the
initrd to the linear mapping. This puts the burden on the bootloader to
pass initrd= and mem= options that are mutually consistent.
Note that, since the placement of the linear region in the PA space is
also dependent on the placement of the kernel Image, which may reside
anywhere in memory, we may still end up with a situation where the initrd
and the kernel Image are simply too far apart to be covered by the linear
region.
Since we now leave it up to the bootloader to pass the initrd in memory
that is guaranteed to be accessible by the kernel, add a mention of this to
the arm64 boot protocol specification as well.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
After choosing memstart_addr to be the highest multiple of
ARM64_MEMSTART_ALIGN less than or equal to the first usable physical memory
address, we clip the memblocks to the maximum size of the linear region.
Since the kernel may be high up in memory, we take care not to clip the
kernel itself, which means we have to clip some memory from the bottom if
this occurs, to ensure that the distance between the first and the last
usable physical memory address can be covered by the linear region.
However, we fail to update memstart_addr if this clipping from the bottom
occurs, which means that we may still end up with virtual addresses that
wrap into the userland range. So increment memstart_addr as appropriate to
prevent this from happening.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The printk() implementation has a limit of LOG_LINE_MAX (== 1024 - 32)
buffer per call which the arm64 mem_init() breaches when printing the
virtual memory layout with CONFIG_KASAN enabled. The result is that the
last line is no longer printed. This patch splits the call into a
pr_notice() + additional pr_cont() calls.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
- Initial page table creation reworked to avoid breaking large block
mappings (huge pages) into smaller ones. The ARM architecture requires
break-before-make in such cases to avoid TLB conflicts but that's not
always possible on live page tables
- Kernel virtual memory layout: the kernel image is no longer linked to
the bottom of the linear mapping (PAGE_OFFSET) but at the bottom of
the vmalloc space, allowing the kernel to be loaded (nearly) anywhere
in physical RAM
- Kernel ASLR: position independent kernel Image and modules being
randomly mapped in the vmalloc space with the randomness is provided
by UEFI (efi_get_random_bytes() patches merged via the arm64 tree,
acked by Matt Fleming)
- Implement relative exception tables for arm64, required by KASLR
(initial code for ARCH_HAS_RELATIVE_EXTABLE added to lib/extable.c but
actual x86 conversion to deferred to 4.7 because of the merge
dependencies)
- Support for the User Access Override feature of ARMv8.2: this allows
uaccess functions (get_user etc.) to be implemented using LDTR/STTR
instructions. Such instructions, when run by the kernel, perform
unprivileged accesses adding an extra level of protection. The
set_fs() macro is used to "upgrade" such instruction to privileged
accesses via the UAO bit
- Half-precision floating point support (part of ARMv8.2)
- Optimisations for CPUs with or without a hardware prefetcher (using
run-time code patching)
- copy_page performance improvement to deal with 128 bytes at a time
- Sanity checks on the CPU capabilities (via CPUID) to prevent
incompatible secondary CPUs from being brought up (e.g. weird
big.LITTLE configurations)
- valid_user_regs() reworked for better sanity check of the sigcontext
information (restored pstate information)
- ACPI parking protocol implementation
- CONFIG_DEBUG_RODATA enabled by default
- VDSO code marked as read-only
- DEBUG_PAGEALLOC support
- ARCH_HAS_UBSAN_SANITIZE_ALL enabled
- Erratum workaround Cavium ThunderX SoC
- set_pte_at() fix for PROT_NONE mappings
- Code clean-ups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJW6u95AAoJEGvWsS0AyF7xMyoP/3x2O6bgreSQ84BdO4JChN4+
RQ9OVdX8u2ItO9sgaCY2AA6KoiBuEjGmPl/XRuK0I7DpODTtRjEXQHuNNhz8AelC
hn4AEVqamY6Z5BzHFIjs8G9ydEbq+OXcKWEdwSsBhP/cMvI7ss3dps1f5iNPT5Vv
50E/kUz+aWYy7pKlB18VDV7TUOA3SuYuGknWV8+bOY5uPb8hNT3Y3fHOg/EuNNN3
DIuYH1V7XQkXtF+oNVIGxzzJCXULBE7egMcWAm1ydSOHK0JwkZAiL7OhI7ceVD0x
YlDxBnqmi4cgzfBzTxITAhn3OParwN6udQprdF1WGtFF6fuY2eRDSH/L/iZoE4DY
OulL951OsBtF8YC3+RKLk908/0bA2Uw8ftjCOFJTYbSnZBj1gWK41VkCYMEXiHQk
EaN8+2Iw206iYIoyvdjGCLw7Y0oakDoVD9vmv12SOaHeQljTkjoN8oIlfjjKTeP7
3AXj5v9BDMDVh40nkVayysRNvqe48Kwt9Wn0rhVTLxwdJEiFG/OIU6HLuTkretdN
dcCNFSQrRieSFHpBK9G0vKIpIss1ZwLm8gjocVXH7VK4Mo/TNQe4p2/wAF29mq4r
xu1UiXmtU3uWxiqZnt72LOYFCarQ0sFA5+pMEvF5W+NrVB0wGpXhcwm+pGsIi4IM
LepccTgykiUBqW5TRzPz
=/oS+
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"Here are the main arm64 updates for 4.6. There are some relatively
intrusive changes to support KASLR, the reworking of the kernel
virtual memory layout and initial page table creation.
Summary:
- Initial page table creation reworked to avoid breaking large block
mappings (huge pages) into smaller ones. The ARM architecture
requires break-before-make in such cases to avoid TLB conflicts but
that's not always possible on live page tables
- Kernel virtual memory layout: the kernel image is no longer linked
to the bottom of the linear mapping (PAGE_OFFSET) but at the bottom
of the vmalloc space, allowing the kernel to be loaded (nearly)
anywhere in physical RAM
- Kernel ASLR: position independent kernel Image and modules being
randomly mapped in the vmalloc space with the randomness is
provided by UEFI (efi_get_random_bytes() patches merged via the
arm64 tree, acked by Matt Fleming)
- Implement relative exception tables for arm64, required by KASLR
(initial code for ARCH_HAS_RELATIVE_EXTABLE added to lib/extable.c
but actual x86 conversion to deferred to 4.7 because of the merge
dependencies)
- Support for the User Access Override feature of ARMv8.2: this
allows uaccess functions (get_user etc.) to be implemented using
LDTR/STTR instructions. Such instructions, when run by the kernel,
perform unprivileged accesses adding an extra level of protection.
The set_fs() macro is used to "upgrade" such instruction to
privileged accesses via the UAO bit
- Half-precision floating point support (part of ARMv8.2)
- Optimisations for CPUs with or without a hardware prefetcher (using
run-time code patching)
- copy_page performance improvement to deal with 128 bytes at a time
- Sanity checks on the CPU capabilities (via CPUID) to prevent
incompatible secondary CPUs from being brought up (e.g. weird
big.LITTLE configurations)
- valid_user_regs() reworked for better sanity check of the
sigcontext information (restored pstate information)
- ACPI parking protocol implementation
- CONFIG_DEBUG_RODATA enabled by default
- VDSO code marked as read-only
- DEBUG_PAGEALLOC support
- ARCH_HAS_UBSAN_SANITIZE_ALL enabled
- Erratum workaround Cavium ThunderX SoC
- set_pte_at() fix for PROT_NONE mappings
- Code clean-ups"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (99 commits)
arm64: kasan: Fix zero shadow mapping overriding kernel image shadow
arm64: kasan: Use actual memory node when populating the kernel image shadow
arm64: Update PTE_RDONLY in set_pte_at() for PROT_NONE permission
arm64: Fix misspellings in comments.
arm64: efi: add missing frame pointer assignment
arm64: make mrs_s prefixing implicit in read_cpuid
arm64: enable CONFIG_DEBUG_RODATA by default
arm64: Rework valid_user_regs
arm64: mm: check at build time that PAGE_OFFSET divides the VA space evenly
arm64: KVM: Move kvm_call_hyp back to its original localtion
arm64: mm: treat memstart_addr as a signed quantity
arm64: mm: list kernel sections in order
arm64: lse: deal with clobbered IP registers after branch via PLT
arm64: mm: dump: Use VA_START directly instead of private LOWEST_ADDR
arm64: kconfig: add submenu for 8.2 architectural features
arm64: kernel: acpi: fix ioremap in ACPI parking protocol cpu_postboot
arm64: Add support for Half precision floating point
arm64: Remove fixmap include fragility
arm64: Add workaround for Cavium erratum 27456
arm64: mm: Mark .rodata as RO
...
Commit 8439e62a15 ("arm64: mm: use bit ops rather than arithmetic in
pa/va translations") changed the boundary check against PAGE_OFFSET from
an arithmetic comparison to a bit test. This means we now silently assume
that PAGE_OFFSET is a power of 2 that divides the kernel virtual address
space into two equal halves. So make that assumption explicit.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Commit c031a4213c ("arm64: kaslr: randomize the linear region")
implements randomization of the linear region, by subtracting a random
multiple of PUD_SIZE from memstart_addr. This causes the virtual mapping
of system RAM to move upwards in the linear region, and at the same time
causes memstart_addr to assume a value which may be negative if the offset
of system RAM in the physical space is smaller than its offset relative to
PAGE_OFFSET in the virtual space.
Since memstart_addr is effectively an offset now, redefine its type as s64
so that expressions involving shifting or division preserve its sign.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In the boot log, instead of listing .init first, list .text, .rodata,
.init and .data in the same order they appear in memory
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Commit dd006da216 ("arm64: mm: increase VA range of identity map") made
some changes to the memory mapping code to allow physical memory to reside
at an offset that exceeds the size of the virtual mapping.
However, since the size of the vmemmap area is proportional to the size of
the VA area, but it is populated relative to the physical space, we may
end up with the struct page array being mapped outside of the vmemmap
region. For instance, on my Seattle A0 box, I can see the following output
in the dmesg log.
vmemmap : 0xffffffbdc0000000 - 0xffffffbfc0000000 ( 8 GB maximum)
0xffffffbfc0000000 - 0xffffffbfd0000000 ( 256 MB actual)
We can fix this by deciding that the vmemmap region is not a projection of
the physical space, but of the virtual space above PAGE_OFFSET, i.e., the
linear region. This way, we are guaranteed that the vmemmap region is of
sufficient size, and we can even reduce the size by half.
Cc: <stable@vger.kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Currently the .rodata section is actually still executable when DEBUG_RODATA
is enabled. This changes that so the .rodata is actually read only, no execute.
It also adds the .rodata section to the mem_init banner.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
[catalin.marinas@arm.com: added vm_struct vmlinux_rodata in map_kernel()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Remove the unnecessary boundary check since there is a huge
gap between user and kernel address that they would never overlap.
(arm64 does not have enough levels of page tables to cover 64-bit
virtual address)
See Documentation/arm64/memory.txt
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When KASLR is enabled (CONFIG_RANDOMIZE_BASE=y), and entropy has been
provided by the bootloader, randomize the placement of RAM inside the
linear region if sufficient space is available. For instance, on a 4KB
granule/3 levels kernel, the linear region is 256 GB in size, and we can
choose any 1 GB aligned offset that is far enough from the top of the
address space to fit the distance between the start of the lowest memblock
and the top of the highest memblock.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This relaxes the kernel Image placement requirements, so that it
may be placed at any 2 MB aligned offset in physical memory.
This is accomplished by ignoring PHYS_OFFSET when installing
memblocks, and accounting for the apparent virtual offset of
the kernel Image. As a result, virtual address references
below PAGE_OFFSET are correctly mapped onto physical references
into the kernel Image regardless of where it sits in memory.
Special care needs to be taken for dealing with memory limits passed
via mem=, since the generic implementation clips memory top down, which
may clip the kernel image itself if it is loaded high up in memory. To
deal with this case, we simply add back the memory covering the kernel
image, which may result in more memory to be retained than was passed
as a mem= parameter.
Since mem= should not be considered a production feature, a panic notifier
handler is installed that dumps the memory limit at panic time if one was
set.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Before deferring the assignment of memstart_addr in a subsequent patch, to
the moment where all memory has been discovered and possibly clipped based
on the size of the linear region and the presence of a mem= command line
parameter, we need to ensure that memstart_addr is not used to perform __va
translations before it is assigned.
One such use is in the generic early DT discovery of the initrd location,
which is recorded as a virtual address in the globals initrd_start and
initrd_end. So wire up the generic support to declare the initrd addresses,
and implement it without __va() translations, and perform the translation
after memstart_addr has been assigned.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This moves the module area to right before the vmalloc area, and moves
the kernel image to the base of the vmalloc area. This is an intermediate
step towards implementing KASLR, which allows the kernel image to be
located anywhere in the vmalloc area.
Since other subsystems such as hibernate may still need to refer to the
kernel text or data segments via their linears addresses, both are mapped
in the linear region as well. The linear alias of the text region is
mapped read-only/non-executable to prevent inadvertent modification or
execution.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Currently we treat the alternatives separately from other data that's
only used during initialisation, using separate .altinstructions and
.altinstr_replacement linker sections. These are freed for general
allocation separately from .init*. This is problematic as:
* We do not remove execute permissions, as we do for .init, leaving the
memory executable.
* We pad between them, making the kernel Image bianry up to PAGE_SIZE
bytes larger than necessary.
This patch moves the two sections into the contiguous region used for
.init*. This saves some memory, ensures that we remove execute
permissions, and allows us to remove some code made redundant by this
reorganisation.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Take the new memblock attribute MEMBLOCK_NOMAP into account when
deciding whether a certain region is or should be covered by the
kernel direct mapping.
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
These functions/variables are not needed after booting, so mark them
as __init or __initdata.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Trying to build with CONFIG_ZONE_DMA=n leaves visible references
to the now-undefined ZONE_DMA, resulting in a syntax error.
Hide the references behind an #ifdef instead of using IS_ENABLED.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
It is not needed after booting, this patch moves the
free_initrd_mem() function to the __init section.
This patch also make keep_initrd __initdata, to reduce kernel
size.
Signed-off-by: Wang Long <long.wanglong@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The memmap freeing code in free_unused_memmap() computes the end of
each memblock by adding the memblock size onto the base. However,
if SPARSEMEM is enabled then the value (start) used for the base
may already have been rounded downwards to work out which memmap
entries to free after the previous memblock.
This may cause memmap entries that are in use to get freed.
In general, you're not likely to hit this problem unless there
are at least 2 memblocks and one of them is not aligned to a
sparsemem section boundary. Note that carve-outs can increase
the number of memblocks by splitting the regions listed in the
device tree.
This problem doesn't occur with SPARSEMEM_VMEMMAP, because the
vmemmap code deals with freeing the unused regions of the memmap
instead of requiring the arch code to do it.
This patch gets the memblock base out of the memblock directly when
computing the block end address to ensure the correct value is used.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Currently, the FDT blob needs to be in the same 512 MB region as
the kernel, so that it can be mapped into the kernel virtual memory
space very early on using a minimal set of statically allocated
translation tables.
Now that we have early fixmap support, we can relax this restriction,
by moving the permanent FDT mapping to the fixmap region instead.
This way, the FDT blob may be anywhere in memory.
This also moves the vetting of the FDT to mmu.c, since the early
init code in head.S does not handle mapping of the FDT anymore.
At the same time, fix up some comments in head.S that have gone stale.
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This splits off the reservation of the memory occupied by the FDT
binary itself from the processing of the memory reservations it
contains. This is necessary because the physical address of the FDT,
which is needed to perform the reservation, may not be known to the
FDT driver core, i.e., it may be mapped outside the linear direct
mapping, in which case __pa() returns a bogus value.
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add support for memtest command line option.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With commit 3690951fc6 (arm64: Use swiotlb late initialisation), the
swiotlb buffer size is limited to MAX_ORDER_NR_PAGES. However, there are
platforms with 32-bit only devices that require bounce buffering via
swiotlb. This patch changes the swiotlb initialisation to an early 64MB
memblock allocation. In order to get the swiotlb buffer correctly
allocated (via memblock_virt_alloc_low_nopanic), this patch also defines
ARCH_LOW_ADDRESS_LIMIT to the maximum physical address capable of 32-bit
DMA.
Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
- reimplementation of the virtual remapping of UEFI Runtime Services in
a way that is stable across kexec
- emulation of the "setend" instruction for 32-bit tasks (user
endianness switching trapped in the kernel, SCTLR_EL1.E0E bit set
accordingly)
- compat_sys_call_table implemented in C (from asm) and made it a
constant array together with sys_call_table
- export CPU cache information via /sys (like other architectures)
- DMA API implementation clean-up in preparation for IOMMU support
- macros clean-up for KVM
- dropped some unnecessary cache+tlb maintenance
- CONFIG_ARM64_CPU_SUSPEND clean-up
- defconfig update (CPU_IDLE)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJU25v3AAoJEGvWsS0AyF7xYjcP/j8ESvs+z0BPgeJ6XREfOnCh
cp+w/1rJ5BafJ5RRkibrciwTNOIJS4FGMivWyURtoh430lS0Rh7fxZ3Ouna3xjrT
Nf7AxenWoA8Lo6wHh+FlNUeGk3iWfX6WwA2tYrbKudK+LBJ1wHjwpE7cWQO0FgwJ
aFDahu+QD5/u45p/VcVctMtiEDvOxBdO8gfat6r+YkLm7pbRxQkZnpA/JE4Gps1p
Td5jvMNH9pXI5pffSbeR9Q+vs/r0yqKLXQg01Eb2bZgGDgwf9yzADrHuaKamZt35
X5flmLiTGC6swJCJvUkZC1Nuue33bXcvW5+vgvar+MNGyXsxv+B/wARLqGhiWhQZ
nLGwFpuNu6wdY9tGHb/XR8khcewkw1/lRH1hHKhchrmRyUqHvXcPgC5tamjLrY8C
BV3BAeQvRho8OKwWUmbXIlyON1vPux6CJdj4D/A5NL+qph2WHeVWJCXg6nVFx0Wc
Eb3bXbI4QRwTFL7pGRF8RyZJBAQtgYhQMKWMW2GHgUgn+r1EixG73BZoSwvpHrrw
FOR9AVNfVBqmNON8xiIb3DN4EViq76EF0jrsZh5I9EoWS2w5qtk60kJQgXE+M4EE
vOlmh3dhEVfCN2SxOn0bgoQmTulyjqGauTSSJKQbIBuinPFveukrJfGNFIWt0SZs
f38FBMo6sgU4VG85B+Fr
=X5x/
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"arm64 updates for 3.20:
- reimplementation of the virtual remapping of UEFI Runtime Services
in a way that is stable across kexec
- emulation of the "setend" instruction for 32-bit tasks (user
endianness switching trapped in the kernel, SCTLR_EL1.E0E bit set
accordingly)
- compat_sys_call_table implemented in C (from asm) and made it a
constant array together with sys_call_table
- export CPU cache information via /sys (like other architectures)
- DMA API implementation clean-up in preparation for IOMMU support
- macros clean-up for KVM
- dropped some unnecessary cache+tlb maintenance
- CONFIG_ARM64_CPU_SUSPEND clean-up
- defconfig update (CPU_IDLE)
The EFI changes going via the arm64 tree have been acked by Matt
Fleming. There is also a patch adding sys_*stat64 prototypes to
include/linux/syscalls.h, acked by Andrew Morton"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (47 commits)
arm64: compat: Remove incorrect comment in compat_siginfo
arm64: Fix section mismatch on alloc_init_p[mu]d()
arm64: Avoid breakage caused by .altmacro in fpsimd save/restore macros
arm64: mm: use *_sect to check for section maps
arm64: drop unnecessary cache+tlb maintenance
arm64:mm: free the useless initial page table
arm64: Enable CPU_IDLE in defconfig
arm64: kernel: remove ARM64_CPU_SUSPEND config option
arm64: make sys_call_table const
arm64: Remove asm/syscalls.h
arm64: Implement the compat_sys_call_table in C
syscalls: Declare sys_*stat64 prototypes if __ARCH_WANT_(COMPAT_)STAT64
compat: Declare compat_sys_sigpending and compat_sys_sigprocmask prototypes
arm64: uapi: expose our struct ucontext to the uapi headers
smp, ARM64: Kill SMP single function call interrupt
arm64: Emulate SETEND for AArch32 tasks
arm64: Consolidate hotplug notifier for instruction emulation
arm64: Track system support for mixed endian EL0
arm64: implement generic IOMMU configuration
arm64: Combine coherent and non-coherent swiotlb dma_ops
...
PCI IO space was intended to be 16MiB, at 32MiB below MODULES_VADDR, but
commit d1e6dc91b5 ("arm64: Add architectural support for PCI")
extended this to cover the full 32MiB. The final 8KiB of this 32MiB is
also allocated for the fixmap, allowing for potential clashes between
the two.
This change was masked by assumptions in mem_init and the page table
dumping code, which assumed the I/O space to be 16MiB long through
seaparte hard-coded definitions.
This patch changes the definition of the PCI I/O space allocation to
live in asm/memory.h, along with the other VA space allocations. As the
fixmap allocation depends on the number of fixmap entries, this is moved
below the PCI I/O space allocation. Both the fixmap and PCI I/O space
are guarded with 2MB of padding. Sites assuming the I/O space was 16MiB
are moved over use new PCI_IO_{START,END} definitions, which will keep
in sync with the size of the IO space (now restored to 16MiB).
As a useful side effect, the use of the new PCI_IO_{START,END}
definitions prevents a build issue in the dumping code due to a (now
redundant) missing include of io.h for PCI_IOBASE.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: Liviu Dudau <liviu.dudau@arm.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
[catalin.marinas@arm.com: reorder FIXADDR and PCI_IO address_markers_idx enum]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add page protections for arm64 similar to those in arm.
This is for security reasons to prevent certain classes
of exploits. The current method:
- Map all memory as either RWX or RW. We round to the nearest
section to avoid creating page tables before everything is mapped
- Once everything is mapped, if either end of the RWX section should
not be X, we split the PMD and remap as necessary
- When initmem is to be freed, we change the permissions back to
RW (using stop machine if necessary to flush the TLB)
- If CONFIG_DEBUG_RODATA is set, the read only sections are set
read only.
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When booting with EFI, we acquire the EFI memory map after parsing the
early params. This unfortuantely renders the option useless as we call
memblock_enforce_memory_limit (which uses memblock_remove_range behind
the scenes) before we've added any memblocks. We end up removing
nothing, then adding all of memory later when efi_init calls
reserve_regions.
Instead, we can log the limit and apply this later when we do the rest
of the memblock work in memblock_init, which should work regardless of
the presence of EFI. At the same time we may as well move the early
parameter into arm64's mm/init.c, close to arm64_memblock_init.
Any memory which must be mapped (e.g. for use by EFI runtime services)
must be mapped explicitly reather than relying on the linear mapping,
which may be truncated as a result of a mem= option passed on the kernel
command line.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch partially reverts commit 421520ba98
(only the arm64 part). There is no guarantee that the boot-loader places other
images like dtb in a different page than initrd start/end, especially when the
kernel is built with 64KB pages. When this happens, such pages must not be
freed. The free_reserved_area() already takes care of rounding up "start" and
rounding down "end" to avoid freeing partially used pages.
Cc: <stable@vger.kernel.org> # 3.17+
Reported-by: Peter Maydell <Peter.Maydell@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
With a blatant copy of some x86 bits we introduce the alternative
runtime patching "framework" to arm64.
This is quite basic for now and we only provide the functions we need
at this time.
This is connected to the newly introduced feature bits.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
- eBPF JIT compiler for arm64
- CPU suspend backend for PSCI (firmware interface) with standard idle
states defined in DT (generic idle driver to be merged via a different
tree)
- Support for CONFIG_DEBUG_SET_MODULE_RONX
- Support for unmapped cpu-release-addr (outside kernel linear mapping)
- set_arch_dma_coherent_ops() implemented and bus notifiers removed
- EFI_STUB improvements when base of DRAM is occupied
- Typos in KGDB macros
- Clean-up to (partially) allow kernel building with LLVM
- Other clean-ups (extern keyword, phys_addr_t usage)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUNB6NAAoJEGvWsS0AyF7x22sP/1qPQvFoY71fSqTZmSY+kfgW
UMXhDFZOd+khD2TPHWptbgBRDElTQjRPHyISv/8ILKwDNoMlUDLlYkp1XPLM/nlB
ea9ou2GX8iktqgM2JF5r4vk1hjH6JqEGOUHyWKZc7ibphTVm3dhg3nWL1A4peOUG
0UyX79kl8BLAaggLSUhjtUz1GMpSNlb6Pc1ForUXaPMayBlOcVoOzh1ir7b5wb3e
IvotUY1gv+opE9uK0QPr1AJSfpCogPEfQ2TSCP8MQZjxkrEz69n0HaFvdy60rwf4
DaJiqBoQ5MSP3Bw+qvoYgyz+tfiPFAvEF+O3YQ5x3LBTteoooriFYH4mL7DsicAs
2WLor/342mHykE0bOc44/gNl8B/xaZNzvO2ezLYrjVGsiY2QHTZ7fXB8arPUvQSS
RUXVfHmcv4qthZjI17rgreBKvsfeFIMighSfvMJnVhGqDSvB8abjiPwZjzqB91Bq
pu5MDitNgR3k3ctwzRaS6JtH2CluVFv97xIS4VaD/hm3JnS5NPeTXFou3Gb3lvon
d/wXOIB3vY8FDMIt+BMCQPzWiU0liZ/sN7p1bsOmkgZ1wLOZ0nmsaHF09PDRGbtA
vifopwaw9qtNlcVrTB/rDBCDaT0Ds/mTYD/a3+ch5CYUeLmQmfW/vBMfq/3gUt65
JdI/nTVXawbl2CpBWw36
=SAfQ
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
- eBPF JIT compiler for arm64
- CPU suspend backend for PSCI (firmware interface) with standard idle
states defined in DT (generic idle driver to be merged via a
different tree)
- Support for CONFIG_DEBUG_SET_MODULE_RONX
- Support for unmapped cpu-release-addr (outside kernel linear mapping)
- set_arch_dma_coherent_ops() implemented and bus notifiers removed
- EFI_STUB improvements when base of DRAM is occupied
- Typos in KGDB macros
- Clean-up to (partially) allow kernel building with LLVM
- Other clean-ups (extern keyword, phys_addr_t usage)
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (51 commits)
arm64: Remove unneeded extern keyword
ARM64: make of_device_ids const
arm64: Use phys_addr_t type for physical address
aarch64: filter $x from kallsyms
arm64: Use DMA_ERROR_CODE to denote failed allocation
arm64: Fix typos in KGDB macros
arm64: insn: Add return statements after BUG_ON()
arm64: debug: don't re-enable debug exceptions on return from el1_dbg
Revert "arm64: dmi: Add SMBIOS/DMI support"
arm64: Implement set_arch_dma_coherent_ops() to replace bus notifiers
of: amba: use of_dma_configure for AMBA devices
arm64: dmi: Add SMBIOS/DMI support
arm64: Correct ftrace calls to aarch64_insn_gen_branch_imm()
arm64:mm: initialize max_mapnr using function set_max_mapnr
setup: Move unmask of async interrupts after possible earlycon setup
arm64: LLVMLinux: Fix inline arm64 assembly for use with clang
arm64: pageattr: Correctly adjust unaligned start addresses
net: bpf: arm64: fix module memory leak when JIT image build fails
arm64: add PSCI CPU_SUSPEND based cpu_suspend support
arm64: kernel: introduce cpu_init_idle CPU operation
...
This patch extends the start and end address of initrd to be page aligned,
so that we can free all memory including the un-page aligned head or tail
page of initrd, if the start or end address of initrd are not page
aligned, the page can't be freed by free_initrd_mem() function.
Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Initializing max_mapnr using set_max_mapnr() helper function instead
of direct reference. Also not adding PHYS_PFN_OFFSET to max_pfn,
since it already contains it.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Commit 86c8b27a01cf:
"arm64: ignore DT memreserve entries when booting in UEFI mode
prevents early_init_fdt_scan_reserved_mem() from being called for
arm64 kernels booting via UEFI. This was done because the kernel
will use the UEFI memory map to determine reserved memory regions.
That approach has problems in that early_init_fdt_scan_reserved_mem()
also reserves the FDT itself and any node-specific reserved memory.
By chance of some kernel configs, the FDT may be overwritten before
it can be unflattened and the kernel will fail to boot. More subtle
problems will result if the FDT has node specific reserved memory
which is not really reserved.
This patch has the UEFI stub remove the memory reserve map entries
from the FDT as it does with the memory nodes. This allows
early_init_fdt_scan_reserved_mem() to be called unconditionally
so that the other needed reservations are made.
Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
UEFI provides its own method for marking regions to reserve, via the
memory map which is also used to initialise memblock. So when using the
UEFI memory map, ignore any memreserve entries present in the DT.
Reported-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Changes include:
- Context tracking support (NO_HZ_FULL) which narrowly missed 3.16
- vDSO layout rework following Andy's work on x86
- TEXT_OFFSET fuzzing for bootloader testing
- /proc/cpuinfo tidy-up
- Preliminary work to support 48-bit virtual addresses, but this is
currently disabled until KVM has been ported to use it (the patches
do, however, bring some nice clean-up)
- Boot-time CPU sanity checks (especially useful on heterogenous
systems)
- Support for syscall auditing
- Support for CC_STACKPROTECTOR
- defconfig updates
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCgAGBQJT3qkzAAoJEC379FI+VC/ZxwEP/3uYs9glDLTd1hmVFr1cRutg
j4m1Kc7RCO+zpbYCXJLAQLPjwjOaUWPZUeZPQZib6bO+4sTqFYe9vsaqRyvn/bxM
BaQhytpyxymfG8m3rmXaI97TzBwnRB2oQ0k36rsjMwG/VQMLf9kVuEwURoAHF07l
RyMK2sAwE0/8XIJZQFNo5SAbkO52EiHlehdlTzCXGWWOWdHDyVfks/k6YhIS991r
0W9Y0ghHaMz+mAumTSq7jzPQa3aF3GjTp0W7gJjk/PRBDHfPisphEO36zsA0yHtE
3uvEH0kUQK/ve4ZUQiNvuEZCSqalPFag6j5Z8BnFtafa66J5h414CGPAfER6Kz7+
KGpoEve+7Rpvvb1S4T0tTMg7HoGrvqc5wKS3uFxfoGooGUcUOchSkYiVTBMDJSKn
QlJbb1QSvuNFGhcKntTOe1QMT+x0w9urq/e+QfnQrZ/m5Er7J3qCZzeOfA2JFTjQ
sB24yjzAz5a5VwbKbuB2b4gDILY9oYNe94HFP08o/rJfANnL0dpP1Oyl0b12ILsI
a69EMdpaeEQo8703KLIlzfW6u92PqYs6UkYvya8o27FAvmNvDfB/PffjgVsOAHFi
Qc+dpYbnzNfwJgG9w0qhJ+MR8g5fiBYHqNpfGOY+g5M50j0hZUX9comoWw1xkl0X
HlvG7xzrTF7/VbWEtZ2o
=6XMc
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"Once again, Catalin's off on holiday and I'm looking after the arm64
tree. Please can you pull the following arm64 updates for 3.17?
Note that this branch also includes the new GICv3 driver (merged via a
stable tag from Jason's irqchip tree), since there is a fix for older
binutils on top.
Changes include:
- context tracking support (NO_HZ_FULL) which narrowly missed 3.16
- vDSO layout rework following Andy's work on x86
- TEXT_OFFSET fuzzing for bootloader testing
- /proc/cpuinfo tidy-up
- preliminary work to support 48-bit virtual addresses, but this is
currently disabled until KVM has been ported to use it (the patches
do, however, bring some nice clean-up)
- boot-time CPU sanity checks (especially useful on heterogenous
systems)
- support for syscall auditing
- support for CC_STACKPROTECTOR
- defconfig updates"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (55 commits)
arm64: add newline to I-cache policy string
Revert "arm64: dmi: Add SMBIOS/DMI support"
arm64: fpsimd: fix a typo in fpsimd_save_partial_state ENDPROC
arm64: don't call break hooks for BRK exceptions from EL0
arm64: defconfig: enable devtmpfs mount option
arm64: vdso: fix build error when switching from LE to BE
arm64: defconfig: add virtio support for running as a kvm guest
arm64: gicv3: Allow GICv3 compilation with older binutils
arm64: fix soft lockup due to large tlb flush range
arm64/crypto: fix makefile rule for aes-glue-%.o
arm64: Do not invoke audit_syscall_* functions if !CONFIG_AUDIT_SYSCALL
arm64: Fix barriers used for page table modifications
arm64: Add support for 48-bit VA space with 64KB page configuration
arm64: asm/pgtable.h pmd/pud definitions clean-up
arm64: Determine the vmalloc/vmemmap space at build time based on VA_BITS
arm64: Clean up the initial page table creation in head.S
arm64: Remove asm/pgtable-*level-types.h files
arm64: Remove asm/pgtable-*level-hwdef.h files
arm64: Convert bool ARM64_x_LEVELS to int ARM64_PGTABLE_LEVELS
arm64: mm: Implement 4 levels of translation tables
...
Rather than guessing what the maximum vmmemap space should be, this
patch allows the calculation based on the VA_BITS and sizeof(struct
page). The vmalloc space extends to the beginning of the vmemmap space.
Since the virtual kernel memory layout now depends on the build
configuration, this patch removes the detailed description in
Documentation/arm64/memory.txt in favour of information printed during
kernel booting.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Jungseok Lee <jungseoklee85@gmail.com>
ZONE_DMA is created to allow 32-bit only devices to access memory in the
absence of an IOMMU. On systems where the memory starts above 4GB, it is
expected that some devices have a DMA offset hardwired to be able to
access the bottom of the memory. Linux currently supports DT bindings
for the DMA offsets but they are not (easily) available early during
boot.
This patch tries to guess a DMA offset and assumes that ZONE_DMA
corresponds to the 32-bit mask above the start of DRAM.
Fixes: 2d5a5612bc (arm64: Limit the CMA buffer to 32-bit if ZONE_DMA)
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Mark Salter <msalter@redhat.com>
Tested-by: Mark Salter <msalter@redhat.com>
Tested-by: Anup Patel <anup.patel@linaro.org>
Currently we place swapper_pg_dir and idmap_pg_dir below the kernel
image, between PHYS_OFFSET and (PHYS_OFFSET + TEXT_OFFSET). However,
bootloaders may use portions of this memory below the kernel and we do
not parse the memory reservation list until after the MMU has been
enabled. As such we may clobber some memory a bootloader wishes to have
preserved.
To enable the use of all of this memory by bootloaders (when the
required memory reservations are communicated to the kernel) it is
necessary to move our initial page tables elsewhere. As we currently
have an effectively unbound requirement for memory at the end of the
kernel image for .bss, we can place the page tables here.
This patch moves the initial page table to the end of the kernel image,
after the BSS. As they do not consist of any initialised data they will
be stripped from the kernel Image as with the BSS. The BSS clearing
routine is updated to stop at __bss_stop rather than _end so as to not
clobber the page tables, and memory reservations made redundant by the
new organisation are removed.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
When the CMA buffer is allocated, it is too early to know whether
devices will require ZONE_DMA memory. This patch limits the CMA buffer
to (DMA_BIT_MASK(32) + 1) if CONFIG_ZONE_DMA is enabled.
In addition, it computes the dma_to_phys(DMA_BIT_MASK(32)) before the
increment (no current functional change).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Move the /memreserve/ processing and dtb memory reservations into
early_init_fdt_scan_reserved_mem. This converts arm, arm64, and powerpc
as they are the only users of early_init_fdt_scan_reserved_mem.
memblock_reserve is safe to call on the same region twice, so the
reservation check for the dtb in powerpc 32-bit reservations is safe to
remove.
Signed-off-by: Rob Herring <robh@kernel.org>
Tested-by: Michal Simek <michal.simek@xilinx.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Tested-by: Grant Likely <grant.likely@linaro.org>
Tested-by: Stephen Chivers <schivers@csc.com>
Updates to devicetree core code. This branch contains the following notable changes:
* Add reserved memory binding
* Make struct device_node a kobject and remove legacy /proc/device-tree
* ePAPR conformance fixes
* Update in-kernel DTC copy to version v1.4.0
* Preparation changes for dynamic device tree overlays
* minor bug fixes and documentation changes
The most significant change in this branch is the conversion of struct
device_node to be a kobject that is exposed via sysfs and removal of the
old /proc/device-tree code. This simplifies the device tree handling
code and tightens up the lifecycle on device tree nodes.
[updated: added fix for dangling select PROC_DEVICETREE]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJTOyNwAAoJEMWQL496c2LNZY0QAIreUrpo3/hKRau61EDPXkOA
UFRyPUHD0k/dNXWWDbTfvKH/nAfzdVwejhePqEWiODiFOFkq7JyQlMKPA+CZuZj0
ygN4215A1yj/hDf6JRD5Zn4WGpawDt9InlbZSps6P5dd8voV5t5dz6uzz+Y7uqaK
CAjTDlBSmxEen5vRHiHQgKv74au/+b9yfSURjPQVWg46+wl3WJwjsdzerphm4unW
tpEr8zkIsm51mqqAx4penIuiovh7+L2J5v4BFeg8o+kaZEuZpVxLHJPOuBd5hdom
zeqEIj3AqHTh5suYIHe4aAbZ2wMP3kYGgkPGwfWLnwLyULxalcCtGZeaCi9nwTFj
Fdj+7f17ocrt5mif0f5Deufi1LqJsDjhY6G9p7HuV7Y9hsMILpJIUoGENPji+TWj
BA4L45eaPmNYdKJytEtFD7F2WnXeHZ6fDtYho/39DWW+Bt16IFX85T199irhxGG4
byN6LRaahk2UeycSXkQHAlWOQHqzBcJJAkQLN2iahzyYRr9Dy+VI2E9clm53m49O
YQYcONdUlMYrtfRwJpbB9XHM0HgZUvg0LT5z/iHQs9uJtoo33Oj+zxFixyZLQ9Dq
qyLqQWEpV9gFLAo9tpf56gffkLiJRsHkX4UJ6oTtj4DY1WWU9H81jjCvv/7flzp/
8ZyyZzANQf1DZ9kqO2v+
=lyA5
-----END PGP SIGNATURE-----
Merge tag 'dt-for-linus' of git://git.secretlab.ca/git/linux
Pull devicetree changes from Grant Likely:
"Updates to devicetree core code. This branch contains the following
notable changes:
- add reserved memory binding
- make struct device_node a kobject and remove legacy
/proc/device-tree
- ePAPR conformance fixes
- update in-kernel DTC copy to version v1.4.0
- preparatory changes for dynamic device tree overlays
- minor bug fixes and documentation changes
The most significant change in this branch is the conversion of struct
device_node to be a kobject that is exposed via sysfs and removal of
the old /proc/device-tree code. This simplifies the device tree
handling code and tightens up the lifecycle on device tree nodes.
[updated: added fix for dangling select PROC_DEVICETREE]"
* tag 'dt-for-linus' of git://git.secretlab.ca/git/linux: (29 commits)
dt: Remove dangling "select PROC_DEVICETREE"
of: Add support for ePAPR "stdout-path" property
of: device_node kobject lifecycle fixes
of: only scan for reserved mem when fdt present
powerpc: add support for reserved memory defined by device tree
arm64: add support for reserved memory defined by device tree
of: add missing major vendors
of: add vendor prefix for SMSC
of: remove /proc/device-tree
of/selftest: Add self tests for manipulation of properties
of: Make device nodes kobjects so they show up in sysfs
arm: add support for reserved memory defined by device tree
drivers: of: add support for custom reserved memory drivers
drivers: of: add initialization code for dynamic reserved memory
drivers: of: add initialization code for static reserved memory
of: document bindings for reserved-memory nodes
Revert "of: fix of_update_property()"
kbuild: dtbs_install: new make target
ARM: mvebu: Allows to get the SoC ID even without PCI enabled
of: Allows to use the PCI translator without the PCI core
...
Since arm64 does not support ISA, there is no need for early swiotlb
initialisation. This patch switches the DMA mapping code to
swiotlb_tlb_late_init_with_default_size(). A side effect of this is that
GFP_DMA is used for the swiotlb buffer and devices with a 32-bit
coherent mask are correctly supported.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
On arm64 we do not have two DMA zones, so it does not make sense to
implement ZONE_DMA32. This patch changes ZONE_DMA32 with ZONE_DMA, the
latter covering 32-bit dma address space to honour GFP_DMA allocations.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
arm64 bit targets need the features CMA provides. Add the appropriate
hooks, header files, and Kconfig to allow this to happen.
Cc: Will Deacon <will.deacon@arm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Remove unnecessary prom.h include in preparation to make prom.h optional.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
In order to unify the initrd scanning for DT across architectures, make
arm64 use initrd_start and initrd_end instead of the physical addresses.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
On some PAE architectures, the entire range of physical memory could reside
outside the 32-bit limit. These systems need the ability to specify the
initrd location using 64-bit numbers.
This patch globally modifies the early_init_dt_setup_initrd_arch() function to
use 64-bit numbers instead of the current unsigned long.
There has been quite a bit of debate about whether to use u64 or phys_addr_t.
It was concluded to stick to u64 to be consistent with rest of the device
tree code. As summarized by Geert, "The address to load the initrd is decided
by the bootloader/user and set at that point later in time. The dtb should not
be tied to the kernel you are booting"
More details on the discussion can be found here:
https://lkml.org/lkml/2013/6/20/690https://lkml.org/lkml/2012/9/13/544
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Prepare for removing num_physpages and simplify mem_init().
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Prepare for removing num_physpages and simplify mem_init().
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Concentrate code to modify totalram_pages into the mm core, so the arch
memory initialized code doesn't need to take care of it. With these
changes applied, only following functions from mm core modify global
variable totalram_pages: free_bootmem_late(), free_all_bootmem(),
free_all_bootmem_node(), adjust_managed_page_count().
With this patch applied, it will be much more easier for us to keep
totalram_pages and zone->managed_pages in consistence.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Address more review comments from last round of code review.
1) Enhance free_reserved_area() to support poisoning freed memory with
pattern '0'. This could be used to get rid of poison_init_mem()
on ARM64.
2) A previous patch has disabled memory poison for initmem on s390
by mistake, so restore to the original behavior.
3) Remove redundant PAGE_ALIGN() when calling free_reserved_area().
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change signature of free_reserved_area() according to Russell King's
suggestion to fix following build warnings:
arch/arm/mm/init.c: In function 'mem_init':
arch/arm/mm/init.c:603:2: warning: passing argument 1 of 'free_reserved_area' makes integer from pointer without a cast [enabled by default]
free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL);
^
In file included from include/linux/mman.h:4:0,
from arch/arm/mm/init.c:15:
include/linux/mm.h:1301:22: note: expected 'long unsigned int' but argument is of type 'void *'
extern unsigned long free_reserved_area(unsigned long start, unsigned long end,
mm/page_alloc.c: In function 'free_reserved_area':
>> mm/page_alloc.c:5134:3: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast [enabled by default]
In file included from arch/mips/include/asm/page.h:49:0,
from include/linux/mmzone.h:20,
from include/linux/gfp.h:4,
from include/linux/mm.h:8,
from mm/page_alloc.c:18:
arch/mips/include/asm/io.h:119:29: note: expected 'const volatile void *' but argument is of type 'long unsigned int'
mm/page_alloc.c: In function 'free_area_init_nodes':
mm/page_alloc.c:5030:34: warning: array subscript is below array bounds [-Warray-bounds]
Also address some minor code review comments.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit f483a853b0 ("arm64: mm: fix booting on systems with no memory
below 4GB") sets max_dma32 to the minimum of the maximum pfn and
MAX_DMA32_PFN. This value is later used as the base of the NORMAL zone,
which is incorrect when MAX_DMA32_PFN is below the minimum pfn (i.e. all
memory is above 4GB).
This patch fixes the problem by ensuring that max_dma32 is always set to
the end of the DMA32 zone.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Booting on a system with all of its memory above the 4GB boundary breaks
for two reasons:
(1) We still try to create a non-empty DMA32 zone
(2) no-bootmem limits allocations to 0xffffffff
This patch fixes these issues for ARM64.
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Following commit 74838b7 (swiotlb: add the late swiotlb initialization
function with iotlb memory) the swiotlb_init_with_default_size() is a
static function. This patch changes the arm64 code to call
swiotlb_init() instead and use the default size of 64MB. It is assumed
that AArch64 platforms have enough RAM to afford the pre-allocated
swiotlb memory. It also removes the #ifdef around this call since
CONFIG_SWIOTLB is always enabled.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch contains the initialisation of the memory blocks, MMU
attributes and the memory map. Only five memory types are defined:
Device nGnRnE (equivalent to Strongly Ordered), Device nGnRE (classic
Device memory), Device GRE, Normal Non-cacheable and Normal Cacheable.
Cache policies are supported via the memory attributes register
(MAIR_EL1) and only affect the Normal Cacheable mappings.
This patch also adds the SPARSEMEM_VMEMMAP initialisation.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>