2019-04-15 02:51:10 +08:00
|
|
|
=================================
|
|
|
|
Kernel Memory Layout on ARM Linux
|
|
|
|
=================================
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
Russell King <rmk@arm.linux.org.uk>
|
2019-04-15 02:51:10 +08:00
|
|
|
|
2005-11-18 06:43:30 +08:00
|
|
|
November 17, 2005 (2.6.15)
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
This document describes the virtual memory layout which the Linux
|
|
|
|
kernel uses for ARM processors. It indicates which regions are
|
|
|
|
free for platforms to use, and which are used by generic code.
|
|
|
|
|
|
|
|
The ARM CPU is capable of addressing a maximum of 4GB virtual memory
|
|
|
|
space, and this must be shared between user space processes, the
|
|
|
|
kernel, and hardware devices.
|
|
|
|
|
|
|
|
As the ARM architecture matures, it becomes necessary to reserve
|
|
|
|
certain regions of VM space for use for new facilities; therefore
|
|
|
|
this document may reserve more VM space over time.
|
|
|
|
|
2019-04-15 02:51:10 +08:00
|
|
|
=============== =============== ===============================================
|
2005-04-17 06:20:36 +08:00
|
|
|
Start End Use
|
2019-04-15 02:51:10 +08:00
|
|
|
=============== =============== ===============================================
|
2005-04-17 06:20:36 +08:00
|
|
|
ffff8000 ffffffff copy_user_page / clear_user_page use.
|
|
|
|
For SA11xx and Xscale, this is used to
|
|
|
|
setup a minicache mapping.
|
|
|
|
|
2009-07-28 05:11:59 +08:00
|
|
|
ffff4000 ffffffff cache aliasing on ARMv6 and later CPUs.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
ffff1000 ffff7fff Reserved.
|
|
|
|
Platforms must not use this address range.
|
|
|
|
|
|
|
|
ffff0000 ffff0fff CPU vector page.
|
|
|
|
The CPU vectors are mapped here if the
|
|
|
|
CPU supports vector relocation (control
|
|
|
|
register V bit.)
|
|
|
|
|
2008-09-17 01:05:53 +08:00
|
|
|
fffe0000 fffeffff XScale cache flush area. This is used
|
|
|
|
in proc-xscale.S to flush the whole data
|
2010-07-13 04:53:28 +08:00
|
|
|
cache. (XScale does not have TCM.)
|
|
|
|
|
|
|
|
fffe8000 fffeffff DTCM mapping area for platforms with
|
|
|
|
DTCM mounted inside the CPU.
|
|
|
|
|
|
|
|
fffe0000 fffe7fff ITCM mapping area for platforms with
|
|
|
|
ITCM mounted inside the CPU.
|
2008-09-17 01:05:53 +08:00
|
|
|
|
ARM: 9012/1: move device tree mapping out of linear region
On ARM, setting up the linear region is tricky, given the constraints
around placement and alignment of the memblocks, and how the kernel
itself as well as the DT are placed in physical memory.
Let's simplify matters a bit, by moving the device tree mapping to the
top of the address space, right between the end of the vmalloc region
and the start of the the fixmap region, and create a read-only mapping
for it that is independent of the size of the linear region, and how it
is organized.
Since this region was formerly used as a guard region, which will now be
populated fully on LPAE builds by this read-only mapping (which will
still be able to function as a guard region for stray writes), bump the
start of the [underutilized] fixmap region by 512 KB as well, to ensure
that there is always a proper guard region here. Doing so still leaves
ample room for the fixmap space, even with NR_CPUS set to its maximum
value of 32.
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-10-11 17:21:37 +08:00
|
|
|
ffc80000 ffefffff Fixmap mapping region. Addresses provided
|
2008-09-17 01:05:53 +08:00
|
|
|
by fix_to_virt() will be located here.
|
|
|
|
|
ARM: 9012/1: move device tree mapping out of linear region
On ARM, setting up the linear region is tricky, given the constraints
around placement and alignment of the memblocks, and how the kernel
itself as well as the DT are placed in physical memory.
Let's simplify matters a bit, by moving the device tree mapping to the
top of the address space, right between the end of the vmalloc region
and the start of the the fixmap region, and create a read-only mapping
for it that is independent of the size of the linear region, and how it
is organized.
Since this region was formerly used as a guard region, which will now be
populated fully on LPAE builds by this read-only mapping (which will
still be able to function as a guard region for stray writes), bump the
start of the [underutilized] fixmap region by 512 KB as well, to ensure
that there is always a proper guard region here. Doing so still leaves
ample room for the fixmap space, even with NR_CPUS set to its maximum
value of 32.
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-10-11 17:21:37 +08:00
|
|
|
ffc00000 ffc7ffff Guard region
|
|
|
|
|
|
|
|
ff800000 ffbfffff Permanent, fixed read-only mapping of the
|
|
|
|
firmware provided DT blob
|
|
|
|
|
2012-03-01 08:10:58 +08:00
|
|
|
fee00000 feffffff Mapping of PCI I/O space. This is a static
|
|
|
|
mapping within the vmalloc space.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
VMALLOC_START VMALLOC_END-1 vmalloc() / ioremap() space.
|
|
|
|
Memory returned by vmalloc/ioremap will
|
|
|
|
be dynamically placed in this region.
|
2011-08-25 12:35:59 +08:00
|
|
|
Machine specific static mappings are also
|
|
|
|
located here through iotable_init().
|
|
|
|
VMALLOC_START is based upon the value
|
|
|
|
of the high_memory variable, and VMALLOC_END
|
2015-09-13 10:25:26 +08:00
|
|
|
is equal to 0xff800000.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
PAGE_OFFSET high_memory-1 Kernel direct-mapped RAM region.
|
|
|
|
This maps the platforms RAM, and typically
|
|
|
|
maps all platform RAM in a 1:1 relationship.
|
|
|
|
|
2010-02-08 04:48:38 +08:00
|
|
|
PKMAP_BASE PAGE_OFFSET-1 Permanent kernel mappings
|
|
|
|
One way of mapping HIGHMEM pages into kernel
|
|
|
|
space.
|
|
|
|
|
|
|
|
MODULES_VADDR MODULES_END-1 Kernel module space
|
2005-04-17 06:20:36 +08:00
|
|
|
Kernel modules inserted via insmod are
|
|
|
|
placed here using dynamic mappings.
|
|
|
|
|
ARM: 9015/2: Define the virtual space of KASan's shadow region
Define KASAN_SHADOW_OFFSET,KASAN_SHADOW_START and KASAN_SHADOW_END for
the Arm kernel address sanitizer. We are "stealing" lowmem (the 4GB
addressable by a 32bit architecture) out of the virtual address
space to use as shadow memory for KASan as follows:
+----+ 0xffffffff
| |
| | |-> Static kernel image (vmlinux) BSS and page table
| |/
+----+ PAGE_OFFSET
| |
| | |-> Loadable kernel modules virtual address space area
| |/
+----+ MODULES_VADDR = KASAN_SHADOW_END
| |
| | |-> The shadow area of kernel virtual address.
| |/
+----+-> TASK_SIZE (start of kernel space) = KASAN_SHADOW_START the
| | shadow address of MODULES_VADDR
| | |
| | |
| | |-> The user space area in lowmem. The kernel address
| | | sanitizer do not use this space, nor does it map it.
| | |
| | |
| | |
| | |
| |/
------ 0
0 .. TASK_SIZE is the memory that can be used by shared
userspace/kernelspace. It us used for userspace processes and for
passing parameters and memory buffers in system calls etc. We do not
need to shadow this area.
KASAN_SHADOW_START:
This value begins with the MODULE_VADDR's shadow address. It is the
start of kernel virtual space. Since we have modules to load, we need
to cover also that area with shadow memory so we can find memory
bugs in modules.
KASAN_SHADOW_END
This value is the 0x100000000's shadow address: the mapping that would
be after the end of the kernel memory at 0xffffffff. It is the end of
kernel address sanitizer shadow area. It is also the start of the
module area.
KASAN_SHADOW_OFFSET:
This value is used to map an address to the corresponding shadow
address by the following formula:
shadow_addr = (address >> 3) + KASAN_SHADOW_OFFSET;
As you would expect, >> 3 is equal to dividing by 8, meaning each
byte in the shadow memory covers 8 bytes of kernel memory, so one
bit shadow memory per byte of kernel memory is used.
The KASAN_SHADOW_OFFSET is provided in a Kconfig option depending
on the VMSPLIT layout of the system: the kernel and userspace can
split up lowmem in different ways according to needs, so we calculate
the shadow offset depending on this.
When kasan is enabled, the definition of TASK_SIZE is not an 8-bit
rotated constant, so we need to modify the TASK_SIZE access code in the
*.s file.
The kernel and modules may use different amounts of memory,
according to the VMSPLIT configuration, which in turn
determines the PAGE_OFFSET.
We use the following KASAN_SHADOW_OFFSETs depending on how the
virtual memory is split up:
- 0x1f000000 if we have 1G userspace / 3G kernelspace split:
- The kernel address space is 3G (0xc0000000)
- PAGE_OFFSET is then set to 0x40000000 so the kernel static
image (vmlinux) uses addresses 0x40000000 .. 0xffffffff
- On top of that we have the MODULES_VADDR which under
the worst case (using ARM instructions) is
PAGE_OFFSET - 16M (0x01000000) = 0x3f000000
so the modules use addresses 0x3f000000 .. 0x3fffffff
- So the addresses 0x3f000000 .. 0xffffffff need to be
covered with shadow memory. That is 0xc1000000 bytes
of memory.
- 1/8 of that is needed for its shadow memory, so
0x18200000 bytes of shadow memory is needed. We
"steal" that from the remaining lowmem.
- The KASAN_SHADOW_START becomes 0x26e00000, to
KASAN_SHADOW_END at 0x3effffff.
- Now we can calculate the KASAN_SHADOW_OFFSET for any
kernel address as 0x3f000000 needs to map to the first
byte of shadow memory and 0xffffffff needs to map to
the last byte of shadow memory. Since:
SHADOW_ADDR = (address >> 3) + KASAN_SHADOW_OFFSET
0x26e00000 = (0x3f000000 >> 3) + KASAN_SHADOW_OFFSET
KASAN_SHADOW_OFFSET = 0x26e00000 - (0x3f000000 >> 3)
KASAN_SHADOW_OFFSET = 0x26e00000 - 0x07e00000
KASAN_SHADOW_OFFSET = 0x1f000000
- 0x5f000000 if we have 2G userspace / 2G kernelspace split:
- The kernel space is 2G (0x80000000)
- PAGE_OFFSET is set to 0x80000000 so the kernel static
image uses 0x80000000 .. 0xffffffff.
- On top of that we have the MODULES_VADDR which under
the worst case (using ARM instructions) is
PAGE_OFFSET - 16M (0x01000000) = 0x7f000000
so the modules use addresses 0x7f000000 .. 0x7fffffff
- So the addresses 0x7f000000 .. 0xffffffff need to be
covered with shadow memory. That is 0x81000000 bytes
of memory.
- 1/8 of that is needed for its shadow memory, so
0x10200000 bytes of shadow memory is needed. We
"steal" that from the remaining lowmem.
- The KASAN_SHADOW_START becomes 0x6ee00000, to
KASAN_SHADOW_END at 0x7effffff.
- Now we can calculate the KASAN_SHADOW_OFFSET for any
kernel address as 0x7f000000 needs to map to the first
byte of shadow memory and 0xffffffff needs to map to
the last byte of shadow memory. Since:
SHADOW_ADDR = (address >> 3) + KASAN_SHADOW_OFFSET
0x6ee00000 = (0x7f000000 >> 3) + KASAN_SHADOW_OFFSET
KASAN_SHADOW_OFFSET = 0x6ee00000 - (0x7f000000 >> 3)
KASAN_SHADOW_OFFSET = 0x6ee00000 - 0x0fe00000
KASAN_SHADOW_OFFSET = 0x5f000000
- 0x9f000000 if we have 3G userspace / 1G kernelspace split,
and this is the default split for ARM:
- The kernel address space is 1GB (0x40000000)
- PAGE_OFFSET is set to 0xc0000000 so the kernel static
image uses 0xc0000000 .. 0xffffffff.
- On top of that we have the MODULES_VADDR which under
the worst case (using ARM instructions) is
PAGE_OFFSET - 16M (0x01000000) = 0xbf000000
so the modules use addresses 0xbf000000 .. 0xbfffffff
- So the addresses 0xbf000000 .. 0xffffffff need to be
covered with shadow memory. That is 0x41000000 bytes
of memory.
- 1/8 of that is needed for its shadow memory, so
0x08200000 bytes of shadow memory is needed. We
"steal" that from the remaining lowmem.
- The KASAN_SHADOW_START becomes 0xb6e00000, to
KASAN_SHADOW_END at 0xbfffffff.
- Now we can calculate the KASAN_SHADOW_OFFSET for any
kernel address as 0xbf000000 needs to map to the first
byte of shadow memory and 0xffffffff needs to map to
the last byte of shadow memory. Since:
SHADOW_ADDR = (address >> 3) + KASAN_SHADOW_OFFSET
0xb6e00000 = (0xbf000000 >> 3) + KASAN_SHADOW_OFFSET
KASAN_SHADOW_OFFSET = 0xb6e00000 - (0xbf000000 >> 3)
KASAN_SHADOW_OFFSET = 0xb6e00000 - 0x17e00000
KASAN_SHADOW_OFFSET = 0x9f000000
- 0x8f000000 if we have 3G userspace / 1G kernelspace with
full 1 GB low memory (VMSPLIT_3G_OPT):
- The kernel address space is 1GB (0x40000000)
- PAGE_OFFSET is set to 0xb0000000 so the kernel static
image uses 0xb0000000 .. 0xffffffff.
- On top of that we have the MODULES_VADDR which under
the worst case (using ARM instructions) is
PAGE_OFFSET - 16M (0x01000000) = 0xaf000000
so the modules use addresses 0xaf000000 .. 0xaffffff
- So the addresses 0xaf000000 .. 0xffffffff need to be
covered with shadow memory. That is 0x51000000 bytes
of memory.
- 1/8 of that is needed for its shadow memory, so
0x0a200000 bytes of shadow memory is needed. We
"steal" that from the remaining lowmem.
- The KASAN_SHADOW_START becomes 0xa4e00000, to
KASAN_SHADOW_END at 0xaeffffff.
- Now we can calculate the KASAN_SHADOW_OFFSET for any
kernel address as 0xaf000000 needs to map to the first
byte of shadow memory and 0xffffffff needs to map to
the last byte of shadow memory. Since:
SHADOW_ADDR = (address >> 3) + KASAN_SHADOW_OFFSET
0xa4e00000 = (0xaf000000 >> 3) + KASAN_SHADOW_OFFSET
KASAN_SHADOW_OFFSET = 0xa4e00000 - (0xaf000000 >> 3)
KASAN_SHADOW_OFFSET = 0xa4e00000 - 0x15e00000
KASAN_SHADOW_OFFSET = 0x8f000000
- The default value of 0xffffffff for KASAN_SHADOW_OFFSET
is an error value. We should always match one of the
above shadow offsets.
When we do this, TASK_SIZE will sometimes get a bit odd values
that will not fit into immediate mov assembly instructions.
To account for this, we need to rewrite some assembly using
TASK_SIZE like this:
- mov r1, #TASK_SIZE
+ ldr r1, =TASK_SIZE
or
- cmp r4, #TASK_SIZE
+ ldr r0, =TASK_SIZE
+ cmp r4, r0
this is done to avoid the immediate #TASK_SIZE that need to
fit into a limited number of bits.
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: kasan-dev@googlegroups.com
Cc: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org> # QEMU/KVM/mach-virt/LPAE/8G
Tested-by: Florian Fainelli <f.fainelli@gmail.com> # Brahma SoCs
Tested-by: Ahmad Fatoum <a.fatoum@pengutronix.de> # i.MX6Q
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Abbott Liu <liuwenliang@huawei.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2020-10-26 06:53:46 +08:00
|
|
|
TASK_SIZE MODULES_VADDR-1 KASAn shadow memory when KASan is in use.
|
|
|
|
The range from MODULES_VADDR to the top
|
|
|
|
of the memory is shadowed here with 1 bit
|
|
|
|
per byte of memory.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
00001000 TASK_SIZE-1 User space mappings
|
|
|
|
Per-thread mappings are placed here via
|
|
|
|
the mmap() system call.
|
|
|
|
|
|
|
|
00000000 00000fff CPU vector page / null pointer trap
|
|
|
|
CPUs which do not support vector remapping
|
|
|
|
place their vector page here. NULL pointer
|
|
|
|
dereferences by both the kernel and user
|
|
|
|
space are also caught via this mapping.
|
2019-04-15 02:51:10 +08:00
|
|
|
=============== =============== ===============================================
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
Please note that mappings which collide with the above areas may result
|
|
|
|
in a non-bootable kernel, or may cause the kernel to (eventually) panic
|
|
|
|
at run time.
|
|
|
|
|
|
|
|
Since future CPUs may impact the kernel mapping layout, user programs
|
|
|
|
must not access any memory which is not mapped inside their 0x0001000
|
|
|
|
to TASK_SIZE address range. If they wish to access these areas, they
|
|
|
|
must set up their own mappings using open() and mmap().
|