linux/arch/parisc/kernel/module.c

992 lines
27 KiB
C
Raw Normal View History

// SPDX-License-Identifier: GPL-2.0-or-later
/* Kernel dynamically loadable module help for PARISC.
*
* The best reference for this stuff is probably the Processor-
* Specific ELF Supplement for PA-RISC:
* http://ftp.parisc-linux.org/docs/arch/elf-pa-hp.pdf
*
* Linux/PA-RISC Project (http://www.parisc-linux.org/)
* Copyright (C) 2003 Randolph Chung <tausq at debian . org>
* Copyright (C) 2008 Helge Deller <deller@gmx.de>
*
* Notes:
* - PLT stub handling
* On 32bit (and sometimes 64bit) and with big kernel modules like xfs or
* ipv6 the relocation types R_PARISC_PCREL17F and R_PARISC_PCREL22F may
* fail to reach their PLT stub if we only create one big stub array for
* all sections at the beginning of the core or init section.
* Instead we now insert individual PLT stub entries directly in front of
* of the code sections where the stubs are actually called.
* This reduces the distance between the PCREL location and the stub entry
* so that the relocations can be fulfilled.
* While calculating the final layout of the kernel module in memory, the
* kernel module loader calls arch_mod_section_prepend() to request the
* to be reserved amount of memory in front of each individual section.
*
* - SEGREL32 handling
* We are not doing SEGREL32 handling correctly. According to the ABI, we
* should do a value offset, like this:
* if (in_init(me, (void *)val))
* val -= (uint32_t)me->init_layout.base;
* else
* val -= (uint32_t)me->core_layout.base;
* However, SEGREL32 is used only for PARISC unwind entries, and we want
* those entries to have an absolute address, and not just an offset.
*
* The unwind table mechanism has the ability to specify an offset for
* the unwind table; however, because we split off the init functions into
* a different piece of memory, it is not possible to do this using a
* single offset. Instead, we use the above hack for now.
*/
#include <linux/moduleloader.h>
#include <linux/elf.h>
#include <linux/vmalloc.h>
#include <linux/fs.h>
module/ftrace: handle patchable-function-entry When using patchable-function-entry, the compiler will record the callsites into a section named "__patchable_function_entries" rather than "__mcount_loc". Let's abstract this difference behind a new FTRACE_CALLSITE_SECTION, so that architectures don't have to handle this explicitly (e.g. with custom module linker scripts). As parisc currently handles this explicitly, it is fixed up accordingly, with its custom linker script removed. Since FTRACE_CALLSITE_SECTION is only defined when DYNAMIC_FTRACE is selected, the parisc module loading code is updated to only use the definition in that case. When DYNAMIC_FTRACE is not selected, modules shouldn't have this section, so this removes some redundant work in that case. To make sure that this is keep up-to-date for modules and the main kernel, a comment is added to vmlinux.lds.h, with the existing ifdeffery simplified for legibility. I built parisc generic-{32,64}bit_defconfig with DYNAMIC_FTRACE enabled, and verified that the section made it into the .ko files for modules. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Helge Deller <deller@gmx.de> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Torsten Duwe <duwe@suse.de> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Sven Schnelle <svens@stackframe.org> Tested-by: Torsten Duwe <duwe@suse.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: linux-parisc@vger.kernel.org
2019-10-17 01:17:11 +08:00
#include <linux/ftrace.h>
#include <linux/string.h>
#include <linux/kernel.h>
#include <linux/bug.h>
#include <linux/mm.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 16:04:11 +08:00
#include <linux/slab.h>
#include <asm/pgtable.h>
#include <asm/unwind.h>
#include <asm/sections.h>
#define RELOC_REACHABLE(val, bits) \
(( ( !((val) & (1<<((bits)-1))) && ((val)>>(bits)) != 0 ) || \
( ((val) & (1<<((bits)-1))) && ((val)>>(bits)) != (((__typeof__(val))(~0))>>((bits)+2)))) ? \
0 : 1)
#define CHECK_RELOC(val, bits) \
if (!RELOC_REACHABLE(val, bits)) { \
printk(KERN_ERR "module %s relocation of symbol %s is out of range (0x%lx in %d bits)\n", \
me->name, strtab + sym->st_name, (unsigned long)val, bits); \
return -ENOEXEC; \
}
/* Maximum number of GOT entries. We use a long displacement ldd from
* the bottom of the table, which has a maximum signed displacement of
* 0x3fff; however, since we're only going forward, this becomes
* 0x1fff, and thus, since each GOT entry is 8 bytes long we can have
* at most 1023 entries.
* To overcome this 14bit displacement with some kernel modules, we'll
* use instead the unusal 16bit displacement method (see reassemble_16a)
* which gives us a maximum positive displacement of 0x7fff, and as such
* allows us to allocate up to 4095 GOT entries. */
#define MAX_GOTS 4095
/* three functions to determine where in the module core
* or init pieces the location is */
static inline int in_init(struct module *me, void *loc)
{
return (loc >= me->init_layout.base &&
loc <= (me->init_layout.base + me->init_layout.size));
}
static inline int in_core(struct module *me, void *loc)
{
return (loc >= me->core_layout.base &&
loc <= (me->core_layout.base + me->core_layout.size));
}
static inline int in_local(struct module *me, void *loc)
{
return in_init(me, loc) || in_core(me, loc);
}
#ifndef CONFIG_64BIT
struct got_entry {
Elf32_Addr addr;
};
struct stub_entry {
Elf32_Word insns[2]; /* each stub entry has two insns */
};
#else
struct got_entry {
Elf64_Addr addr;
};
struct stub_entry {
Elf64_Word insns[4]; /* each stub entry has four insns */
};
#endif
/* Field selection types defined by hppa */
#define rnd(x) (((x)+0x1000)&~0x1fff)
/* fsel: full 32 bits */
#define fsel(v,a) ((v)+(a))
/* lsel: select left 21 bits */
#define lsel(v,a) (((v)+(a))>>11)
/* rsel: select right 11 bits */
#define rsel(v,a) (((v)+(a))&0x7ff)
/* lrsel with rounding of addend to nearest 8k */
#define lrsel(v,a) (((v)+rnd(a))>>11)
/* rrsel with rounding of addend to nearest 8k */
#define rrsel(v,a) ((((v)+rnd(a))&0x7ff)+((a)-rnd(a)))
#define mask(x,sz) ((x) & ~((1<<(sz))-1))
/* The reassemble_* functions prepare an immediate value for
insertion into an opcode. pa-risc uses all sorts of weird bitfields
in the instruction to hold the value. */
static inline int sign_unext(int x, int len)
{
int len_ones;
len_ones = (1 << len) - 1;
return x & len_ones;
}
static inline int low_sign_unext(int x, int len)
{
int sign, temp;
sign = (x >> (len-1)) & 1;
temp = sign_unext(x, len-1);
return (temp << 1) | sign;
}
static inline int reassemble_14(int as14)
{
return (((as14 & 0x1fff) << 1) |
((as14 & 0x2000) >> 13));
}
static inline int reassemble_16a(int as16)
{
int s, t;
/* Unusual 16-bit encoding, for wide mode only. */
t = (as16 << 1) & 0xffff;
s = (as16 & 0x8000);
return (t ^ s ^ (s >> 1)) | (s >> 15);
}
static inline int reassemble_17(int as17)
{
return (((as17 & 0x10000) >> 16) |
((as17 & 0x0f800) << 5) |
((as17 & 0x00400) >> 8) |
((as17 & 0x003ff) << 3));
}
static inline int reassemble_21(int as21)
{
return (((as21 & 0x100000) >> 20) |
((as21 & 0x0ffe00) >> 8) |
((as21 & 0x000180) << 7) |
((as21 & 0x00007c) << 14) |
((as21 & 0x000003) << 12));
}
static inline int reassemble_22(int as22)
{
return (((as22 & 0x200000) >> 21) |
((as22 & 0x1f0000) << 5) |
((as22 & 0x00f800) << 5) |
((as22 & 0x000400) >> 8) |
((as22 & 0x0003ff) << 3));
}
void *module_alloc(unsigned long size)
{
/* using RWX means less protection for modules, but it's
* easier than trying to map the text, data, init_text and
* init_data correctly */
return __vmalloc_node_range(size, 1, VMALLOC_START, VMALLOC_END,
GFP_KERNEL,
mm: vmalloc: pass additional vm_flags to __vmalloc_node_range() For instrumenting global variables KASan will shadow memory backing memory for modules. So on module loading we will need to allocate memory for shadow and map it at address in shadow that corresponds to the address allocated in module_alloc(). __vmalloc_node_range() could be used for this purpose, except it puts a guard hole after allocated area. Guard hole in shadow memory should be a problem because at some future point we might need to have a shadow memory at address occupied by guard hole. So we could fail to allocate shadow for module_alloc(). Now we have VM_NO_GUARD flag disabling guard page, so we need to pass into __vmalloc_node_range(). Add new parameter 'vm_flags' to __vmalloc_node_range() function. Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: Andrey Konovalov <adech.fo@gmail.com> Cc: Yuri Gribov <tetra2005@gmail.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-14 06:40:07 +08:00
PAGE_KERNEL_RWX, 0, NUMA_NO_NODE,
__builtin_return_address(0));
}
#ifndef CONFIG_64BIT
static inline unsigned long count_gots(const Elf_Rela *rela, unsigned long n)
{
return 0;
}
static inline unsigned long count_fdescs(const Elf_Rela *rela, unsigned long n)
{
return 0;
}
static inline unsigned long count_stubs(const Elf_Rela *rela, unsigned long n)
{
unsigned long cnt = 0;
for (; n > 0; n--, rela++)
{
switch (ELF32_R_TYPE(rela->r_info)) {
case R_PARISC_PCREL17F:
case R_PARISC_PCREL22F:
cnt++;
}
}
return cnt;
}
#else
static inline unsigned long count_gots(const Elf_Rela *rela, unsigned long n)
{
unsigned long cnt = 0;
for (; n > 0; n--, rela++)
{
switch (ELF64_R_TYPE(rela->r_info)) {
case R_PARISC_LTOFF21L:
case R_PARISC_LTOFF14R:
case R_PARISC_PCREL22F:
cnt++;
}
}
return cnt;
}
static inline unsigned long count_fdescs(const Elf_Rela *rela, unsigned long n)
{
unsigned long cnt = 0;
for (; n > 0; n--, rela++)
{
switch (ELF64_R_TYPE(rela->r_info)) {
case R_PARISC_FPTR64:
cnt++;
}
}
return cnt;
}
static inline unsigned long count_stubs(const Elf_Rela *rela, unsigned long n)
{
unsigned long cnt = 0;
for (; n > 0; n--, rela++)
{
switch (ELF64_R_TYPE(rela->r_info)) {
case R_PARISC_PCREL22F:
cnt++;
}
}
return cnt;
}
#endif
void module_arch_freeing_init(struct module *mod)
{
kfree(mod->arch.section);
mod->arch.section = NULL;
}
/* Additional bytes needed in front of individual sections */
unsigned int arch_mod_section_prepend(struct module *mod,
unsigned int section)
{
/* size needed for all stubs of this section (including
* one additional for correct alignment of the stubs) */
return (mod->arch.section[section].stub_entries + 1)
* sizeof(struct stub_entry);
}
#define CONST
int module_frob_arch_sections(CONST Elf_Ehdr *hdr,
CONST Elf_Shdr *sechdrs,
CONST char *secstrings,
struct module *me)
{
unsigned long gots = 0, fdescs = 0, len;
unsigned int i;
len = hdr->e_shnum * sizeof(me->arch.section[0]);
me->arch.section = kzalloc(len, GFP_KERNEL);
if (!me->arch.section)
return -ENOMEM;
for (i = 1; i < hdr->e_shnum; i++) {
const Elf_Rela *rels = (void *)sechdrs[i].sh_addr;
unsigned long nrels = sechdrs[i].sh_size / sizeof(*rels);
unsigned int count, s;
if (strncmp(secstrings + sechdrs[i].sh_name,
".PARISC.unwind", 14) == 0)
me->arch.unwind_section = i;
if (sechdrs[i].sh_type != SHT_RELA)
continue;
/* some of these are not relevant for 32-bit/64-bit
* we leave them here to make the code common. the
* compiler will do its thing and optimize out the
* stuff we don't need
*/
gots += count_gots(rels, nrels);
fdescs += count_fdescs(rels, nrels);
/* XXX: By sorting the relocs and finding duplicate entries
* we could reduce the number of necessary stubs and save
* some memory. */
count = count_stubs(rels, nrels);
if (!count)
continue;
/* so we need relocation stubs. reserve necessary memory. */
/* sh_info gives the section for which we need to add stubs. */
s = sechdrs[i].sh_info;
/* each code section should only have one relocation section */
WARN_ON(me->arch.section[s].stub_entries);
/* store number of stubs we need for this section */
me->arch.section[s].stub_entries += count;
}
/* align things a bit */
me->core_layout.size = ALIGN(me->core_layout.size, 16);
me->arch.got_offset = me->core_layout.size;
me->core_layout.size += gots * sizeof(struct got_entry);
me->core_layout.size = ALIGN(me->core_layout.size, 16);
me->arch.fdesc_offset = me->core_layout.size;
me->core_layout.size += fdescs * sizeof(Elf_Fdesc);
me->arch.got_max = gots;
me->arch.fdesc_max = fdescs;
return 0;
}
#ifdef CONFIG_64BIT
static Elf64_Word get_got(struct module *me, unsigned long value, long addend)
{
unsigned int i;
struct got_entry *got;
value += addend;
BUG_ON(value == 0);
got = me->core_layout.base + me->arch.got_offset;
for (i = 0; got[i].addr; i++)
if (got[i].addr == value)
goto out;
BUG_ON(++me->arch.got_count > me->arch.got_max);
got[i].addr = value;
out:
pr_debug("GOT ENTRY %d[%lx] val %lx\n", i, i*sizeof(struct got_entry),
value);
return i * sizeof(struct got_entry);
}
#endif /* CONFIG_64BIT */
#ifdef CONFIG_64BIT
static Elf_Addr get_fdesc(struct module *me, unsigned long value)
{
Elf_Fdesc *fdesc = me->core_layout.base + me->arch.fdesc_offset;
if (!value) {
printk(KERN_ERR "%s: zero OPD requested!\n", me->name);
return 0;
}
/* Look for existing fdesc entry. */
while (fdesc->addr) {
if (fdesc->addr == value)
return (Elf_Addr)fdesc;
fdesc++;
}
BUG_ON(++me->arch.fdesc_count > me->arch.fdesc_max);
/* Create new one */
fdesc->addr = value;
fdesc->gp = (Elf_Addr)me->core_layout.base + me->arch.got_offset;
return (Elf_Addr)fdesc;
}
#endif /* CONFIG_64BIT */
enum elf_stub_type {
ELF_STUB_GOT,
ELF_STUB_MILLI,
ELF_STUB_DIRECT,
};
static Elf_Addr get_stub(struct module *me, unsigned long value, long addend,
enum elf_stub_type stub_type, Elf_Addr loc0, unsigned int targetsec)
{
struct stub_entry *stub;
int __maybe_unused d;
/* initialize stub_offset to point in front of the section */
if (!me->arch.section[targetsec].stub_offset) {
loc0 -= (me->arch.section[targetsec].stub_entries + 1) *
sizeof(struct stub_entry);
/* get correct alignment for the stubs */
loc0 = ALIGN(loc0, sizeof(struct stub_entry));
me->arch.section[targetsec].stub_offset = loc0;
}
/* get address of stub entry */
stub = (void *) me->arch.section[targetsec].stub_offset;
me->arch.section[targetsec].stub_offset += sizeof(struct stub_entry);
/* do not write outside available stub area */
BUG_ON(0 == me->arch.section[targetsec].stub_entries--);
#ifndef CONFIG_64BIT
/* for 32-bit the stub looks like this:
* ldil L'XXX,%r1
* be,n R'XXX(%sr4,%r1)
*/
//value = *(unsigned long *)((value + addend) & ~3); /* why? */
stub->insns[0] = 0x20200000; /* ldil L'XXX,%r1 */
stub->insns[1] = 0xe0202002; /* be,n R'XXX(%sr4,%r1) */
stub->insns[0] |= reassemble_21(lrsel(value, addend));
stub->insns[1] |= reassemble_17(rrsel(value, addend) / 4);
#else
/* for 64-bit we have three kinds of stubs:
* for normal function calls:
* ldd 0(%dp),%dp
* ldd 10(%dp), %r1
* bve (%r1)
* ldd 18(%dp), %dp
*
* for millicode:
* ldil 0, %r1
* ldo 0(%r1), %r1
* ldd 10(%r1), %r1
* bve,n (%r1)
*
* for direct branches (jumps between different section of the
* same module):
* ldil 0, %r1
* ldo 0(%r1), %r1
* bve,n (%r1)
*/
switch (stub_type) {
case ELF_STUB_GOT:
d = get_got(me, value, addend);
if (d <= 15) {
/* Format 5 */
stub->insns[0] = 0x0f6010db; /* ldd 0(%dp),%dp */
stub->insns[0] |= low_sign_unext(d, 5) << 16;
} else {
/* Format 3 */
stub->insns[0] = 0x537b0000; /* ldd 0(%dp),%dp */
stub->insns[0] |= reassemble_16a(d);
}
stub->insns[1] = 0x53610020; /* ldd 10(%dp),%r1 */
stub->insns[2] = 0xe820d000; /* bve (%r1) */
stub->insns[3] = 0x537b0030; /* ldd 18(%dp),%dp */
break;
case ELF_STUB_MILLI:
stub->insns[0] = 0x20200000; /* ldil 0,%r1 */
stub->insns[1] = 0x34210000; /* ldo 0(%r1), %r1 */
stub->insns[2] = 0x50210020; /* ldd 10(%r1),%r1 */
stub->insns[3] = 0xe820d002; /* bve,n (%r1) */
stub->insns[0] |= reassemble_21(lrsel(value, addend));
stub->insns[1] |= reassemble_14(rrsel(value, addend));
break;
case ELF_STUB_DIRECT:
stub->insns[0] = 0x20200000; /* ldil 0,%r1 */
stub->insns[1] = 0x34210000; /* ldo 0(%r1), %r1 */
stub->insns[2] = 0xe820d002; /* bve,n (%r1) */
stub->insns[0] |= reassemble_21(lrsel(value, addend));
stub->insns[1] |= reassemble_14(rrsel(value, addend));
break;
}
#endif
return (Elf_Addr)stub;
}
#ifndef CONFIG_64BIT
int apply_relocate_add(Elf_Shdr *sechdrs,
const char *strtab,
unsigned int symindex,
unsigned int relsec,
struct module *me)
{
int i;
Elf32_Rela *rel = (void *)sechdrs[relsec].sh_addr;
Elf32_Sym *sym;
Elf32_Word *loc;
Elf32_Addr val;
Elf32_Sword addend;
Elf32_Addr dot;
Elf_Addr loc0;
unsigned int targetsec = sechdrs[relsec].sh_info;
//unsigned long dp = (unsigned long)$global$;
register unsigned long dp asm ("r27");
pr_debug("Applying relocate section %u to %u\n", relsec,
targetsec);
for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rel); i++) {
/* This is where to make the change */
loc = (void *)sechdrs[targetsec].sh_addr
+ rel[i].r_offset;
/* This is the start of the target section */
loc0 = sechdrs[targetsec].sh_addr;
/* This is the symbol it is referring to */
sym = (Elf32_Sym *)sechdrs[symindex].sh_addr
+ ELF32_R_SYM(rel[i].r_info);
if (!sym->st_value) {
printk(KERN_WARNING "%s: Unknown symbol %s\n",
me->name, strtab + sym->st_name);
return -ENOENT;
}
//dot = (sechdrs[relsec].sh_addr + rel->r_offset) & ~0x03;
dot = (Elf32_Addr)loc & ~0x03;
val = sym->st_value;
addend = rel[i].r_addend;
#if 0
#define r(t) ELF32_R_TYPE(rel[i].r_info)==t ? #t :
pr_debug("Symbol %s loc 0x%x val 0x%x addend 0x%x: %s\n",
strtab + sym->st_name,
(uint32_t)loc, val, addend,
r(R_PARISC_PLABEL32)
r(R_PARISC_DIR32)
r(R_PARISC_DIR21L)
r(R_PARISC_DIR14R)
r(R_PARISC_SEGREL32)
r(R_PARISC_DPREL21L)
r(R_PARISC_DPREL14R)
r(R_PARISC_PCREL17F)
r(R_PARISC_PCREL22F)
"UNKNOWN");
#undef r
#endif
switch (ELF32_R_TYPE(rel[i].r_info)) {
case R_PARISC_PLABEL32:
/* 32-bit function address */
/* no function descriptors... */
*loc = fsel(val, addend);
break;
case R_PARISC_DIR32:
/* direct 32-bit ref */
*loc = fsel(val, addend);
break;
case R_PARISC_DIR21L:
/* left 21 bits of effective address */
val = lrsel(val, addend);
*loc = mask(*loc, 21) | reassemble_21(val);
break;
case R_PARISC_DIR14R:
/* right 14 bits of effective address */
val = rrsel(val, addend);
*loc = mask(*loc, 14) | reassemble_14(val);
break;
case R_PARISC_SEGREL32:
/* 32-bit segment relative address */
/* See note about special handling of SEGREL32 at
* the beginning of this file.
*/
*loc = fsel(val, addend);
break;
case R_PARISC_SECREL32:
/* 32-bit section relative address. */
*loc = fsel(val, addend);
break;
case R_PARISC_DPREL21L:
/* left 21 bit of relative address */
val = lrsel(val - dp, addend);
*loc = mask(*loc, 21) | reassemble_21(val);
break;
case R_PARISC_DPREL14R:
/* right 14 bit of relative address */
val = rrsel(val - dp, addend);
*loc = mask(*loc, 14) | reassemble_14(val);
break;
case R_PARISC_PCREL17F:
/* 17-bit PC relative address */
/* calculate direct call offset */
val += addend;
val = (val - dot - 8)/4;
if (!RELOC_REACHABLE(val, 17)) {
/* direct distance too far, create
* stub entry instead */
val = get_stub(me, sym->st_value, addend,
ELF_STUB_DIRECT, loc0, targetsec);
val = (val - dot - 8)/4;
CHECK_RELOC(val, 17);
}
*loc = (*loc & ~0x1f1ffd) | reassemble_17(val);
break;
case R_PARISC_PCREL22F:
/* 22-bit PC relative address; only defined for pa20 */
/* calculate direct call offset */
val += addend;
val = (val - dot - 8)/4;
if (!RELOC_REACHABLE(val, 22)) {
/* direct distance too far, create
* stub entry instead */
val = get_stub(me, sym->st_value, addend,
ELF_STUB_DIRECT, loc0, targetsec);
val = (val - dot - 8)/4;
CHECK_RELOC(val, 22);
}
*loc = (*loc & ~0x3ff1ffd) | reassemble_22(val);
break;
case R_PARISC_PCREL32:
/* 32-bit PC relative address */
*loc = val - dot - 8 + addend;
break;
default:
printk(KERN_ERR "module %s: Unknown relocation: %u\n",
me->name, ELF32_R_TYPE(rel[i].r_info));
return -ENOEXEC;
}
}
return 0;
}
#else
int apply_relocate_add(Elf_Shdr *sechdrs,
const char *strtab,
unsigned int symindex,
unsigned int relsec,
struct module *me)
{
int i;
Elf64_Rela *rel = (void *)sechdrs[relsec].sh_addr;
Elf64_Sym *sym;
Elf64_Word *loc;
Elf64_Xword *loc64;
Elf64_Addr val;
Elf64_Sxword addend;
Elf64_Addr dot;
Elf_Addr loc0;
unsigned int targetsec = sechdrs[relsec].sh_info;
pr_debug("Applying relocate section %u to %u\n", relsec,
targetsec);
for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rel); i++) {
/* This is where to make the change */
loc = (void *)sechdrs[targetsec].sh_addr
+ rel[i].r_offset;
/* This is the start of the target section */
loc0 = sechdrs[targetsec].sh_addr;
/* This is the symbol it is referring to */
sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
+ ELF64_R_SYM(rel[i].r_info);
if (!sym->st_value) {
printk(KERN_WARNING "%s: Unknown symbol %s\n",
me->name, strtab + sym->st_name);
return -ENOENT;
}
//dot = (sechdrs[relsec].sh_addr + rel->r_offset) & ~0x03;
dot = (Elf64_Addr)loc & ~0x03;
loc64 = (Elf64_Xword *)loc;
val = sym->st_value;
addend = rel[i].r_addend;
#if 0
#define r(t) ELF64_R_TYPE(rel[i].r_info)==t ? #t :
printk("Symbol %s loc %p val 0x%Lx addend 0x%Lx: %s\n",
strtab + sym->st_name,
loc, val, addend,
r(R_PARISC_LTOFF14R)
r(R_PARISC_LTOFF21L)
r(R_PARISC_PCREL22F)
r(R_PARISC_DIR64)
r(R_PARISC_SEGREL32)
r(R_PARISC_FPTR64)
"UNKNOWN");
#undef r
#endif
switch (ELF64_R_TYPE(rel[i].r_info)) {
case R_PARISC_LTOFF21L:
/* LT-relative; left 21 bits */
val = get_got(me, val, addend);
pr_debug("LTOFF21L Symbol %s loc %p val %llx\n",
strtab + sym->st_name,
loc, val);
val = lrsel(val, 0);
*loc = mask(*loc, 21) | reassemble_21(val);
break;
case R_PARISC_LTOFF14R:
/* L(ltoff(val+addend)) */
/* LT-relative; right 14 bits */
val = get_got(me, val, addend);
val = rrsel(val, 0);
pr_debug("LTOFF14R Symbol %s loc %p val %llx\n",
strtab + sym->st_name,
loc, val);
*loc = mask(*loc, 14) | reassemble_14(val);
break;
case R_PARISC_PCREL22F:
/* PC-relative; 22 bits */
pr_debug("PCREL22F Symbol %s loc %p val %llx\n",
strtab + sym->st_name,
loc, val);
val += addend;
/* can we reach it locally? */
if (in_local(me, (void *)val)) {
/* this is the case where the symbol is local
* to the module, but in a different section,
* so stub the jump in case it's more than 22
* bits away */
val = (val - dot - 8)/4;
if (!RELOC_REACHABLE(val, 22)) {
/* direct distance too far, create
* stub entry instead */
val = get_stub(me, sym->st_value,
addend, ELF_STUB_DIRECT,
loc0, targetsec);
} else {
/* Ok, we can reach it directly. */
val = sym->st_value;
val += addend;
}
} else {
val = sym->st_value;
if (strncmp(strtab + sym->st_name, "$$", 2)
== 0)
val = get_stub(me, val, addend, ELF_STUB_MILLI,
loc0, targetsec);
else
val = get_stub(me, val, addend, ELF_STUB_GOT,
loc0, targetsec);
}
pr_debug("STUB FOR %s loc %px, val %llx+%llx at %llx\n",
strtab + sym->st_name, loc, sym->st_value,
addend, val);
val = (val - dot - 8)/4;
CHECK_RELOC(val, 22);
*loc = (*loc & ~0x3ff1ffd) | reassemble_22(val);
break;
case R_PARISC_PCREL32:
/* 32-bit PC relative address */
*loc = val - dot - 8 + addend;
break;
case R_PARISC_PCREL64:
/* 64-bit PC relative address */
*loc64 = val - dot - 8 + addend;
break;
case R_PARISC_DIR64:
/* 64-bit effective address */
*loc64 = val + addend;
break;
case R_PARISC_SEGREL32:
/* 32-bit segment relative address */
/* See note about special handling of SEGREL32 at
* the beginning of this file.
*/
*loc = fsel(val, addend);
break;
case R_PARISC_SECREL32:
/* 32-bit section relative address. */
*loc = fsel(val, addend);
break;
case R_PARISC_FPTR64:
/* 64-bit function address */
if(in_local(me, (void *)(val + addend))) {
*loc64 = get_fdesc(me, val+addend);
pr_debug("FDESC for %s at %llx points to %llx\n",
strtab + sym->st_name, *loc64,
((Elf_Fdesc *)*loc64)->addr);
} else {
/* if the symbol is not local to this
* module then val+addend is a pointer
* to the function descriptor */
pr_debug("Non local FPTR64 Symbol %s loc %p val %llx\n",
strtab + sym->st_name,
loc, val);
*loc64 = val + addend;
}
break;
default:
printk(KERN_ERR "module %s: Unknown relocation: %Lu\n",
me->name, ELF64_R_TYPE(rel[i].r_info));
return -ENOEXEC;
}
}
return 0;
}
#endif
static void
register_unwind_table(struct module *me,
const Elf_Shdr *sechdrs)
{
unsigned char *table, *end;
unsigned long gp;
if (!me->arch.unwind_section)
return;
table = (unsigned char *)sechdrs[me->arch.unwind_section].sh_addr;
end = table + sechdrs[me->arch.unwind_section].sh_size;
gp = (Elf_Addr)me->core_layout.base + me->arch.got_offset;
pr_debug("register_unwind_table(), sect = %d at 0x%p - 0x%p (gp=0x%lx)\n",
me->arch.unwind_section, table, end, gp);
me->arch.unwind = unwind_table_add(me->name, 0, gp, table, end);
}
static void
deregister_unwind_table(struct module *me)
{
if (me->arch.unwind)
unwind_table_remove(me->arch.unwind);
}
int module_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sechdrs,
struct module *me)
{
int i;
unsigned long nsyms;
const char *strtab = NULL;
const Elf_Shdr *s;
char *secstrings;
module/ftrace: handle patchable-function-entry When using patchable-function-entry, the compiler will record the callsites into a section named "__patchable_function_entries" rather than "__mcount_loc". Let's abstract this difference behind a new FTRACE_CALLSITE_SECTION, so that architectures don't have to handle this explicitly (e.g. with custom module linker scripts). As parisc currently handles this explicitly, it is fixed up accordingly, with its custom linker script removed. Since FTRACE_CALLSITE_SECTION is only defined when DYNAMIC_FTRACE is selected, the parisc module loading code is updated to only use the definition in that case. When DYNAMIC_FTRACE is not selected, modules shouldn't have this section, so this removes some redundant work in that case. To make sure that this is keep up-to-date for modules and the main kernel, a comment is added to vmlinux.lds.h, with the existing ifdeffery simplified for legibility. I built parisc generic-{32,64}bit_defconfig with DYNAMIC_FTRACE enabled, and verified that the section made it into the .ko files for modules. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Helge Deller <deller@gmx.de> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Torsten Duwe <duwe@suse.de> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Sven Schnelle <svens@stackframe.org> Tested-by: Torsten Duwe <duwe@suse.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: linux-parisc@vger.kernel.org
2019-10-17 01:17:11 +08:00
int symindex = -1;
Elf_Sym *newptr, *oldptr;
Elf_Shdr *symhdr = NULL;
#ifdef DEBUG
Elf_Fdesc *entry;
u32 *addr;
entry = (Elf_Fdesc *)me->init;
printk("FINALIZE, ->init FPTR is %p, GP %lx ADDR %lx\n", entry,
entry->gp, entry->addr);
addr = (u32 *)entry->addr;
printk("INSNS: %x %x %x %x\n",
addr[0], addr[1], addr[2], addr[3]);
printk("got entries used %ld, gots max %ld\n"
"fdescs used %ld, fdescs max %ld\n",
me->arch.got_count, me->arch.got_max,
me->arch.fdesc_count, me->arch.fdesc_max);
#endif
register_unwind_table(me, sechdrs);
/* haven't filled in me->symtab yet, so have to find it
* ourselves */
for (i = 1; i < hdr->e_shnum; i++) {
if(sechdrs[i].sh_type == SHT_SYMTAB
&& (sechdrs[i].sh_flags & SHF_ALLOC)) {
int strindex = sechdrs[i].sh_link;
parisc: add dynamic ftrace This patch implements dynamic ftrace for PA-RISC. The required mcount call sequences can get pretty long, so instead of patching the whole call sequence out of the functions, we are using -fpatchable-function-entry from gcc. This puts a configurable amount of NOPS before/at the start of the function. Taking do_sys_open() as example, which would look like this when the call is patched out: 1036b248: 08 00 02 40 nop 1036b24c: 08 00 02 40 nop 1036b250: 08 00 02 40 nop 1036b254: 08 00 02 40 nop 1036b258 <do_sys_open>: 1036b258: 08 00 02 40 nop 1036b25c: 08 03 02 41 copy r3,r1 1036b260: 6b c2 3f d9 stw rp,-14(sp) 1036b264: 08 1e 02 43 copy sp,r3 1036b268: 6f c1 01 00 stw,ma r1,80(sp) When ftrace gets enabled for this function the kernel will patch these NOPs to: 1036b248: 10 19 57 20 <address of ftrace> 1036b24c: 6f c1 00 80 stw,ma r1,40(sp) 1036b250: 48 21 3f d1 ldw -18(r1),r1 1036b254: e8 20 c0 02 bv,n r0(r1) 1036b258 <do_sys_open>: 1036b258: e8 3f 1f df b,l,n .-c,r1 1036b25c: 08 03 02 41 copy r3,r1 1036b260: 6b c2 3f d9 stw rp,-14(sp) 1036b264: 08 1e 02 43 copy sp,r3 1036b268: 6f c1 01 00 stw,ma r1,80(sp) So the first NOP in do_sys_open() will be patched to jump backwards into some minimal trampoline code which pushes a stackframe, saves r1 which holds the return address, loads the address of the real ftrace function, and branches to that location. For 64 Bit things are getting a bit more complicated (and longer) because we must make sure that the address of ftrace location is 8 byte aligned, and the offset passed to ldd for fetching the address is 8 byte aligned as well. Note that gcc has a bug which misplaces the function label, and needs a patch to make dynamic ftrace work. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90751 for details. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-06-06 04:32:22 +08:00
symindex = i;
/* FIXME: AWFUL HACK
* The cast is to drop the const from
* the sechdrs pointer */
symhdr = (Elf_Shdr *)&sechdrs[i];
strtab = (char *)sechdrs[strindex].sh_addr;
break;
}
}
pr_debug("module %s: strtab %p, symhdr %p\n",
me->name, strtab, symhdr);
if(me->arch.got_count > MAX_GOTS) {
printk(KERN_ERR "%s: Global Offset Table overflow (used %ld, allowed %d)\n",
me->name, me->arch.got_count, MAX_GOTS);
return -EINVAL;
}
kfree(me->arch.section);
me->arch.section = NULL;
/* no symbol table */
if(symhdr == NULL)
return 0;
oldptr = (void *)symhdr->sh_addr;
newptr = oldptr + 1; /* we start counting at 1 */
nsyms = symhdr->sh_size / sizeof(Elf_Sym);
pr_debug("OLD num_symtab %lu\n", nsyms);
for (i = 1; i < nsyms; i++) {
oldptr++; /* note, count starts at 1 so preincrement */
if(strncmp(strtab + oldptr->st_name,
".L", 2) == 0)
continue;
if(newptr != oldptr)
*newptr++ = *oldptr;
else
newptr++;
}
nsyms = newptr - (Elf_Sym *)symhdr->sh_addr;
pr_debug("NEW num_symtab %lu\n", nsyms);
symhdr->sh_size = nsyms * sizeof(Elf_Sym);
/* find .altinstructions section */
secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;
for (s = sechdrs; s < sechdrs + hdr->e_shnum; s++) {
void *aseg = (void *) s->sh_addr;
char *secname = secstrings + s->sh_name;
if (!strcmp(".altinstructions", secname))
/* patch .altinstructions */
apply_alternatives(aseg, aseg + s->sh_size, me->name);
module/ftrace: handle patchable-function-entry When using patchable-function-entry, the compiler will record the callsites into a section named "__patchable_function_entries" rather than "__mcount_loc". Let's abstract this difference behind a new FTRACE_CALLSITE_SECTION, so that architectures don't have to handle this explicitly (e.g. with custom module linker scripts). As parisc currently handles this explicitly, it is fixed up accordingly, with its custom linker script removed. Since FTRACE_CALLSITE_SECTION is only defined when DYNAMIC_FTRACE is selected, the parisc module loading code is updated to only use the definition in that case. When DYNAMIC_FTRACE is not selected, modules shouldn't have this section, so this removes some redundant work in that case. To make sure that this is keep up-to-date for modules and the main kernel, a comment is added to vmlinux.lds.h, with the existing ifdeffery simplified for legibility. I built parisc generic-{32,64}bit_defconfig with DYNAMIC_FTRACE enabled, and verified that the section made it into the .ko files for modules. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Helge Deller <deller@gmx.de> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Torsten Duwe <duwe@suse.de> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Sven Schnelle <svens@stackframe.org> Tested-by: Torsten Duwe <duwe@suse.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: linux-parisc@vger.kernel.org
2019-10-17 01:17:11 +08:00
#ifdef CONFIG_DYNAMIC_FTRACE
parisc: add dynamic ftrace This patch implements dynamic ftrace for PA-RISC. The required mcount call sequences can get pretty long, so instead of patching the whole call sequence out of the functions, we are using -fpatchable-function-entry from gcc. This puts a configurable amount of NOPS before/at the start of the function. Taking do_sys_open() as example, which would look like this when the call is patched out: 1036b248: 08 00 02 40 nop 1036b24c: 08 00 02 40 nop 1036b250: 08 00 02 40 nop 1036b254: 08 00 02 40 nop 1036b258 <do_sys_open>: 1036b258: 08 00 02 40 nop 1036b25c: 08 03 02 41 copy r3,r1 1036b260: 6b c2 3f d9 stw rp,-14(sp) 1036b264: 08 1e 02 43 copy sp,r3 1036b268: 6f c1 01 00 stw,ma r1,80(sp) When ftrace gets enabled for this function the kernel will patch these NOPs to: 1036b248: 10 19 57 20 <address of ftrace> 1036b24c: 6f c1 00 80 stw,ma r1,40(sp) 1036b250: 48 21 3f d1 ldw -18(r1),r1 1036b254: e8 20 c0 02 bv,n r0(r1) 1036b258 <do_sys_open>: 1036b258: e8 3f 1f df b,l,n .-c,r1 1036b25c: 08 03 02 41 copy r3,r1 1036b260: 6b c2 3f d9 stw rp,-14(sp) 1036b264: 08 1e 02 43 copy sp,r3 1036b268: 6f c1 01 00 stw,ma r1,80(sp) So the first NOP in do_sys_open() will be patched to jump backwards into some minimal trampoline code which pushes a stackframe, saves r1 which holds the return address, loads the address of the real ftrace function, and branches to that location. For 64 Bit things are getting a bit more complicated (and longer) because we must make sure that the address of ftrace location is 8 byte aligned, and the offset passed to ldd for fetching the address is 8 byte aligned as well. Note that gcc has a bug which misplaces the function label, and needs a patch to make dynamic ftrace work. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90751 for details. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-06-06 04:32:22 +08:00
/* For 32 bit kernels we're compiling modules with
* -ffunction-sections so we must relocate the addresses in the
module/ftrace: handle patchable-function-entry When using patchable-function-entry, the compiler will record the callsites into a section named "__patchable_function_entries" rather than "__mcount_loc". Let's abstract this difference behind a new FTRACE_CALLSITE_SECTION, so that architectures don't have to handle this explicitly (e.g. with custom module linker scripts). As parisc currently handles this explicitly, it is fixed up accordingly, with its custom linker script removed. Since FTRACE_CALLSITE_SECTION is only defined when DYNAMIC_FTRACE is selected, the parisc module loading code is updated to only use the definition in that case. When DYNAMIC_FTRACE is not selected, modules shouldn't have this section, so this removes some redundant work in that case. To make sure that this is keep up-to-date for modules and the main kernel, a comment is added to vmlinux.lds.h, with the existing ifdeffery simplified for legibility. I built parisc generic-{32,64}bit_defconfig with DYNAMIC_FTRACE enabled, and verified that the section made it into the .ko files for modules. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Helge Deller <deller@gmx.de> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Torsten Duwe <duwe@suse.de> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Sven Schnelle <svens@stackframe.org> Tested-by: Torsten Duwe <duwe@suse.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: linux-parisc@vger.kernel.org
2019-10-17 01:17:11 +08:00
* ftrace callsite section.
parisc: add dynamic ftrace This patch implements dynamic ftrace for PA-RISC. The required mcount call sequences can get pretty long, so instead of patching the whole call sequence out of the functions, we are using -fpatchable-function-entry from gcc. This puts a configurable amount of NOPS before/at the start of the function. Taking do_sys_open() as example, which would look like this when the call is patched out: 1036b248: 08 00 02 40 nop 1036b24c: 08 00 02 40 nop 1036b250: 08 00 02 40 nop 1036b254: 08 00 02 40 nop 1036b258 <do_sys_open>: 1036b258: 08 00 02 40 nop 1036b25c: 08 03 02 41 copy r3,r1 1036b260: 6b c2 3f d9 stw rp,-14(sp) 1036b264: 08 1e 02 43 copy sp,r3 1036b268: 6f c1 01 00 stw,ma r1,80(sp) When ftrace gets enabled for this function the kernel will patch these NOPs to: 1036b248: 10 19 57 20 <address of ftrace> 1036b24c: 6f c1 00 80 stw,ma r1,40(sp) 1036b250: 48 21 3f d1 ldw -18(r1),r1 1036b254: e8 20 c0 02 bv,n r0(r1) 1036b258 <do_sys_open>: 1036b258: e8 3f 1f df b,l,n .-c,r1 1036b25c: 08 03 02 41 copy r3,r1 1036b260: 6b c2 3f d9 stw rp,-14(sp) 1036b264: 08 1e 02 43 copy sp,r3 1036b268: 6f c1 01 00 stw,ma r1,80(sp) So the first NOP in do_sys_open() will be patched to jump backwards into some minimal trampoline code which pushes a stackframe, saves r1 which holds the return address, loads the address of the real ftrace function, and branches to that location. For 64 Bit things are getting a bit more complicated (and longer) because we must make sure that the address of ftrace location is 8 byte aligned, and the offset passed to ldd for fetching the address is 8 byte aligned as well. Note that gcc has a bug which misplaces the function label, and needs a patch to make dynamic ftrace work. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90751 for details. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-06-06 04:32:22 +08:00
*/
module/ftrace: handle patchable-function-entry When using patchable-function-entry, the compiler will record the callsites into a section named "__patchable_function_entries" rather than "__mcount_loc". Let's abstract this difference behind a new FTRACE_CALLSITE_SECTION, so that architectures don't have to handle this explicitly (e.g. with custom module linker scripts). As parisc currently handles this explicitly, it is fixed up accordingly, with its custom linker script removed. Since FTRACE_CALLSITE_SECTION is only defined when DYNAMIC_FTRACE is selected, the parisc module loading code is updated to only use the definition in that case. When DYNAMIC_FTRACE is not selected, modules shouldn't have this section, so this removes some redundant work in that case. To make sure that this is keep up-to-date for modules and the main kernel, a comment is added to vmlinux.lds.h, with the existing ifdeffery simplified for legibility. I built parisc generic-{32,64}bit_defconfig with DYNAMIC_FTRACE enabled, and verified that the section made it into the .ko files for modules. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Helge Deller <deller@gmx.de> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Torsten Duwe <duwe@suse.de> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Sven Schnelle <svens@stackframe.org> Tested-by: Torsten Duwe <duwe@suse.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: linux-parisc@vger.kernel.org
2019-10-17 01:17:11 +08:00
if (symindex != -1 && !strcmp(secname, FTRACE_CALLSITE_SECTION)) {
int err;
parisc: add dynamic ftrace This patch implements dynamic ftrace for PA-RISC. The required mcount call sequences can get pretty long, so instead of patching the whole call sequence out of the functions, we are using -fpatchable-function-entry from gcc. This puts a configurable amount of NOPS before/at the start of the function. Taking do_sys_open() as example, which would look like this when the call is patched out: 1036b248: 08 00 02 40 nop 1036b24c: 08 00 02 40 nop 1036b250: 08 00 02 40 nop 1036b254: 08 00 02 40 nop 1036b258 <do_sys_open>: 1036b258: 08 00 02 40 nop 1036b25c: 08 03 02 41 copy r3,r1 1036b260: 6b c2 3f d9 stw rp,-14(sp) 1036b264: 08 1e 02 43 copy sp,r3 1036b268: 6f c1 01 00 stw,ma r1,80(sp) When ftrace gets enabled for this function the kernel will patch these NOPs to: 1036b248: 10 19 57 20 <address of ftrace> 1036b24c: 6f c1 00 80 stw,ma r1,40(sp) 1036b250: 48 21 3f d1 ldw -18(r1),r1 1036b254: e8 20 c0 02 bv,n r0(r1) 1036b258 <do_sys_open>: 1036b258: e8 3f 1f df b,l,n .-c,r1 1036b25c: 08 03 02 41 copy r3,r1 1036b260: 6b c2 3f d9 stw rp,-14(sp) 1036b264: 08 1e 02 43 copy sp,r3 1036b268: 6f c1 01 00 stw,ma r1,80(sp) So the first NOP in do_sys_open() will be patched to jump backwards into some minimal trampoline code which pushes a stackframe, saves r1 which holds the return address, loads the address of the real ftrace function, and branches to that location. For 64 Bit things are getting a bit more complicated (and longer) because we must make sure that the address of ftrace location is 8 byte aligned, and the offset passed to ldd for fetching the address is 8 byte aligned as well. Note that gcc has a bug which misplaces the function label, and needs a patch to make dynamic ftrace work. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90751 for details. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-06-06 04:32:22 +08:00
if (s->sh_type == SHT_REL)
err = apply_relocate((Elf_Shdr *)sechdrs,
strtab, symindex,
s - sechdrs, me);
else if (s->sh_type == SHT_RELA)
err = apply_relocate_add((Elf_Shdr *)sechdrs,
strtab, symindex,
s - sechdrs, me);
if (err)
return err;
}
module/ftrace: handle patchable-function-entry When using patchable-function-entry, the compiler will record the callsites into a section named "__patchable_function_entries" rather than "__mcount_loc". Let's abstract this difference behind a new FTRACE_CALLSITE_SECTION, so that architectures don't have to handle this explicitly (e.g. with custom module linker scripts). As parisc currently handles this explicitly, it is fixed up accordingly, with its custom linker script removed. Since FTRACE_CALLSITE_SECTION is only defined when DYNAMIC_FTRACE is selected, the parisc module loading code is updated to only use the definition in that case. When DYNAMIC_FTRACE is not selected, modules shouldn't have this section, so this removes some redundant work in that case. To make sure that this is keep up-to-date for modules and the main kernel, a comment is added to vmlinux.lds.h, with the existing ifdeffery simplified for legibility. I built parisc generic-{32,64}bit_defconfig with DYNAMIC_FTRACE enabled, and verified that the section made it into the .ko files for modules. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Helge Deller <deller@gmx.de> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Torsten Duwe <duwe@suse.de> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Sven Schnelle <svens@stackframe.org> Tested-by: Torsten Duwe <duwe@suse.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: linux-parisc@vger.kernel.org
2019-10-17 01:17:11 +08:00
#endif
parisc: add dynamic ftrace This patch implements dynamic ftrace for PA-RISC. The required mcount call sequences can get pretty long, so instead of patching the whole call sequence out of the functions, we are using -fpatchable-function-entry from gcc. This puts a configurable amount of NOPS before/at the start of the function. Taking do_sys_open() as example, which would look like this when the call is patched out: 1036b248: 08 00 02 40 nop 1036b24c: 08 00 02 40 nop 1036b250: 08 00 02 40 nop 1036b254: 08 00 02 40 nop 1036b258 <do_sys_open>: 1036b258: 08 00 02 40 nop 1036b25c: 08 03 02 41 copy r3,r1 1036b260: 6b c2 3f d9 stw rp,-14(sp) 1036b264: 08 1e 02 43 copy sp,r3 1036b268: 6f c1 01 00 stw,ma r1,80(sp) When ftrace gets enabled for this function the kernel will patch these NOPs to: 1036b248: 10 19 57 20 <address of ftrace> 1036b24c: 6f c1 00 80 stw,ma r1,40(sp) 1036b250: 48 21 3f d1 ldw -18(r1),r1 1036b254: e8 20 c0 02 bv,n r0(r1) 1036b258 <do_sys_open>: 1036b258: e8 3f 1f df b,l,n .-c,r1 1036b25c: 08 03 02 41 copy r3,r1 1036b260: 6b c2 3f d9 stw rp,-14(sp) 1036b264: 08 1e 02 43 copy sp,r3 1036b268: 6f c1 01 00 stw,ma r1,80(sp) So the first NOP in do_sys_open() will be patched to jump backwards into some minimal trampoline code which pushes a stackframe, saves r1 which holds the return address, loads the address of the real ftrace function, and branches to that location. For 64 Bit things are getting a bit more complicated (and longer) because we must make sure that the address of ftrace location is 8 byte aligned, and the offset passed to ldd for fetching the address is 8 byte aligned as well. Note that gcc has a bug which misplaces the function label, and needs a patch to make dynamic ftrace work. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90751 for details. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-06-06 04:32:22 +08:00
}
2010-10-06 02:29:27 +08:00
return 0;
}
void module_arch_cleanup(struct module *mod)
{
deregister_unwind_table(mod);
}
#ifdef CONFIG_64BIT
void *dereference_module_function_descriptor(struct module *mod, void *ptr)
{
unsigned long start_opd = (Elf64_Addr)mod->core_layout.base +
mod->arch.fdesc_offset;
unsigned long end_opd = start_opd +
mod->arch.fdesc_count * sizeof(Elf64_Fdesc);
if (ptr < (void *)start_opd || ptr >= (void *)end_opd)
return ptr;
return dereference_function_descriptor(ptr);
}
#endif