of_get_property() is used inside the loop, but then the reference to the
node is dropped before dereferencing the prop pointer, which could by then
point to junk if the node has been freed.
Instead use of_property_read_u32() to actually read the property
value before dropping the reference.
of_property_read_u32() requires at least one cell (u32) to be present,
which is stricter than the old logic which would happily dereference a
property of any size. However we believe all device trees in the wild
have at least one cell.
Skiboot may produce memory nodes with more than one cell, but that is
OK, of_property_read_u32() will return the first one.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
[mpe: Expand change log with device tree details]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The TUNE_CELL option allows you to build a kernel that runs on multiple
CPUs but is tuned (ie. optimised) to run on Cell CPUs. Now days no one
is building a distro in that fashion, and any users who are building
custom kernels for their Cell machines are better off building with
CONFIG_CELL_CPU, which builds a kernel that only runs on Cell and
therefore can be optimised even more aggresively.
Dropping the option also avoids confusing other users, who are presented
with an option to tune for Cell when they are not building for a Cell
CPU at all.
Suggested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
EBB (Event Based Branches) are currently only available on POWER8, so we
should skip them on other CPUs.
I've found that at least one test loops forever on 970MP (cycles_with_freeze_test).
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
[mpe: Minor change log editing, add skip to cpu_event_vs_ebb_test]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When building against older kernel headers, currently the tm-syscall
test fails to build because PPC_FEATURE2_HTM_NOSC is not defined.
Tweak the test so that if PPC_FEATURE2_HTM_NOSC is not defined it still
builds, but prints a warning at run time and marks the test as skipped.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This list has gotten too long. Split it into individual lines and sort
them, so in future we can add new entries more cleanly.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The kernel log buffer is often much longer than the size of a terminal
so paginate it's output.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The paca display is already more than 24 lines, which can be problematic
if you have an old school 80x24 terminal, or more likely you are on a
virtual terminal which does not scroll for whatever reason.
This patch adds a new command "#", which takes a single (hex) numeric
argument: lines per page. It will cause the output of "dp" and "dpa"
to be broken into pages, if necessary.
Sample output:
0:mon> # 10
0:mon> dp1
paca for cpu 0x1 @ c00000000fdc0480:
possible = yes
present = yes
online = yes
lock_token = 0x8000 (0x8)
paca_index = 0x1 (0xa)
kernel_toc = 0xc000000000eb2400 (0x10)
kernelbase = 0xc000000000000000 (0x18)
kernel_msr = 0xb000000000001032 (0x20)
emergency_sp = 0xc00000003ffe8000 (0x28)
mc_emergency_sp = 0xc00000003ffe4000 (0x2e0)
in_mce = 0x0 (0x2e8)
data_offset = 0x7f170000 (0x30)
hw_cpu_id = 0x8 (0x38)
cpu_start = 0x1 (0x3a)
kexec_state = 0x0 (0x3b)
[Hit a key (a:all, q:truncate, any:next page)]
0:mon>
__current = 0xc00000007e696620 (0x290)
kstack = 0xc00000007e6ebe30 (0x298)
stab_rr = 0xb (0x2a0)
saved_r1 = 0xc00000007ef37860 (0x2a8)
trap_save = 0x0 (0x2b8)
soft_enabled = 0x0 (0x2ba)
irq_happened = 0x1 (0x2bb)
io_sync = 0x0 (0x2bc)
irq_work_pending = 0x0 (0x2bd)
nap_state_lost = 0x0 (0x2be)
0:mon>
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[mpe: Use bool, make some variables static]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
of_get_next_parent can be used to simplify the while() loop and
avoid the need of a temp variable.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
of_get_next_parent can be used to simplify the while() loop and
avoid the need of a temp variable.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In commit 3c8464a9b1 ("powerpc:
Delete old PrPMC 280/2800 support") we got rid of most of the C
code, and the Makefile/Kconfig hooks, but it seems I left the
platform's DTS file orphaned in the tree as well as the boot code.
Here we get rid of them both.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
CONFIG_INPUT_KEYBDEV does not exist and no additional keyboard-specific
options are needed to get the keyboard working.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
.exit.text is discarded at run time and there are some references from
that to .exit.data, so we need to discard .exit.data at run time as well.
Fixes these errors:
`.exit.data' referenced in section `.exit.text' of drivers/built-in.o: defined in discarded section `.exit.data' of drivers/built-in.o
`.exit.data' referenced in section `.exit.text' of drivers/built-in.o: defined in discarded section `.exit.data' of drivers/built-in.o
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
No need to have two atomic opertions (update and fetch/check) when
decreasing PE's number of passed devices as one atomic operation
is enough.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When adding a vPHB in cxl_pci_vphb_add(), we allocate a pci_controller
struct using pcibios_alloc_controller(). However, we don't free it in
cxl_pci_vphb_remove(), causing a leak.
Call pcibios_free_controller() in cxl_pci_vphb_remove() to free the vPHB
data structure correctly.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Export pcibios_free_controller(), so it can be used by the cxl module to
free virtual PHBs.
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This patch provides individual system call numbers for the following
System V IPC system calls, on PowerPC, so that they do not need to be
multiplexed:
* semop, semget, semctl, semtimedop
* msgsnd, msgrcv, msgget, msgctl
* shmat, shmdt, shmget, shmctl
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Now that pseries selects PCI_MSI && PCI, EEH will always be true, and
therefore CONFIG_PSERIES_MSI will always be true. So drop it, and move
msi.o to obj-y.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Make it entirely clear in the Makefile that we always build the pci
related files by moving them to obj-y.
Note that CONFIG_EEH is now always enabled on pseries, because it
depends on PSERIES && PCI.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Now that we always have CONFIG_PCI=y for pseries, we can stop guarding
code with CONFIG_PCI ifdefs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The pseries build with PCI=n looks to have been broken for at least 5
years, and no one's noticed or cared.
Following the obvious breakages backward, the first commit I can find
that builds is the parent of 2eb4afb69f ("powerpc/pci: Move pseries
code into pseries platform specific area") from April 2009.
A distro would never ship a PCI=n kernel, so it is only useful for folks
building custom kernels. Also on KVM the virtio devices appear on PCI,
so it would only be useful if you were building kernels specifically to
run on PowerVM and with no PCI devices.
The added code complexity, and testing load (which we've clearly not
been doing), is not justified by the small reduction in kernel size for
such a niche use case.
So just make PCI non-optional on pseries.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
My recent commit d2036f30cf ("scripts/kconfig/Makefile: Allow
KBUILD_DEFCONFIG to be a target"), contained a bug in that when it
checks if KBUILD_DEFCONFIG is a file it forgets to prepend $(srctree) to
the path.
This causes the build to fail when building out of tree (with O=), and
when the value of KBUILD_DEFCONFIG is 'defconfig'. In that case we will
fail to find the 'defconfig' file, because we look in the build
directory not $(srctree), and so we will call Make again with
'defconfig' as the target. From there we loop infinitely calling 'make
defconfig' again and again.
The fix is simple, we need to look for the file under $(srctree).
Fixes: d2036f30cf ("scripts/kconfig/Makefile: Allow KBUILD_DEFCONFIG to be a target")
Reported-by: Olof Johansson <olof@lixom.net>
Acked-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We need to properly identify whether a hugepage is an explicit or
a transparent hugepage in follow_huge_addr(). We used to depend
on hugepage shift argument to do that. But in some case that can
result in wrong results. For ex:
On finding a transparent hugepage we set hugepage shift to PMD_SHIFT.
But we can end up clearing the thp pte, via pmdp_huge_get_and_clear.
We do prevent reusing the pfn page via the usage of
kick_all_cpus_sync(). But that happens after we updated the pte to 0.
Hence in follow_huge_addr() we can find hugepage shift set, but transparent
huge page check fail for a thp pte.
NOTE: We fixed a variant of this race against thp split in commit
691e95fd73
("powerpc/mm/thp: Make page table walk safe against thp split/collapse")
Without this patch, we may hit the BUG_ON(flags & FOLL_GET) in
follow_page_mask occasionally.
In the long term, we may want to switch ppc64 64k page size config to
enable CONFIG_ARCH_WANT_GENERAL_HUGETLB
Reported-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
After commit e2b3d202d1
("powerpc: Switch 16GB and 16MB explicit hugepages to a
different page table format"), we don't need to support
is_hugepd() for 64K page size.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
pi_buff is being memset before it is sanity checked. Move the
memset after the null pi_buff sanity check to avoid an oops.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Always include a timeout when waiting for secondary cpus to enter OPAL
in the kexec path, rather than only when crashing.
Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The derive_parent() has similar semantics to what we have in newly introduced
of_helpers module. The replacement reduces code base and propagates the actual
error code to the caller.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In case we have node without '/' strrchr() returns NULL which might lead to
crash. Replace strrchr() by kbasename() and modify condition to avoid such
behaviour.
Suggested-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The helper kstrndup() will do the same in one line.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In case we have a full node name like /foo/bar and /foo is not found the
parent_path left unfreed. So, free a memory before return to a caller.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Extract a new module to share the code between other modules.
There is no functional change.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Fix the memory leak in create_gatt_table:
we've lost a kfree on the exit path for the pages array allocated
in uninorth_create_gatt_table
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
'nvram_create_os_partition' should be 'nvram_create_partition'.
Use __func__ to have it right, as done elsewhere in this file.
Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
If 'nvram_write_header' fails, then 'new_part' should be freed, otherwise,
there is a memory leak.
Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Arch Makefiles can set KBUILD_DEFCONFIG to tell kbuild the name of the
defconfig that should be built by default.
However currently there is an assumption that KBUILD_DEFCONFIG points to
a file at arch/$(SRCARCH)/configs/$(KBUILD_DEFCONFIG).
We would like to use a target, using merge_config, as our defconfig, so
adapt the logic in scripts/kconfig/Makefile to allow that.
To minimise the chance of breaking anything, we first check if
KBUILD_DEFCONFIG is a file, and if so we do the old logic. If it's not a
file, then we call the top-level Makefile with KBUILD_DEFCONFIG as the
target.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Michal Marek <mmarek@suse.com>
This add helper virt_to_pfn and remove the opencoded usage of the
same.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
powerpc has a link register (lr) used for calling functions. We "bl
<func>" to call a function, and "blr" to return back to the call site.
The lr is only a single register, so if we call another function from
inside this function (ie. nested calls), software must save away the
lr on the software stack before calling the new function. Before
returning (ie. before the "blr"), the lr is restored by software from
the software stack.
This makes branch prediction quite difficult for the processor as it
will only know the branch target just before the "blr".
To help with this, modern powerpc processors keep a (non-architected)
hardware stack of lr called a "link stack". When a "bl <func>" is
run, the lr is pushed onto this stack. When a "blr" is called, the
branch predictor pops the lr value from the top of the link stack, and
uses it to predict the branch target. Hence the processor pipeline
knows a lot earlier the branch target.
This works great but there are some cases where you call "bl" but
without a matching "blr". Once such case is when trying to determine
the program counter (which can't be read directly). Here you "bl+4;
mflr" to get the program counter. If you do this, the link stack will
get out of sync with reality, causing the branch predictor to
mis-predict subsequent function returns.
To avoid this, modern micro-architectures have a special case of bl.
Using the form "bcl 20,31,+4", ensures the processor doesn't push to
the link stack.
The 32 and 64 bit variants of __get_datapage() use a "bl; mflr" to
determine the loaded address of the VDSO. The current versions of
these attempt to use this special bl variant.
Unfortunately they use +8 rather than the required +4. Hence the
current code results in the link stack getting out of sync with
reality and hence the resulting performance degradation.
This patch moves it to bcl+4 by moving __kernel_datapage_offset out of
__get_datapage().
With this patch, running a gettimeofday() (which uses
__get_datapage()) microbenchmark we get a decent bump in performance
on POWER7/8.
For the benchmark in tools/testing/selftests/powerpc/benchmarks/gettimeofday.c
POWER8:
64bit gets ~4% improvement
32bit gets ~9% improvement
POWER7:
64bit gets ~7% improvement
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reported-by: Aaron Sawdey <sawdey@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This adds a benchmark directory to the powerpc selftests and adds a
gettimeofday() benchmark to it.
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This patch defines macros for the three bolted SLB indexes we use.
Switch the functions that take the indexes as an argument to use the
enum.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Andy Lutomirski says:
Some dynamic loaders may be slightly faster if a GNU hash is
available.
This is unlikely to have any measurable effect on the time it takes
to resolve vdso symbols (since there are so few of them). In some
contexts, it can be a win for a different reason: if every DSO has a
GNU hash section, then libc can avoid calculating SysV hashes at
all. Both musl and glibc appear to have this optimization.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
On the unlikely event that drv is null, the current code will
perform a null pointer dereference with it when printing a dev_dbg
message. Instead, the BUG_ON check on drv should be performed
before we emit the dev_dbg message.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently, little endian is only supported on powernv and pseries,
however, Kconfigs still allow us to include other platforms in a LE
kernel, this may result in space wasting or even build error if some
BE-only platforms always assume they are built for a BE kernel. So just
modify the Kconfigs of BE-only platforms to remove them from being built
for a LE kernel.
For 32bit only platforms, nothing needs to be done, because
CPU_LITTLE_ENDIAN depends on PPC64. For 64bit supported platforms, add
CPU_BIG_ENDIAN to dependencies explicitly, so that these platforms will
be disabled for LE [Suggested-by: Cédric Le Goater <clg@fr.ibm.com>].
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Acked-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>