Meelis Roos reported that kernels built with gcc-4.9 do not boot, we
eventually narrowed this down to only impacting machines using
UltraSPARC-III and derivitive cpus.
The crash happens right when the first user process is spawned:
[ 54.451346] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
[ 54.451346]
[ 54.571516] CPU: 1 PID: 1 Comm: init Not tainted 3.16.0-rc2-00211-gd7933ab #96
[ 54.666431] Call Trace:
[ 54.698453] [0000000000762f8c] panic+0xb0/0x224
[ 54.759071] [000000000045cf68] do_exit+0x948/0x960
[ 54.823123] [000000000042cbc0] fault_in_user_windows+0xe0/0x100
[ 54.902036] [0000000000404ad0] __handle_user_windows+0x0/0x10
[ 54.978662] Press Stop-A (L1-A) to return to the boot prom
[ 55.050713] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
Further investigation showed that compiling only per_cpu_patch() with
an older compiler fixes the boot.
Detailed analysis showed that the function is not being miscompiled by
gcc-4.9, but it is using a different register allocation ordering.
With the gcc-4.9 compiled function, something during the code patching
causes some of the %i* input registers to get corrupted. Perhaps
we have a TLB miss path into the firmware that is deep enough to
cause a register window spill and subsequent restore when we get
back from the TLB miss trap.
Let's plug this up by doing two things:
1) Stop using the firmware stack for client interface calls into
the firmware. Just use the kernel's stack.
2) As soon as we can, call into a new function "start_early_boot()"
to put a one-register-window buffer between the firmware's
deepest stack frame and the top-most initial kernel one.
Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is based upon a report by Meelis Roos showing that it's possible
that we'll try to fetch a property that is 32K in size with some
devices. With the current fixed 3K buffer we use for moving data in
and out of the firmware during PROM calls, that simply won't work.
In fact, it will scramble random kernel data during bootup.
The reasoning behind the temporary buffer is entirely historical. It
used to be the case that we had problems referencing dynamic kernel
memory (including the stack) early in the boot process before we
explicitly told the firwmare to switch us over to the kernel trap
table.
So what we did was always give the firmware buffers that were locked
into the main kernel image.
But we no longer have problems like that, so get rid of all of this
indirect bounce buffering.
Besides fixing Meelis's bug, this also makes the kernel data about 3K
smaller.
It was also discovered during these conversions that the
implementation of prom_retain() was completely wrong, so that was
fixed here as well. Currently that interface is not in use.
Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
- all files with identical names copied and renamed to *_64.c
- the remaning files copied as is
- added sparc64 specific files to sparc/prom/Makefile
- teach sparc64 Makefile to look into sparc/prom/
- delete unused Makefile from sparc64/prom/
linking order was not kept for sparc64 with this change.
It was not possible to keep linking order for both sparc and sparc64
and as sparc64 see more testing than sparc it was natural to
break linking order on sparc64. Should it have any effect it
would be detected sooner this way.
printf_32.c and printf_64.c are obvious candidates to be merged
but they are not 100% equal so that was left for later
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>