Commit Graph

719 Commits

Author SHA1 Message Date
Kirill A. Shutemov e1534ae950 mm: differentiate page_mapped() from page_mapcount() for compound pages
Let's define page_mapped() to be true for compound pages if any
sub-pages of the compound page is mapped (with PMD or PTE).

On other hand page_mapcount() return mapcount for this particular small
page.

This will make cases like page_get_anon_vma() behave correctly once we
allow huge pages to be mapped with PTE.

Most users outside core-mm should use page_mapcount() instead of
page_mapped().

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Linus Torvalds 0f0836b7eb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching
Pull livepatching updates from Jiri Kosina:

 - RO/NX attribute fixes for patch module relocations from Josh
   Poimboeuf.  As part of this effort, module.c has been cleaned up as
   well and livepatching is piggy-backing on this cleanup.  Rusty is OK
   with this whole lot going through livepatching tree.

 - symbol disambiguation support from Chris J Arges.  That series is
   also

        Reviewed-by: Miroslav Benes <mbenes@suse.cz>

   but this came in only after I've alredy pushed out.  Didn't want to
   rebase because of that, hence I am mentioning it here.

 - symbol lookup fix from Miroslav Benes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
  livepatch: Cleanup module page permission changes
  module: keep percpu symbols in module's symtab
  module: clean up RO/NX handling.
  module: use a structure to encapsulate layout.
  gcov: use within_module() helper.
  module: Use the same logic for setting and unsetting RO/NX
  livepatch: function,sympos scheme in livepatch sysfs directory
  livepatch: add sympos as disambiguator field to klp_reloc
  livepatch: add old_sympos as disambiguator field to klp_func
2016-01-14 16:38:02 -08:00
Vineet Gupta 6b538db7c6 ARC: dw2 unwind: Catch Dwarf SNAFUs early
Instead of seeing empty stack traces, let kernel fail early so dwarf
issues can be fixed sooner

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-21 14:01:49 +05:30
Vineet Gupta 6d0d506012 ARC: dw2 unwind: Don't bail for CIE.version != 1
The rudimentary CIE.version == 3 handling is already present in code
(for return address register specification)

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-21 14:01:25 +05:30
Vineet Gupta 2d64affc92 Revert "ARC: dw2 unwind: Ignore CIE version !=1 gracefully instead of bailing"
Blingly ignoring CIE.version != 1 was a bad idea.
It still leaves "desirability" when running perf with callgraphing where libgcc
symbols might show in hotspot.

More importantly, basic CIE.version == 3 support already exists in code:

|
|   retAddrReg = state.version <= 1 ? *ptr++ : get_uleb128(&ptr, end);
|

Next commit with simply add continue-not-bail for CIE.version != 1

This reverts commit 323f41f9e7.
2015-12-21 13:29:44 +05:30
Vineet Gupta 07fd7d4bbc ARC: Fix linking errors with CONFIG_MODULE + CONFIG_CC_OPTIMIZE_FOR_SIZE
At -Os, ARC gcc generates millicode thunk for function prologue/epilogue,
which are served by libgcc.

Modules historically are NOT linked with libgcc to avoid code bloat, reducing
runtime relocation fixups etc. I even once tried doing that but got lost
in makefile intricacies.

This means modules at -Os don't get the millicode thunks, causing build
failures below:

| MODPOST 5 modules
| ERROR: "__ld_r13_to_r18" [crypto/sha256_generic.ko] undefined!
| ERROR: "__ld_r13_to_r18_ret" [crypto/sha256_generic.ko] undefined!
| ERROR: "__st_r13_to_r18" [crypto/sha256_generic.ko] undefined!
| ERROR: "__ld_r13_to_r17_ret" [crypto/sha256_generic.ko] undefined!
| ERROR: "__st_r13_to_r17" [crypto/sha256_generic.ko] undefined!
| ERROR: "__ld_r13_to_r16_ret" [crypto/sha256_generic.ko] undefined!
| ERROR: "__st_r13_to_r16" [crypto/sha256_generic.ko] undefined!
|....
|....

Workaround that by inhibiting millicode thunks for loadable modules

Fixes STAR 9000641864:
("Linux built with optimizations for size emits errors for modules")

Reported-by: Anton Kolesov <akolesov@synosys.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-21 13:01:19 +05:30
Alexey Brodkin 4b32e89af7 ARC: mm: fix building for MMU v2
ARC700 cores with MMU v2 don't have IC_PTAG AUX register and so we only
define ARC_REG_IC_PTAG for MMU versions >= 3.

But current implementation of cache_line_loop_vX() routines assumes
availability of all of them (v2, v3 and v4) simultaneously.

And given undefined ARC_REG_IC_PTAG if CONFIG_MMU_VER=2 we're seeing
compilation problem:
---------------------------------->8-------------------------------
  CC      arch/arc/mm/cache.o
arch/arc/mm/cache.c: In function '__cache_line_loop_v3':
arch/arc/mm/cache.c:270:13: error: 'ARC_REG_IC_PTAG' undeclared (first use in this function)
   aux_tag = ARC_REG_IC_PTAG;
             ^
arch/arc/mm/cache.c:270:13: note: each undeclared identifier is reported only once for each function it appears in
scripts/Makefile.build:258: recipe for target 'arch/arc/mm/cache.o' failed
---------------------------------->8-------------------------------

The simples fix is to have ARC_REG_IC_PTAG defined regardless MMU
version being used.

We don't use it in cache_line_loop_v2() anyways so who cares.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-21 12:10:40 +05:30
Vineet Gupta 899cfd2bb0 ARC: mm: HIGHMEM: Fix section mismatch splat
| WARNING: vmlinux.o(.text+0xd6c2): Section mismatch in reference from the function alloc_kmap_pgtable() to the function
| .init.text:__alloc_bootmem_low()
The function alloc_kmap_pgtable() references the function __init __alloc_bootmem_low().
This is often because alloc_kmap_pgtable lacks a __init annotation or the annotation of __alloc_bootmem_low is wrong.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-21 12:10:40 +05:30
Vineet Gupta 575a9d4e2c ARC: smp: Rename platform hook @init_cpu_smp -> @init_per_cpu
Makes it similar to smp_ops which also has callback with same name

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17 12:56:56 +05:30
Noam Camus b474a02382 ARC: rename smp operation init_irq_cpu() to init_per_cpu()
This will better reflect its description i.e. "any needed setup..."
and not just do an "IPI request".

Signed-off-by: Noam Camus <noamc@ezchip.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17 12:56:43 +05:30
Vineet Gupta 323f41f9e7 ARC: dw2 unwind: Ignore CIE version !=1 gracefully instead of bailing
ARC dwarf unwinder only supports CIE version == 1
The boot time dwarf sanitizer (part of binary lookup table constructor)
would simply bail if it saw CIE version == 3, rendering unwinder with a
NULL lookup table.

It seems libgcc linked with kernel does have such entries.

With fallback linear search removed, and a NULL binary lookup table,
unwinder fails to generate any stack trace.

So allow graceful ignoring of unsupported CIE entries.

This problem was initially seen in Alexey's setup (and not mine) as he
was using buildroot built toolchain (libgcc) which doesn't get built with
CFLAGS_FOR_TARGET="-gdwarf-2 which is my default

Fixes STAR 9000985048: "kernel unwinder broken with stock tools"

Fixes: 2e22502c08 ARC: dw2 unwind: Remove falllback linear search thru FDE entries
Reported-by Alexey Brodkin <abrodkin@synopsys.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17 11:10:23 +05:30
Vineet Gupta bc79c9a721 ARC: dw2 unwind: Reinstante unwinding out of modules
The fix which removed linear searching of dwarf (because binary lookup
data always exists) missed out on the fact that modules don't get the
binary lookup tables info. This caused unwinding out of modules to stop
working.

So add binary lookup header setup (equivalent of eh_frame_hdr setup) to
modules as well.

While at it, confine the header setup to within unwinder code,
reducing one API exposed out of unwinder code.

Fixes: 2e22502c08 ARC: dw2 unwind: Remove falllback linear search thru FDE entries
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17 11:10:23 +05:30
Vineet Gupta ff1c0b6a79 ARC: [plat-sim] unbork non default CONFIG_LINUX_LINK_BASE
HIGHMEM support bumped the default memory size for nsim platform to 1G.
Thus total memory ended at the very edge of start of peripherals address
space. With linux link base shifted, memory started bleeding into
peripheral space which caused early boot bad_page spew !

Fixes: 29e332261d ("ARC: mm: HIGHMEM: populate high memory from DT")
Reported-by: Anton Kolesov <akolesov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-17 11:06:43 +05:30
Vineet Gupta c512c6ba7a ARC: intc: Document arc_request_percpu_irq() better
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-12 16:04:12 +05:30
Vineet Gupta c6317bc7c5 ARCv2: perf: Ensure perf intr gets enabled on all cores
This was the second perf intr issue

perf sampling on multicore requires intr to be enabled on all cores.
ARC perf probe code used helper arc_request_percpu_irq() which calls
 - request_percpu_irq() on core0
 - enable_percpu_irq() on all all cores (including core0)

genirq requires that request be made ahead of enable call.
However if perf probe happened on non core0 (observed on a 3.18 kernel),
enable would get called ahead of request, failing obviously and
rendering perf intr disabled on all such cores

[   11.120000] 1 ARC perf       : 8 counters (48 bits), 113 conditions, [overflow IRQ support]
[   11.130000] 1 -----> enable_percpu_irq() IRQ 20 failed
[   11.140000] 3 -----> enable_percpu_irq() IRQ 20 failed
[   11.140000] 2 -----> enable_percpu_irq() IRQ 20 failed
[   11.140000] 0 =====> request_percpu_irq() IRQ 20
[   11.140000] 0 -----> enable_percpu_irq() IRQ 20

Fix this fragility, by calling request_percpu_irq() on whatever core
calls probe (there is no requirement on which core calls this anyways)
and then calling enable on each cores.

Interestingly this started as invesigation of STAR 9000838902:
"sporadically IRQs enabled on perf prob"

which was about occassional boot spew as request_percpu_irq got called
non-locally (from an IPI), and re-enabled interrupts in following path
proc_mkdir ->  spin_unlock_irq()

which the irq work code didn't like.

| ARC perf     : 8 counters (48 bits), 113 conditions, [overflow IRQ support]
|
| BUG: failure at ../kernel/irq_work.c:135/irq_work_run_list()!
| CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.10-01127-g285efb8e66d1 #2
|
| Stack Trace:
|  arc_unwind_core.constprop.1+0x94/0x104
|  dump_stack+0x62/0x98
|  irq_work_run_list+0xb0/0xb4
|  irq_work_run+0x22/0x3c
|  do_IPI+0x74/0x9c
|  handle_irq_event_percpu+0x34/0x164
|  handle_percpu_irq+0x58/0x78
|  generic_handle_irq+0x1e/0x2c
|  arch_do_IRQ+0x3c/0x60
|  ret_from_exception+0x0/0x8

Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: <stable@vger.kernel.org> #4.2+
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-12 16:03:59 +05:30
Vineet Gupta 5bf704c204 ARC: intc: No need to clear IRQ_NOAUTOEN
arc_request_percpu_irq() is called by all cores to request/enable percpu
irq. It has some "prep" calls needed by genirq:
 - setup percpu devid
 - disable IRQ_NOAUTOEN

However given that enable_percpu_irq() is called enayways, latter can be
avoided.

We are now left with irq_set_percpu_devid() quirk and that too for
ARCompact builds only, since previous patch updated ARCv2 intc to do this
in the "right" place, i.e. irq map function.

By next release, this will ultimately be fixed for ARCompact as well.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-12 16:03:59 +05:30
Vineet Gupta 8eb0984bf4 ARCv2: intc: Fix random perf irq disabling in SMP setup
As part of fixing another perf issue, observed that after a perf run,
the interrupt got disabled on one/more cores.

Turns out that despite requesting perf irq as percpu, the flow handler
registered was not handle_percpu_irq()

Given that on ARCv2 cores, IRQs < 24 are always private to cpu, we
register the right handler at the very onset.

Before Fix

| [ARCLinux]# cat /proc/interrupts | grep perf
|  20:    0      0      0       0  ARCv2 core Intc  20 ARC perf counters
|
| [ARCLinux]# perf record -c 20000 /sbin/hackbench
| Running with 10*40 (== 400) tasks.
|
| [ARCLinux]# cat /proc/interrupts | grep perf
|  20:    0    522      8    51916  ARCv2 core Intc  20 ARC perf counters
|
| [ARCLinux]# perf record -c 20000 /sbin/hackbench
| Running with 10*40 (== 400) tasks.
|
| [ARCLinux]# cat /proc/interrupts | grep perf
|  20:    0    522      8   104368  ARCv2 core Intc  20 ARC perf counters

After Fix

| [ARCLinux]# cat /proc/interrupts | grep perf
|  20:    0      0      0       0  ARCv2 core Intc  20 ARC perf counters
|
| [ARCLinux]# perf record -c 20000 /sbin/hackbench
| Running with 10*40 (== 400) tasks.
|
| [ARCLinux]# cat /proc/interrupts | grep perf
|  20:  64198  62012  62697  67803  ARCv2 core Intc  20 ARC perf counters
|
| [ARCLinux]# perf record -c 20000 /sbin/hackbench
| Running with 10*40 (== 400) tasks.
|
| [ARCLinux]# cat /proc/interrupts | grep perf
|  20: 126014 122792 123301 133654  ARCv2 core Intc  20 ARC perf counters

Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: stable@vger.kernel.org #4.2+
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-12 16:03:41 +05:30
Alexey Brodkin 6d1a2adef7 ARC: [axs10x] cap ethernet phy to 100 Mbit/sec
Current ARC SDP boards cannot reliably handle 1Gbit
Ethernet connections due to limitations in hardware.

To make sure networking is stable on the board we're
limiting phy to 100 Mbit.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-12-07 19:40:03 +05:30
Rusty Russell 7523e4dc50 module: use a structure to encapsulate layout.
Makes it easier to handle init vs core cleanly, though the change is
fairly invasive across random architectures.

It simplifies the rbtree code immediately, however, while keeping the
core data together in the same cachline (now iff the rbtree code is
enabled).

Acked-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2015-12-04 22:46:25 +01:00
Vineet Gupta 2e22502c08 ARC: dw2 unwind: Remove falllback linear search thru FDE entries
Fixes STAR 9000953410: "perf callgraph profiling causing RCU stalls"

| perf record -g -c 15000 -e cycles /sbin/hackbench
|
| INFO: rcu_preempt self-detected stall on CPU
| 1: (1 GPs behind) idle=609/140000000000002/0 softirq=2914/2915 fqs=603
| Task dump for CPU 1:

in-kernel dwarf unwinder has a fast binary lookup and a fallback linear
search (which iterates thru each of ~11K entries) thus takes 2 orders of
magnitude longer (~3 million cycles vs. 2000). Routines written in hand
assembler lack dwarf info (as we don't support assembler CFI pseudo-ops
yet) fail the unwinder binary lookup, hit linear search, failing
nevertheless in the end.

However the linear search is pointless as binary lookup tables are created
from it in first place. It is impossible to have binary lookup fail while
succeed the linear search. It is pure waste of cycles thus removed by
this patch.

This manifested as RCU stalls / NMI watchdog splat when running
hackbench under perf with callgraph profiling. The triggering condition
was perf counter overflowing in routine lacking dwarf info (like memset)
leading to patheic 3 million cycle unwinder slow path and by the time it
returned new interrupts were already pending (Timer, IPI) and taken
rightaway. The original memset didn't make forward progress, system kept
accruing more interrupts and more unwinder delayes in a vicious feedback
loop, ultimately triggering the NMI diagnostic.

Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-23 21:36:49 +05:30
Vineet Gupta e81b75f7b2 ARC: remove SYNC from __switch_to()
SYNC in __switch_to() is a historic relic and not needed at all.

 - In UP context it is obviously useless, why would we want to stall
   the core for all updates to stack memory of t0 to complete before
   loading kernel mode callee registers from t1 stack's memory.

 - In SMP, there could be potential race in which outgoing task could
   be concurrently picked for running on a different core, thus writes
   to stack here need to be visible before the reads from stack on
   other core. Peter confirmed that generic schedular already has needed
   barriers (by way of rq lock) so there is no need for additional arch
   barrier.

This came up when Noam was trying to replace this SYNC with EZChip
specific hardware thread scheduling instruction for their platform
support.

Link: http://lkml.kernel.org/r/20151102092654.GM17308@twins.programming.kicks-ass.net
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Cc: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-17 22:05:30 +05:30
Vineet Gupta b8628f3fe4 ARCv2: Use the default irq priority for idle sleep
Although kernel doesn't support the multiple IRQ priority levels provided
by HS38x core intc yet, ensure that the default prio value is used
anyways by relevant code.

SLEEP insn needs to be provided the IRQ priority level which can
interrupt it. This needs to be the default level which maynot
necessarily be 0 as assumed by current code.

This change allows a kernel with ARCV2_IRQ_DEF_PRIO = 1 to boot fine.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-16 14:17:06 +05:30
Vineet Gupta 512b5b89b9 ARC: Abstract out ISA specific SLEEP args
No semantical changes, prepares for ARCv2 specific change in next commit

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-16 14:17:02 +05:30
Vineet Gupta 61a1634818 ARC: comments update
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-16 12:00:16 +05:30
sujayraaj 17c4fbaf1a ARC: switch to arc-linux- CROSS_COMPILE prefix across all configs
When building kernel with buildroot built toolchain, CROSS_COMPILE
currently needs adjustment even if minor. This is because the defconfigs
prefer "arc-linux-uclibc-" prefix from hand built (non buildroot) toolchain
while buildroot provides "arc-buildroot-linux-uclibc-"

To avoid this use the common "arc-linux-" prefix which is provided by
buildroot and has also been in hand built tools for quite some time.

Signed-off-by: sujayraaj <sujayraaj@gmail.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: updated changelog]
2015-11-16 12:00:09 +05:30
Linus Torvalds b3a0d9a232 ARC fixes for 4.4-rc1
- A bunch of brown paper bag bugs (MAINTAINERS list email, SMP build failure)
 - cpu_relax() now compiler barrier for UP as well
 - Handling of userspace Bus Errors for ARCompact builds
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWRucxAAoJEGnX8d3iisJehaMP/RBYry4TGSSYDC7DqkbpTDlv
 Wd/HE3HDNC0r6hqLXMO2MBLWLvK22pJPcGzXkXtw8kqwTYtKWjZ0IE3R72Lgfw4P
 xlWKdjeI4RXkzJKOZhh6/7HVTkOSto4cUAcdwFcq4ec8asJBAwl/zWoM2Rw8PdfG
 A11iAZTHUSavj/QkFpuqhtvRMRUe76cK41RvdoJfOtB9MjRF3XD0+ceXeDTYuoRb
 aETH42JS5XjRvGShcvaUOCKDZcxlsPyd9LZZAzrLLIoepb7pBluh+YVJH3wJXBcl
 ECjXSprv6GUR1C8R7G3lMtGwIt2tBINTuxH/ZVQp2pUKIUW/TL8MXEQGvWfrosXL
 SbgsIYQSTuc9aO5c5qZ7MSEWz+hLDVlrgWZzs7FsNH7wxREQgl0hjsr382X0Y4n0
 tWPVMr0Hvu7rLdH0gvxIXA3rF92Q9kfLTXj2flaiwXgCxoOoeQj/CtqhUs1xbzed
 qJXf9bwWsjhexxwvHi9CJpyYaVFDp2kkxMoJvzvsLT9cOUGC0XKHnCT4Q96VE5Ms
 9bVBngAugeZRNqB8vKi/84oU8A1Sq+KoTk6b87Z/wpzkh06tmsMQrOEbzoZQsDh9
 6yPW7hgYb794apY9oKwTshHZzXDPg8J/+SVzKht8f84YtSbzS476K70PqnFeoUkD
 2W7IeYKaUolgt3n3BoOO
 =isVd
 -----END PGP SIGNATURE-----

Merge tag 'arc-4.4-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:
 "Found a couple of brown paper bag bugs with the prev pull request
  (including a SMP build breakage report from Guenter).  Since these are
  urgent I also decided to send over a bunch of other pending fixes
  which could have otherwise waited an rc or two.

  Summary:

   - A bunch of brown paper bag bugs (MAINTAINERS list email, SMP build
     failure)
   - cpu_relax() now compiler barrier for UP as well
   - handling of userspace Bus Errors for ARCompact builds"

* tag 'arc-4.4-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: Fix silly typo in MAINTAINERS file
  ARC: cpu_relax() to be compiler barrier even for UP
  ARC: use ASL assembler mnemonic
  ARC: [arcompact] Handle bus error from userspace as Interrupt not exception
  ARC: remove extraneous header include
  ARCv2: lib: memcpy: use local symbols
2015-11-14 09:09:37 -08:00
Vineet Gupta 1cfc05cbe2 ARC: cpu_relax() to be compiler barrier even for UP
cpu_relax() on ARC has been barrier only for SMP (and no-op for UP). Per
recent discussions, it is safer to make it a compiler barrier
unconditionally.

Link: http://lkml.kernel.org/r/53A7D3AA.9020100@synopsys.com
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-14 13:12:30 +05:30
Vineet Gupta a6416f57ce ARC: use ASL assembler mnemonic
ARCompact and ARCv2 only have ASL, while binutils used to support LSL as
a alias mnemonic.

Newer binutils (upstream) don't want to do that so replace it.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-14 13:12:21 +05:30
Vineet Gupta 541366da6a ARC: [arcompact] Handle bus error from userspace as Interrupt not exception
Bus errors from userspace on ARCompact based cores are handled by core
as a high priority L2 interrupt but current code treated it as interrupt
Handling an interrupt like exception is certainly not going to go unnoticed.
(and it worked so far as we never saw a Bus error from userspace until
IPPK guys tested a DDR controller with ECC error detection etc hence
needed to explicitly trigger/handle such errors)

 - So move mem_service exception handler from common code into ARCv2 code.
 - In ARCompact code, define  mem_service as L2 interrupt handler which
   just drops down to pure kernel mode and goes of to enqueue SIGBUS

Reported-by: Nelson Pereira <npereira@synopsys.com>
Tested-by: Ana Martins <amartins@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-14 13:12:20 +05:30
Vineet Gupta 76a8c40c65 ARC: remove extraneous header include
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-14 13:11:38 +05:30
Linus Torvalds 9bbd4b9f38 DeviceTree updates for 4.4:
- DT binding doc consolidation moving similar bindings to common
   locations. The majority of these are display related which were
   scattered in video/, fb/, drm/, gpu/, and panel/ directories.
 - Add new config option, CONFIG_OF_ALL_DTBS, to enable building all dtbs
   in the tree for most arches with dts files (except powerpc for now).
 - OF_IRQ=n fixes for user enabled CONFIG_OF.
 - of_node_put ref counting fixes from Julia Lawall.
 - Common DT binding for wakeup-source and deprecation of all similar
   bindings.
 - DT binding for PXA LCD controller.
 - Allow ignoring failed PCI resource translations in order to ignore
   64-bit addresses on non-LPAE 32-bit kernels.
 - Support setting the NUMA node from DT instead of only from parent
   device.
 - Couple of earlycon DT parsing fixes for address and options.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWO+HwAAoJEPr7XbWNvGHDFOoP/RcK4z5dtVQL2XRFbJBqRBZU
 riMc4BZkwpcKGXjzZlMrLw+mg4vaoLKSIAGAsYkgdDOKbYphQQdVwDzN099Fzpzy
 EE6K7Q+AW618z34AWdDjcpJYzSFnjAjYdh6JabohmKdPlxobR1RsKT+nRpDOGfeO
 c1DmxAQ1Fiav+xAI4m0YUuyxQDUeFYC0mVcVPzRbWZj31Ia5BgIrvT+V7fM55CTb
 dOAYWDHrfOw+yYI98sqZoSAb5H+E3UuY1ymBMhiP16Ot2vCyIKRYwhDokF9VYO/0
 MpIMhCgv1jmE5NCbBeYSSJUePySvvnuyYe6HIaJQjV8KGEZ5C7c1iLMgtvov2KVI
 bcSx6nPH/u5FuWIhWdMINPc50AQBAK/akYcgoCVjioVX+4WY2pqLvsXW2arEr//Z
 XY3FUpgS6eI42HBsj4SxrGnzaRc2jPOs6yiANkywmHnpWcyvszCxUEvf0Mh0cbkm
 diu31/owdpUO5imCe+ErtGVV3vY2VUMnbcm8J61pqL52OZNPLZUUEfktv1JCGW7y
 cVdnlgeGSFuCbY4jUErvxtQGJQLFwf7Gg1U/VfFXX2iuQGAACB1KwxY5sR6P5Ghz
 W6NvOTT36kaM9WGqn/bquOd6EmKcx9aZfgRLOkgV86Q/11gHKpuRjkrsth7rIJDI
 97qkGiWl4t1Ll5T4StK8
 =tGCy
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull DeviceTree updates from Rob Herring:
 "A fairly large (by DT standards) pull request this time with the
  majority being some overdue moving DT binding docs around to
  consolidate similar bindings.

   - DT binding doc consolidation moving similar bindings to common
     locations.  The majority of these are display related which were
     scattered in video/, fb/, drm/, gpu/, and panel/ directories.

   - Add new config option, CONFIG_OF_ALL_DTBS, to enable building all
     dtbs in the tree for most arches with dts files (except powerpc for
     now).

   - OF_IRQ=n fixes for user enabled CONFIG_OF.

   - of_node_put ref counting fixes from Julia Lawall.

   - Common DT binding for wakeup-source and deprecation of all similar
     bindings.

   - DT binding for PXA LCD controller.

   - Allow ignoring failed PCI resource translations in order to ignore
     64-bit addresses on non-LPAE 32-bit kernels.

   - Support setting the NUMA node from DT instead of only from parent
     device.

   - Couple of earlycon DT parsing fixes for address and options"

* tag 'devicetree-for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (45 commits)
  MAINTAINERS: update DT binding doc locations
  devicetree: add Sigma Designs vendor prefix
  of: simplify arch_find_n_match_cpu_physical_id() function
  Documentation: arm: Fixed typo in socfpga fpga mgr example
  Documentation: devicetree: fix reference to legacy wakeup properties
  Documentation: devicetree: standardize/consolidate on "wakeup-source" property
  drivers: of: removing assignment of 0 to static variable
  xtensa: enable building of all dtbs
  mips: enable building of all dtbs
  metag: enable building of all dtbs
  metag: use common make variables for dtb builds
  h8300: enable building of all dtbs
  arm64: enable building of all dtbs
  arm: enable building of all dtbs
  arc: enable building of all dtbs
  arc: use common make variables for dtb builds
  of: add config option to enable building of all dtbs
  of/fdt: fix error checking for earlycon address
  of/overlay: add missing of_node_put
  of/platform: add missing of_node_put
  ...
2015-11-06 12:17:09 -08:00
Linus Torvalds d63a978865 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking changes from Ingo Molnar:
 "The main changes in this cycle were:

   - More gradual enhancements to atomic ops: new atomic*_read_ctrl()
     ops, synchronize atomic_{read,set}() ordering requirements between
     architectures, add atomic_long_t bitops.  (Peter Zijlstra)

   - Add _{relaxed|acquire|release}() variants for inc/dec atomics and
     use them in various locking primitives: mutex, rtmutex, mcs, rwsem.
     This enables weakly ordered architectures (such as arm64) to make
     use of more locking related optimizations.  (Davidlohr Bueso)

   - Implement atomic[64]_{inc,dec}_relaxed() on ARM.  (Will Deacon)

   - Futex kernel data cache footprint micro-optimization.  (Rasmus
     Villemoes)

   - pvqspinlock runtime overhead micro-optimization.  (Waiman Long)

   - misc smaller fixlets"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ARM, locking/atomics: Implement _relaxed variants of atomic[64]_{inc,dec}
  locking/rwsem: Use acquire/release semantics
  locking/mcs: Use acquire/release semantics
  locking/rtmutex: Use acquire/release semantics
  locking/mutex: Use acquire/release semantics
  locking/asm-generic: Add _{relaxed|acquire|release}() variants for inc/dec atomics
  atomic: Implement atomic_read_ctrl()
  atomic, arch: Audit atomic_{read,set}()
  atomic: Add atomic_long_t bitops
  futex: Force hot variables into a single cache line
  locking/pvqspinlock: Kick the PV CPU unconditionally when _Q_SLOW_VAL
  locking/osq: Relax atomic semantics
  locking/qrwlock: Rename ->lock to ->wait_lock
  locking/Documentation/lockstat: Fix typo - lokcing -> locking
  locking/atomics, cmpxchg: Privatize the inclusion of asm/cmpxchg.h
2015-11-03 16:10:43 -08:00
Linus Torvalds 2c2b8285dc - Support for new MM features in ARCv2 cores (THP, PAE40)
Some generic THP bits are touched - all ACKed by Kirill
 
 - Platform framework updates to prepare for EZChip arrival (still in works)
 
 - ARC Public Mailing list setup finally (linux-snps-arc@lists.infraded.org)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWOJuTAAoJEGnX8d3iisJeTs0P/jFFQLrsRHALWVEJ/i7TCOSK
 ud/uekSmPzbUUHkR4BziXsrKZS7Mp+ht2CsXStMLfdk6nJ5X1ydzaRbpXeMPckcV
 Cn8/Y0L1bbsjgJV/eOP3CsQfUrzjSBZY/Oo4VBKw5YOcSNGpGXpWLeni8Oyl3KZW
 3RO0TnNdQ1V8IJFVl8TkcruoR0KhK+UOqMyQh5Axwy6JBbPYdB319AfcJ6Pl2rmp
 JomwVf8igZHU77OJYT4AKmxXpXuZF+ZNM77q5bMoXUZg0YJKyJkKvFAwZw6Z+ypt
 inJ7oEmpZyPwvlsa4MUwSzgp/ycxQklvQbEgZBtlYBkJAs9iLxRmRvfqI1JqPF3G
 vnAhiZgr8ZRh37A8L0UladBZ8GP2ckEURb6vgJUiJwG7o2hkmEF7lIecoyKYIWpp
 +qmtre0iQLPQAVvH5apJsoMJK2Zj1dWOFrGh3tPKcL+QBIafC4GORjKg6Kd642w4
 TBC20QU2QH+kDBH4AGlcm7BWDz+bXh5S7NpilNggy2GqOet50du8LiA7GoqTA5GF
 POeGGeIKjwHgBQxONqpHj5Hdb6fRtFUmAvicdolkd/da77gbsKqIZj6TrfGnlNkt
 Fzn6a+WpeTQBzoyvKMW3KxLpq28qugYyaWfRacS+g2m5fcaRno+U7rjGOdRalINk
 ujJ2CGfAmPWCFNJBvxwb
 =H+Sl
 -----END PGP SIGNATURE-----

Merge tag 'arc-4.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC updates from Vineet Gupta:

 - Support for new MM features in ARCv2 cores (THP, PAE40) Some generic
   THP bits are touched - all ACKed by Kirill

 - Platform framework updates to prepare for EZChip arrival (still in works)

 - ARC Public Mailing list setup finally (linux-snps-arc@lists.infraded.org)

* tag 'arc-4.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (42 commits)
  ARC: mm: PAE40 support
  ARC: mm: PAE40: tlbex.S: Explicitify the size of pte_t
  ARC: mm: PAE40: switch to using phys_addr_t for physical addresses
  ARC: mm: HIGHMEM: populate high memory from DT
  ARC: mm: HIGHMEM: kmap API implementation
  ARC: mm: preps ahead of HIGHMEM support #2
  ARC: mm: preps ahead of HIGHMEM support
  ARC: mm: use generic macros _BITUL()/_AC()
  ARC: mm: Improve Duplicate PD Fault handler
  MAINTAINERS: Add public mailing list for ARC
  ARC: Ensure DT mem base is same as what kernel is built with
  ARC: boot: Non Master cpus only need to call EARLY_CPU_SETUP once
  ARCv2: smp: [plat-*]: No need to explicitly call mcip_init_smp()
  ARC: smp: Introduce smp hook @init_irq_cpu called for all cores
  ARC: smp: Rename platform hook @init_smp -> @init_cpu_smp
  ARCv2: smp: [plat-*]: No need to explicitly call mcip_init_early_smp()
  ARC: smp: Introduce smp hook @init_early_smp for Master core
  ARC: remove @init_time, @init_irq platform callbacks
  ARC: smp: irqchip: handle IPI as percpu irq like timer
  ARC: boot: Support Halt-on-reset and Run-on-reset SMP booting modes
  ...
2015-11-03 13:21:09 -08:00
Vineet Gupta ac506b7f22 ARCv2: lib: memcpy: use local symbols
Otherwise perf profiles don't charge tme to memcpy

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-11-03 17:33:00 +05:30
Vineet Gupta 5a364c2a17 ARC: mm: PAE40 support
This is the first working implementation of 40-bit physical address
extension on ARCv2.

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-29 18:41:30 +05:30
Vineet Gupta 25d464183c ARC: mm: PAE40: tlbex.S: Explicitify the size of pte_t
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:50:29 +05:30
Vineet Gupta 28b4af729f ARC: mm: PAE40: switch to using phys_addr_t for physical addresses
That way a single flip of phys_addr_t to 64 bit ensures all places
dealing with physical addresses get correct data

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:50:29 +05:30
Vineet Gupta 29e332261d ARC: mm: HIGHMEM: populate high memory from DT
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:50:26 +05:30
Vineet Gupta 45890f6d34 ARC: mm: HIGHMEM: kmap API implementation
Implement kmap* API for ARC.

This enables
 - permanent kernel maps (pkmaps): :kmap() API
 - fixmap : kmap_atomic()

We use a very simple/uniform approach for both (unlike some of the other
arches). So fixmap doesn't use the customary compile time address stuff.
The important semantic is sleep'ability (pkmap) vs. not (fixmap) which
the API guarantees.

Note that this patch only enables highmem for subsequent PAE40 support
as there is no real highmem for ARC in pure 32-bit paradigm as explained
below.

ARC has 2:2 address split of the 32-bit address space with lower half
being translated (virtual) while upper half unstranslated
(0x8000_0000 to 0xFFFF_FFFF). kernel itself is linked at base of
unstranslated space (i.e. 0x8000_0000 onwards), which is mapped to say
DDR 0x0 by external Bus Glue logic (outside the core). So kernel can
potentially access 1.75G worth of memory directly w/o need for highmem.
(the top 256M is taken by uncached peripheral space from 0xF000_0000 to
0xFFFF_FFFF)

In PAE40, hardware can address memory beyond 4G (0x1_0000_0000) while
the logical/virtual addresses remain 32-bits. Thus highmem is required
for kernel proper to be able to access these pages for it's own purposes
(user space is agnostic to this anyways).

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:49:04 +05:30
Vineet Gupta 6101be5ad4 ARC: mm: preps ahead of HIGHMEM support #2
Explicit'ify that all memory added so far is low memory
Nothing semantical

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:49:00 +05:30
Vineet Gupta 336e2136e1 ARC: mm: preps ahead of HIGHMEM support
Before we plug in highmem support, some of code needs to be ready for it
 - copy_user_highpage() needs to be using the kmap_atomic API
 - mk_pte() can't assume page_address()
 - do_page_fault() can't assume VMALLOC_END is end of kernel vaddr space

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:31:05 +05:30
Alexey Brodkin d40846457f ARC: mm: use generic macros _BITUL()/_AC()
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:31:05 +05:30
Vineet Gupta 8840e14cd8 ARC: mm: Improve Duplicate PD Fault handler
- Move the verbosity knob from .data to .bss by using inverted logic
 - No need to readout PD1 descriptor
 - clip the non pfn bits of PD0 to avoid clipping inside the loop

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 19:31:04 +05:30
Vineet Gupta f759ee57b2 ARC: Ensure DT mem base is same as what kernel is built with
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:42 +05:30
Vineet Gupta 483bcc99c0 ARC: boot: Non Master cpus only need to call EARLY_CPU_SETUP once
With prev fixes, all cores now start via common entry point @stext which
already calls EARLY_CPU_SETUP for all cores - so no need to invoke it
again

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:42 +05:30
Vineet Gupta aa0efcde45 ARCv2: smp: [plat-*]: No need to explicitly call mcip_init_smp()
MCIP now registers it's own per cpu setup routine (for IPI IRQ request)
using smp_ops.init_irq_cpu().

So no need for platforms to do that. This now completely decouples
platforms from MCIP.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:41 +05:30
Vineet Gupta 286130ebf1 ARC: smp: Introduce smp hook @init_irq_cpu called for all cores
Note this is not part of platform owned static machine_desc,
but more of device owned plat_smp_ops (rather misnamed) which a IPI
provider or some such typically defines.

This will help us seperate out the IPI registration from platform
specific init_cpu_smp() into device specific init_irq_cpu()

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:41 +05:30
Vineet Gupta 8721a7f5a6 ARC: smp: Rename platform hook @init_smp -> @init_cpu_smp
This conveys better that it is called for each cpu

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:40 +05:30
Vineet Gupta 26b8f99623 ARCv2: smp: [plat-*]: No need to explicitly call mcip_init_early_smp()
MCIP now registers it's own probe callback with smp_ops.init_early_smp()
which is called by ARC common code, so no need for platforms to do that.

This decouples the platforms and MCIP and helps confine MCIP details
to it's own file.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:40 +05:30
Vineet Gupta e55af4da02 ARC: smp: Introduce smp hook @init_early_smp for Master core
This adds a platform agnostic early SMP init hook which is called on
Master core before calling setup_processor()

  setup_arch()
     smp_init_cpus()
         smp_ops.init_early_smp()
     ...
     setup_processor()

How this helps:
 - Used for one time init of certain SMP centric IP blocks, before
   calling setup_processor() which probes various bits of core,
   possibly including this block

 - Currently platforms need to call this IP block init from their
   init routines, which doesn't make sense as this is specific to ARC
   core and not platform and otherwise requires copy/paste in all
   (and hence a possible point of failure)

e.g. MCIP init is called from 2 platforms currently (axs10x and sim)
which will go away once we have this.

This change only adds the hooks but they are empty for now. Next commit
will populate them and remove the explicit init calls from platforms.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:40 +05:30
Vineet Gupta 4c82f28617 ARC: remove @init_time, @init_irq platform callbacks
These are not in use for ARC platforms. Moreover DT mechanims exist to
probe them w/o explicit platform calls.

 - clocksource drivers can use CLOCKSOURCE_OF_DECLARE()
 - intc IRQCHIP_DECLARE() calls + cascading inside DT allows external
   intc to be probed automatically

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:39 +05:30
Vineet Gupta e0868e6f67 ARC: smp: irqchip: handle IPI as percpu irq like timer
The reason this was not done so far was lack of genuine IPI_IRQ for
ARC700, as we don't have a SMP version of core yet (which might change
soon thx to EZChip). Nevertheles to increase the build coverage, we
need to allow CONFIG_SMP for ARC700 and still be able to run it on a
UP platform (nsim or AXS101) with a UP Device Tree (SMP-on-UP)

The build itself requires some define for IPI_IRQ and even a dummy
value is fine since that code won't run anyways.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:13:39 +05:30
Vineet Gupta 3971cdc202 ARC: boot: Support Halt-on-reset and Run-on-reset SMP booting modes
For Run-on-reset, non masters need to spin wait. For Halt-on-reset they
can jump to entry point directly.

Also while at it, made reset vector handler as "the" entry point for
kernel including host debugger based boot (which uses the ELF header
entry point)

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-28 16:08:17 +05:30
Rob Herring b83abc8c2c arc: enable building of all dtbs
Enable building all dtb files when CONFIG_OF_ALL_DTBS is enabled. The dtbs
are not really dependent on a platform being enabled or any other kernel
config, so for testing coverage it is convenient to build all of the dtbs.
This builds all dts files in the tree, not just targets listed.

Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-27 16:12:13 -05:00
Rob Herring 10375ccc67 arc: use common make variables for dtb builds
Use dtb-y and always make variables to build dtbs instead of explicit
dtbs rule. This is in preparation to support building all dtbs.

Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-27 16:12:13 -05:00
Shawn Lin 005a5243aa arc: axs10x_defconfig: remove CONFIG_MMC_DW_IDMAC
DesignWare MMC Controller's transfer mode should be decided
at runtime instead of compile-time. So we remove this config
option and read dw_mmc's register to select DMA master.

Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2015-10-26 16:00:17 +01:00
Vineet Gupta f33e9c434b ARC: smp: Move default boot kick/wait code out of MCIP into common code
For non halt-on-reset case, all cores start of simultaneously in @stext.
Master core0 proceeds with kernel boot, while other spin-wait on
@wake_flag being set by master once it is ready. So NO hardware assist
is needed for master to "kick" the others.

This patch moves this soft implementation out of mcip.c (as there is no
hardware assist) into common smp.c

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:27 +05:30
Vineet Gupta d0890ea5b6 ARC: boot log: decode more mmu config items
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:26 +05:30
Vineet Gupta 964cf28f9d ARC: boot log: move helper macros to header for reuse
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:25 +05:30
Vineet Gupta b598e17f6a ARC: mm: compute TLB size as needed from ways * sets
This frees up some bits to hold more high level info such as PAE being
present, w/o increasing the size of already bloated cpuinfo struct

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:25 +05:30
Vineet Gupta c583ee4fb0 ARC: mm: MMU v1..v3 only selectable for ARCompact ISA based cores
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:24 +05:30
Vineet Gupta 5c35ee642a ARC: make write_aux_reg safer against macro substitution
It was generating warnings when called as write_aux_reg(x, paddr >> 32)

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:24 +05:30
Vineet Gupta 9fabcc636b ARC: [arcompact] entry.S: Elide extra check/branch in exception ret path
This is done by improving the laddering logic !

Before:

   if Exception
      goto excep_or_pure_k_ret

   if !Interrupt(L2)
      goto l1_chk
   else
      INTERRUPT_EPILOGUE 2

 l1_chk:
   if !Interrupt(L1)  (i.e. pure kernel mode)
      goto excep_or_pure_k_ret
   else
      INTERRUPT_EPILOGUE 1

 excep_or_pure_k_ret:
   EXCEPTION_EPILOGUE

Now:

   if !Interrupt(L1 or L2) (i.e. exception or pure kernel mode)
      goto excep_or_pure_k_ret

  ; guaranteed to be an interrupt
   if !Interrupt(L2)
      goto l1_ret
   else
      INTERRUPT_EPILOGUE 2

 ; by virtue of above, no need to chk for L1 active
 l1_ret:
    INTERRUPT_EPILOGUE 1

 excep_or_pure_k_ret:
    EXCEPTION_EPILOGUE

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:23 +05:30
Vineet Gupta 5f88808745 ARC: [arcompact] entry.S: Document preemption games for L2 intr
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:23 +05:30
Vineet Gupta 55a2ae775a ARC: [arcompact] entry.S: Improve early return from exception
The requirement is to
 - Reenable Exceptions (AE cleared)
 - Reenable Interrupts (E1/E2 set)

We need to do wiggle these bits into ERSTATUS and call RTIE.

Prev version used the pre-exception STATUS32 as starting point for what
goes into ERSTATUS. This required explicit fixups of U/DE/L bits.

Instead, use the current (in-exception) STATUS32 as starting point.
Being in exception handler U/DE/L can be safely assumed to be correct.
Only AE/E1/E2 need to be fixed.

So the new implementation is slightly better
 -Avoids read form memory
 -Is 4 bytes smaller for the typical 1 level of intr configuration
 -Depicts the semantics more clearly

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:22 +05:30
Vineet Gupta 9dbd3d9bfd ARC: [arcompact] don't check for hard isr calling local_irq_enable()
Historically this was done by ARC IDE driver, which is long gone.
IRQ core is pretty robust now and already checks if IRQs are enabled
in hard ISRs. Thus no point in checking this in arch code, for every
call of irq enabled.

Further if some driver does do that - let it bring down the system so we
notice/fix this sooner than covering up for sucker

This makes local_irq_enable() - for L1 only case atleast simple enough
so we can inline it.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:22 +05:30
Vineet Gupta c7119d56d2 ARCv2: mm: THP: flush_pmd_tlb_range make SMP safe
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:21 +05:30
Vineet Gupta 722fe8fd36 ARCv2: mm: THP: Implement flush_pmd_tlb_range() optimization
Implement the TLB flush routine to evict a sepcific Super TLB entry,
vs. moving to a new ASID on every such flush.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:21 +05:30
Vineet Gupta 6ce187985f ARCv2: mm: THP: boot validation/reporting
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:18 +05:30
Vineet Gupta fe6c1b8611 ARCv2: mm: THP support
MMUv4 in HS38x cores supports Super Pages which are basis for Linux THP
support.

Normal and Super pages can co-exist (ofcourse not overlap) in TLB with a
new bit "SZ" in TLB page desciptor to distinguish between them.
Super Page size is configurable in hardware (4K to 16M), but fixed once
RTL builds.

The exact THP size a Linx configuration will support is a function of:
 - MMU page size (typical 8K, RTL fixed)
 - software page walker address split between PGD:PTE:PFN (typical
   11:8:13, but can be changed with 1 line)

So for above default, THP size supported is 8K * 256 = 2M

Default Page Walker is 2 levels, PGD:PTE:PFN, which in THP regime
reduces to 1 level (as PTE is folded into PGD and canonically referred
to as PMD).

Thus thp PMD accessors are implemented in terms of PTE (just like sparc)

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-17 17:48:18 +05:30
Vineet Gupta 24830fc782 ARC: mm: Introduce PTE_SPECIAL
Needed for THP, but will also come in handy for fast GUP later

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-09 17:04:23 +05:30
Vineet Gupta 129cbed54a ARC: mm: pte flags comsetic cleanups, comments
No semantical changes

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-09 17:04:22 +05:30
Vineet Gupta e8a75963a4 ARC: mm: switch pgtable_to to pte_t *
ARC is the only arch with unsigned long type (vs. struct page *).
Historically this was done to avoid the page_address() calls in various
arch hooks which need to get the virtual/logical address of the table.

Some arches alternately define it as pte_t *, and is as efficient as
unsigned long (generated code doesn't change)

Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-10-09 17:04:22 +05:30
Ingo Molnar 82fc167c39 Linux 4.3-rc4
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWEUxnAAoJEHm+PkMAQRiGYCYH/3gtGkFdvSLi+E1PfI8Qk3ZA
 XuYA4Mj09JBVSmaICeueMTDVrdiq0OE0zPib26GWlF/za13kNU8KgMR3+6XCuLSX
 DiCmh6mwDItoNoSIIUERLqrFHABXz8rZ3gb3uu2+kNN74Cl0piNm1YpFclEEWjMr
 9Wk5fkq+ontnDVUQOvWUxPiUXOJTvdLXBWTRDw1yTdE3RMNwRI2d/hme6Hq++WYV
 tRalZZKQaoB33js9WRVAoLVunvtna+i+/y7VGLj8QyS0+d6ec81Hey2r1/fR/oG4
 bs4ul6vtqeb3IR/PjUqxF59pSrCLEO+qrp9KrTlJNYgr1m1QyjRxWUdy/XhyaWo=
 =gIhN
 -----END PGP SIGNATURE-----

Merge tag 'v4.3-rc4' into locking/core, to pick up fixes before applying new changes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-06 17:10:28 +02:00
Linus Torvalds 30c44659f4 Merge branch 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Pull strscpy string copy function implementation from Chris Metcalf.

Chris sent this during the merge window, but I waffled back and forth on
the pull request, which is why it's going in only now.

The new "strscpy()" function is definitely easier to use and more secure
than either strncpy() or strlcpy(), both of which are horrible nasty
interfaces that have serious and irredeemable problems.

strncpy() has a useless return value, and doesn't NUL-terminate an
overlong result.  To make matters worse, it pads a short result with
zeroes, which is a performance disaster if you have big buffers.

strlcpy(), by contrast, is a mis-designed "fix" for strlcpy(), lacking
the insane NUL padding, but having a differently broken return value
which returns the original length of the source string.  Which means
that it will read characters past the count from the source buffer, and
you have to trust the source to be properly terminated.  It also makes
error handling fragile, since the test for overflow is unnecessarily
subtle.

strscpy() avoids both these problems, guaranteeing the NUL termination
(but not excessive padding) if the destination size wasn't zero, and
making the overflow condition very obvious by returning -E2BIG.  It also
doesn't read past the size of the source, and can thus be used for
untrusted source data too.

So why did I waffle about this for so long?

Every time we introduce a new-and-improved interface, people start doing
these interminable series of trivial conversion patches.

And every time that happens, somebody does some silly mistake, and the
conversion patch to the improved interface actually makes things worse.
Because the patch is mindnumbing and trivial, nobody has the attention
span to look at it carefully, and it's usually done over large swatches
of source code which means that not every conversion gets tested.

So I'm pulling the strscpy() support because it *is* a better interface.
But I will refuse to pull mindless conversion patches.  Use this in
places where it makes sense, but don't do trivial patches to fix things
that aren't actually known to be broken.

* 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: use global strscpy() rather than private copy
  string: provide strscpy()
  Make asm/word-at-a-time.h available on all architectures
2015-10-04 16:31:13 +01:00
Peter Zijlstra 62e8a3258b atomic, arch: Audit atomic_{read,set}()
This patch makes sure that atomic_{read,set}() are at least
{READ,WRITE}_ONCE().

We already had the 'requirement' that atomic_read() should use
ACCESS_ONCE(), and most archs had this, but a few were lacking.
All are now converted to use READ_ONCE().

And, by a symmetry and general paranoia argument, upgrade atomic_set()
to use WRITE_ONCE().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: james.hogan@imgtec.com
Cc: linux-kernel@vger.kernel.org
Cc: oleg@redhat.com
Cc: will.deacon@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-23 09:54:28 +02:00
Thomas Gleixner bd0b9ac405 genirq: Remove irq argument from irq flow handlers
Most interrupt flow handlers do not use the irq argument. Those few
which use it can retrieve the irq number from the irq descriptor.

Remove the argument.

Search and replace was done with coccinelle and some extra helper
scripts around it. Thanks to Julia for her help!

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
2015-09-16 15:47:51 +02:00
Vineet Gupta 3ebb0540c2 ARCv2: [axs103_smp] Reduce clk for SMP FPGA configs
Newer bitfiles needs the reduced clk even for SMP builds

Cc: <stable@vger.kernel.org>  #4.2
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-11 19:34:01 -07:00
Linus Torvalds ca520cab25 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking and atomic updates from Ingo Molnar:
 "Main changes in this cycle are:

   - Extend atomic primitives with coherent logic op primitives
     (atomic_{or,and,xor}()) and deprecate the old partial APIs
     (atomic_{set,clear}_mask())

     The old ops were incoherent with incompatible signatures across
     architectures and with incomplete support.  Now every architecture
     supports the primitives consistently (by Peter Zijlstra)

   - Generic support for 'relaxed atomics':

       - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return()
       - atomic_read_acquire()
       - atomic_set_release()

     This came out of porting qwrlock code to arm64 (by Will Deacon)

   - Clean up the fragile static_key APIs that were causing repeat bugs,
     by introducing a new one:

       DEFINE_STATIC_KEY_TRUE(name);
       DEFINE_STATIC_KEY_FALSE(name);

     which define a key of different types with an initial true/false
     value.

     Then allow:

       static_branch_likely()
       static_branch_unlikely()

     to take a key of either type and emit the right instruction for the
     case.  To be able to know the 'type' of the static key we encode it
     in the jump entry (by Peter Zijlstra)

   - Static key self-tests (by Jason Baron)

   - qrwlock optimizations (by Waiman Long)

   - small futex enhancements (by Davidlohr Bueso)

   - ... and misc other changes"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
  jump_label/x86: Work around asm build bug on older/backported GCCs
  locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations
  locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h
  locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics
  locking/qrwlock: Implement queue_write_unlock() using smp_store_release()
  locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition
  locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t'
  locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication
  locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations
  locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic
  locking/static_keys: Make verify_keys() static
  jump label, locking/static_keys: Update docs
  locking/static_keys: Provide a selftest
  jump_label: Provide a self-test
  s390/uaccess, locking/static_keys: employ static_branch_likely()
  x86, tsc, locking/static_keys: Employ static_branch_likely()
  locking/static_keys: Add selftest
  locking/static_keys: Add a new static_key interface
  locking/static_keys: Rework update logic
  locking/static_keys: Add static_key_{en,dis}able() helpers
  ...
2015-09-03 15:46:07 -07:00
Linus Torvalds 17e6b00ac4 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "This updated pull request does not contain the last few GIC related
  patches which were reported to cause a regression.  There is a fix
  available, but I let it breed for a couple of days first.

  The irq departement provides:

   - new infrastructure to support non PCI based MSI interrupts
   - a couple of new irq chip drivers
   - the usual pile of fixlets and updates to irq chip drivers
   - preparatory changes for removal of the irq argument from interrupt
     flow handlers
   - preparatory changes to remove IRQF_VALID"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits)
  irqchip/imx-gpcv2: IMX GPCv2 driver for wakeup sources
  irqchip: Add bcm2836 interrupt controller for Raspberry Pi 2
  irqchip: Add documentation for the bcm2836 interrupt controller
  irqchip/bcm2835: Add support for being used as a second level controller
  irqchip/bcm2835: Refactor handle_IRQ() calls out of MAKE_HWIRQ
  PCI: xilinx: Fix typo in function name
  irqchip/gic: Ensure gic_cpu_if_up/down() programs correct GIC instance
  irqchip/gic: Only allow the primary GIC to set the CPU map
  PCI/MSI: pci-xgene-msi: Consolidate chained IRQ handler install/remove
  unicore32/irq: Prepare puv3_gpio_handler for irq argument removal
  tile/pci_gx: Prepare trio_handle_level_irq for irq argument removal
  m68k/irq: Prepare irq handlers for irq argument removal
  C6X/megamode-pic: Prepare megamod_irq_cascade for irq argument removal
  blackfin: Prepare irq handlers for irq argument removal
  arc/irq: Prepare idu_cascade_isr for irq argument removal
  sparc/irq: Use access helper irq_data_get_affinity_mask()
  sparc/irq: Use helper irq_data_get_irq_handler_data()
  parisc/irq: Use access helper irq_data_get_affinity_mask()
  mn10300/irq: Use access helper irq_data_get_affinity_mask()
  irqchip/i8259: Prepare i8259_irq_dispatch for irq argument removal
  ...
2015-09-01 14:33:35 -07:00
Vineet Gupta 3d5926599a ARCv2: entry: Fix reserved handler
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-27 16:25:37 +05:30
Vineet Gupta 9b28829d6d ARCv2: perf: Finally introduce HS perf unit
With all features in place, the ARC HS pct block can now be effectively
allowed to be probed/used

Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-27 14:59:07 +05:30
Alexey Brodkin e525c37f84 ARCv2: perf: SMP support
* split off pmu info into singleton and per-cpu bits
* setup PMU on all cores

Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-27 14:58:42 +05:30
Alexey Brodkin e6b1d126bb ARCv2: perf: implement exclusion of event counting in user or kernel mode
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-27 14:58:14 +05:30
Alexey Brodkin 36481cf7fb ARCv2: perf: Support sampling events using overflow interrupts
In times of ARC 700 performance counters didn't have support of
interrupt an so for ARC we only had support of non-sampling events.

Put simply only "perf stat" was functional.

Now with ARC HS we have support of interrupts in performance counters
which this change introduces support of.

ARC performance counters act in the following way in regard of
interrupts generation.
 [1] A counter counts starting from value set in PCT_COUNT register pair
 [2] Once counter reaches value set in PCT_INT_CNT interrupt is raised

Basic setup look like this:
 [1] PCT_COUNT = 0;
 [2] PCT_INT_CNT = __limit_value__;
 [3] Enable interrupts for that counter and let it run
 [4] Let counter reach its limit
 [5] Handle interrupt when it happens

Note that PCT HW block is build in CPU core and so ints interrupt
line (which is basically OR of all counters IRQs) is wired directly to
top-level IRQC. That means do de-assert PCT interrupt it's required to
reset IRQs from all counters that have reached their limit values.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-27 14:57:43 +05:30
Alexey Brodkin 1fe8bfa5ff ARCv2: perf: implement "event_set_period"
This generalization prepares for support of overflow interrupts.

Hardware event counters on ARC work that way:
Each counter counts from programmed start value (set in
ARC_REG_PCT_COUNT) to a limit value (set in ARC_REG_PCT_INT_CNT) and
once limit value is reached this timer generates an interrupt.

Even though this hardware implementation allows for more flexibility,
in Linux kernel we decided to mimic behavior of other architectures
this way:

 [1] Set limit value as half of counter's max value (to allow counter to
     run after reaching it limit, see below for more explanation):
 ---------->8-----------
 arc_pmu->max_period = (1ULL << counter_size) / 2 - 1ULL;
 ---------->8-----------

 [2] Set start value as "arc_pmu->max_period - sample_period" and then
count up to the limit

Our event counters don't stop on reaching max value (the one we set in
ARC_REG_PCT_INT_CNT) but continue to count until kernel explicitly
stops each of them.

And setting a limit as half of counter capacity is done to allow
capturing of additional events in between moment when interrupt was
triggered until we're actually processing PMU interrupts. That way
we're trying to be more precise.

For example if we count CPU cycles we keep track of cycles while
running through generic IRQ handling code:

 [1] We set counter period as say 100_000 events of type "crun"
 [2] Counter reaches that limit and raises its interrupt
 [3] Once we get in PMU IRQ handler we read current counter value from
ARC_REG_PCT_SNAP ans see there something like 105_000.

If counters stop on reaching a limit value then we would miss
additional 5000 cycles.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-27 14:57:29 +05:30
Vineet Gupta fb7c572551 ARC: perf: cap the number of counters to hardware max of 32
The number of counters in PCT can never be more than 32 (while
countable conditions could be 100+) for both ARCompact and ARCv2

And while at it update copyright dates.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-27 14:57:03 +05:30
Vineet Gupta fd0881a24a ARC: Eliminate some ARCv2 specific code for ARCompact build
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-21 15:06:43 +05:30
Vineet Gupta 090749502f ARC: add/fix some comments in code - no functional change
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20 19:05:49 +05:30
Yuriy Kolerov 6de6066c0d ARC: change some branchs to jumps to resolve linkage errors
When kernel's binary becomes large enough (32M and more) errors
may occur during the final linkage stage. It happens because
the build system uses short relocations for ARC  by default.
This problem may be easily resolved by passing -mlong-calls
option to GCC to use long absolute jumps (j) instead of short
relative branchs (b).

But there are fragments of pure assembler code exist which use
branchs in inappropriate places and cause a linkage error because
of relocations overflow.

First of these fragments is .fixup insertion in futex.h and
unaligned.c. It inserts a code in the separate section (.fixup)
with branch instruction. It leads to the linkage error when
kernel becomes large.

Second of these fragments is calling scheduler's functions
(common kernel code) from entry.S of ARC's code. When kernel's
binary becomes large it may lead to the linkage error because
scheduler may occur far enough from ARC's code in the final
binary.

Signed-off-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20 18:53:15 +05:30
Vineet Gupta eb2cd8b72b ARC: ensure futex ops are atomic in !LLSC config
W/o hardware assisted atomic r-m-w the best we can do is to disable
preemption.

Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20 18:16:01 +05:30
Vineet Gupta 5e0574292a ARC: Enable HAVE_FUTEX_CMPXCHG
ARC doesn't need the runtime detection of futex cmpxchg op

Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20 18:16:01 +05:30
Vineet Gupta 882a95ae0a ARC: make futex_atomic_cmpxchg_inatomic() return bimodal
Callers of cmpxchg_futex_value_locked() in futex code expect bimodal
return value:
  !0 (essentially -EFAULT as failure)
   0 (success)

Before this patch, the success return value was old value of futex,
which could very well be non zero, causing caller to possibly take the
failure path erroneously.

Fix that by returning 0 for success

(This fix was done back in 2011 for all upstream arches, which ARC
obviously missed)

Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20 18:16:00 +05:30
Vineet Gupta ed574e2bbd ARC: futex cosmetics
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20 18:16:00 +05:30
Vineet Gupta 31d30c8208 ARC: add barriers to futex code
The atomic ops on futex need to provide the full barrier just like
regular atomics in kernel.

Also remove pagefault_enable/disable in futex_atomic_cmpxchg_inatomic()
as core code already does that

Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20 18:15:59 +05:30
Alexey Brodkin 1648c70d30 ARCv2: IOC: Allow boot time disable
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20 18:15:31 +05:30
Vineet Gupta 79335a2ca0 ARCv2: SLC: Allow boot time disable
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20 18:11:52 +05:30
Alexey Brodkin f2b0b25a37 ARCv2: Support IO Coherency and permutations involving L1 and L2 caches
In case of ARCv2 CPU there're could be following configurations
that affect cache handling for data exchanged with peripherals
via DMA:
 [1] Only L1 cache exists
 [2] Both L1 and L2 exist, but no IO coherency unit
 [3] L1, L2 caches and IO coherency unit exist

Current implementation takes care of [1] and [2].
Moreover support of [2] is implemented with run-time check
for SLC existence which is not super optimal.

This patch introduces support of [3] and rework of DMA ops
usage. Instead of doing run-time check every time a particular
DMA op is executed we'll have 3 different implementations of
DMA ops and select appropriate one during init.

As for IOC support for it we need:
 [a] Implement empty DMA ops because IOC takes care of cache
     coherency with DMAed data
 [b] Route dma_alloc_coherent() via dma_alloc_noncoherent()
     This is required to make IOC work in first place and also
     serves as optimization as LD/ST to coherent buffers can be
     srviced from caches w/o going all the way to memory

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
[vgupta:
  -Added some comments about IOC gains
  -Marked dma ops as static,
  -Massaged changelog a bit]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-20 18:11:17 +05:30
Vineet Gupta 2a4401687c ARC: Enable optimistic spinning for LLSC config
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-11 14:51:09 +05:30
Vineet Gupta 1097163870 ARCv2: spinlock/rwlock/atomics: reduce 1 instruction in exponential backoff
The increment of delay counter was 2 instructions:
Arithmatic Shfit Left (ASL) + set to 1 on overflow

This can be done in 1 using ROtate Left (ROL)

Suggested-by: Nigel Topham <ntopham@synopsys.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-08-07 13:56:16 +05:30