Commit Graph

5 Commits

Author SHA1 Message Date
Vineet Gupta 36425cd670 ARC: udelay: fix inline assembler by adding LP_COUNT to clobber list
commit 3c7c7a2fc8 ("ARC: Don't use "+l" inline asm constraint")
modified the inline assembly to setup LP_COUNT register manually and NOT
rely on gcc to do it (with the +l inline assembler contraint hint, now
being retired in the compiler)

However the fix was flawed as we didn't add LP_COUNT to asm clobber list,
meaning gcc doesn't know that LP_COUNT or zero-delay-loops are in action
in the inline asm.

This resulted in some fun - as nested ZOL loops were being generared

| mov lp_count,250000 ;16 # tmp235,
| lp .L__GCC__LP14 #		<======= OUTER LOOP (gcc generated)
|   .L14:
|   ld r2, [r5] # MEM[(volatile u32 *)prephitmp_43], w
|   dmb 1
|   breq r2, -1, @.L21 #, w,,
|   bbit0 r2,1,@.L13 # w,,
|   ld r4,[r7] ;25 # loops_per_jiffy, loops_per_jiffy
|   mpymu r3,r4,r6 #, loops_per_jiffy, tmp234
|
|   mov lp_count, r3 #		 <====== INNER LOOP (from inline asm)
|   lp 1f
| 	 nop
|   1:
|   nop_s
| .L__GCC__LP14: ; loop end, start is @.L14 #,

This caused issues with drivers relying on sane behaviour of udelay
friends.

With LP_COUNT added to clobber list, gcc doesn't generate the outer
loop in say above case.

Addresses STAR 9001146134

Reported-by: Joao Pinto <jpinto@synopsys.com>
Fixes: 3c7c7a2fc8 ("ARC: Don't use "+l" inline asm constraint")
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2017-01-24 10:54:24 -08:00
Vineet Gupta 3c7c7a2fc8 ARC: Don't use "+l" inline asm constraint
Apparenty this is coming in the way of gcc fix which inhibits the usage
of LP_COUNT as a gpr.

Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2016-11-28 09:17:32 -08:00
Vineet Gupta 8922bc3058 ARCv2: Adhere to Zero Delay loop restriction
Branch insn can't be scheduled as last insn of Zero Overhead loop

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2015-06-22 14:06:56 +05:30
Mischa Jonker 7efd0da2d1 ARC: Fix __udelay calculation
Cast usecs to u64, to ensure that the (usecs * 4295 * HZ)
multiplication is 64 bit.

Initially, the (usecs * 4295 * HZ) part was done as a 32 bit
multiplication, with the result casted to 64 bit. This led to some bits
falling off, causing a "DMA initialization error" in the stmmac Ethernet
driver, due to a premature timeout.

Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-09-05 10:31:12 +05:30
Vineet Gupta d8005e6b95 ARC: Timers/counters/delay management
ARC700 includes 2 in-core 32bit timers TIMER0 and TIMER1.
Both have exactly same capabilies.

* programmable to count from TIMER<n>_CNT to TIMER<n>_LIMIT
* for count 0 and LIMIT ~1, provides a free-running counter by
    auto-wrapping when limit is reached.
* optionally interrupt when LIMIT is reached (oneshot event semantics)
* rearming the interrupt provides periodic semantics
* run at CPU clk

ARC Linux uses TIMER0 for clockevent (periodic/oneshot) and TIMER1 for
clocksource (free-running clock).

Newer cores provide RTSC insn which gives a 64bit cpu clk snapshot hence
is more apt for clocksource when available.

SMP poses a bit of challenge for global timekeeping clocksource /
sched_clock() backend:
 -TIMER1 based local clocks are out-of-sync hence can't be used
  (thus we default to jiffies based cs as well as sched_clock() one/both
  of which platform can override with it's specific hardware assist)
 -RTSC is only allowed in SMP if it's cross-core-sync (Kconfig glue
  ensures that) and thus usable for both requirements.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
2013-02-11 20:00:39 +05:30