SLC maintenance ops need to be serialized by software as there is no
inherent buffering / quequing of aux commands. It can silently ignore a
new aux operation if previous one is still ongoing (SLC_CTRL_BUSY)
So gaurd the SLC op using a spin lock
The spin lock doesn't seem to be contended even in heavy workloads such
as iperf. On FPGA @ 75 MHz.
[1] Before this change:
============================================================
# iperf -c 10.42.0.1
------------------------------------------------------------
Client connecting to 10.42.0.1, TCP port 5001
TCP window size: 43.8 KByte (default)
------------------------------------------------------------
[ 3] local 10.42.0.110 port 38935 connected with 10.42.0.1 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 48.4 MBytes 40.6 Mbits/sec
============================================================
[2] After this change:
============================================================
# iperf -c 10.42.0.1
------------------------------------------------------------
Client connecting to 10.42.0.1, TCP port 5001
TCP window size: 43.8 KByte (default)
------------------------------------------------------------
[ 3] local 10.42.0.243 port 60248 connected with 10.42.0.1 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 47.5 MBytes 39.8 Mbits/sec
# iperf -c 10.42.0.1
------------------------------------------------------------
Client connecting to 10.42.0.1, TCP port 5001
TCP window size: 43.8 KByte (default)
------------------------------------------------------------
[ 3] local 10.42.0.243 port 60249 connected with 10.42.0.1 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 54.9 MBytes 46.0 Mbits/sec
============================================================
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: arc-linux-dev@synopsys.com
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
L2 cache on ARCHS processors is called SLC (System Level Cache)
For working DMA (in absence of hardware assisted IO Coherency) we need
to manage SLC explicitly when buffers transition between cpu and
controllers.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Caveats about cache flush on ARCv2 based cores
- dcache is PIPT so paddr is sufficient for cache maintenance ops (no
need to setup PTAG reg
- icache is still VIPT but only aliasing configs need PTAG setup
So basically this is departure from MMU-v3 which always need vaddr in
line ops registers (DC_IVDL, DC_FLDL, IC_IVIL) but paddr in DC_PTAG,
IC_PTAG respectively.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
- Remove the ifdef'ery and write distinct versions for each mmu ver even
if there is some code duplication
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
That is because __after_dc_op() already reads it for status check, so it
is better anyways to use that "newer" value.
Also reduces the clutter in callers for passing from/to these routines.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>