Commit Graph

6 Commits

Author SHA1 Message Date
Alex Shi e7b52ffd45 x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range
x86 has no flush_tlb_range support in instruction level. Currently the
flush_tlb_range just implemented by flushing all page table. That is not
the best solution for all scenarios. In fact, if we just use 'invlpg' to
flush few lines from TLB, we can get the performance gain from later
remain TLB lines accessing.

But the 'invlpg' instruction costs much of time. Its execution time can
compete with cr3 rewriting, and even a bit more on SNB CPU.

So, on a 512 4KB TLB entries CPU, the balance points is at:
	(512 - X) * 100ns(assumed TLB refill cost) =
		X(TLB flush entries) * 100ns(assumed invlpg cost)

Here, X is 256, that is 1/2 of 512 entries.

But with the mysterious CPU pre-fetcher and page miss handler Unit, the
assumed TLB refill cost is far lower then 100ns in sequential access. And
2 HT siblings in one core makes the memory access more faster if they are
accessing the same memory. So, in the patch, I just do the change when
the target entries is less than 1/16 of whole active tlb entries.
Actually, I have no data support for the percentage '1/16', so any
suggestions are welcomed.

As to hugetlb, guess due to smaller page table, and smaller active TLB
entries, I didn't see benefit via my benchmark, so no optimizing now.

My micro benchmark show in ideal scenarios, the performance improves 70
percent in reading. And in worst scenario, the reading/writing
performance is similar with unpatched 3.4-rc4 kernel.

Here is the reading data on my 2P * 4cores *HT NHM EP machine, with THP
'always':

multi thread testing, '-t' paramter is thread number:
	       	        with patch   unpatched 3.4-rc4
./mprotect -t 1           14ns		24ns
./mprotect -t 2           13ns		22ns
./mprotect -t 4           12ns		19ns
./mprotect -t 8           14ns		16ns
./mprotect -t 16          28ns		26ns
./mprotect -t 32          54ns		51ns
./mprotect -t 128         200ns		199ns

Single process with sequencial flushing and memory accessing:

		       	with patch   unpatched 3.4-rc4
./mprotect		    7ns			11ns
./mprotect -p 4096  -l 8 -n 10240
			    21ns		21ns

[ hpa: http://lkml.kernel.org/r/1B4B44D9196EFF41AE41FDA404FC0A100BFF94@SHSMSX101.ccr.corp.intel.com
  has additional performance numbers. ]

Signed-off-by: Alex Shi <alex.shi@intel.com>
Link: http://lkml.kernel.org/r/1340845344-27557-3-git-send-email-alex.shi@intel.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-06-27 19:29:07 -07:00
Russ Anderson 78c0617646 x86: Enable NMI on all cpus on UV
Enable NMI on all cpus in UV system and add an NMI handler
to dump_stack on each cpu.

By default on x86 all the cpus except the boot cpu have NMI
masked off.  This patch enables NMI on all cpus in UV system
and adds an NMI handler to dump_stack on each cpu.  This
way if a system hangs we can NMI the machine and get a
backtrace from all the cpus.

Version 2: Use x86_platform driver mechanism for nmi init, per
           Ingo's suggestion.

Version 3: Clean up Ingo's nits.

Signed-off-by: Russ Anderson <rja@sgi.com>
LKML-Reference: <20100226164912.GA24439@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-27 12:34:21 +01:00
Yinghai Lu 2b6163bf57 x86: remove update_apic from x86_quirks
Impact: cleanup

x86_quirks->update_apic() calling looks crazy. so try to remove it:

 1. every apic take wakeup_cpu member directly
 2. separate es7000_apic to es7000_apic_cluster
 3. use uv_wakeup_cpu directly

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26 06:32:25 +01:00
Randy Dunlap 58105ef185 x86: UV: fix header struct usage
Impact: Fixes warning

Fix uv.h struct usage:

arch/x86/include/asm/uv/uv.h:16: warning: 'struct mm_struct' declared inside parameter list
arch/x86/include/asm/uv/uv.h:16: warning: its scope is only this definition or declaration, which is probably not what you want

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-02-11 17:17:29 -08:00
Nick Piggin 03b486322e x86: make UV support configurable
Make X86 SGI Ultraviolet support configurable. Saves about 13K of text size
on my modest config.

   text    data     bss     dec     hex filename
6770537 1158680  694356 8623573  8395d5 vmlinux
6757492 1157664  694228 8609384  835e68 vmlinux.nouv

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21 13:00:42 +01:00
Tejun Heo bdbcdd4888 x86: uv cleanup
Impact: cleanup

Make the following uv related cleanups.

* collect visible uv related definitions and interfaces into uv/uv.h
  and use it.  this cleans up the messy situation where on 64bit, uv
  is defined properly, on 32bit generic it's dummy and on the rest
  undefined.  after this clean up, uv is defined on 64 and dummy on
  32.

* update uv_flush_tlb_others() such that it takes cpumask of
  to-be-flushed cpus as argument, instead of that minus self, and
  returns yet-to-be-flushed cpumask, instead of modifying the passed
  in parameter.  this interface change will ease dummy implementation
  of uv_flush_tlb_others() and makes uv tlb flush related stuff
  defined in tlb_uv proper.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:06 +09:00