linux

Commit Graph

Author	SHA1	Message	Date
H. Peter Anvin	fe770bf031	x86: clean up the page table dumper and add 32-bit support Clean up the page table dumper (fix boundary conditions, table driven address ranges, some formatting changes since it is no longer using the kernel log but a separate virtual file), and generalize to 32 bits. [ mingo@elte.hu: x86: fix the pagetable dumper ] Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
Arjan van de Ven	926e5392ba	x86: add code to dump the (kernel) page tables for visual inspection by kernel developers This patch adds code to the kernel to have an (optional) /proc/kernel_page_tables debug file that basically dumps the kernel pagetables; this allows us kernel developers to verify that nothing fishy is going on and that the various mappings are set up correctly. This was quite useful in finding various change_page_attr() bugs, and is very likely to be useful in the future as well. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: mingo@elte.hu Cc: tglx@tglx.de Cc: hpa@zytor.com Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
H. Peter Anvin	2596e0fae0	x86: unify arch/x86/mm/Makefile Unify arch/x86/mm/Makefile between 32 and 64 bits. All configuration variables that are protected by Kconfig constraints have been put in the common part of the Makefile; however, the NUMA files are totally different between 32 and 64 bits and are handled via an ifdef. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
Thomas Gleixner	ee7ae7a198	x86: add debug info to DEBUG_PAGEALLOC Add debug information for DEBUG_PAGEALLOC to get some statistics about the pool usage and split status. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-04-17 17:40:45 +02:00
Ingo Molnar	b4e0409a36	x86: check vmlinux limits, 64-bit these build-time and link-time checks would have prevented the vmlinux size regression. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-17 17:40:45 +02:00
Andrew Morton	9c312058b2	Avoid false positive warnings in kmap_atomic_prot() with DEBUG_HIGHMEM I believe http://bugzilla.kernel.org/show_bug.cgi?id=10318 is a false positive. There's no way in which networking will be using highmem pages here, so it won't be taking the KM_USER0 kmap slot, so there's no point in performing these checks. Cc: Pawel Staszewski <pstaszewski@artcom.pl> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Christoph Lameter <clameter@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Really sad. We lose almost all real-life coverage of the debug tests with this patch. Now it will only report problems for the cases where people actually end up using a HIGHMEM page, not when they just _might_ use one. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-28 13:08:14 -07:00
Ingo Molnar	3085354de6	x86: prefetch fix #2 Linus noticed a second bug and an uncleanliness: - we'd return on any instruction fetch fault - we'd use both the value of 16 and the PF_INSTR symbol which are the same and make no sense the cleanup nicely unifies this piece of logic. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-27 22:00:16 +01:00
Christoph Lameter	25e59881f1	x86: stricter check in follow_huge_addr() The first page of the compound page is determined in follow_huge_addr() but then PageCompound() only checks if the page is part of a compound page. PageHead() allows checking if this is indeed the first page of the compound. Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-27 16:08:45 +01:00
Ingo Molnar	bc713dcf35	x86: fix prefetch workaround some early Athlon XP's and Opterons generate bogus faults on prefetch instructions. The workaround for this regressed over .24 - reinstate it. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-27 16:08:44 +01:00
Suresh Siddha	d546b67a94	x86: fix performance drop for glx fix the 3D performance drop reported at: http://bugzilla.kernel.org/show_bug.cgi?id=10328 fb drivers are using ioremap()/ioremap_nocache(), followed by mtrr_add with WC attribute. Recent changes in page attribute code made both ioremap()/ioremap_nocache() mappings as UC (instead of previous UC-). This breaks the graphics performance, as the effective memory type is UC instead of expected WC. The correct way to fix this is to add ioremap_wc() (which uses UC- in the absence of PAT kernel support and WC with PAT) and change all the fb drivers to use this new ioremap_wc() API. We can take this correct and longer route for post 2.6.25. For now, revert back to the UC- behavior for ioremap/ioremap_nocache. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-26 22:23:41 +01:00
Yinghai Lu	76c324182b	x86: fix trim mtrr not to setup_memory two times we could call find_max_pfn() directly instead of setup_memory() to get max_pfn needed for mtrr trimming. otherwise setup_memory() is called two times... that is duplicated... [ mingo@elte.hu: both Thomas and me simulated a double call to setup_bootmem_allocator() and can confirm that it is a real bug which can hang in certain configs. It's not been reported yet but that is probably due to the relatively scarce nature of MTRR-trimming systems. ] Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-26 22:23:41 +01:00
Linus Torvalds	b9e76a0074	x86-32: Pass the full resource data to ioremap() It appears that 64-bit PCI resources cannot possibly ever have worked on x86-32 even when the RESOURCES_64BIT config option was set, because any driver that tried to [pci_]ioremap() the resource would have been unable to do so because the high 32 bits would have been silently dropped on the floor by the ioremap() routines that only used "unsigned long". Change them to use "resource_size_t" instead, which properly encodes the whole 64-bit resource data if RESOURCES_64BIT is enabled. Acked-by: H. Peter Anvin <hpa@kernel.org> Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-24 11:22:39 -07:00
Yinghai Lu	37bff62e98	x86_64: free_bootmem should take phys so use nodedata_phys directly. Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-21 17:06:15 +01:00
Thomas Gleixner	985a34bd75	x86: remove quicklists quicklists cause a serious memory leak on 32-bit x86, as documented at: http://bugzilla.kernel.org/show_bug.cgi?id=9991 the reason is that the quicklist pool is a special-purpose cache that grows out of proportion. It is not accounted for anywhere and users have no way to even realize that it's the quicklists that are causing RAM usage spikes. It was supposed to be a relatively small pool, but as demonstrated by KOSAKI Motohiro, they can grow as large as: Quicklists: 1194304 kB given how much trouble this code has caused historically, and given that Andrew objected to its introduction on x86 (years ago), the best option at this point is to remove them. [ any performance benefits of caching constructed pgds should be implemented in a more generic way (possibly within the page allocator), while still allowing constructed pages to be allocated by other workloads. ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-11 17:11:55 +01:00
Ingo Molnar	9a46d7e5b6	x86: ioremap, remove WARN_ON() Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-11 17:11:54 +01:00
Yinghai Lu	7c9e92b6cd	x86: not set node to cpu_to_node if the node is not online resolve boot problem reported by Mel Gorman: http://lkml.org/lkml/2008/2/13/404 init_cpu_to_node will use cpu->apic (from MADT or mptable) and apic->node(from SRAT or AMD config space with k8_bus_64.c) to have cpu->node mapping, and later identify_cpu will overwrite them again...(with nearby_node...) this patch checks if the node is online, otherwise it will not update cpu_node map. so keep cpu_node map to online node before identify_cpu..., to prevent possible error. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Thomas Gleixner <tglx@linutronix.de>	2008-03-04 17:10:12 +01:00
Rafael J. Wysocki	9b5cf48b06	x86: revert "x86: CPA: avoid split of alias mappings" Revert: commit `8be8f54bae` Author: Thomas Gleixner <tglx@linutronix.de> Date: Sat Feb 23 20:43:21 2008 +0100 x86: CPA: avoid split of alias mappings because it clearly mishandles the case when __change_page_attr(), called from __change_page_attr_set_clr(), changes cpa->processed to 1 and cpa_process_alias(cpa) is executed right after that. This crashes my x86-64 test box early in the boot process (ref. http://bugzilla.kernel.org/show_bug.cgi?id=10140#c4). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-03-03 14:18:27 +01:00
Thomas Gleixner	8be8f54bae	x86: CPA: avoid split of alias mappings avoid over-eager large page splitup. When the target area needs to be split or is split already (ioremap) then the current code enforces the split of large mappings in the alias regions even if we could avoid it. Use a separate variable processed in the cpa_data structure to carry the number of pages which have been processed instead of reusing the numpages variable. This keeps numpages intact and gives the alias code a chance to keep large mappings intact. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-29 18:55:42 +01:00
Ingo Molnar	b16bf712f4	x86: fix leak un ioremap_page_range() failure Jan Beulich noticed it during code review that if a driver's ioremap() fails (say due to -ENOMEM) then we might leak the struct vm_area. Free it properly. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-29 18:55:42 +01:00
Ingo Molnar	88f3aec7af	x86: fix spontaneous reboot with allyesconfig bzImage recently the 64-bit allyesconfig bzImage kernel started spontaneously rebooting during early bootup. after a few fun hours spent with early init debugging, it turns out that we've got this rather annoying limit on the size of the kernel image: #define KERNEL_TEXT_SIZE (4010241024) which limit my vmlinux just happened to pass: text data bss dec hex filename 29703744 4222751 `8646224` 42572719 2899baf vmlinux 40 MB is 42572719 bytes, so my vmlinux was just 1.5% above this limit :-/ So it happily crashed right in head_64.S, which - as we all know - is the most debuggable code in the whole architecture ;-) So increase the limit to allow an up to 128MB kernel image to be mapped. (should anyone be that crazy or lazy) We have a full 4K of pagetable (level2_kernel_pgt) allocated for these mappings already, so there's no RAM overhead and the limit was rather pointless and arbitrary. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-26 12:55:56 +01:00
Yinghai Lu	3b57bc461f	x86: remove double-checking empty zero pages debug so far no one complained about that. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-26 12:55:55 +01:00
Ingo Molnar	92cb54a37a	x86: make DEBUG_PAGEALLOC and CPA more robust Use PF_MEMALLOC to prevent recursive calls in the DBEUG_PAGEALLOC case. This makes the code simpler and more robust against allocation failures. This fixes the following fallback to non-mmconfig: http://lkml.org/lkml/2008/2/20/551 http://bugzilla.kernel.org/show_bug.cgi?id=10083 Also, for DEBUG_PAGEALLOC=n reduce the pool size to one page. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-26 12:55:50 +01:00
Rafael J. Wysocki	8a235efad5	Hibernation: Handle DEBUG_PAGEALLOC on x86 Make hibernation work with CONFIG_DEBUG_PAGEALLOC set on x86, by checking if the pages to be copied are marked as present in the kernel mapping and temporarily marking them as present if that's not the case. No functional modifications are introduced if CONFIG_DEBUG_PAGEALLOC is unset. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>	2008-02-21 02:15:28 -05:00
Linus Torvalds	5d9c4a7de6	Merge branch 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6 * 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6: agp: fix missing casts that produced a warning. agp: add support for 662/671 to agp driver fix historic ioremap() abuse in AGP agp/sis: Suspend support for SiS AGP agp/sis: Clear bit 2 from aperture size byte as well	2008-02-19 18:29:57 -08:00
Arjan van de Ven	156fbc3fbe	x86: fix page_is_ram() thinko page_is_ram() has a special case for the 640k-1M bios area, however due to a thinko the special case checks the e820 table entry and not the memory the user has asked for. This patch fixes the bug. [ mingo@elte.hu: this too is better solved in the e820 space, but those fixes are too intrusive for v2.6.25. ] Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-19 16:18:34 +01:00
Arjan van de Ven	d8a9e6a51e	x86: fix WARN_ON() message: teach page_is_ram() about the special 4Kb bios data page This patch teaches page_is_ram() about the fact that the first 4Kb of memory are special on x86, even though the E820 table normally doesn't exclude it. This fixes the WARN_ON() reported by Laurent Riffard who was also very helpful in diagnosing the issue. [ mingo@elte.hu: we are working on doing this properly in the e820 space, but for 2.6.25 this is the better fix. ] Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-19 16:18:34 +01:00
Sam Ravnborg	d01b9ad56e	x86: fix section mismatch in srat_64.c:reserve_hotadd reserve_hotadd() are only used by __init acpi_numa_memory_affinity_init(). Annotate reserve_hotadd() with __init is the trivial fix. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-19 16:18:31 +01:00
Andi Kleen	8e31c2ac11	x86: CPA: remove BUG_ON for LRU/Compound pages New implementation does not use lru for anything so there is no need to reject pages that are in the LRU. Similar for compound pages (which were checked because they also use page->lru) [ tglx@linutronix.de: removed unused variable ] Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-19 16:18:29 +01:00
Arjan van dev Ven	fcea424d31	fix historic ioremap() abuse in AGP Several AGP drivers right now use ioremap_nocache() on kernel ram in order to turn a page of regular memory uncached. There are two problems with this: 1) This is a total nightmare for the ioremap() implementation to keep various mappings of the same page coherent. 2) It's a total nightmare for the AGP code since it adds a ton of complexity in terms of keeping track of 2 different pointers to the same thing, in terms of error handling etc etc. This patch fixes this by making the AGP drivers use the new set_memory_XX APIs instead. Note: amd-k7-agp.c is built on Alpha too, and generic.c is built on ia64 as well, which do not yet have the set_memory_*() APIs, so for them some we have a few ugly #ifdefs - hopefully they'll be fixed soon. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Dave Airlie <airlied@linux.ie>	2008-02-19 14:46:39 +10:00
Yinghai Lu	b7ad149d62	x86: reenable support for system without on node0 One system doesn't have RAM for node0 installed. SRAT: PXM 0 -> APIC 0 -> Node 0 SRAT: PXM 0 -> APIC 1 -> Node 0 SRAT: PXM 1 -> APIC 2 -> Node 1 SRAT: PXM 1 -> APIC 3 -> Node 1 SRAT: Node 1 PXM 1 0-a0000 SRAT: Node 1 PXM 1 0-dd000000 SRAT: Node 1 PXM 1 0-123000000 ACPI: SLIT: nodes = 2 10 13 13 10 mapped APIC to ffffffffff5fb000 ( fee00000) Bootmem setup node 1 0000000000000000-0000000123000000 NODE_DATA [000000000000e000 - 0000000000014fff] bootmap [0000000000015000 - 00000000000395ff] pages 25 Could not find start_pfn for node 0 Pid: 0, comm: swapper Not tainted 2.6.24-smp-g5a514e21-dirty #14 Call Trace: [<ffffffff80bab498>] free_area_init_node+0x22/0x381 [<ffffffff8045ffc5>] generic_swap+0x0/0x17 [<ffffffff80bab0cc>] find_zone_movable_pfns_for_nodes+0x54/0x271 [<ffffffff80baba5f>] free_area_init_nodes+0x239/0x287 [<ffffffff80ba6311>] paging_init+0x46/0x4c [<ffffffff80b9dda5>] setup_arch+0x3c3/0x44e [<ffffffff80b978be>] start_kernel+0x6f/0x2c7 [<ffffffff80b971cc>] _sinittext+0x1cc/0x1d3 This happens because node 0 is not online, but the node state in mm/page_alloc.c has node 0 set. nodemask_t node_states[NR_NODE_STATES] __read_mostly = { [N_POSSIBLE] = NODE_MASK_ALL, [N_ONLINE] = { { [0] = 1UL } }, So we need to clear node_online_map before initializing the memory. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-18 20:54:14 +01:00
Thomas Gleixner	f34b439f34	x86: CPA: avoid double checking of alias ranges When the CPA code is called with an virtual address in the range of the direct mapping or the high alias then we do not need to run through the alias check for this range. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-18 20:54:14 +01:00
Thomas Gleixner	af96e4438a	x86: CPA no alias checking for _NX NX settings are not required to be consistent across alias mappings. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-18 20:54:14 +01:00
Thomas Gleixner	31eedd823c	x86: zap invalid and unused pmds in early boot The early boot code maps KERNEL_TEXT_SIZE (currently 40MB) starting from __START_KERNEL_map. The kernel itself only needs _text to _end mapped in the high alias. On relocatible kernels the ASM setup code adjusts the compile time created high mappings to the relocation. This creates invalid pmd entries for negative offsets: 0xffffffff80000000 -> pmd entry: ffffffffff2001e3 It points outside of the physical address space and is marked present. This starts at the virtual address __START_KERNEL_map and goes up to the point where the first valid physical address (0x0) is mapped. Zap the mappings before _text and after _end right away in early boot. This removes also the invalid entries. Furthermore it simplifies the range check for high aliases. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-18 20:54:14 +01:00
Thomas Gleixner	c31c7d4844	x86: CPA, fix alias checks c_p_a() did not discover all aliases correctly. (such as when called on vmalloc()-ed areas or ioremap()-ed areas) Push the alias checks to the lower, physical level and consistently discover all aliases that might exist: the low direct mappings and the high linear kernel-text mappings (on 64-bit). Thanks to Andi Kleen for pointing out that this was buggy. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-18 20:54:14 +01:00
Ingo Molnar	f8d8406bcb	x86: cpa, fix out of date comment Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-14 23:30:21 +01:00
Thomas Gleixner	69b1415e93	x86: cpa: ensure page alignment the cpa API is page aligned - warn about any weird alignments. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-14 23:30:20 +01:00
Harvey Harrison	7bfeab9af9	x86: include proper prototypes for rodata_test extern should not appear in C files. Also, the definitions do not match the prototype currently, not sure what way you want to go with this, I've switched the prototype to return int, but I can see going to the void return as well. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-14 23:30:20 +01:00
Adrian Bunk	cae30f8270	x86: make dump_pagetable() static dump_pagetable() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-14 23:30:19 +01:00
Andi Kleen	5d3c8b21e2	x86: CPA: fix gbpages support in try_preserve_large_page [ mingo@elte.hu: while gbpages cannot be enabled on mainline currently, keep the code uptodate and this fix is easy enough. ] Use correct page sizes and masks for GB pages in try_preserve_large_page() This prevents a boot hang on a GB capable system with CONFIG_DIRECT_GBPAGES enabled. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-13 16:20:35 +01:00
Jeremy Fitzhardinge	37cc8d7f96	x86/early_ioremap: don't assume we're using swapper_pg_dir At the early stages of boot, before the kernel pagetable has been fully initialized, a Xen kernel will still be running off the Xen-provided pagetables rather than swapper_pg_dir[]. Therefore, readback cr3 to determine the base of the pagetable rather than assuming swapper_pg_dir[]. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Tested-by: Jody Belka <knew-linux@pimb.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-13 16:20:35 +01:00
Thomas Gleixner	81772fea41	x86: remove over noisy debug printk pageattr-test.c contains a noisy debug printk that people reported. The condition under which it prints (randomly tapping into a mem_map[] hole and not being able to c_p_a() there) is valid behavior and not interesting to report. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-11 11:24:24 -08:00
Thomas Gleixner	fac8493960	x86: cpa, strict range check in try_preserve_large_page() Right now, we check only the first 4k page for static required protections. This does not take overlapping regions into account. So we might end up setting the wrong permissions/protections for other parts of this large page. This can be optimized further, but correctness is the important part. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-09 23:24:09 +01:00
Thomas Gleixner	eb5b5f024c	x86: cpa, use page pool Switch the split page code to use the page pool. We do this unconditionally to avoid different behaviour with and without DEBUG_PAGEALLOC enabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-09 23:24:09 +01:00
Thomas Gleixner	76ebd0548d	x86: introduce page pool in cpa DEBUG_PAGEALLOC was not possible on 64-bit due to its early-bootup hardcoded reliance on PSE pages, and the unrobustness of the runtime splitup of large pages. The splitup ended in recursive calls to alloc_pages() when a page for a pte split was requested. Avoid the recursion with a preallocated page pool, which is used to split up large mappings and gets refilled in the return path of kernel_map_pages after the split has been done. The size of the page pool is adjusted to the available memory. This part just implements the page pool and the initialization w/o using it yet. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-09 23:24:09 +01:00
Ian Campbell	b6fbb669c8	x86: fix early_ioremap pagetable ops Some important parts of `f6df72e71e` got dropped along the way, reintroduce them. Only affects paravirt guests. Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-09 23:24:09 +01:00
Ian Campbell	551889a6e2	x86: construct 32-bit boot time page tables in native format. Specifically the boot time page tables in a CONFIG_X86_PAE=y enabled kernel are in PAE format. early_ioremap is updated to use the standard page table accessors. Clear any mappings beyond max_low_pfn from the boot page tables in native_pagetable_setup_start because the initial mappings can extend beyond the range of physical memory and into the vmalloc area. Derived from patches by Eric Biederman and H. Peter Anvin. [ jeremy@goop.org: PAE swapper_pg_dir needs to be page-sized fix ] Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Mika PenttilÃÂ¤ <mika.penttila@kolumbus.fi> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-09 23:24:09 +01:00
Thomas Gleixner	bfc734b246	x86: avoid unused variable warning in mm/init_64.c Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-09 23:24:09 +01:00
Harvey Harrison	da7bfc50f5	x86: sparse warnings in pageattr.c Adjust the definition of lookup_address to take an unsigned long level argument. Adjust callers in xen/mmu.c that pass in a dummy variable. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-09 23:24:08 +01:00
Martin Schwidefsky	2f569afd9c	CONFIG_HIGHPTE vs. sub-page page tables. Background: I've implemented 1K/2K page tables for s390. These sub-page page tables are required to properly support the s390 virtualization instruction with KVM. The SIE instruction requires that the page tables have 256 page table entries (pte) followed by 256 page status table entries (pgste). The pgstes are only required if the process is using the SIE instruction. The pgstes are updated by the hardware and by the hypervisor for a number of reasons, one of them is dirty and reference bit tracking. To avoid wasting memory the standard pte table allocation should return 1K/2K (31/64 bit) and 2K/4K if the process is using SIE. Problem: Page size on s390 is 4K, page table size is 1K or 2K. That means the s390 version for pte_alloc_one cannot return a pointer to a struct page. Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one cannot return a pointer to a pte either, since that would require more than 32 bit for the return value of pte_alloc_one (and the pte * would not be accessible since its not kmapped). Solution: The only solution I found to this dilemma is a new typedef: a pgtable_t. For s390 pgtable_t will be a (pte ) - to be introduced with a later patch. For everybody else it will be a (struct page ). The additional problem with the initialization of the ptl lock and the NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and a destructor pgtable_page_dtor. The page table allocation and free functions need to call these two whenever a page table page is allocated or freed. pmd_populate will get a pgtable_t instead of a struct page pointer. To get the pgtable_t back from a pmd entry that has been installed with pmd_populate a new function pmd_pgtable is added. It replaces the pmd_page call in free_pte_range and apply_to_pte_range. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-08 09:22:42 -08:00
Bernhard Walle	72a7fe3967	Introduce flags for reserve_bootmem() This patchset adds a flags variable to reserve_bootmem() and uses the BOOTMEM_EXCLUSIVE flag in crashkernel reservation code to detect collisions between crashkernel area and already used memory. This patch: Change the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE. If that flag is set, the function returns with -EBUSY if the memory already has been reserved in the past. This is to avoid conflicts. Because that code runs before SMP initialisation, there's no race condition inside reserve_bootmem_core(). [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix powerpc build] Signed-off-by: Bernhard Walle <bwalle@suse.de> Cc: <linux-arch@vger.kernel.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:25 -08:00
Ingo Molnar	58d5d0d8dd	x86: fix deadlock, make pgd_lock irq-safe lockdep just caught this one: ================================= [ INFO: inconsistent lock state ] 2.6.24 #38 --------------------------------- inconsistent {in-softirq-W} -> {softirq-on-W} usage. swapper/1 [HC0[0]:SC0[0]:HE1:SE1] takes: (pgd_lock){-+..}, at: [<ffffffff8022a9ea>] mm_init+0x1da/0x250 {in-softirq-W} state was registered at: [<ffffffffffffffff>] 0xffffffffffffffff irq event stamp: 394559 hardirqs last enabled at (394559): [<ffffffff80267f0a>] get_page_from_freelist+0x30a/0x4c0 hardirqs last disabled at (394558): [<ffffffff80267d25>] get_page_from_freelist+0x125/0x4c0 softirqs last enabled at (393952): [<ffffffff80232f8e>] __do_softirq+0xce/0xe0 softirqs last disabled at (393945): [<ffffffff8020c57c>] call_softirq+0x1c/0x30 other info that might help us debug this: no locks held by swapper/1. stack backtrace: Pid: 1, comm: swapper Not tainted 2.6.24 #38 Call Trace: [<ffffffff8024e1fb>] print_usage_bug+0x18b/0x190 [<ffffffff8024f55d>] mark_lock+0x53d/0x560 [<ffffffff8024fffa>] __lock_acquire+0x3ca/0xed0 [<ffffffff80250ba8>] lock_acquire+0xa8/0xe0 [<ffffffff8022a9ea>] ? mm_init+0x1da/0x250 [<ffffffff809bcd10>] _spin_lock+0x30/0x70 [<ffffffff8022a9ea>] mm_init+0x1da/0x250 [<ffffffff8022aa99>] mm_alloc+0x39/0x50 [<ffffffff8028b95a>] bprm_mm_init+0x2a/0x1a0 [<ffffffff8028d12b>] do_execve+0x7b/0x220 [<ffffffff80209776>] sys_execve+0x46/0x70 [<ffffffff8020c214>] kernel_execve+0x64/0xd0 [<ffffffff8020901e>] ? _stext+0x1e/0x20 [<ffffffff802090ba>] init_post+0x9a/0xf0 [<ffffffff809bc5f6>] ? trace_hardirqs_on_thunk+0x35/0x3a [<ffffffff8024f75a>] ? trace_hardirqs_on+0xba/0xd0 [<ffffffff8020c1a8>] ? child_rip+0xa/0x12 [<ffffffff8020bcbc>] ? restore_args+0x0/0x44 [<ffffffff8020c19e>] ? child_rip+0x0/0x12 turns out that pgd_lock has been used on 64-bit x86 in an irq-unsafe way for almost two years, since commit `8c914cb704`. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-06 22:39:45 +01:00
Ingo Molnar	971a52d66a	x86: delay CPA self-test and repeat it delay the CPA self-test so that any impact (corruption) of user-space pagetables can be triggered. Repeat the test every 30 seconds. this would have prevented the bug fixed by `8cb2a7c1e9`, at its source. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-06 22:39:45 +01:00
Arjan van de Ven	cc842b82cc	x86: remove suprious ifdefs from pageattr.c The .rodata section really should just be read only; the config option is there to make breaking up the 2Mb page an option (so people whos machines give more performance for the 2Mb case can opt to do so). But when the page gets split anyway, this is no longer an issue, so clean up the code and remove the ifdefs Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-06 22:39:45 +01:00
Arjan van de Ven	984bb80d94	x86: mark the .rodata section also NX The .rodata section shouldn't just be read-only, but also non-executable. This is free since we've broken up the 2MB page already anyway. also update test_nx to check for this. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-06 22:39:45 +01:00
Ingo Molnar	2d684cd6d9	x86: remove X2 workaround With the spurious handler fix, the X2 does not lock up anymore. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-06 22:39:44 +01:00
Thomas Gleixner	d8b57bb700	x86: make spurious fault handler aware of large mappings In very rare cases, on certain CPUs, we could end up in the spurious fault handler and ignore a large pud/pmd mapping. The resulting pte pointer points into the mapped physical space and dereferencing it will fault recursively. Make the code aware of large mappings and do the permission check on the pmd/pud entry, when a large pud/pmd mapping is detected. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-06 22:39:43 +01:00
Hugh Dickins	8cb2a7c1e9	stop c_p_a corrupting the pds When change_page_attr splits a large page on x86_32 (without PAE), it is currently corrupting every process's page directory: fix that by removing the thinko which passes down a physical instead of a virtual address. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-05 14:37:14 -08:00
Benjamin Herrenschmidt	5e5419734c	add mm argument to pte/pmd/pud/pgd_free (with Martin Schwidefsky <schwidefsky@de.ibm.com>) The pgd/pud/pmd/pte page table allocation functions get a mm_struct pointer as first argument. The free functions do not get the mm_struct argument. This is 1) asymmetrical and 2) to do mm related page table allocations the mm argument is needed on the free function as well. [kamalesh@linux.vnet.ibm.com: i386 fix] [akpm@linux-foundation.org: coding-syle fixes] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-05 09:44:18 -08:00
Thomas Gleixner	7b610eec7a	x86: cpa, micro-optimization Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:10 +01:00
Ingo Molnar	87f7f8fe32	x86: cpa, clean up code flow Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:10 +01:00
Ingo Molnar	beaff6333b	x86: cpa, eliminate CPA_ enum Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:09 +01:00
Ingo Molnar	9df84993cb	x86: cpa, cleanups Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:09 +01:00
Andi Kleen	f07333fd14	x86: implement gbpages support in change_page_attr() Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:09 +01:00
Andi Kleen	b536022227	x86: support gbpages in pagetable dump Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:09 +01:00
Andi Kleen	c2f71ee214	x86: add gbpages support to lookup_address [ tglx@linutronix.de: fix bootup crash on sparse mappings. ] Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:09 +01:00
Andi Kleen	d4f71f7969	x86: switch direct mapping setup over to set_pte Use set_pte() for setting up the 2MB pages in the direct mapping. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:09 +01:00
Thomas Gleixner	7bfb72e847	x86: fix page-present check in cpa_flush_range pte_present() might return true for PROT_NONE mappings. Explicitely check the present bit. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:08 +01:00
Ingo Molnar	6ce9fc17d9	x86: remove cpa warning this race is legit and can happen on SMP systems. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:08 +01:00
Andi Kleen	bde1965ce8	x86: remove now unused clear_kernel_mapping Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:08 +01:00
Thomas Gleixner	64f351d197	x86: cpa selftest, skip non present entries pud and pmd entries in the RAM area might be marked as non present. Do not try to modify them in the selftest. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:08 +01:00
Thomas Gleixner	07cf89c05f	x86: CPA fix pagetable split Move the readout of the large entry into the spinlock section to prevent an unlikely but possible race. Mark the pmd/pud entry present after the split. We preserved the non present bit in the new split mapping. Remove the stale gfp_flags double initialization. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:08 +01:00
Andi Kleen	31422c51e0	x86: rename LARGE_PAGE_SIZE to PMD_PAGE_SIZE Fix up all users. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:08 +01:00
Thomas Gleixner	9a14aefc1d	x86: cpa, fix lookup_address lookup_address() returns a wrong level and a wrong pointer to a non existing pte, when pmd or pud entries are marked !present. This happens for example due to boot time mapping of GART into the low memory space. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:07 +01:00
Ingo Molnar	34508f66b6	x86: AMD Athlon X2 hard hang fix An Athlon 64 X2 test system showed hard hangs shortly after marking the kernel text read-only, if we tried to preserve largepages and changed the PSE entry from RW to RO. The pagetable code itself is correct, it's the CPU that locked up hard (and not even the NMI watchdog could punch through that hard hang). So be conservative and always do splitups - like we did in the past. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:07 +01:00
Thomas Gleixner	65e074dffa	x86: cpa, preserve large pages if possible When CPA is called on a range which fits into a large page mapping, avoid to split the page when: 1) There is no change of attributes 2) The range to change is a complete large mapping Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:07 +01:00
Thomas Gleixner	f4ae5da0e8	x86: cpa, check if we changed anything and tlb flushing is necessary Flush tlbs only when there was a real change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:07 +01:00
Thomas Gleixner	72e458dfa6	x86: introduce struct cpa_data The number of arguments which need to be transported is increasing and we want to add flush optimizations and large page preserving. Create struct cpa data and pass a pointer instead of increasing the number of arguments further. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:07 +01:00
Andi Kleen	6bb8383beb	x86: cpa, only flush the cache if the caching attributes have changed We only need to flush the caches in cpa() if the the caching attributes have changed. Otherwise only flush the TLBs. This checks the PAT bits too although they are currently not used by the kernel. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:06 +01:00
Thomas Gleixner	331e406588	x86: CPA return early when requested feature is not available Mask out the not supported bits (e.g. NX). If the clr/set masks are empty after the mask return without changing anything. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:06 +01:00
Thomas Gleixner	f56d005d30	x86: no CPA on iounmap When an ioremap is unmapped, do not change the page attributes. There might be another mapping of the same physical address. PAT might detect a conflicting mapping attribute for no good reason. The mapping is removed anyway. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:05 +01:00
Thomas Gleixner	75ab43bfce	x86: ioremap remove the range check of cpa Now that cpa works on non-direct mappings as well, we can safely remove the range check in ioremap_change_attr(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:05 +01:00
Thomas Gleixner	e66aadbe6c	x86: simplify __ioremap Remove tons of castings which make the code hard to read. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:05 +01:00
Thomas Gleixner	63c1dcf4bc	x86: CPA use the existing pfn in split as well When splitting large pages, we ge the pfn from the existing entry instead of calculating it ourself. This removes the last remaining range restriction of the cpa code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:48:05 +01:00
Arjan van de Ven	626c2c9d06	x86: use the pfn from the page when change its attributes When changing the attributes of a pte, we should use the PFN from the existing PTE rather than going through hoops calculating what we think it might have been; this is both fragile and totally unneeded. It also makes it more hairy to call any of these functions on non-direct maps for no good reason whatsover. With this change, __change_page_attr() no longer takes a pfn as argument, which simplifies all the callers. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@tglx.de>	2008-02-04 16:48:05 +01:00
Arjan van de Ven	cc0f21bbc1	x86: teach the static_protection function about high mappings Right now, enforcing that the high mapping of the kernel text doesn't get the NX bit is done deep in the guts of CPA, rather than in the static_protection() function that enforces all other per-arch sanity checks. This patch moves this sanity check into the central static_protection() function instead, and makes it apply ONLY to the kernel text, not to all other areas in the high mapping. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:05 +01:00
Jeremy Fitzhardinge	a67ad9c9f8	x86: revert "defer cr3 reload when doing pud_clear()" Revert "defer cr3 reload when doing pud_clear()" since I'm going to replace it. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:02 +01:00
Jeremy Fitzhardinge	e618c9579c	x86: unify PAE/non-PAE pgd_ctor The constructors for PAE and non-PAE pgd_ctors are more or less identical, and can be made into the same function. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: William Irwin <wli@holomorphy.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:48:02 +01:00
H. Peter Anvin	f832ff18e8	x86: use _ASM_EXTABLE macro in arch/x86/mm/init_32.c Use the _ASM_EXTABLE macro from <asm/asm.h>, instead of open-coding __ex_table entires in arch/x86/mm/init_32.c. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:47:58 +01:00
Harvey Harrison	cf89ec924d	x86: reduce ifdef sections in fault.c Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:47:56 +01:00
Yinghai Lu	6118f76fb7	x86: print out node_data addr and bootmap_start addr print out node_data addr and bootmap_start addr. helpful for debugging early crashes on high-end NUMA systems. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:47:56 +01:00
Thomas Gleixner	b50516fc20	x86: CPA remove bogus NX clear In split_large_page we clear the NX bit for the new split ptes, but we need to preserve the original setting of it for the split ptes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-04 16:47:55 +01:00
Ingo Molnar	38cb47ba01	x86: relax RAM check in ioremap() Kevin Winchester reported the loss of direct rendering, due to: [ 0.588184] agpgart: Detected AGP bridge 0 [ 0.588184] agpgart: unable to get memory for graphics translation table. [ 0.588184] agpgart: agp_backend_initialize() failed. [ 0.588207] agpgart-amd64: probe of 0000:00:00.0 failed with error -12 and bisected it down to: commit `266b9f8727` Author: Thomas Gleixner <tglx@linutronix.de> Date: Wed Jan 30 13:34:06 2008 +0100 x86: fix ioremap RAM check this check was too strict and caused an ioremap() failure. the problem is due to the somewhat unclean way of how the GART code reserves a memory range for its aperture, and how it utilizes it later on. Allow RAM pages to be ioremap()-ed too, as long as they are reserved. Bisected-by: Kevin Winchester <kjwinchester@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Tested-by: Kevin Winchester <kjwinchester@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-02-04 16:47:54 +01:00
Rafael J. Wysocki	a6eb84bc1e	suspend: cleanup reference to swsusp_pg_dir[] swsusp_pg_dir[] is used for suspend, but not for hibernation. clean-up the ifdefs which worked by accident, while implying the opposite. Delete the __nosavedata, which also implied the opposite. Some day we may optimize CONFIG_ACPI_SLEEP to build minimal kernels for just hibernate or just suspend but not both, but today isn't that day. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>	2008-02-01 18:30:59 -05:00
Harvey Harrison	93809be8b1	x86: fixes for lookup_address args Signedness mismatches in level argument. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-01 17:49:43 +01:00
Yinghai Lu	9347e0b0ce	x86: remove unneeded round_up Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-01 17:49:42 +01:00
Yinghai Lu	24a5da73f4	x86_64: make bootmap_start page align v6 boot oopses when a system has 64 or 128 GB of RAM installed: Calling initcall 0xffffffff80bc33b6: sctp_init+0x0/0x711() BUG: unable to handle kernel NULL pointer dereference at 000000000000005f IP: [<ffffffff802bfe55>] proc_register+0xe7/0x10f PGD 0 Oops: 0000 [1] SMP CPU 0 Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.24-smp-g5a514e21-dirty #6 RIP: 0010:[<ffffffff802bfe55>] [<ffffffff802bfe55>] proc_register+0xe7/0x10f RSP: 0000:ffff810824c57e60 EFLAGS: 00010246 RAX: 000000000000d7d7 RBX: ffff811024c5fa80 RCX: ffff810824c57e08 RDX: 0000000000000000 RSI: 0000000000000195 RDI: ffffffff80cc2460 RBP: ffffffffffffffff R08: 0000000000000000 R09: ffff811024c5fa80 R10: 0000000000000000 R11: 0000000000000002 R12: ffff810824c57e6c R13: 0000000000000000 R14: ffff810824c57ee0 R15: 00000006abd25bee FS: 0000000000000000(0000) GS:ffffffff80b4d000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 000000000000005f CR3: 0000000000201000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 1, threadinfo ffff810824c56000, task ffff812024c52000) Stack: ffffffff80a57348 0000019500000000 ffff811024c5fa80 0000000000000000 00000000ffffff97 ffffffff802bfef0 0000000000000000 ffffffffffffffff 0000000000000000 ffffffff80bc3b4b ffff810824c57ee0 ffffffff80bc34a5 Call Trace: [<ffffffff802bfef0>] ? create_proc_entry+0x73/0x8a [<ffffffff80bc3b4b>] ? sctp_snmp_proc_init+0x1c/0x34 [<ffffffff80bc34a5>] ? sctp_init+0xef/0x711 [<ffffffff80b976e3>] ? kernel_init+0x175/0x2e1 [<ffffffff8020ccf8>] ? child_rip+0xa/0x12 [<ffffffff80b9756e>] ? kernel_init+0x0/0x2e1 [<ffffffff8020ccee>] ? child_rip+0x0/0x12 Code: 1e 48 83 7b 38 00 75 08 48 c7 43 38 f0 e8 82 80 48 83 7b 30 00 75 08 48 c7 43 30 d0 e9 82 80 48 c7 c7 60 24 cc 80 e8 bd 5a 54 00 <48> 8b 45 60 48 89 6b 58 48 89 5d 60 48 89 43 50 fe 05 f5 25 a0 RIP [<ffffffff802bfe55>] proc_register+0xe7/0x10f RSP <ffff810824c57e60> CR2: 000000000000005f ---[ end trace 02c2d78def82877a ]--- Kernel panic - not syncing: Attempted to kill init! it turns out some variables near end of bss are corrupted already. in System.map we have ffffffff80d40420 b rsi_table ffffffff80d40620 B krb5_seq_lock ffffffff80d40628 b i.20437 ffffffff80d40630 b xprt_rdma_inline_write_padding ffffffff80d40638 b sunrpc_table_header ffffffff80d40640 b zero ffffffff80d40644 b min_memreg ffffffff80d40648 b rpcrdma_tk_lock_g ffffffff80d40650 B sctp_assocs_id_lock ffffffff80d40658 B proc_net_sctp ffffffff80d40660 B sctp_assocs_id ffffffff80d40680 B sysctl_sctp_mem ffffffff80d40690 B sysctl_sctp_rmem ffffffff80d406a0 B sysctl_sctp_wmem ffffffff80d406b0 b sctp_ctl_socket ffffffff80d406b8 b sctp_pf_inet6_specific ffffffff80d406c0 b sctp_pf_inet_specific ffffffff80d406c8 b sctp_af_v4_specific ffffffff80d406d0 b sctp_af_v6_specific ffffffff80d406d8 b sctp_rand.33270 ffffffff80d406dc b sctp_memory_pressure ffffffff80d406e0 b sctp_sockets_allocated ffffffff80d406e4 b sctp_memory_allocated ffffffff80d406e8 b sctp_sysctl_header ffffffff80d406f0 b zero ffffffff80d406f4 A __bss_stop ffffffff80d406f4 A _end and setup_node_bootmem() will use that page 0xd40000 for bootmap Bootmem setup node 0 0000000000000000-0000000828000000 NODE_DATA [000000000008a485 - 0000000000091484] bootmap [0000000000d406f4 - 0000000000e456f3] pages 105 Bootmem setup node 1 0000000828000000-0000001028000000 NODE_DATA [0000000828000000 - 0000000828006fff] bootmap [0000000828007000 - 0000000828106fff] pages 100 Bootmem setup node 2 0000001028000000-0000001828000000 NODE_DATA [0000001028000000 - 0000001028006fff] bootmap [0000001028007000 - 0000001028106fff] pages 100 Bootmem setup node 3 0000001828000000-0000002028000000 NODE_DATA [0000001828000000 - 0000001828006fff] bootmap [0000001828007000 - 0000001828106fff] pages 100 setup_node_bootmem() makes NODE_DATA cacheline aligned, and bootmap is page-aligned. the patch updates find_e820_area() to make sure we can meet the alignment constraints. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-01 17:49:41 +01:00
Yinghai Lu	25eff8d4cd	x86_64: add debug name for early_res helps debugging problems in this rather murky area of code. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-02-01 17:49:41 +01:00
Ingo Molnar	5aa0508508	x86: uninline __pte_free_tlb() and __pmd_free_tlb() this also removes an include file dependency. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-31 22:05:48 +01:00
Huang, Ying	1fd6a53ddc	x86: early_ioremap_reset fix 2 This patch fixes a bug of early_ioremap_reset(), which had been fixed before by "convert the boot time page table to the kernels native format" patch. But that patch has been reverted now. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-31 22:05:45 +01:00
Huang, Ying	5827040df0	x86: change_page_attr_clear fix This patch replaces __change_page_attr_set_clr() with change_page_attr_set_clr() in change_page_attr_clear() to flush the TLB/cache properly. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-31 22:05:43 +01:00
Yinghai Lu	afadcd788f	x86: fix nodemap_size according to nodeid bits memnode.map is s16 array because of nodeid is 16 bit now. so need to increase the nodemap_size according to that bits. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:12 +01:00
Yinghai Lu	9198715763	x86: fix overlap between pagetable with bss section one early crash on one 8 node 256g machine: Command line: console=uart8250,io,0x3f8,115200n8 initrd=kernel.org/mydisk11_x86_64.gz rw root=/dev/ram0 debug initcall_debug apic=debug acpi.debug_level=0x0000000f pci=routeirq ip=dhcp load_ramdisk=1 ramdisk_size=131072 BOOT_IMAGE=kernel.org/bzImage_2.6.25_k8.1 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009bc00 (usable) BIOS-e820: 000000000009bc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000e6000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 00000000dffe0000 (usable) BIOS-e820: 00000000dffe0000 - 00000000dffee000 (ACPI data) BIOS-e820: 00000000dffee000 - 00000000dffff050 (ACPI NVS) BIOS-e820: 00000000dffff050 - 00000000e0000000 (reserved) BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000ff700000 - 0000000100000000 (reserved) BIOS-e820: 0000000100000000 - 0000004020000000 (usable) Early serial console at I/O port 0x3f8 (options '115200n8') console [uart0] enabled end_pfn_map = 67239936 Kernel panic - not syncing: Duplicated early reservation d40000-e42000 Pid: 0, comm: swapper Not tainted 2.6.24-smp-g5a514e21-dirty #3 Call Trace: [<ffffffff80221545>] lapic_get_maxlvt+0x0/0x10 [<ffffffff80221657>] clear_local_APIC+0x5/0xcf [<ffffffff80221726>] disable_local_APIC+0x5/0x17 [<ffffffff8021fe16>] smp_send_stop+0x46/0x4c [<ffffffff80235293>] panic+0x94/0x13e [<ffffffff80bc3b03>] sctp_eps_proc_init+0x12/0x34 [<ffffffff80b9f1c5>] reserve_early+0x30/0x6c [<ffffffff80803925>] init_memory_mapping+0x2cd/0x2dc [<ffffffff80b9dc01>] setup_arch+0x21f/0x44e [<ffffffff80b978be>] start_kernel+0x6f/0x2c7 [<ffffffff80b971cc>] _sinittext+0x1cc/0x1d3 it turns out there is overlap between pgtable and bss... in System.map we have ffffffff80d40420 b rsi_table ffffffff80d40620 B krb5_seq_lock ffffffff80d40628 b i.20437 ffffffff80d40630 b xprt_rdma_inline_write_padding ffffffff80d40638 b sunrpc_table_header ffffffff80d40640 b zero ffffffff80d40644 b min_memreg ffffffff80d40648 b rpcrdma_tk_lock_g ffffffff80d40650 B sctp_assocs_id_lock ffffffff80d40658 B proc_net_sctp ffffffff80d40660 B sctp_assocs_id ffffffff80d40680 B sysctl_sctp_mem ffffffff80d40690 B sysctl_sctp_rmem ffffffff80d406a0 B sysctl_sctp_wmem ffffffff80d406b0 b sctp_ctl_socket ffffffff80d406b8 b sctp_pf_inet6_specific ffffffff80d406c0 b sctp_pf_inet_specific ffffffff80d406c8 b sctp_af_v4_specific ffffffff80d406d0 b sctp_af_v6_specific ffffffff80d406d8 b sctp_rand.33270 ffffffff80d406dc b sctp_memory_pressure ffffffff80d406e0 b sctp_sockets_allocated ffffffff80d406e4 b sctp_memory_allocated ffffffff80d406e8 b sctp_sysctl_header ffffffff80d406f0 b zero ffffffff80d406f4 A __bss_stop ffffffff80d406f4 A _end need to round up table_start to PAGE_SIZE. also make the panic more informative. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:12 +01:00
Joachim Deguara	bb4a1d644a	x86: add PCI IDs to k8topology_64.c This just adds the PCI IDs of AMD's family 10h and 11h CPU's northbridges to k8topology discovery. Signed-off-by: Joachim Deguara <joachim.deguara@amd.com> Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:12 +01:00
Jeremy Fitzhardinge	f6df72e71e	x86: fix early_ioremap pagetable ops Put appropriate pagetable update hooks in so that paravirt knows what's going on in there. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:11 +01:00
Jeremy Fitzhardinge	e3ed910db2	x86: use the same pgd_list for PAE and 64-bit Use a standard list threaded through page->lru for maintaining the pgd list on PAE. This is the same as 64-bit, and seems saner than using a non-standard list via page->index. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:11 +01:00
Jeremy Fitzhardinge	6194ba6ff6	x86: don't special-case pmd allocations as much In x86 PAE mode, stop treating pmds as a special case. Previously they were always allocated and freed with the pgd. The modifies the code to be the same as 64-bit mode, where they are allocated on demand. This is a step on the way to unifying 32/64-bit pagetable allocation as much as possible. There is a complicating wart, however. When you install a new reference to a pmd in the pgd, the processor isn't guaranteed to see it unless you reload cr3. Since reloading cr3 also has the side-effect of flushing the tlb, this is an expense that we want to avoid whereever possible. This patch simply avoids reloading cr3 unless the update is to the current pagetable. Later patches will optimise this further. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Andi Kleen <ak@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: William Irwin <wli@holomorphy.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:11 +01:00
Harvey Harrison	fd40d6e318	x86: shrink some ifdefs in fault.c The change from current to tsk in do_page_fault is safe as this is set at the very beginning of the function. Removes a likely() annotation from the 64-bit version, this could have instead been added to 32-bit. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:11 +01:00
Jeremy Fitzhardinge	5b727a3b01	x86: ignore spurious faults When changing a kernel page from RO->RW, it's OK to leave stale TLB entries around, since doing a global flush is expensive and they pose no security problem. They can, however, generate a spurious fault, which we should catch and simply return from (which will have the side-effect of reloading the TLB to the current PTE). This can occur when running under Xen, because it frequently changes kernel pages from RW->RO->RW to implement Xen's pagetable semantics. It could also occur when using CONFIG_DEBUG_PAGEALLOC, since it avoids doing a global TLB flush after changing page permissions. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:11 +01:00
Harvey Harrison	b406ac61e9	x86: remove nx_enabled from fault.c On !PAE 32-bit, _PAGE_NX will be 0, making is_prefetch always return early. The test is sufficient on PAE as __supported_pte_mask is updated in the same places as nx_enabled in init_32.c which also takes disable_nx into account. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:11 +01:00
Harvey Harrison	c61e211d99	x86: unify fault_32\|64.c Unify includes in moved fault.c. Modify Makefiles to pick up unified file. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:11 +01:00
Harvey Harrison	f8c2ee224d	x86: unify fault_32\|64.c with ifdefs Elimination of these ifdefs can be done in a unified file. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Harvey Harrison	1156e098c5	x86: unify fault_32\|64.c by ifdef'd function bodies It's about time to get on with unifying these files, elimination of the ugly ifdefs can occur in the unified file. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Ingo Molnar	d7d119d777	x86: arch/x86/mm/init_32.c printk fixes printk fixes. NOP in terms of functionality, but strings got a bit larger due to the KERN_ markers that were added. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Ingo Molnar	8550eb9982	x86: arch/x86/mm/init_32.c cleanup Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Ingo Molnar	10f22dde55	x86: arch/x86/mm/init_64.c printk fixes Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Thomas Gleixner	14a62c34b1	x86: unify ioremap Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Harvey Harrison	19f0dda91e	x86: unify page fault oops printing This changes the oops dumping format for page faults to be similar between X86_32 and 64. This is the first user of printk_address on X86_32. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Harvey Harrison	b3279c7fd7	x86: introduce show_fault_oops helper to fault_32\|64.c This will help when unifying the oops dumping code on 32/64 bit. No functional changes. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:10 +01:00
Harvey Harrison	35f3266ffb	x86: add is_errata100 helper to fault_32\|64.c Further towards unifying these files, add another helper in same spirit as is_errata93. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:09 +01:00
Harvey Harrison	29caf2f98c	x86: add is_f00f_bug helper to fault_32\|64.c Further towards unifying these files, add another helper in same spirit as is_errata93. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:09 +01:00
Thomas Gleixner	0879750f5d	x86: cpa cleanup the 64-bit alias math Cleanup the address calculations, which are necessary to identify the high/low alias mappings of the kernel on 64 bit machines. Instead of calling __pa/__va back and forth, calculate the physical address once and base the other calculations on it. Add understandable constants so we can use the already available within() helper. Also add comments, which help mere mortals to understand what this code does. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:09 +01:00
Ingo Molnar	86f03989d9	x86: cpa: fix the self-test Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:09 +01:00
Ingo Molnar	ee01f1122c	x86: init memory debugging debug incorrect/late access to init memory, by permanently unmapping the init memory ranges. Depends on CONFIG_DEBUG_PAGEALLOC=y. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:09 +01:00
Arjan van de Ven	1a4872529e	x86: move misplaced rodata check call It looks like a mismerge put the rodata self-check in the wrong spot; move it to the right place after marking the .rodata section read only. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:09 +01:00
Ingo Molnar	4c61afcdb2	x86: fix clflush_page_range logic only present ptes must be flushed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:09 +01:00
Thomas Gleixner	3b233e52f7	x86: optimize clflush clflush is sufficient to be issued on one CPU. The invalidation is broadcast throughout the coherence domain. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	cd8ddf1a28	x86: clflush_page_range needs mfence clflush is an unordered operation with respect to other memory traffic, including other CLFLUSH instructions. This needs proper fencing with mfence. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	af1e6844d6	x86: cpa: rename global_flush_tlb() to cpa_flush_all() The function name global_flush_tlb() suggests something different from what the function really does. Rename it to cpa_flush_all(), which is an understandable counterpart to cpa_flush_range(). no global visibility of the old API anymore. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	57a6a46aa2	x86: cpa: implement clflush optimization Use clflush on CPUs which support this. clflush is only used when the page attribute operation has been successful. On CPUs which do not support clflush and in the case of error the old fashioned global_flush_tlb() is called. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	56744546b3	x86: cpa use the new set_clr function Convert cpa_set and cpa_clear to call the new set_clr function. Seperate out the debug helpers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	ff31452b6e	x86: cpa create set_and_clr function Create a set_and_clr function to avoid the duplicate loops. Allows also to do combined operations for optimization. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	72932c7ad2	x86: cpa move the flush into set and clear functions To avoid the modification of the flush code for the clflush implementation, move the flush into the set and clear functions and provide helper functions for the debugging code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:08 +01:00
Arjan van de Ven	edeed30589	x86: add testcases for RODATA and NX protections/attributes Latest update; I now have 4 NX tests, but 2 fail so they're #if 0'd. I also cleaned up the NX test code quite a bit, and got rid of the ugly exception table sorting stuff. From: Arjan van de Ven <arjan@linux.intel.com> This patch adds testcases for the CONFIG_DEBUG_RODATA configuration option as well as the NX CPU feature/mappings. Both testcases can move to tests/ once that patch gets merged into mainline. (I'm half considering moving the rodata test into mm/init.c but I'll wait with that until init.c is unified) As part of this I had to fix a not-quite-right alignment in the vmlinux.lds.h for the RODATA sections, which lead to 1 page less being marked read only. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:08 +01:00
Ingo Molnar	adafdf6a4e	x86: ioremap KERN_INFO Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:08 +01:00
Thomas Gleixner	6eade8ff46	x86: cpa: clean up change_page_attr_set/clear() Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:08 +01:00
Ingo Molnar	4692a1450b	x86: cpa: fix loop Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Thomas Gleixner	a72a08a4b6	x86: cpa: fix split thinko Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Arjan van de Ven	3c1df68b84	x86: make sure initmem is writable When we free initmem, various rodata and CPA checks may have left memory read only.. this patch ensures that the memory is writable before we free it. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Arjan van de Ven	488fd99588	x86: fix pageattr-selftest In Ingo's testing, he found a bug in the CPA selftest code. What would happen is that the test would call change_page_attr_addr on a range of memory, part of which was read only, part of which was writable. The only thing the test wanted to change was the global bit... What actually happened was that the selftest would take the permissions of the first page, and then the change_page_attr_addr call would then set the permissions of the entire range to this first page. In the rodata section case, this resulted in pages after the .rodata becoming read only... which made the kernel rather unhappy in many interesting ways. This is just another example of how dangerous the cpa API is (was); this patch changes the test to use the incremental clear/set APIs instead, and it changes the clear/set implementation to work on a 1 page at a time basis. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Thomas Gleixner	d7c8f21a8c	x86: cpa: move flush to cpa The set_memory_* and set_pages_* family of API's currently requires the callers to do a global tlb flush after the function call; forgetting this is a very nasty deathtrap. This patch moves the global tlb flush into each of the callers Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Arjan van de Ven	d1028a154c	x86: make various pageattr.c functions static change_page_attr_add is only used in pageattr.c now, so we can make this function static. change_page_attr() isn't used anywere at all anymore; this function is a really bad API anyway so just remove the bloat entirely. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Ingo Molnar	f62d0f008e	x86: cpa: set_memory_notpresent() Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:07 +01:00
Thomas Gleixner	d806e5ee20	x86: cpa: convert ioremap to new API Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:06 +01:00
Thomas Gleixner	5f8681529c	x86: fix ioremap API Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:06 +01:00
Thomas Gleixner	266b9f8727	x86: fix ioremap RAM check Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:06 +01:00
Thomas Gleixner	950f9d95be	x86: fix the missing BIOS area check in page_is_ram page_is_ram has a FIXME since ages, which reminds to sanity check the BIOS area between 640k and 1M, which is sometimes falsely reported as RAM in the e820 tables. Implement the sanity check. Move the BIOS range defines from pageattr.c into e820.h to avoid duplicate defines. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:06 +01:00
Thomas Gleixner	5f5192b9fe	x86: move page_is_ram() function Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:34:06 +01:00
Arjan van de Ven	e1271f686a	x86: deprecate change_page_attr() for drivers With the introduction of the new API, no driver or non-archcore code needs to use c-p-a anymore, so this patch also deprecates the EXPORT_SYMBOL of CPA (it's a horrible API after all). Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:06 +01:00
Arjan van de Ven	6d238cc4dc	x86: convert CPA users to the new set_page_ API This patch converts various users of change_page_attr() to the new, more intent driven set_page_/set_memory_ API set. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:06 +01:00
Arjan van de Ven	75cbade8ea	x86: a new API for drivers/etc to control cache and other page attributes Right now, if drivers or other code want to change, say, a cache attribute of a page, the only API they have is change_page_attr(). c-p-a is a really bad API for this, because it forces the caller to know ALL the attributes he wants for the page, not just the 1 thing he wants to change. So code that wants to set a page uncachable, needs to be aware of the NX status as well etc etc etc. This patch introduces a set of new APIs for this, set_pages_<attr> and set_memory_<attr>, that offer a logical change to the user, and leave all attributes not implied by the requested logical change alone. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:34:06 +01:00

1 2 3 4 5 ...

444 Commits