linux

Commit Graph

Author	SHA1	Message	Date
Heiko Carstens	7dd8fe1f91	[S390] pfault: cleanup code Small code cleanup. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-05-23 10:24:30 +02:00
Heiko Carstens	f2db2e6cb3	[S390] pfault: cpu hotplug vs missing completion interrupts On cpu hot remove a PFAULT CANCEL command is sent to the hypervisor which in turn will cancel all outstanding pfault requests that have been issued on that cpu (the same happens with a SIGP cpu reset). The result is that we end up with uninterruptible processes where the interrupt that would wake up these processes never arrives. In order to solve this all processes which wait for a pfault completion interrupt get woken up after a cpu hot remove. The worst case that could happen is that they fault again and in turn need to wait again. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-05-23 10:24:29 +02:00
Heiko Carstens	89db4df160	[S390] extmem: get rid of compile warning Get rid of these: arch/s390/mm/extmem.c: In function 'segment_modify_shared': arch/s390/mm/extmem.c:622:3: warning: 'end_addr' may be used uninitialized in this function [-Wuninitialized] arch/s390/mm/extmem.c:627:18: warning: 'start_addr' may be used uninitialized in this function [-Wuninitialized] arch/s390/mm/extmem.c: In function 'segment_load': arch/s390/mm/extmem.c:481:11: warning: 'end_addr' may be used uninitialized in this function [-Wuninitialized] arch/s390/mm/extmem.c:480:18: warning: 'start_addr' may be used uninitialized in this function [-Wuninitialized] Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-05-23 10:24:29 +02:00
Heiko Carstens	7712f83aa9	[S390] get rid of unused variables Remove trivially unused variables as detected with -Wunused-but-set-variable. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-05-23 10:24:28 +02:00
Martin Schwidefsky	043d07084b	[S390] Remove data execution protection The noexec support on s390 does not rely on a bit in the page table entry but utilizes the secondary space mode to distinguish between memory accesses for instructions vs. data. The noexec code relies on the assumption that the cpu will always use the secondary space page table for data accesses while it is running in the secondary space mode. Up to the z9-109 class machines this has been the case. Unfortunately this is not true anymore with z10 and later machines. The load-relative-long instructions lrl, lgrl and lgfrl access the memory operand using the same addressing-space mode that has been used to fetch the instruction. This breaks the noexec mode for all user space binaries compiled with march=z10 or later. The only option is to remove the current noexec support. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-05-23 10:24:28 +02:00
Jan Glauber	448694a1d5	module: undo module RONX protection correctly. While debugging I stumbled over two problems in the code that protects module pages. First issue is that disabling the protection before freeing init or unload of a module is not symmetric with the enablement. For instance, if pages are set to RO the page range from module_core to module_core + core_ro_size is protected. If a module is unloaded the page range from module_core to module_core + core_size is set back to RW. So pages that were not set to RO are also changed to RW. This is not critical but IMHO it should be symmetric. Second issue is that while set_memory_rw & set_memory_ro are used for RO/RW changes only set_memory_nx is involved for NX/X. One would await that the inverse function is called when the NX protection should be removed, which is not the case here, unless I'm missing something. Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2011-05-19 16:55:26 +09:30
Michael Holzheu	83ace2701b	[S390] replace diag10() with diag10_range() function Currently the diag10() function can only release one page. For exploiters that have to call diag10 on a contiguous memory region this is suboptimal. This patch replaces the diag10() function with diag10_range() that is able to release multiple pages. In addition to that the new function now allows to release memory with addresses higher than 2047 MiB. This was due to a restriction of the diagnose implementation under z/VM prior to release 5.2. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-05-10 17:13:43 +02:00
Heiko Carstens	a985183285	[S390] irqstats: fix counting of pfault, dasd diag and virtio irqs pfault, dasd diag and virtio all use the same external interrupt number. The respective interrupt handlers decide by the subcode if they are meant to handle the interrupt. Counting is currently done before looking at the subcode which means each handler counts an interrupt even if it is not handling it. Fix this by moving the kstat code after the code which looks at the subcode. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-04-29 10:42:25 +02:00
Heiko Carstens	e35c76cd47	[S390] pfault: fix token handling `f6649a7e` "[S390] cleanup lowcore access from external interrupts" changed handling of external interrupts. Instead of letting the external interrupt handlers accessing the per cpu lowcore the entry code of the kernel reads already all fields that are necessary and passes them to the handlers. The pfault interrupt handler was incorrectly converted. It tries to dereference a value which used to be a pointer to a lowcore field. After the conversion however it is not anymore the pointer to the field but its content. So instead of a dereference only a cast is needed to get the task pointer that caused the pfault. Fixes a NULL pointer dereference and a subsequent kernel crash: Unable to handle kernel pointer dereference at virtual kernel address (null) Oops: 0004 [#1] SMP Modules linked in: nfsd exportfs nfs lockd fscache nfs_acl auth_rpcgss sunrpc loop qeth_l3 qeth vmur ccwgroup ext3 jbd mbcache dm_mod dasd_eckd_mod dasd_diag_mod dasd_mod CPU: 0 Not tainted 2.6.38-2-s390x #1 Process cron (pid: 1106, task: 000000001f962f78, ksp: 000000001fa0f9d0) Krnl PSW : 0404200180000000 000000000002c03e (pfault_interrupt+0xa2/0x138) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3 Krnl GPRS: 0000000000000000 0000000000000001 0000000000000000 0000000000000001 000000001f962f78 0000000000518968 0000000090000002 000000001ff03280 0000000000000000 000000000064f000 000000001f962f78 0000000000002603 0000000006002603 0000000000000000 000000001ff7fe68 000000001ff7fe48 Krnl Code: 000000000002c036: 5820d010 l %r2,16(%r13) 000000000002c03a: 1832 lr %r3,%r2 000000000002c03c: 1a31 ar %r3,%r1 >000000000002c03e: ba23d010 cs %r2,%r3,16(%r13) 000000000002c042: a744fffc brc 4,2c03a 000000000002c046: a7290002 lghi %r2,2 000000000002c04a: e320d0000024 stg %r2,0(%r13) 000000000002c050: 07f0 bcr 15,%r0 Call Trace: ([<000000001f962f78>] 0x1f962f78) [<000000000001acda>] do_extint+0xf6/0x138 [<000000000039b6ca>] ext_no_vtime+0x30/0x34 [<000000007d706e04>] 0x7d706e04 Last Breaking-Event-Address: [<0000000000000000>] 0x0 For stable maintainers: the first kernel which contains this bug is 2.6.37. Reported-by: Stephen Powell <zlinuxman@wowway.com> Cc: Jonathan Nieder <jrnieder@gmail.com> Cc: stable@kernel.org Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-04-20 10:15:44 +02:00
Jan Glauber	e4c031b4f2	[S390] fix page table walk for changing page attributes The page table walk for changing page attributes used the wrong address for pgd/pud/pmd lookups if the range was bigger than a pmd entry. Fix the lookup by using the correct address. Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-04-20 10:15:43 +02:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
Jan Glauber	305b152325	[S390] Write protect module text and RO data Implement write protection for kernel modules text and read-only data sections by implementing set_memory_[ro\|rw] on s390. Since s390 has no execute bit in the pte's NX is not supported. set_memory_[ro\|rw] will only work on normal pages and not on large pages, so in case a large page should be modified bail out with a warning. Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-03-15 17:08:23 +01:00
Martin Schwidefsky	f1be77bb21	[S390] pgtable_list corruption After page_table_free_rcu removed a page from the pgtable_list page_table_free better not add it again. Otherwise a page_table_alloc can reuse a page table fragment that is still in the rcu process. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-01-31 11:30:20 +01:00
Heiko Carstens	df1ca53cba	[S390] Randomize mmap start address Randomize mmap start address with 8MB. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-01-12 09:55:25 +01:00
Heiko Carstens	1060f62ea4	[S390] Rearrange mmap.c Shuffle code around so it looks more like x86 and powerpc. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-01-12 09:55:25 +01:00
Heiko Carstens	7e0d48574e	[S390] Enable flexible mmap layout for 64 bit processes Historically 64 bit processes use the legacy address layout. However there is no reason why 64 bit processes shouldn't benefit from the flexible mmap layout advantages. Therefore just enable it. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-01-12 09:55:24 +01:00
Heiko Carstens	9e78a13bfb	[S390] reduce miminum gap between stack and mmap_base Reduce minimum gap between stack and mmap_base to 32MB. That way there is a bit more space for heap and mmap for tight 31 bit address spaces. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-01-12 09:55:24 +01:00
Heiko Carstens	9046e401e7	[S390] mmap: consider stack address randomization Consider stack address randomization when calulating mmap_base for flexible mmap layout . Because of address randomization the stack address can be up to 8MB lower than STACK_TOP. When calculating mmap_base this isn't taken into account, which could lead to the case that the gap between the real stack top and mmap_base is lower than what ulimit specifies for the maximum stack size. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-01-12 09:55:24 +01:00
Martin Schwidefsky	5e9a26928f	[S390] ptrace cleanup Overhaul program event recording and the code dealing with the ptrace user space interface. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-01-05 12:47:31 +01:00
Heiko Carstens	fb0a9d7e86	[S390] pfault: delay register of pfault interrupt Use an early init call to initialize pfault. That way it is possible to use the register_external_interrupt() instead of the early variant. No need to enable pfault any earlier since it has only effect if user space processes are running. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-01-05 12:47:26 +01:00
Heiko Carstens	052ff461c8	[S390] irq: have detailed statistics for interrupt types Up to now /proc/interrupts only has statistics for external and i/o interrupts but doesn't split up them any further. This patch adds a line for every single interrupt source so that it is possible to easier tell what the machine is/was doing. Part of the output now looks like this; CPU0 CPU2 CPU4 EXT: 3898 4232 2305 I/O: 782 315 245 CLK: 1029 1964 727 [EXT] Clock Comparator IPI: 2868 2267 1577 [EXT] Signal Processor TMR: 0 0 0 [EXT] CPU Timer TAL: 0 0 0 [EXT] Timing Alert PFL: 0 0 0 [EXT] Pseudo Page Fault [...] NMI: 0 1 1 [NMI] Machine Checks Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-01-05 12:47:25 +01:00
Martin Schwidefsky	25591b0703	[S390] fix get_user_pages_fast The check for the _PAGE_RO bit in get_user_pages_fast for write==1 is the wrong way around. It must not be set for the fast path. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-11-10 10:05:53 +01:00
Martin Schwidefsky	14375bc4eb	[S390] cleanup facility list handling Store the facility list once at system startup with stfl/stfle and reuse the result for all facility tests. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-10-25 16:10:21 +02:00
Christian Borntraeger	e05ef9bdb8	[S390] kvm: Fix badness at include/asm/mmu_context.h:83 commit `050eef364a` [S390] fix tlb flushing vs. concurrent /proc accesses broke KVM on s390x. On every schedule a Badness at include/asm/mmu_context.h:83 appears. s390_enable_sie replaces the mm on the __running__ task, therefore, we have to increase the attach count of the new mm. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-10-25 16:10:20 +02:00
Martin Schwidefsky	f6649a7e5a	[S390] cleanup lowcore access from external interrupts Read external interrupts parameters from the lowcore in the first level interrupt handler in entry[64].S. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-10-25 16:10:19 +02:00
Martin Schwidefsky	1e54622e04	[S390] cleanup lowcore access from program checks Read all required fields for program checks from the lowcore in the first level interrupt handler in entry[64].S. If the context that caused the fault was enabled for interrupts we can now re-enable the irqs in entry[64].S. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-10-25 16:10:19 +02:00
Martin Schwidefsky	36bf96801e	[S390] fix SIGBUS handling Raise SIGBUS with a siginfo structure. Deliver BUS_ADRERR as si_code and the address of the fault in the si_addr field. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-10-25 16:10:19 +02:00
Heiko Carstens	a20852d2b7	[S390] cmm: fix crash on case conversion When the cmm module is compiled into the kernel it will crash when writing to the R/O data section. Reason is the lower to upper case conversion of the "sender" module parameter which ignored the fact that the pointer is preinitialized. Introduced with `41b42876` "cmm, smsgiucv_app: convert sender to uppercase" Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-10-25 16:10:17 +02:00
Martin Schwidefsky	92f842eac7	[S390] store indication fault optimization Use the store indication bit in the translation exception code on page faults to avoid the protection faults that immediatly follow the page fault if the access has been a write. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-10-25 16:10:15 +02:00
Martin Schwidefsky	80217147a3	[S390] lockless get_user_pages_fast() Implement get_user_pages_fast without locking in the fastpath on s390. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-10-25 16:10:15 +02:00
Martin Schwidefsky	238ec4efee	[S390] zero page cache synonyms If the zero page is mapped to virtual user space addresses that differ only in bit 2^12 or 2^13 we get L1 cache synonyms which can affect performance. Follow the mips model and use multiple zero pages to avoid the synonyms. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-10-25 16:10:14 +02:00
David Howells	df9ee29270	Fix IRQ flag handling naming Fix the IRQ flag handling naming. In linux/irqflags.h under one configuration, it maps: local_irq_enable() -> raw_local_irq_enable() local_irq_disable() -> raw_local_irq_disable() local_irq_save() -> raw_local_irq_save() ... and under the other configuration, it maps: raw_local_irq_enable() -> local_irq_enable() raw_local_irq_disable() -> local_irq_disable() raw_local_irq_save() -> local_irq_save() ... This is quite confusing. There should be one set of names expected of the arch, and this should be wrapped to give another set of names that are expected by users of this facility. Change this to have the arch provide: flags = arch_local_save_flags() flags = arch_local_irq_save() arch_local_irq_restore(flags) arch_local_irq_disable() arch_local_irq_enable() arch_irqs_disabled_flags(flags) arch_irqs_disabled() arch_safe_halt() Then linux/irqflags.h wraps these to provide: raw_local_save_flags(flags) raw_local_irq_save(flags) raw_local_irq_restore(flags) raw_local_irq_disable() raw_local_irq_enable() raw_irqs_disabled_flags(flags) raw_irqs_disabled() raw_safe_halt() with type checking on the flags 'arguments', and then wraps those to provide: local_save_flags(flags) local_irq_save(flags) local_irq_restore(flags) local_irq_disable() local_irq_enable() irqs_disabled_flags(flags) irqs_disabled() safe_halt() with tracing included if enabled. The arch functions can now all be inline functions rather than some of them having to be macros. Signed-off-by: David Howells <dhowells@redhat.com> [X86, FRV, MN10300] Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> [Tile] Signed-off-by: Michal Simek <monstr@monstr.eu> [Microblaze] Tested-by: Catalin Marinas <catalin.marinas@arm.com> [ARM] Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> [AVR] Acked-by: Tony Luck <tony.luck@intel.com> [IA-64] Acked-by: Hirokazu Takata <takata@linux-m32r.org> [M32R] Acked-by: Greg Ungerer <gerg@uclinux.org> [M68K/M68KNOMMU] Acked-by: Ralf Baechle <ralf@linux-mips.org> [MIPS] Acked-by: Kyle McMartin <kyle@mcmartin.ca> [PA-RISC] Acked-by: Paul Mackerras <paulus@samba.org> [PowerPC] Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [S390] Acked-by: Chen Liqin <liqin.chen@sunplusct.com> [Score] Acked-by: Matt Fleming <matt@console-pimps.org> [SH] Acked-by: David S. Miller <davem@davemloft.net> [Sparc] Acked-by: Chris Zankel <chris@zankel.net> [Xtensa] Reviewed-by: Richard Henderson <rth@twiddle.net> [Alpha] Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp> [H8300] Cc: starvik@axis.com [CRIS] Cc: jesper.nilsson@axis.com [CRIS] Cc: linux-cris-kernel@axis.com	2010-10-07 14:08:55 +01:00
Martin Schwidefsky	050eef364a	[S390] fix tlb flushing vs. concurrent /proc accesses The tlb flushing code uses the mm_users field of the mm_struct to decide if each page table entry needs to be flushed individually with IPTE or if a global flush for the mm_struct is sufficient after all page table updates have been done. The comment for mm_users says "How many users with user space?" but the /proc code increases mm_users after it found the process structure by pid without creating a new user process. Which makes mm_users useless for the decision between the two tlb flusing methods. The current code can be confused to not flush tlb entries by a concurrent access to /proc files if e.g. a fork is in progres. The solution for this problem is to make the tlb flushing logic independent from the mm_users field. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-08-24 09:26:34 +02:00
Linus Torvalds	0d6ffdb8f1	Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6 * 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: [S390] dasd: tunable missing interrupt handler [S390] dasd: allocate fallback cqr for reserve/release [S390] topology: use default MC domain initializer [S390] initrd: change default load address [S390] cmm, smsgiucv_app: convert sender to uppercase [S390] cmm: add missing __init/__exit annotations [S390] cio: use all available paths for some internal I/O [S390] ccwreq: add ability to use all paths [S390] cio: ccw_device_online_store return -EINVAL in case of missing driver [S390] cio: Log the response from the unit check handler [S390] cio: CHSC SIOSL Support	2010-08-10 14:01:26 -07:00
Heiko Carstens	a1b200e27c	mm: provide init_mm mm_context initializer Provide an INIT_MM_CONTEXT intializer macro which can be used to statically initialize mm_struct:mm_context of init_mm. This way we can get rid of code which will do the initialization at run time (on s390). In addition the current code can be found at a place where it is not expected. So let's have a common initializer which architectures can use if needed. This is based on a patch from Suzuki Poulose. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Suzuki Poulose <suzuki@in.ibm.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-09 20:44:54 -07:00
Hendrik Brueckner	41b4287677	[S390] cmm, smsgiucv_app: convert sender to uppercase The sender kernel parameter contains a z/VM user ID where alphabetic characters must be specified in uppercase. Allow users to specify lowercase characters and convert the sender string to uppercase at module initialization. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-08-09 18:12:54 +02:00
Heiko Carstens	2e85ba510e	[S390] cmm: add missing __init/__exit annotations Add missing __init and __exit annoations for module init and exit functions. This will save us huge amounts of memory... sort of. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-08-09 18:12:54 +02:00
Heiko Carstens	c2f0e8c803	[S390] appldata/extmem/kvm: add missing GFP_KERNEL flag Add missing GFP flag to memory allocations. The part in cio only changes a comment. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-06-08 18:58:23 +02:00
Heiko Carstens	cf9daf4a73	[S390] cmm: get rid of CMM_PROC config option All distros have this option switched on, so lets get rid of at least one of the tons of config options that are available. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-05-26 23:27:10 +02:00
Heiko Carstens	db705e831a	[S390] cmm: remove superfluous EXPORT_SYMBOLs plus cleanups Remove superfluous EXPORT_SYMBOLS and do coding style cleanup while being at it. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-05-26 23:27:10 +02:00
Heiko Carstens	1ef6acf597	[S390] cmm: fix crash on module unload There might be a scheduled cmm_timer if the cmm module gets unloaded. That timer was not deleted during module unload and thus could lead to system crash later on. Besides that reorder function calls in module init and exit code to avoid a couple of other races which could lead to accesses to uninitialized data. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-05-26 23:26:29 +02:00
Heiko Carstens	ab3c68ee5f	[S390] debug: enable exception-trace debug facility The exception-trace facility on x86 and other architectures prints traces to dmesg whenever a user space application crashes. s390 has such a feature since ages however it is called userprocess_debug and is enabled differently. This patch makes sure that whenever one of the two procfs files /proc/sys/kernel/userprocess_debug /proc/sys/debug/exception-trace is modified the contents of the second one changes as well. That way we keep backwards compatibilty but also support the same interface like other architectures do. Besides that the output of the traces is improved since it will now also contain the corresponding filename of the vma (when available) where the process caused a fault or trap. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-05-17 10:00:17 +02:00
Christian Borntraeger	6af7eea2ae	[S390] s390: disable change bit override commit `6a985c6194` ([S390] s390: use change recording override for kernel mapping) deactivated the change bit recording for the kernel mapping to improve the performance. This works most of the time, but there are cases (e.g. kernel runs in home space, futex atomic compare xcmg) where we modify user memory with the kernel mapping instead of the user mapping. Instead of fixing these cases, this patch just deactivates change bit override to avoid future problems with other kernel code that might use the kernel mapping for user memory. CC: stable@kernel.org Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-04-09 13:43:02 +02:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Michael Holzheu	92fe31329c	[S390] zcore: CPU registers are not saved under LPAR To save the registers for all CPUs a sigp "store status" is done that stores the registers to address absolute zero. To access storage at absolute zero, normally the address of the prefix register of the accessing CPU has to be used. This does not work when large pages are active (currently only under LPAR). In order to fix that problem, instead of memcpy memcpy_real is used, which switches to real mode where prefixing works. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-03-24 11:49:53 +01:00
Hendrik Brueckner	09003ed90a	[S390] smsgiucv: declare char pointers as "const" Declare the smsgiucv prefix char pointer as "const" and use use const char pointers in callback functions. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-03-08 12:26:28 +01:00
Heiko Carstens	22e0a04672	[S390] use kprobes_built_in() in mm/fault code Use kprobes_built_in() to avoid ifdefs like most other architectures do. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-02-26 22:37:32 +01:00
Heiko Carstens	cbb870c822	[S390] Cleanup struct _lowcore usage and defines. Use asm offsets to make sure the offset defines to struct _lowcore and its layout don't get out of sync. Also add a BUILD_BUG_ON() which checks that the size of the structure is sane. And while being at it change those sites which use odd casts to access the current lowcore. These should use S390_lowcore instead. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-02-26 22:37:31 +01:00
Heiko Carstens	d96221ab1e	[S390] free_initmem: reduce code duplication free_initmem() and free_initrd_mem() are nearly identical. So make them call a common function. Also fixes a bug: if the initrd wouldn't start on a page boundary also memory after the initrd would be initialized with the poison value. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-02-26 22:37:31 +01:00
Heiko Carstens	b8e660b83d	[S390] Replace ENOTSUPP usage with EOPNOTSUPP ENOTSUPP is not supposed to leak to userspace so lets just use EOPNOTSUPP everywhere. Doesn't fix a bug, but makes future reviews easier. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-02-26 22:37:31 +01:00
Jiri Slaby	a58c26bba9	[S390] use helpers for rlimits Make sure compiler won't do weird things with limits. E.g. fetching them twice may return 2 different values after writable limits are implemented. I.e. either use rlimit helpers added in `3e10e716ab` or ACCESS_ONCE if not applicable. Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Cc: linux-s390@vger.kernel.org Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-01-13 20:44:45 +01:00
Linus Torvalds	67dd2f5a66	Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6 * 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: (72 commits) [S390] 3215/3270 console: remove wrong comment [S390] dasd: remove BKL from extended error reporting code [S390] vmlogrdr: remove BKL [S390] vmur: remove BKL [S390] zcrypt: remove BKL [S390] 3270: remove BKL [S390] vmwatchdog: remove lock_kernel() from open() function [S390] monwriter: remove lock_kernel() from open() function [S390] monreader: remove lock_kernel() from open() function [S390] s390: remove unused nfsd #includes [S390] ftrace: build ftrace.o when CONFIG_FTRACE_SYSCALLS is set for s390 [S390] etr/stp: put correct per cpu variable [S390] tty3270: move keyboard compat ioctls [S390] sclp: improve servicability setting [S390] s390: use change recording override for kernel mapping [S390] MAINTAINERS: Add s390 drivers block [S390] use generic sockios.h header file [S390] use generic termbits.h header file [S390] smp: remove unused typedef and defines [S390] cmm: free pages on hibernate. ...	2009-12-09 19:01:47 -08:00
Christian Borntraeger	6a985c6194	[S390] s390: use change recording override for kernel mapping We dont need the dirty bit if a write access is done via the kernel mapping. In that case SetPageDirty and friends are used anyway, no need to do that a second time. We can use the change-recording overide function for the kernel mapping, if available. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-12-07 12:51:37 +01:00
Martin Schwidefsky	52b169c864	[S390] cmm: free pages on hibernate. The pages allocated by the cmm memory balloon should be freed before the hibernation image is created. Otherwise the memory reserved by the balloon gets written to the swap device but there is no content in these pages that need to be preserved. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-12-07 12:51:37 +01:00
Gerald Schaefer	6c1e3e7943	[S390] Use do_exception() in pagetable walk usercopy functions. The pagetable walk usercopy functions have used a modified copy of the do_exception() function for fault handling. This lead to inconsistencies with recent changes to do_exception(), e.g. performance counters. This patch changes the pagetable walk usercopy code to call do_exception() directly, eliminating the redundancy. A new parameter is added to do_exception() to specify the fault address. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-12-07 12:51:34 +01:00
Martin Schwidefsky	1ab947de29	[S390] fault handler access flags check. Simplify the check of the vma->flags in do_exception for the different fault types. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-12-07 12:51:34 +01:00
Martin Schwidefsky	50d7280d43	[S390] fault handler performance optimization. Slim down the do_exception function to handle only the fast path of a fault and move the exceptional cases into a new function. That slightly increases the performance of the fault handling. Build fix for !CONFIG_COMPAT by Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-12-07 12:51:33 +01:00
Martin Schwidefsky	7ecb344ae8	[S390] Improve notify_page_fault implementation. notify_page_fault does a preempt_disable/preempt_enable for each fault generated by a kernel access to user space. If kprobes is not active that is unnecessary since the interrupts are not reenabled yet. To play safe repeat the kprobe_running check after preempt_disable(). Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-12-07 12:51:33 +01:00
Martin Schwidefsky	b11b533427	[S390] Improve address space mode selection. Introduce user_mode to replace the two variables switch_amode and s390_noexec. There are three valid combinations of the old values: 1) switch_amode == 0 && s390_noexec == 0 2) switch_amode == 1 && s390_noexec == 0 3) switch_amode == 1 && s390_noexec == 1 They get replaced by 1) user_mode == HOME_SPACE_MODE 2) user_mode == PRIMARY_SPACE_MODE 3) user_mode == SECONDARY_SPACE_MODE The new kernel parameter user_mode=[primary,secondary,home] lets you choose the address space mode the user space processes should use. In addition the CONFIG_S390_SWITCH_AMODE config option is removed. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-12-07 12:51:33 +01:00
Martin Schwidefsky	61365e132e	[S390] Improve address space check. A data access in access-register mode always is a user mode access, the code to inspect the access-registers can be removed. The second change is to use a different test to check for no-execute fault. The third change is to pass the translation exception identification as parameter, in theory the trans_exc_code in the lowcore could have been overwritten by the time the call to check_space from do_no_context is done. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-12-07 12:51:33 +01:00
Eric W. Biederman	6d4561110a	sysctl: Drop & in front of every proc_handler. For consistency drop & in front of every proc_handler. Explicity taking the address is unnecessary and it prevents optimizations like stubbing the proc_handlers to NULL. Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joe Perches <joe@perches.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-18 08:37:40 -08:00
Eric W. Biederman	b05fd35d91	sysctl s390: Remove dead sysctl binary support Now that sys_sysctl is a generic wrapper around /proc/sys .ctl_name and .strategy members of sysctl tables are dead code. Remove them. Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-12 02:05:01 -08:00
Martin Schwidefsky	52a21f2cee	[S390] fix build breakage with CONFIG_AIO=n next-20090925 randconfig build breaks on s390x, with CONFIG_AIO=n. arch/s390/mm/pgtable.c: In function 's390_enable_sie': arch/s390/mm/pgtable.c:282: error: 'struct mm_struct' has no member named 'ioctx_list' arch/s390/mm/pgtable.c:298: error: 'struct mm_struct' has no member named 'ioctx_list' make[1]: *** [arch/s390/mm/pgtable.o] Error 1 Reported-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-10-06 10:35:05 +02:00
Alexey Dobriyan	8d65af789f	sysctl: remove "struct file *" argument of ->proc_handler It's unused. It isn't needed -- read or write flag is already passed and sysctl shouldn't care about the rest. It _was_ used in two places at arch/frv for some reason. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-24 07:21:04 -07:00
Linus Torvalds	9fd815b55f	Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6 * 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: (22 commits) [S390] Update default configuration. [S390] hibernate: Do real CPU swap at resume time [S390] dasd: tolerate devices that have no feature codes [S390] zcrypt: Do not add/remove devices in s/r callbacks [S390] hibernate: make sure pfn_is_nosave handles lowcore pages [S390] smp: introduce LC_ORDER and simplify lowcore handling [S390] ptrace: use common code for simple peek/poke operations [S390] fix disabled_wait inline assembly clobber list [S390] Change kernel_page_present coding style. [S390] hibernation: reset system after resume [S390] hibernation: fix guest page hinting related crash [S390] Get rid of init_module/delete_module compat functions. [S390] Convert sys_execve to function with parameters. [S390] Convert sys_clone to function with parameters. [S390] qdio: change state of all primed input buffers [S390] qdio: reduce per device debug messages [S390] cio: introduce consistent subchannel scanning [S390] cio: idset use actual number of ssids [S390] cio: dont kfree vmalloced memory [S390] cio: introduce css_settle ...	2009-09-23 10:02:14 -07:00
Heiko Carstens	87458ff458	[S390] Change kernel_page_present coding style. Make the inline assembly look like all others. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-22 22:58:44 +02:00
Heiko Carstens	846955c8af	[S390] hibernation: fix guest page hinting related crash On resume the system that loads the to be resumed image might have unstable pages. When the resume image is copied back and a write access happen to an unstable page this causes an exception and the system crashes. To fix this set all free pages to stable before copying the resumed image data. Also after everything has been restored set all free pages of the resumed system to unstable again. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-22 22:58:44 +02:00
Geert Uytterhoeven	cc013a8890	arches: drop superfluous casts in nr_free_pages() callers Commit `9617729941` ("Drop free_pages()") modified nr_free_pages() to return 'unsigned long' instead of 'unsigned int'. This made the casts to 'unsigned long' in most callers superfluous, so remove them. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Mikael Starvik <starvik@axis.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Howells <dhowells@redhat.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Chris Zankel <zankel@tensilica.com> Cc: Michal Simek <monstr@monstr.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-22 07:17:34 -07:00
Ingo Molnar	cdd6c482c9	perf: Do the big rename: Performance Counters -> Performance Events Bye-bye Performance Counters, welcome Performance Events! In the past few months the perfcounters subsystem has grown out its initial role of counting hardware events, and has become (and is becoming) a much broader generic event enumeration, reporting, logging, monitoring, analysis facility. Naming its core object 'perf_counter' and naming the subsystem 'perfcounters' has become more and more of a misnomer. With pending code like hw-breakpoints support the 'counter' name is less and less appropriate. All in one, we've decided to rename the subsystem to 'performance events' and to propagate this rename through all fields, variables and API names. (in an ABI compatible fashion) The word 'event' is also a bit shorter than 'counter' - which makes it slightly more convenient to write/handle as well. Thanks goes to Stephane Eranian who first observed this misnomer and suggested a rename. User-space tooling and ABI compatibility is not affected - this patch should be function-invariant. (Also, defconfigs were not touched to keep the size down.) This patch has been generated via the following script: FILES=$(find * -type f \| grep -vE 'oprofile\|[^K]config') sed -i \ -e 's/PERF_EVENT_/PERF_RECORD_/g' \ -e 's/PERF_COUNTER/PERF_EVENT/g' \ -e 's/perf_counter/perf_event/g' \ -e 's/nb_counters/nb_events/g' \ -e 's/swcounter/swevent/g' \ -e 's/tpcounter_event/tp_event/g' \ $FILES for N in $(find . -name perf_counter.[ch]); do M=$(echo $N \| sed 's/perf_counter/perf_event/g') mv $N $M done FILES=$(find . -name perf_event.*) sed -i \ -e 's/COUNTER_MASK/REG_MASK/g' \ -e 's/COUNTER/EVENT/g' \ -e 's/\<event\>/event_id/g' \ -e 's/counter/event/g' \ -e 's/Counter/Event/g' \ $FILES ... to keep it as correct as possible. This script can also be used by anyone who has pending perfcounters patches - it converts a Linux kernel tree over to the new naming. We tried to time this change to the point in time where the amount of pending patches is the smallest: the end of the merge window. Namespace clashes were fixed up in a preparatory patch - and some stylistic fallout will be fixed up in a subsequent patch. ( NOTE: 'counters' are still the proper terminology when we deal with hardware registers - and these sed scripts are a bit over-eager in renaming them. I've undone some of that, but in case there's something left where 'counter' would be better than 'event' we can undo that on an individual basis instead of touching an otherwise nicely automated patch. ) Suggested-by: Stephane Eranian <eranian@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-21 14:28:04 +02:00
Heiko Carstens	bde69af2ab	[S390] Wire up page fault events for software perf counters. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:56 +02:00
Heiko Carstens	2ddddf3e0a	[S390] Enable guest page hinting by default. Get rid of the PAGE_STATES config option and enable guest page hinting by default. It can be disabled by specifying "cmma=off" at the command line. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:54 +02:00
Martin Schwidefsky	50aa98bad0	[S390] fix recursive locking on page_table_lock Suzuki Poulose reported the following recursive locking bug on s390: Here is the stack trace : (see Appendix I for more info) [<0000000000406ed6>] _spin_lock+0x52/0x94 [<0000000000103bde>] crst_table_free+0x14e/0x1a4 [<00000000001ba684>] __pmd_alloc+0x114/0x1ec [<00000000001be8d0>] handle_mm_fault+0x2cc/0xb80 [<0000000000407d62>] do_dat_exception+0x2b6/0x3a0 [<0000000000114f8c>] sysc_return+0x0/0x8 [<00000200001642b2>] 0x200001642b2 The page_table_lock is already acquired in __pmd_alloc (mm/memory.c) and it tries to populate the pud/pgd with a new pmd allocated. If another thread populates it before we get a chance, we free the pmd using pmd_free(). On s390x, pmd_free(even pud_free ) is #defined to crst_table_free(), which acquires the page_table_lock to protect the crst_table index updates. Hence this ends up in a recursive locking of the page_table_lock. The solution suggested by Dave Hansen is to use a new spin lock in the mmu context to protect the access to the crst_list and the pgtable_list. Reported-by: Suzuki Poulose <suzuki@in.ibm.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:53 +02:00
Alexey Dobriyan	405f55712d	headers: smp_lock.h redux * Remove smp_lock.h from files which don't need it (including some headers!) * Add smp_lock.h to files which do need it * Make smp_lock.h include conditional in hardirq.h It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT This will make hardirq.h inclusion cheaper for every PREEMPT=n config (which includes allmodconfig/allyesconfig, BTW) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-07-12 12:22:34 -07:00
Linus Torvalds	d06063cc22	Move FAULT_FLAG_xyz into handle_mm_fault() callers This allows the callers to now pass down the full set of FAULT_FLAG_xyz flags to handle_mm_fault(). All callers have been (mechanically) converted to the new calling convention, there's almost certainly room for architectures to clean up their code and then add FAULT_FLAG_RETRY when that support is added. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-21 13:08:22 -07:00
Hans-Joachim Picht	7db11a363f	[S390] pm: add kernel_page_present Fix the following build failure caused by make allyesconfig using CONFIG_HIBERNATION and CONFIG_DEBUG_PAGEALLOC kernel/built-in.o: In function `saveable_page': kernel/power/snapshot.c:897: undefined reference to `kernel_page_present' kernel/built-in.o: In function `safe_copy_page': kernel/power/snapshot.c:948: undefined reference to `kernel_page_present' make: *** [.tmp_vmlinux1] Error 1 Signed-off-by: Hans-Joachim Picht <hans@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-06-16 10:31:11 +02:00
Heiko Carstens	88df125fd6	[S390] maccess: arch specific probe_kernel_write() implementation Add an s390 specific probe_kernel_write() function which allows to write to the kernel text segment even if write protection is enabled. This is implemented using the lra (load real address) and stura (store using real address) instructions. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-06-12 10:27:37 +02:00
Heiko Carstens	239a64255f	[S390] vmalloc: add vmalloc kernel parameter support With the kernel parameter 'vmalloc=<size>' the size of the vmalloc area can be specified. This can be used to increase or decrease the size of the area. Works in the same way as on some other architectures. This can be useful for features which make excessive use of vmalloc and wouldn't work otherwise. The default sizes remain unchanged: 96MB for 31 bit kernels and 1GB for 64 bit kernels. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-06-12 10:27:33 +02:00
Heiko Carstens	7757591ab4	[S390] implement is_compat_task Implement is_compat_task and use it all over the place. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-06-12 10:27:30 +02:00
Rusty Russell	005f8eee6f	[S390] cpumask: use mm_cpumask() wrapper Makes code futureproof against the impending change to mm->cpu_vm_mask. It's also a chance to use the new cpumask_ ops which take a pointer (the older ones are deprecated, but there's no hurry for arch code). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-03-26 15:24:34 +01:00
Heiko Carstens	1485c5c884	[S390] move EXPORT_SYMBOLs to definitions Move all EXPORT_SYMBOLs to their corresponding definitions. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-03-26 15:24:11 +01:00
Carsten Otte	702d9e584f	[S390] check addressing mode in s390_enable_sie The sie instruction requires address spaces to be switched to run proper. This patch verifies that this is the case in s390_enable_sie, otherwise the kernel would crash badly as soon as the process runs into sie. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-03-26 15:24:09 +01:00
Heiko Carstens	59fa4392dd	[S390] page fault: invoke oom-killer s390 arch backend for `1c0fe6e3bd` "mm: invoke oom-killer from page fault". Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-03-26 15:24:03 +01:00
Martin Schwidefsky	0fb1d9bcbc	[S390] make page table upgrade work again After TASK_SIZE now gives the current size of the address space the upgrade of a 64 bit process from 3 to 4 levels of page table needs to use the arch_mmap_check hook to catch large mmap lengths. The get_unmapped_area* functions need to check for -ENOMEM from the arch_get_unmapped_area*, upgrade the page table and retry. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-03-18 13:28:13 +01:00
Martin Schwidefsky	f481bfafd3	[S390] make page table walking more robust Make page table walking on s390 more robust. The current code requires that the pgd/pud/pmd/pte loop is only done for address ranges that are below the end address of the last vma of the address space. But this is not always true, e.g. the generic page table walker does not guarantee this. Change TASK_SIZE/TASK_SIZE_OF to reflect the current size of the address space. This makes the generic page table walker happy but it breaks the upgrade of a 3 level page table to a 4 level page table. To make the upgrade work again another fix is required. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-03-18 13:28:13 +01:00
Gary Hade	c04fc586c1	mm: show node to memory section relationship with symlinks in sysfs Show node to memory section relationship with symlinks in sysfs Add /sys/devices/system/node/nodeX/memoryY symlinks for all the memory sections located on nodeX. For example: /sys/devices/system/node/node1/memory135 -> ../../memory/memory135 indicates that memory section 135 resides on node1. Also revises documentation to cover this change as well as updating Documentation/ABI/testing/sysfs-devices-memory to include descriptions of memory hotremove files 'phys_device', 'phys_index', and 'state' that were previously not described there. In addition to it always being a good policy to provide users with the maximum possible amount of physical location information for resources that can be hot-added and/or hot-removed, the following are some (but likely not all) of the user benefits provided by this change. Immediate: - Provides information needed to determine the specific node on which a defective DIMM is located. This will reduce system downtime when the node or defective DIMM is swapped out. - Prevents unintended onlining of a memory section that was previously offlined due to a defective DIMM. This could happen during node hot-add when the user or node hot-add assist script onlines _all_ offlined sections due to user or script inability to identify the specific memory sections located on the hot-added node. The consequences of reintroducing the defective memory could be ugly. - Provides information needed to vary the amount and distribution of memory on specific nodes for testing or debugging purposes. Future: - Will provide information needed to identify the memory sections that need to be offlined prior to physical removal of a specific node. Symlink creation during boot was tested on 2-node x86_64, 2-node ppc64, and 2-node ia64 systems. Symlink creation during physical memory hot-add tested on a 2-node x86_64 system. Signed-off-by: Gary Hade <garyhade@us.ibm.com> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-06 15:59:00 -08:00
Jens Axboe	abf137dd77	aio: make the lookup_ioctx() lockless The mm->ioctx_list is currently protected by a reader-writer lock, so we always grab that lock on the read side for doing ioctx lookups. As the workload is extremely reader biased, turn this into an rcu hlist so we can make lookup_ioctx() lockless. Get rid of the rwlock and use a spinlock for providing update side exclusion. There's usually only 1 entry on this list, so it doesn't make sense to look into fancier data structures. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:50 +01:00
Hongjie Yang	93098bf015	[S390] convert dcssblk and extmem printks messages to pr_xxx macros. Signed-off-by: Hongjie Yang <hongjie@us.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-12-25 13:39:23 +01:00
Christian Borntraeger	250cf776f7	[S390] pgtables: Fix race in enable_sie vs. page table ops The current enable_sie code sets the mm->context.pgstes bit to tell dup_mm that the new mm should have extended page tables. This bit is also used by the s390 specific page table primitives to decide about the page table layout - which means context.pgstes has two meanings. This can cause any kind of bugs. For example - e.g. shrink_zone can call ptep_clear_flush_young while enable_sie is running. ptep_clear_flush_young will test for context.pgstes. Since enable_sie changed that value of the old struct mm without changing the page table layout ptep_clear_flush_young will do the wrong thing. The solution is to split pgstes into two bits - one for the allocation - one for the current state Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-10-28 11:12:03 +01:00
Badari Pulavarty	71088785c6	mm: cleanup to make remove_memory() arch-neutral There is nothing architecture specific about remove_memory(). remove_memory() function is common for all architectures which support hotplug memory remove. Instead of duplicating it in every architecture, collapse them into arch neutral function. [akpm@linux-foundation.org: fix the export] Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Gary Hade <garyhade@us.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 08:50:25 -07:00
Hongjie Yang	b2300b9efe	[S390] dcssblk: add >2G DCSSs support and stacked contiguous DCSSs support. The DCSS block device driver is modified to add >2G DCSSs support and allow a DCSS block device to map to a set of contiguous DCSSs. The extmem code is also modified to use new Diagnose x'64' subcodes for >2G DCSSs. Signed-off-by: Hongjie Yang <hongjie@us.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-10-10 21:33:57 +02:00
Gerald Schaefer	7e9238fbc1	[S390] Add support for memory hot-remove. This patch enables memory hot-remove on s390. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-08-01 16:39:33 +02:00
Johannes Weiner	c55281dee0	s390: use generic show_mem() Remove arch-specific show_mem() in favor of the generic version. This also removes the following redundant information display: - pages in swapcache, printed by show_swap_cache_info() where show_mem() calls show_free_areas(), which calls show_swap_cache_info(). Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:11 -07:00
Andi Kleen	ceb8687961	hugetlb: introduce pud_huge Straight forward extensions for huge pages located in the PUD instead of PMDs. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:18 -07:00
Andi Kleen	a551643895	hugetlb: modular state for hugetlb page size The goal of this patchset is to support multiple hugetlb page sizes. This is achieved by introducing a new struct hstate structure, which encapsulates the important hugetlb state and constants (eg. huge page size, number of huge pages currently allocated, etc). The hstate structure is then passed around the code which requires these fields, they will do the right thing regardless of the exact hstate they are operating on. This patch adds the hstate structure, with a single global instance of it (default_hstate), and does the basic work of converting hugetlb to use the hstate. Future patches will add more hstate structures to allow for different hugetlbfs mounts to have different page sizes. [akpm@linux-foundation.org: coding-style fixes] Acked-by: Adam Litke <agl@us.ibm.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 10:47:17 -07:00
Heiko Carstens	421c175c4d	[S390] Add support for memory hot-add. Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-07-14 10:02:16 +02:00
Linus Torvalds	a4df1ac12d	Merge branch 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm * 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: KVM: MMU: Fix is_empty_shadow_page() check KVM: MMU: Fix printk() format string KVM: IOAPIC: only set remote_irr if interrupt was injected KVM: MMU: reschedule during shadow teardown KVM: VMX: Clear CR4.VMXE in hardware_disable KVM: migrate PIT timer KVM: ppc: Report bad GFNs KVM: ppc: Use a read lock around MMU operations, and release it on error KVM: ppc: Remove unmatched kunmap() call KVM: ppc: add lwzx/stwz emulation KVM: ppc: Remove duplicate function KVM: s390: Fix race condition in kvm_s390_handle_wait KVM: s390: Send program check on access error KVM: s390: fix interrupt delivery KVM: s390: handle machine checks when guest is running KVM: s390: fix locking order problem in enable_sie KVM: s390: use yield instead of schedule to implement diag 0x44 KVM: x86 emulator: fix hypercall return value on AMD KVM: ia64: fix zero extending for mmio ld1/2/4 emulation in KVM	2008-06-11 10:35:44 -07:00
Heiko Carstens	ee0ddadd08	[S390] vmemmap: fix off-by-one bug. If a memory range is supposed to be added to the 1:1 mapping and it ends just below the maximum supported physical address it won't succeed. This is because a test doesn't consider that the end address is 1 smaller than start + size. Fix the comparison. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-06-10 10:03:27 +02:00
Christian Borntraeger	74b6b522ec	KVM: s390: fix locking order problem in enable_sie There are potential locking problem in enable_sie. We take the task_lock and the mmap_sem. As exit_mm uses the same locks vice versa, this triggers a lockdep warning. The second problem is that dup_mm and mmput might sleep, so we must not hold the task_lock at that moment. The solution is to dup the mm unconditional and use the task_lock before and afterwards to check if we can use the new mm. dup_mm and mmput are called outside the task_lock, but we run update_mm while holding the task_lock, protection us against ptrace. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-06-06 21:08:26 +03:00
Heiko Carstens	c1bb7f31ea	[S390] showmem: Only walk spanned pages. Convert show_mem() so its nearly the same as on x86/powerpc. Gives us proper locking and we get also rid of the only use of max_mapnr. Also the number of pages was contained in an int which might not be sufficient not too far in the future. Cc: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-05-30 10:03:34 +02:00
Heiko Carstens	67060d9c1f	[S390] Fix section mismatch warnings. This fixes the last remaining section mismatch warnings in s390 architecture code. It reveals also a real bug introduced by... me with git commit `2069e978d5` ("[S390] sparsemem vmemmap: initialize memmap.") Calling the generic vmemmap_alloc_block() function to get initialized memory is a nice idea, however that function is __meminit annotated and therefore the function might be gone if we try to call it later. This can happen if a DCSS segment gets added. So basically revert the patch and clear the memmap explicitly to fix the original bug. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-05-30 10:03:34 +02:00
Heiko Carstens	2069e978d5	[S390] sparsemem vmemmap: initialize memmap. Let's just use the generic vmmemmap_alloc_block() function which always returns initialized memory. Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-05-15 16:52:38 +02:00
Martin Schwidefsky	45e576b1c3	[S390] guest page hinting light Use the existing arch_alloc_page/arch_free_page callbacks to do the guest page state transitions between stable and unused. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-05-07 09:23:02 +02:00
Heiko Carstens	17f3458085	[S390] Convert to SPARSEMEM & SPARSEMEM_VMEMMAP Convert s390 to SPARSEMEM and SPARSEMEM_VMEMMAP. We do a select of SPARSEMEM_VMEMMAP since it is configurable. This is because SPARSEMEM without SPARSEMEM_VMEMMAP gives us a hell of broken include dependencies that I don't want to fix. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-04-30 13:38:48 +02:00
Gerald Schaefer	53492b1de4	[S390] System z large page support. This adds hugetlbfs support on System z, using both hardware large page support if available and software large page emulation on older hardware. Shared (large) page tables are implemented in software emulation mode, by using page->index of the first tail page from a compound large page to store page table information. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-04-30 13:38:47 +02:00
Heiko Carstens	8fc6365868	[S390] vmemmap: use clear_table to initialise page tables. Always use clear_table to initialise page tables. The overlapping memcpy is just a leftover of a previous version that wasn't fully converted to clear_table. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-04-30 13:38:46 +02:00
Carsten Otte	402b08622d	s390: KVM preparation: provide hook to enable pgstes in user pagetable The SIE instruction on s390 uses the 2nd half of the page table page to virtualize the storage keys of a guest. This patch offers the s390_enable_sie function, which reorganizes the page tables of a single-threaded process to reserve space in the page table: s390_enable_sie makes sure that the process is single threaded and then uses dup_mm to create a new mm with reorganized page tables. The old mm is freed and the process has now a page status extended field after every page table. Code that wants to exploit pgstes should SELECT CONFIG_PGSTE. This patch has a small common code hit, namely making dup_mm non-static. Edit (Carsten): I've modified Martin's patch, following Jeremy Fitzhardinge's review feedback. Now we do have the prototype for dup_mm in include/linux/sched.h. Following Martin's suggestion, s390_enable_sie() does now call task_lock() to prevent race against ptrace modification of mm_users. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 12:00:40 +03:00
Martin Schwidefsky	ca68305bf3	[S390] Remove code duplication from monreader / dcssblk. Move the function that prints the segment warning messages found in the monreader driver and the dcssblk driver to the extmem base code. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2008-04-17 07:47:07 +02:00
Heiko Carstens	a806170e29	[S390] Fix a lot of sparse warnings. Most noteable part of this commit is the new local header file entry.h which contains all the function declarations of functions that get only called from asm code or are arch internal. That way we can avoid extern declarations in C files. This is more or less the same that was done for sparc64. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2008-04-17 07:47:06 +02:00
Johannes Weiner	1e42f32785	[S390] remove redundant display of free swap space in show_mem() Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2008-04-17 07:47:04 +02:00
Heiko Carstens	c2e8b8531b	[S390] exec_protect: Fix incorrect extern declarations. sys_sigreturn and sys_rt_sigreturn don't take any arguments. So luckily this resulted only in unneeded instead of incorrect code. But still this clearly shows why one should not put extern declarations in C files (will be fixed with a larger sparse patch). Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2008-04-17 07:46:59 +02:00
Martin Schwidefsky	6252d702c5	[S390] dynamic page tables. Add support for different number of page table levels dependent on the highest address used for a process. This will cause a 31 bit process to use a two level page table instead of the four level page table that is the default after the pud has been introduced. Likewise a normal 64 bit process will use three levels instead of four. Only if a process runs out of the 4 tera bytes which can be addressed with a three level page table the fourth level is dynamically added. Then the process can use up to 8 peta byte. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-09 18:24:41 +01:00
Martin Schwidefsky	5a216a2083	[S390] Add four level page tables for CONFIG_64BIT=y. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-09 18:24:40 +01:00
Martin Schwidefsky	146e4b3c8b	[S390] 1K/2K page table pages. This patch implements 1K/2K page table pages for s390. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-09 18:24:40 +01:00
Martin Schwidefsky	2f569afd9c	CONFIG_HIGHPTE vs. sub-page page tables. Background: I've implemented 1K/2K page tables for s390. These sub-page page tables are required to properly support the s390 virtualization instruction with KVM. The SIE instruction requires that the page tables have 256 page table entries (pte) followed by 256 page status table entries (pgste). The pgstes are only required if the process is using the SIE instruction. The pgstes are updated by the hardware and by the hypervisor for a number of reasons, one of them is dirty and reference bit tracking. To avoid wasting memory the standard pte table allocation should return 1K/2K (31/64 bit) and 2K/4K if the process is using SIE. Problem: Page size on s390 is 4K, page table size is 1K or 2K. That means the s390 version for pte_alloc_one cannot return a pointer to a struct page. Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one cannot return a pointer to a pte either, since that would require more than 32 bit for the return value of pte_alloc_one (and the pte * would not be accessible since its not kmapped). Solution: The only solution I found to this dilemma is a new typedef: a pgtable_t. For s390 pgtable_t will be a (pte ) - to be introduced with a later patch. For everybody else it will be a (struct page ). The additional problem with the initialization of the ptl lock and the NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and a destructor pgtable_page_dtor. The page table allocation and free functions need to call these two whenever a page table page is allocated or freed. pmd_populate will get a pgtable_t instead of a struct page pointer. To get the pgtable_t back from a pmd entry that has been installed with pmd_populate a new function pmd_pgtable is added. It replaces the pmd_page call in free_pte_range and apply_to_pte_range. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-08 09:22:42 -08:00
Heiko Carstens	0189103c69	[S390] Remove BUILD_BUG_ON() in vmem code. Remove BUILD_BUG_ON() in vmem code since it causes build failures if the size of struct page increases. Instead calculate at compile time the address of the highest physical address that can be added to the 1:1 mapping. This supposed to fix a build failure with the page owner tracking leak detector patches as reported by akpm. page-owner-tracking-leak-detector-broken-on-s390.patch can be removed from -mm again when this is merged. Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-05 16:51:01 +01:00
Heiko Carstens	2bc89b5ece	[S390] Fix couple of section mismatches. Fix couple of section mismatches. And since we touch the code anyway change the IPL code to use C99 initializers. Cc: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-05 16:50:56 +01:00
Heiko Carstens	2485579bf5	[S390] DEBUG_PAGEALLOC support for s390. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-05 16:50:54 +01:00
Christian Borntraeger	a2fd64d6aa	[S390] vmemmap: allocate struct pages before 1:1 mapping We have seen an oops in an OOM situation, where show_mem tried to access the struct page of a dcss segment. The vmemmap code has already created the 1:1 mapping but failed allocating the struct pages. In the OOM case, show_mem now walks the memory. It uses pfn_valid to detect if it may access the struct page. In the case described above, the mapping was established and pfn_valid returned true. As the struct pages were not allocated, the kernel oopsed. We have to ensure that we have created the struct pages, before we add a mapping pointing to the pages. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:23 +01:00
Denis Cheng	c11ca97ee9	[S390] use LIST_HEAD instead of LIST_HEAD_INIT single list_head variable initialized with LIST_HEAD_INIT could almost always can be replaced with LIST_HEAD declaration, this shrinks the code and looks better. Signed-off-by: Denis Cheng <crquan@gmail.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:21 +01:00
Heiko Carstens	9f4b0ba81f	[S390] Get rid of HOLES_IN_ZONE requirement. Align everything to MAX_ORDER so we can get rid of the extra checks. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:13 +01:00
Christian Borntraeger	5fd9c6e214	[S390] Change vmalloc defintions Currently the vmalloc area starts at a dynamic address depending on the memory size. There was also an 8MB security hole after the physical memory to catch out-of-bounds accesses. We can simplify the code by putting the vmalloc area explicitely at the top of the kernel mapping and setting the vmalloc size to a fixed value of 128MB/128GB for 31bit/64bit systems. Part of the vmalloc area will be used for the vmem_map. This leaves an area of 96MB/1GB for normal vmalloc allocations. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:12 +01:00
Heiko Carstens	43ebbf119a	[S390] cmm: remove unused binary sysctls. Remove binary sysctls that never worked due to missing strategy functions. Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-11-20 11:13:45 +01:00
Martin Schwidefsky	190a1d722a	[S390] 4level-fixup cleanup Get independent from asm-generic/4level-fixup.h Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-10-22 12:52:49 +02:00
Martin Schwidefsky	3610cce87a	[S390] Cleanup page table definitions. - De-confuse the defines for the address-space-control-elements and the segment/region table entries. - Create out of line functions for page table allocation / freeing. - Simplify get_shadow_xxx functions. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-10-22 12:52:49 +02:00
Serge E. Hallyn	b460cbc581	pid namespaces: define is_global_init() and is_container_init() is_init() is an ambiguous name for the pid==1 check. Split it into is_global_init() and is_container_init(). A cgroup init has it's tsk->pid == 1. A global init also has it's tsk->pid == 1 and it's active pid namespace is the init_pid_ns. But rather than check the active pid namespace, compare the task structure with 'init_pid_ns.child_reaper', which is initialized during boot to the /sbin/init process and never changes. Changelog: 2.6.22-rc4-mm2-pidns1: - Use 'init_pid_ns.child_reaper' to determine if a given task is the global init (/sbin/init) process. This would improve performance and remove dependence on the task_pid(). 2.6.21-mm2-pidns2: - [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc, ppc,avr32}/traps.c for the _exception() call to is_global_init(). This way, we kill only the cgroup if the cgroup's init has a bug rather than force a kernel panic. [akpm@linux-foundation.org: fix comment] [sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c] [bunk@stusta.de: kernel/pid.c: remove unused exports] [sukadev@us.ibm.com: Fix capability.c to work with threaded init] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Acked-by: Pavel Emelianov <xemul@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Herbert Poetzel <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
David Rientjes	5a3135c2e7	oom: move prototypes to appropriate header file Move the OOM killer's extern function prototypes to include/linux/oom.h and include it where necessary. [clg@fr.ibm.com: build fix] Cc: Andrea Arcangeli <andrea@suse.de> Acked-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:45 -07:00
Will Schmidt	dcca2bde4f	During VM oom condition, kill all threads in process group We have had complaints where a threaded application is left in a bad state after one of it's threads is killed when we hit a VM: out_of_memory condition. Killing just one of the process threads can leave the application in a bad state, whereas killing the entire process group would allow for the application to restart, or be otherwise handled, and makes it very obvious that something has gone wrong. This change allows the entire process group to be taken down, rather than just the one thread. Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Andi Kleen <ak@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Matthew Wilcox <willy@debian.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-16 09:42:52 -07:00
Heiko Carstens	c41fbc6965	[S390] pfault: Fix alignment of parameter list. Make sure parameter list of the pfault token function is eight byte aligned. Otherwise we can get specification exceptions. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-10-12 16:13:11 +02:00
Michael Holzheu	0a87c5cfc0	[S390] vmur: fix diag14 exceptions with addresses > 2GB. There are several s390 diagnose calls, which must be executed below the 2GB memory boundary. In order to enforce this, those diagnoses must be compiled into the kernel. Currently diag 14 can be called within the vmur kernel module from addresses above 2GB. This leads to specification exceptions. This patch moves diag10, diag14 and diag210 into the new diag.c file. Signed-off-by: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2007-08-22 13:51:47 +02:00
Heiko Carstens	e62133b4ea	[S390] Get rid of new section mismatch warnings. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-07-27 12:29:18 +02:00
Nick Piggin	83c54070ee	mm: fault feedback #2 This patch completes Linus's wish that the fault return codes be made into bit flags, which I agree makes everything nicer. This requires requires all handle_mm_fault callers to be modified (possibly the modifications should go further and do things like fault accounting in handle_mm_fault -- however that would be for another patch). [akpm@linux-foundation.org: fix alpha build] [akpm@linux-foundation.org: fix s390 build] [akpm@linux-foundation.org: fix sparc build] [akpm@linux-foundation.org: fix sparc64 build] [akpm@linux-foundation.org: fix ia64 build] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Bryan Wu <bryan.wu@analog.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Matthew Wilcox <willy@debian.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Chris Zankel <chris@zankel.net> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Still apparently needs some ARM and PPC loving - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 10:04:41 -07:00
Heiko Carstens	be2864b5ee	[S390] More verbose show_mem() like other architectures. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-05-21 11:25:29 +02:00
Heiko Carstens	490f03d659	[S390] Avoid compile warning. arch/s390/mm/fault.c: In function `signal_return': arch/s390/mm/fault.c:256: warning: unused variable `compat' Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2007-05-10 15:45:53 +02:00
Christoph Hellwig	1eeb66a1bb	move die notifier handling to common code This patch moves the die notifier handling to common code. Previous various architectures had exactly the same code for it. Note that the new code is compiled unconditionally, this should be understood as an appel to the other architecture maintainer to implement support for it aswell (aka sprinkling a notify_die or two in the proper place) arm had a notifiy_die that did something totally different, I renamed it to arm_notify_die as part of the patch and made it static to the file it's declared and used at. avr32 used to pass slightly less information through this interface and I brought it into line with the other architectures. [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix vmalloc_sync_all bustage] [bryan.wu@analog.com: fix vmalloc_sync_all in nommu] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: <linux-arch@vger.kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Bryan Wu <bryan.wu@analog.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-08 11:15:04 -07:00
Christoph Hellwig	33464e3b57	[S390] get rid of kprobes notifier call chain. And here's a port of the powerpc patch to get rid of the notifier chain completely to s390. It's ontop of Martins patch as that one is in mainline already. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-05-04 18:48:24 +02:00
Martin Schwidefsky	be5ec363e9	[S390] No execute support cleanup. Simplify the signal_return function that checks for the two special system calls sigreturn and rt_sigreturn. No need to do a page table walk, a call to copy_from_user while disabled page faults will work as well. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2007-04-27 16:01:43 +02:00
Martin Schwidefsky	10c1031f70	[S390] Minor fault path optimization. The minor fault path has grown a lot in terms of cycles. In particular the kprobes hook is very costly. Optimize the path to save a couple of cycles. If kprobes is enabled more than 300 cycles can be avoided if kprobes_running() is false. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2007-04-27 16:01:43 +02:00
Gerald Schaefer	482b05dd53	[S390] Fixed handling of access register mode faults. Replaced check_user_space() + __check_access_register with the new check_space(). The old functions made wrong assumptions about kernel and user space when the kernel and user address spaces are switched (kernel in home space, user in primary/secondary space). Secondly the user process can switch to the accress register mode if it is running in primary or secondary mode. In addition it can load an arbitrary value to the access registers. If any other value than 0 for primary space or 1 for secondary space is loaded and memory is accessed using the base register related to the access register, the program should be terminated with a SIGSEGV. To achieve that the DUALD pointer in the DUCT and the PSALD pointer in the PASTE need to point to an array of 8 invalid access-list entries to get a ALEN-translation exception if an invalid alet is used. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-03-05 23:35:54 +01:00
Heiko Carstens	118bcd31b3	[S390] Optional ZONE_DMA for s390. Disable ZONE_DMA on 31-bit. All memory is addressable by all devices and we do not need any special memory pool. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-02-21 10:55:12 +01:00
Eric W. Biederman	0b4d414714	[PATCH] sysctl: remove insert_at_head from register_sysctl The semantic effect of insert_at_head is that it would allow new registered sysctl entries to override existing sysctl entries of the same name. Which is pain for caching and the proc interface never implemented. I have done an audit and discovered that none of the current users of register_sysctl care as (excpet for directories) they do not register duplicate sysctl entries. So this patch simply removes the support for overriding existing entries in the sys_sysctl interface since no one uses it or cares and it makes future enhancments harder. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Andi Kleen <ak@muc.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Corey Minyard <minyard@acm.org> Cc: Neil Brown <neilb@suse.de> Cc: "John W. Linville" <linville@tuxdriver.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Jan Kara <jack@ucw.cz> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: David Chinner <dgc@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:59 -08:00
Eric W. Biederman	481f7337a1	[PATCH] sysctl: s390: remove unnecessary use of insert_at_head Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:57 -08:00
Eric W. Biederman	feceb63ec5	[PATCH] sysctl: s390: move sysctl definitions to sysctl.h We need to have the the definition of all top level sysctl directories registers in sysctl.h so we don't conflict by accident and cause abi problems. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:57 -08:00
Al Viro	23db764d3d	[PATCH] Switch s390 to NO_IOMEM Martin Schwidefsky wrote: "s390 does not even need (in\|out)b(_p\|). I wondered what else from io.h do we not need. The answer is: almost nothing. With the devres patch from Al and the dma-mapping patch from Heiko we can get rid of iomem and all associated definitions." So we'll just need to replace NO_IOPORT with NO_IOMEM in Kconfig and kill arch/s390/mm/ioremap.c. BTW, there's an annoying bit of junk in there - IO_SPACE_LIMIT. We only need it for /proc/ioports, which AFAICS shouldn't even be there on s390 (or uml). OTOH, removing that thing would mean a user-visible change - we go from "empty file in /proc" to "no such file in /proc"... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 11:18:07 -08:00
Kirill Korotaev	cefc8be824	[PATCH] Consolidate bust_spinlocks() Part of long forgotten patch http://groups.google.com/group/fa.linux.kernel/msg/e98e941ce1cf29f6?dmode=source Since then, m32r grabbed two copies. Leave s390 copy because of important absence of CONFIG_VT, but remove references to non-existent timerlist_lock. ia64 also loses timerlist_lock. Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Andi Kleen <ak@muc.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-11 10:51:34 -08:00
Heiko Carstens	4d284cac76	[S390] Avoid excessive inlining. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-02-05 21:18:53 +01:00
Heiko Carstens	162e006ef5	[S390] Mark kernel text section read-only. Set read-only flag in the page table entries for the kernel image text section. This will catch all instruction caused corruptions withing the text section. Instruction replacement via kprobes still works, since it bypasses now dynamic address translation. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-02-05 21:18:41 +01:00
Gerald Schaefer	c1821c2e97	[S390] noexec protection This provides a noexec protection on s390 hardware. Our hardware does not have any bits left in the pte for a hw noexec bit, so this is a different approach using shadow page tables and a special addressing mode that allows separate address spaces for code and data. As a special feature of our "secondary-space" addressing mode, separate page tables can be specified for the translation of data addresses (storage operands) and instruction addresses. The shadow page table is used for the instruction addresses and the standard page table for the data addresses. The shadow page table is linked to the standard page table by a pointer in page->lru.next of the struct page corresponding to the page that contains the standard page table (since page->private is not really private with the pte_lock and the page table pages are not in the LRU list). Depending on the software bits of a pte, it is either inserted into both page tables or just into the standard (data) page table. Pages of a vma that does not have the VM_EXEC bit set get mapped only in the data address space. Any try to execute code on such a page will cause a page translation exception. The standard reaction to this is a SIGSEGV with two exceptions: the two system call opcodes 0x0a77 (sys_sigreturn) and 0x0aad (sys_rt_sigreturn) are allowed. They are stored by the kernel to the signal stack frame. Unfortunately, the signal return mechanism cannot be modified to use an SA_RESTORER because the exception unwinding code depends on the system call opcode stored behind the signal stack frame. This feature requires that user space is executed in secondary-space mode and the kernel in home-space mode, which means that the addressing modes need to be switched and that the noexec protection only works for user space. After switching the addressing modes, we cannot use the mvcp/mvcs instructions anymore to copy between kernel and user space. A new mvcos instruction has been added to the z9 EC/BC hardware which allows to copy between arbitrary address spaces, but on older hardware the page tables need to be walked manually. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-02-05 21:18:17 +01:00
Gerald Schaefer	444f0e5489	[S390] Show loaded DCSS segments under /proc/iomem. Currently loaded DCSS segments are now listed in /proc/iomem with their name followed by a trailing "(DCSS)". Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-02-05 21:17:11 +01:00
Heiko Carstens	60383201c2	[S390] Remove pointless/unreliable kernel messages. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-02-05 21:16:52 +01:00
Heiko Carstens	2b67fc4606	[S390] Get rid of a lot of sparse warnings. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-02-05 21:16:47 +01:00

1 2 3 4 5 ...

292 Commits