linux_old1

Commit Graph

Author	SHA1	Message	Date
Gleb Natapov	78646121e9	KVM: Fix interrupt unhalting a vcpu when it shouldn't kvm_vcpu_block() unhalts vpu on an interrupt/timer without checking if interrupt window is actually opened. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-06-10 11:48:33 +03:00
Sheng Yang	e56d532f20	KVM: Device assignment framework rework After discussion with Marcelo, we decided to rework device assignment framework together. The old problems are kernel logic is unnecessary complex. So Marcelo suggest to split it into a more elegant way: 1. Split host IRQ assign and guest IRQ assign. And userspace determine the combination. Also discard msi2intx parameter, userspace can specific KVM_DEV_IRQ_HOST_MSI \| KVM_DEV_IRQ_GUEST_INTX in assigned_irq->flags to enable MSI to INTx convertion. 2. Split assign IRQ and deassign IRQ. Import two new ioctls: KVM_ASSIGN_DEV_IRQ and KVM_DEASSIGN_DEV_IRQ. This patch also fixed the reversed _IOR vs _IOW in definition(by deprecated the old interface). [avi: replace homemade bitcount() by hweight_long()] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-06-10 11:48:29 +03:00
Gleb Natapov	58c2dde17d	KVM: APIC: get rid of deliver_bitmask Deliver interrupt during destination matching loop. Signed-off-by: Gleb Natapov <gleb@redhat.com> Acked-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-06-10 11:48:27 +03:00
Gleb Natapov	343f94fe4d	KVM: consolidate ioapic/ipi interrupt delivery logic Use kvm_apic_match_dest() in kvm_get_intr_delivery_bitmask() instead of duplicating the same code. Use kvm_get_intr_delivery_bitmask() in apic_send_ipi() to figure out ipi destination instead of reimplementing the logic. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-06-10 11:48:27 +03:00
Gleb Natapov	a53c17d21c	KVM: ioapic/msi interrupt delivery consolidation ioapic_deliver() and kvm_set_msi() have code duplication. Move the code into ioapic_deliver_entry() function and call it from both places. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2009-06-10 11:48:27 +03:00
Christian Borntraeger	b95b51d580	KVM: declare ioapic functions only on affected hardware Since "KVM: Unify the delivery of IOAPIC and MSI interrupts" I get the following warnings: CC [M] arch/s390/kvm/kvm-s390.o In file included from arch/s390/kvm/kvm-s390.c:22: include/linux/kvm_host.h:357: warning: 'struct kvm_ioapic' declared inside parameter list include/linux/kvm_host.h:357: warning: its scope is only this definition or declaration, which is probably not what you want This patch limits IOAPIC functions for architectures that have one. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-06-10 11:48:24 +03:00
Sheng Yang	d510d6cc65	KVM: Enable MSI-X for KVM assigned device This patch finally enable MSI-X. What we need for MSI-X: 1. Intercept one page in MMIO region of device. So that we can get guest desired MSI-X table and set up the real one. Now this have been done by guest, and transfer to kernel using ioctl KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY. 2. Information for incoming interrupt. Now one device can have more than one interrupt, and they are all handled by one workqueue structure. So we need to identify them. The previous patch enable gsi_msg_pending_bitmap get this done. 3. Mapping from host IRQ to guest gsi as well as guest gsi to real MSI/MSI-X message address/data. We used same entry number for the host and guest here, so that it's easy to find the correlated guest gsi. What we lack for now: 1. The PCI spec said nothing can existed with MSI-X table in the same page of MMIO region, except pending bits. The patch ignore pending bits as the first step (so they are always 0 - no pending). 2. The PCI spec allowed to change MSI-X table dynamically. That means, the OS can enable MSI-X, then mask one MSI-X entry, modify it, and unmask it. The patch didn't support this, and Linux also don't work in this way. 3. The patch didn't implement MSI-X mask all and mask single entry. I would implement the former in driver/pci/msi.c later. And for single entry, userspace should have reposibility to handle it. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-06-10 11:48:23 +03:00
Sheng Yang	2350bd1f62	KVM: Add MSI-X interrupt injection logic We have to handle more than one interrupt with one handler for MSI-X. Avi suggested to use a flag to indicate the pending. So here is it. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-06-10 11:48:23 +03:00
Sheng Yang	c1e0151429	KVM: Ioctls for init MSI-X entry Introduce KVM_SET_MSIX_NR and KVM_SET_MSIX_ENTRY two ioctls. This two ioctls are used by userspace to specific guest device MSI-X entry number and correlate MSI-X entry with GSI during the initialization stage. MSI-X should be well initialzed before enabling. Don't support change MSI-X entry number for now. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-06-10 11:48:23 +03:00
Sheng Yang	116191b69b	KVM: Unify the delivery of IOAPIC and MSI interrupts Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-06-10 11:48:22 +03:00
Sheng Yang	cf9e4e15e8	KVM: Split IOAPIC structure Prepared for reuse ioapic_redir_entry for MSI. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-06-10 11:48:21 +03:00
Takashi Iwai	03cece06c4	Merge branch 'topic/lx6464es' into for-linus * topic/lx6464es: ALSA: Add missing description of lx6464es to ALSA-Configuration.txt ALSA: lx6464es - Disable lx_message_send() ALSA: lx6464es - Use snd_card_create() ALSA: lx6464es - driver for the digigram lx6464es interface	2009-06-10 07:26:34 +02:00
Takashi Iwai	e618a5609e	Merge branch 'topic/ctxfi' into for-linus * topic/ctxfi: (35 commits) ALSA: ctxfi - Clear PCM resources at hw_params and hw_free ALSA: ctxfi - Check the presence of SRC instance in PCM pointer callbacks ALSA: ctxfi - Add missing start check in atc_pcm_playback_start() ALSA: ctxfi - Add use_system_timer module option ALSA: ctxfi - Fix wrong model id for UAA ALSA: ctxfi - Clean up probe routines ALSA: ctxfi - Fix / clean up hw20k2 chip code ALSA: ctxfi - Fix possible buffer pointer overrun ALSA: ctxfi - Remove useless initializations and cast ALSA: ctxfi - Fix DMA mask for emu20k2 chip ALSA: ctxfi - Make volume controls more intuitive ALSA: ctxfi - Optimize the native timer handling using wc counter ALSA: ctxfi - Add missing inclusion of linux/math64.h ALSA: ctxfi - Set device 0 for mixer control elements ALSA: ctxfi - Clean up / optimize ALSA: ctxfi - Set periods_min to 2 ALSA: ctxfi - Use native timer interrupt on emu20k1 ALSA: ctxfi - Fix previous fix for 64bit DMA ALSA: ctxfi - Fix endian-dependent codes ALSA: ctxfi - Allow 64bit DMA ...	2009-06-10 07:26:27 +02:00
Jan Beulich	fd6c3a8dc4	initconst adjustments - add .init.rodata to INIT_DATA, and group all initconst flavors together - move strings generated from __setup_param() into .init.rodata - add .*init.rodata to modpost's sets of init sections - make modpost warn about references between meminit and cpuinit as well as memexit and cpuexit sections (as CPU and memory hotplug are independently selectable features) Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2009-06-09 22:37:43 +02:00
Steven Rostedt	725c624a58	tracing: add trace_seq_vprint interface The code to update the print formats for events requires a vprintf format in the trace_seq. This patch adds that interface. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2009-06-09 15:17:32 -04:00
Li Zefan	55782138e4	tracing/events: convert block trace points to TRACE_EVENT() TRACE_EVENT is a more generic way to define tracepoints. Doing so adds these new capabilities to this tracepoint: - zero-copy and per-cpu splice() tracing - binary tracing without printf overhead - structured logging records exposed under /debug/tracing/events - trace events embedded in function tracer output and other plugins - user-defined, per tracepoint filter expressions ... Cons: - no dev_t info for the output of plug, unplug_timer and unplug_io events. no dev_t info for getrq and sleeprq events if bio == NULL. no dev_t info for rq_abort,...,rq_requeue events if rq->rq_disk == NULL. This is mainly because we can't get the deivce from a request queue. But this may change in the future. - A packet command is converted to a string in TP_assign, not TP_print. While blktrace do the convertion just before output. Since pc requests should be rather rare, this is not a big issue. - In blktrace, an event can have 2 different print formats, but a TRACE_EVENT has a unique format, which means we have some unused data in a trace entry. The overhead is minimized by using __dynamic_array() instead of __array(). I've benchmarked the ioctl blktrace vs the splice based TRACE_EVENT tracing: dd dd + ioctl blktrace dd + TRACE_EVENT (splice) 1 7.36s, 42.7 MB/s 7.50s, 42.0 MB/s 7.41s, 42.5 MB/s 2 7.43s, 42.3 MB/s 7.48s, 42.1 MB/s 7.43s, 42.4 MB/s 3 7.38s, 42.6 MB/s 7.45s, 42.2 MB/s 7.41s, 42.5 MB/s So the overhead of tracing is very small, and no regression when using those trace events vs blktrace. And the binary output of TRACE_EVENT is much smaller than blktrace: # ls -l -h -rw-r--r-- 1 root root 8.8M 06-09 13:24 sda.blktrace.0 -rw-r--r-- 1 root root 195K 06-09 13:24 sda.blktrace.1 -rw-r--r-- 1 root root 2.7M 06-09 13:25 trace_splice.out Following are some comparisons between TRACE_EVENT and blktrace: plug: kjournald-480 [000] 303.084981: block_plug: [kjournald] kjournald-480 [000] 303.084981: 8,0 P N [kjournald] unplug_io: kblockd/0-118 [000] 300.052973: block_unplug_io: [kblockd/0] 1 kblockd/0-118 [000] 300.052974: 8,0 U N [kblockd/0] 1 remap: kjournald-480 [000] 303.085042: block_remap: 8,0 W 102736992 + 8 <- (8,8) 33384 kjournald-480 [000] 303.085043: 8,0 A W 102736992 + 8 <- (8,8) 33384 bio_backmerge: kjournald-480 [000] 303.085086: block_bio_backmerge: 8,0 W 102737032 + 8 [kjournald] kjournald-480 [000] 303.085086: 8,0 M W 102737032 + 8 [kjournald] getrq: kjournald-480 [000] 303.084974: block_getrq: 8,0 W 102736984 + 8 [kjournald] kjournald-480 [000] 303.084975: 8,0 G W 102736984 + 8 [kjournald] bash-2066 [001] 1072.953770: 8,0 G N [bash] bash-2066 [001] 1072.953773: block_getrq: 0,0 N 0 + 0 [bash] rq_complete: konsole-2065 [001] 300.053184: block_rq_complete: 8,0 W () 103669040 + 16 [0] konsole-2065 [001] 300.053191: 8,0 C W 103669040 + 16 [0] ksoftirqd/1-7 [001] 1072.953811: 8,0 C N (5a 00 08 00 00 00 00 00 24 00) [0] ksoftirqd/1-7 [001] 1072.953813: block_rq_complete: 0,0 N (5a 00 08 00 00 00 00 00 24 00) 0 + 0 [0] rq_insert: kjournald-480 [000] 303.084985: block_rq_insert: 8,0 W 0 () 102736984 + 8 [kjournald] kjournald-480 [000] 303.084986: 8,0 I W 102736984 + 8 [kjournald] Changelog from v2 -> v3: - use the newly introduced __dynamic_array(). Changelog from v1 -> v2: - use __string() instead of __array() to minimize the memory required to store hex dump of rq->cmd(). - support large pc requests. - add missing blk_fill_rwbs_rq() in block_rq_requeue TRACE_EVENT. - some cleanups. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <4A2DF669.5070905@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2009-06-09 12:34:23 -04:00
Yinghai Lu	0281b5dc03	cpumask: introduce zalloc_cpumask_var So can get cpumask_var with cpumask_clear Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-06-09 22:30:26 +09:30
john cooper	1d589bb16b	Add serial number support for virtio_blk, V4a This patch extracts the opaque data from pci i/o region 0 via the added VIRTIO_BLK_F_IDENTIFY field. By convention this data takes the form of that returned by an ATA IDENTIFY DEVICE command, however the driver (except for structure size) makes no interpretation of the data. The structure data is copied wholesale to userspace via a HDIO_GET_IDENTITY ioctl command (eg: hdparm -i <dev>). Signed-off-by: john cooper <john.cooper@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-06-09 14:41:40 +02:00
Tejun Heo	151060ac13	CUSE: implement CUSE - Character device in Userspace CUSE enables implementing character devices in userspace. With recent additions of ioctl and poll support, FUSE already has most of what's necessary to implement character devices. All CUSE has to do is bonding all those components - FUSE, chardev and the driver model - nicely. When client opens /dev/cuse, kernel starts conversation with CUSE_INIT. The client tells CUSE which device it wants to create. As the previous patch made fuse_file usable without associated fuse_inode, CUSE doesn't create super block or inodes. It attaches fuse_file to cdev file->private_data during open and set ff->fi to NULL. The rest of the operation is almost identical to FUSE direct IO case. Each CUSE device has a corresponding directory /sys/class/cuse/DEVNAME (which is symlink to /sys/devices/virtual/class/DEVNAME if SYSFS_DEPRECATED is turned off) which hosts "waiting" and "abort" among other things. Those two files have the same meaning as the FUSE control files. The only notable lacking feature compared to in-kernel implementation is mmap support. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>	2009-06-09 11:24:11 +02:00
Jens Axboe	9df1bb9b51	Revert "block: Fix bounce limit setting in DM" This reverts commit `a05c0205ba`. DM doesn't need to access the bounce_pfn directly. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-06-09 06:22:57 +02:00
James Morris	0b4ec6e4e0	Merge branch 'master' into next	2009-06-09 09:27:53 +10:00
Peter Zijlstra	1f8a6a10fb	ring-buffer: pass in lockdep class key for reader_lock On Sun, 7 Jun 2009, Ingo Molnar wrote: > Testing tracer sched_switch: <6>Starting ring buffer hammer > PASSED > Testing tracer sysprof: PASSED > Testing tracer function: PASSED > Testing tracer irqsoff: > ============================================= > PASSED > Testing tracer preemptoff: PASSED > Testing tracer preemptirqsoff: [ INFO: possible recursive locking detected ] > PASSED > Testing tracer branch: 2.6.30-rc8-tip-01972-ge5b9078-dirty #5760 > --------------------------------------------- > rb_consumer/431 is trying to acquire lock: > (&cpu_buffer->reader_lock){......}, at: [<c109eef7>] ring_buffer_reset_cpu+0x37/0x70 > > but task is already holding lock: > (&cpu_buffer->reader_lock){......}, at: [<c10a019e>] ring_buffer_consume+0x7e/0xc0 > > other info that might help us debug this: > 1 lock held by rb_consumer/431: > #0: (&cpu_buffer->reader_lock){......}, at: [<c10a019e>] ring_buffer_consume+0x7e/0xc0 The ring buffer is a generic structure, and can be used outside of ftrace. If ftrace traces within the use of the ring buffer, it can produce false positives with lockdep. This patch passes in a static lock key into the allocation of the ring buffer, so that different ring buffers will have their own lock class. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1244477919.13761.9042.camel@twins> [ store key in ring buffer descriptor ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2009-06-08 18:50:20 -04:00
Joe Eykholt	226c7ffe74	[SCSI] net, libfcoe: Add the FCoE Initialization Protocol ethertype FIP is the FCoE Initialization Protocol and this patch adds the protocol ethertype to the kernel's list of ethertypes. Signed-off-by: Joe Eykholt <jeykholt@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-06-08 13:29:17 -05:00
Russell King	7698fdedcf	Merge branch 'for-rmk' of git://git.marvell.com/orion into devel	2009-06-08 19:27:13 +01:00
Linus Torvalds	6025974bab	Merge master.kernel.org:/home/rmk/linux-2.6-arm * master.kernel.org:/home/rmk/linux-2.6-arm: [ARM] 5543/1: arm: serial amba: add missing declaration in serial.h [ARM] pxa: fix pxa27x_udc default pullup GPIO [ARM] pxa/imote2: fix UCAM sensor board ADC model number mx[23]: don't put clock lookups in __initdata fix oops when using console=ttymxcN with N > 0 [ARM] ARMv7 errata: only apply fixes when running on applicable CPU [ARM] 5534/1: kmalloc must return a cache line aligned buffer	2009-06-08 08:29:31 -07:00
Alessandro Rubini	aa853f85d9	[ARM] 5543/1: arm: serial amba: add missing declaration in serial.h This header is sometimes included in the uncompress stage to get register values, but no <linux/amba/bus.h> can be included there. So declare "struct amba_device" here before using it in a prototype. Signed-off-by: Alessandro Rubini <rubini@unipv.it> Acked-by: Andrea Gallo <andrea.gallo@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2009-06-07 16:19:47 +01:00
Bartlomiej Zolnierkiewicz	734affdcae	ide: add IDE_DFLAG_NIEN_QUIRK device flag Add IDE_DFLAG_NIEN_QUIRK device flag and use it instead of drive->quirk_list. There should be no functional changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-06-07 15:37:10 +02:00
Bartlomiej Zolnierkiewicz	8bc1e5aa06	ide: respect quirk_drives[] list on all controllers * Add ide_check_nien_quirk_list() helper to the core code and then use it in ide_port_tune_devices(). * Remove no longer needed ->quirkproc methods from hpt366.c and pdc202xx_{new,old}.c. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-06-07 15:37:09 +02:00
Bartlomiej Zolnierkiewicz	6250d3af2a	Merge branch 'for-linus' into for-next	2009-06-07 14:27:11 +02:00
Bartlomiej Zolnierkiewicz	075affcbe0	ide: preserve Host Protected Area by default (v2) From the perspective of most users of recent systems, disabling Host Protected Area (HPA) can break vendor RAID formats, GPT partitions and risks corrupting firmware or overwriting vendor system recovery tools. Unfortunately the original (kernels < 2.6.30) behavior (unconditionally disabling HPA and using full disk capacity) was introduced at the time when the main use of HPA was to make the drive look small enough for the BIOS to allow the system to boot with large capacity drives. Thus to allow the maximum compatibility with the existing setups (using HPA and partitioned with HPA disabled) we automically disable HPA if any partitions overlapping HPA are detected. Additionally HPA can also be disabled using the "nohpa" module parameter (i.e. "ide_core.nohpa=0.0" to disable HPA on /dev/hda). v2: Fix ->resume HPA support. While at it: - remove stale "idebus=" entry from Documentation/kernel-parameters.txt Cc: Robert Hancock <hancockrwd@gmail.com> Cc: Frans Pop <elendil@planet.nl> Cc: "Andries E. Brouwer" <Andries.Brouwer@cwi.nl> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> [patch description was based on input from Alan Cox and Frans Pop] Emphatically-Acked-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-06-07 13:52:52 +02:00
Bartlomiej Zolnierkiewicz	e957b60d15	ide-gd: implement block device ->set_capacity method (v2) * Use ->probed_capacity to store native device capacity for ATA disks. * Add ->set_capacity method to struct ide_disk_ops. * Implement disk device ->set_capacity method for ATA disks. * Implement block device ->set_capacity method. v2: * Check if LBA and HPA are supported in ide_disk_set_capacity(). * According to the spec the SET MAX ADDRESS command shall be immediately preceded by a READ NATIVE MAX ADDRESS command. * Add ide_disk_hpa_{get_native,set}_capacity() helpers. Together with the previous patch adding ->set_capacity block device method this allows automatic disabling of Host Protected Area (HPA) if any partitions overlapping HPA are detected. Cc: Robert Hancock <hancockrwd@gmail.com> Cc: Frans Pop <elendil@planet.nl> Cc: "Andries E. Brouwer" <Andries.Brouwer@cwi.nl> Cc: Al Viro <viro@zeniv.linux.org.uk> Emphatically-Acked-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-06-07 13:52:52 +02:00
Bartlomiej Zolnierkiewicz	db429e9ec0	partitions: add ->set_capacity block device method * Add ->set_capacity block device method and use it in rescan_partitions() to attempt enabling native capacity of the device upon detecting the partition which exceeds device capacity. * Add GENHD_FL_NATIVE_CAPACITY flag to try limit attempts of enabling native capacity during partition scan. Together with the consecutive patch implementing ->set_capacity method in ide-gd device driver this allows automatic disabling of Host Protected Area (HPA) if any partitions overlapping HPA are detected. Cc: Robert Hancock <hancockrwd@gmail.com> Cc: Frans Pop <elendil@planet.nl> Cc: "Andries E. Brouwer" <Andries.Brouwer@cwi.nl> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Emphatically-Acked-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-06-07 13:52:52 +02:00
Ingo Molnar	56fdd18c7b	Merge branch 'linus' into core/iommu Merge reason: This branch was on an -rc5 base so pull almost-2.6.30 to resync with the latest upstream fixes and make sure the combination works fine. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-07 11:35:05 +02:00
Ingo Molnar	75b5032212	Merge branch 'linus' into perfcounters/core Merge reason: Pick up the latest fixes before the -v8 perfcounters release. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-06 20:21:28 +02:00
Ingo Molnar	8326f44da0	perf_counter: Implement generalized cache event types Extend generic event enumeration with the PERF_TYPE_HW_CACHE method. This is a 3-dimensional space: { L1-D, L1-I, L2, ITLB, DTLB, BPU } x { load, store, prefetch } x { accesses, misses } User-space passes in the 3 coordinates and the kernel provides a counter. (if the hardware supports that type and if the combination makes sense.) Combinations that make no sense produce a -EINVAL. Combinations that are not supported by the hardware produce -ENOTSUP. Extend the tools to deal with this, and rewrite the event symbol parsing code with various popular aliases for the units and access methods above. So 'l1-cache-miss' and 'l1d-read-ops' are both valid aliases. ( x86 is supported for now, with the Nehalem event table filled in, and with Core2 and Atom having placeholder tables. ) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-06 13:14:47 +02:00
Ingo Molnar	a21ca2cac5	perf_counter: Separate out attr->type from attr->config Counter type is a frequently used value and we do a lot of bit juggling by encoding and decoding it from attr->config. Clean this up by creating a separate attr->type field. Also clean up the various similarly complex user-space bits all around counter attribute management. The net improvement is significant, and it will be easier to add a new major type (which is what triggered this cleanup). (This changes the ABI, all tools are adapted.) (PowerPC build-tested.) Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-06 11:37:22 +02:00
Jack Morgenstein	2ac6bf4ddc	IB/mlx4: Add strong ordering to local inval and fast reg work requests The ConnectX Programmer's Reference Manual states that the "SO" bit must be set when posting Fast Register and Local Invalidate send work requests. When this bit is set, the work request will be executed only after all previous work requests on the send queue have been executed. (If the bit is not set, Fast Register and Local Invalidate WQEs may begin execution too early, which violates the defined semantics for these operations) This fixes the issue with NFS/RDMA reported in <http://lists.openfabrics.org/pipermail/general/2009-April/059253.html> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-06-05 10:36:24 -07:00
Peter Zijlstra	6a24ed6c60	perf_counter: Fix frequency adjustment for < HZ Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-05 18:07:48 +02:00
Peter Zijlstra	689802b2d0	perf_counter: Add PERF_SAMPLE_PERIOD In order to allow easy tracking of the period, also provide means of adding it to the sample data. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-05 18:07:47 +02:00
Peter Zijlstra	ac4bcf8894	perf_counter: Change PERF_SAMPLE_CONFIG into PERF_SAMPLE_ID The purpose of PERF_SAMPLE_CONFIG was to identify the counters, since then we've added counter ids, use those instead. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-05 18:07:47 +02:00
Bjorn Helgaas	1b8e69662e	pnp: add PNP resource range checking function Add a PNP resource range check function, indicating whether a resource has been assigned to any device. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> [apw@canonical.com: fixed up exports et al] Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-06-05 14:37:41 +00:00
Peter Zijlstra	089dd79db9	perf_counter: Generate mmap events for install_special_mapping() In order to track the vdso also generate mmap events for install_special_mapping(). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-05 14:46:41 +02:00
Oleg Nesterov	087eb43705	ptrace: tracehook_report_clone: fix false positives The "trace \|\| CLONE_PTRACE" check in tracehook_report_clone() is not right, - If the untraced task does clone(CLONE_PTRACE) the new child is not traced, we must not queue SIGSTOP. - If we forked the traced task, but the tracer exits and untraces both the forking task and the new child (after copy_process() drops tasklist_lock), we should not queue SIGSTOP too. Change the code to check task_ptrace() != 0 instead. This is still racy, but the race is harmless. We can race with another tracer attaching to this child, or the tracer can exit and detach in parallel. But giwen that we didn't do wake_up_new_task() yet, the child must have the pending SIGSTOP anyway. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-04 18:07:40 -07:00
Tony Lindgren	c068303920	[ARM] 5536/1: Move clk_add_alias() to arch/arm/common/clkdev.c This can be used for other arm platforms too as discussed on the linux-arm-kernel list. Also check the return value with IS_ERR and return PTR_ERR as suggested by Russell King. Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2009-06-04 17:45:43 +01:00
Alessandro Rubini	5926a295bb	[ARM] 5541/1: serial/amba-pl011.c: add support for the modified port found in Nomadik The Nomadik 8815 SoC has a slightly modified version of the PL011 block. The patch uses the different ID value as a key to select a vendor structure that is used to keep track of the differences, as suggested by Russell King. Signed-off-by: Alessandro Rubini <rubini@unipv.it> Acked-by: Andrea Gallo <andrea.gallo@stericsson.com> Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2009-06-04 17:45:30 +01:00
Peter Zijlstra	d99e944620	perf_counter: Remove munmap stuff In name of keeping it simple, only track mmap events. Userspace will have to remove old overlapping maps when it encounters them. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-04 17:51:38 +02:00
Peter Zijlstra	60313ebed7	perf_counter: Add fork event Create a fork event so that we can easily clone the comm and dso maps without having to generate all those events. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-04 17:51:38 +02:00
Christoph Lameter	e0a94c2a63	security: use mmap_min_addr indepedently of security models This patch removes the dependency of mmap_min_addr on CONFIG_SECURITY. It also sets a default mmap_min_addr of 4096. mmapping of addresses below 4096 will only be possible for processes with CAP_SYS_RAWIO. Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Acked-by: Eric Paris <eparis@redhat.com> Looks-ok-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: James Morris <jmorris@namei.org>	2009-06-04 12:07:48 +10:00
Martin K. Petersen	a05c0205ba	block: Fix bounce limit setting in DM blk_queue_bounce_limit() is more than a wrapper about the request queue limits.bounce_pfn variable. Introduce blk_queue_bounce_pfn() which can be called by stacking drivers that wish to set the bounce limit explicitly. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-06-03 09:33:18 +02:00
Peter Zijlstra	0d48696f87	perf_counter: Rename perf_counter_hw_event => perf_counter_attr The structure isn't hw only and when I read event, I think about those things that fall out the other end. Rename the thing. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Kacur <jkacur@redhat.com> Cc: Stephane Eranian <eranian@googlemail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-02 21:45:33 +02:00
Peter Zijlstra	08247e31ca	perf_counter: Add ioctl for changing the sample period/frequency Reported-by: Stephane Eranian <eranian@googlemail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-02 21:45:32 +02:00
Peter Zijlstra	8e3747c13c	perf_counter: Change data head from u32 to u64 Since some people worried that 4G might not be a large enough as an mmap data window, extend it to 64 bit for capable platforms. Reported-by: Stephane Eranian <eranian@googlemail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-02 21:45:32 +02:00
Peter Zijlstra	8a016db386	perf_counter: Remove the last nmi/irq bits IRQ (non-NMI) sampling is not used anymore - remove the last few bits. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-02 21:45:31 +02:00
Peter Zijlstra	b23f3325ed	perf_counter: Rename various fields A few renames: s/irq_period/sample_period/ s/irq_freq/sample_freq/ s/PERF_RECORD_/PERF_SAMPLE_/ s/record_type/sample_type/ And change both the new sample_type and read_format to u64. Reported-by: Stephane Eranian <eranian@googlemail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-02 21:45:30 +02:00
Peter Zijlstra	8e5799b1ad	perf_counter: Add unique counter id Stephan raised the issue that we currently cannot distinguish between similar counters within a group (PERF_RECORD_GROUP uses the config value as identifier). Therefore, generate a new ID for each counter using a global u64 sequence counter. Reported-by: Stephane Eranian <eranian@googlemail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-02 21:45:29 +02:00
Alan Cox	05ad709d04	parport: quickfix the proc registration bug Ideally we should have a directory of drivers and a link to the 'active' driver. For now just show the first device which is effectively the existing semantics without a warning. This is an update on the original buggy patch that I then forgot to resubmit. Confusingly it was proposed by Red Hat, written by Etched Pixels fixed and submitted by Intel ... Resolves-Bug: http://bugzilla.kernel.org/show_bug.cgi?id=9749 Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-02 09:53:22 -07:00
Peter Zijlstra	709e50cf87	perf_counter: Use PID namespaces properly Stop using task_struct::pid and start using PID namespaces. PIDs will be reported in the PID namespace of the monitoring task at the moment of counter creation. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-02 16:16:25 +02:00
Paul Mackerras	bf4e0ed3d0	perf_counter: Remove unused prev_state field This removes the prev_state field of struct perf_counter since it is now unused. It was only used by the cpu migration counter, which doesn't use it any more. Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <18979.35052.915728.626374@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-02 13:10:55 +02:00
Paul Mackerras	3f731ca60a	perf_counter: Fix cpu migration counter This fixes the cpu migration software counter to count correctly even when contexts get swapped from one task to another. Previously the cpu migration counts reported by perf stat were bogus, ranging from negative to several thousand for a single "lat_ctx 2 8 32" run. With this patch the cpu migration count reported for "lat_ctx 2 8 32" is almost always between 35 and 44. This fixes the problem by adding a call into the perf_counter code from set_task_cpu when tasks are migrated. This enables us to use the generic swcounter code (with some modifications) for the cpu migration counter. This modifies the swcounter code to allow a NULL regs pointer to be passed in to perf_swcounter_ctx_event() etc. The cpu migration counter does this because there isn't necessarily a pt_regs struct for the task available. In this case, the counter will not have interrupt capability - but the migration counter didn't have interrupt capability before, so this is no loss. Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <18979.35006.819769.416327@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-02 13:10:54 +02:00
Steven Rostedt	112f38a7e3	tracing: make trace pipe recognize latency format flag The trace_pipe did not recognize the latency format flag and would produce different output than the trace file. The problem was partly due that the trace flags in the iterator was not set as well as the trace_pipe zeros out part of the iterator (including the flags) to be able to use the same routines as the trace file. trace_flags of the iterator should not cause any problems when not zeroed out by for trace_pipe. Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2009-06-01 23:26:02 -04:00
Ingo Molnar	3d58f48ba0	Merge branch 'linus' into irq/numa Conflicts: arch/mips/sibyte/bcm1480/irq.c arch/mips/sibyte/sb1250/irq.c Merge reason: we gathered a few conflicts plus update to latest upstream fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-01 21:06:21 +02:00
Ingo Molnar	22a4f650d6	perf_counter: Tidy up style details - whitespace fixlets - make local variable definitions more consistent [ Impact: cleanup ] Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-01 19:55:32 +02:00
Linus Torvalds	6e42910184	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: 3c509: Add missing EISA IDs MAINTAINERS: take maintainership of the cpmac Ethernet driver net/firmare: Ignore .cis files ath1e: add new device id for asus hardware mlx4_en: Fix a kernel panic when waking tx queue rtl8187: add USB ID for Linksys WUSB54GC-EU v2 USB wifi dongle at76c50x-usb: avoid mutex deadlock in at76_dwork_hw_scan mac8390: fix build with NET_POLL_CONTROLLER cxgb3: link fault fixes cxgb3: fix dma mapping regression netfilter: nfnetlink_log: fix wrong skbuff size calculation netfilter: xt_hashlimit does a wrong SEQ_SKIP bfin_mac: fix build error due to net_device_ops convert atlx: move modinfo data from atlx.h to atl1.c gianfar: fix babbling rx error event bug cls_cgroup: read classid atomically in classifier netfilter: nf_ct_dccp: add missing DCCP protocol changes in event cache netfilter: nf_ct_tcp: fix accepting invalid RST segments	2009-06-01 08:02:05 -07:00
Linus Torvalds	c4e51e4657	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jaswinder/headers-check-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jaswinder/headers-check-2.6: headers_check fix: linux/net_dropmon.h headers_check fix: linux/auto_fs.h	2009-06-01 08:01:42 -07:00
Paul Mackerras	25346b93ca	perf_counter: Provide functions for locking and pinning the context for a task This abstracts out the code for locking the context associated with a task. Because the context might get transferred from one task to another concurrently, we have to check after locking the context that it is still the right context for the task and retry if not. This was open-coded in find_get_context() and perf_counter_init_task(). This adds a further function for pinning the context for a task, i.e. marking it so it can't be transferred to another task. This adds a 'pin_count' field to struct perf_counter_context to indicate that a context is pinned, instead of the previous method of setting the parent_gen count to all 1s. Pinning the context with a pin_count is easier to undo and doesn't require saving the parent_gen value. This also adds a perf_unpin_context() to undo the effect of perf_pin_task_context() and changes perf_counter_init_task to use it. Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <18979.34748.755674.596386@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-01 10:04:05 +02:00
Ingo Molnar	23db9f430b	Merge branch 'linus' into perfcounters/core Merge reason: merge almost-rc8 into perfcounters/core, which was -rc6 based - to pick up the latest upstream fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-06-01 10:01:39 +02:00
Jaswinder Singh Rajput	d280cc989a	headers_check fix: linux/net_dropmon.h fix the following 'make headers_check' warnings: usr/include/linux/net_dropmon.h:7: found __[us]{8,16,32,64} type without #include <linux/types.h> Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>	2009-06-01 06:23:25 +00:00
Jaswinder Singh Rajput	52bb25a620	headers_check fix: linux/auto_fs.h fix the following 'make headers_check' warnings: usr/include/linux/auto_fs.h:17: include of <linux/types.h> is preferred over <asm/types.h> Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>	2009-06-01 06:21:13 +00:00
Len Brown	6afec830ac	Merge branches 'bugzilla-13121+', 'bugzilla-13233', 'redhat-bugzilla-500311', 'pci-bind-oops', 'misc-2.6.30' and 'i7300_idle' into release	2009-05-29 21:30:01 -04:00
Linus Torvalds	5f789cd8ba	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: libps2 - better handle bad scheduler decisions Input: usb1400_ts - fix access to "device data" in resume function Input: multitouch - augment event semantics documentation Input: multitouch - add tracking ID to the protocol	2009-05-29 08:48:25 -07:00
Daisuke Nishimura	e767e0561d	memcg: fix deadlock between lock_page_cgroup and mapping tree_lock mapping->tree_lock can be acquired from interrupt context. Then, following dead lock can occur. Assume "A" as a page. CPU0: lock_page_cgroup(A) interrupted -> take mapping->tree_lock. CPU1: take mapping->tree_lock -> lock_page_cgroup(A) This patch tries to fix above deadlock by moving memcg's hook to out of mapping->tree_lock. charge/uncharge of pagecache/swapcache is protected by page lock, not tree_lock. After this patch, lock_page_cgroup() is not called under mapping->tree_lock. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-29 08:40:02 -07:00
Alexey Dobriyan	b2e1feaf0a	cred: #include init.h in cred.h linux/cred.h can't be included as first header (alphabetical order) because it uses __init which is enough to break compilation on some archs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-29 08:40:01 -07:00
Peter Zijlstra	bbbee90829	perf_counter: Ammend cleanup in fork() fail When fork() fails we cannot use perf_counter_exit_task() since that assumes to operate on current. Write a new helper that cleans up unused/clean contexts. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-29 16:21:52 +02:00
Len Brown	2f102607ac	i7300_idle: allow testing on i5000-series hardware w/o re-compile Testing the i7300_idle driver on i5000-series hardware required an edit to i7300_idle.h to "#define SUPPORT_I5000 1" and a re-build of both i7300_idle and ioat_dma. Replace that build-time scheme with a load-time module parameter: "7300_idle.forceload=1" to make it easier to test the driver on hardware that while not officially validated, works fine and is much more commonly available. By default (no modparam) the driver will continue to load only on the i7300. Note that ioat_dma runs a copy of i7300_idle's probe routine to know to reserve an IOAT channel for i7300_idle. This change makes ioat_dma do that always on the i5000, just like it does on the i7300. Signed-off-by: Len Brown <len.brown@intel.com> Acked-by: Andrew Henroid <andrew.d.henroid@intel.com>	2009-05-28 20:52:40 -04:00
Paul Mackerras	c93f766909	perf_counter: Fix race in attaching counters to tasks and exiting Commit `564c2b21` ("perf_counter: Optimize context switch between identical inherited contexts") introduced a race where it is possible that a counter being attached to a task could get attached to the wrong task, if the task is one that has inherited its context from another task via fork. This happens because the optimized context switch could switch the context to another task after find_get_context has read task->perf_counter_ctxp. In fact, it's possible that the context could then get freed, if the other task then exits. This fixes the problem by protecting both the context switch and the critical code in find_get_context with spinlocks. The context switch locks the cxt->lock of both the outgoing and incoming contexts before swapping them. That means that once code such as find_get_context has obtained the spinlock for the context associated with a task, the context can't get swapped to another task. However, the context may have been swapped in the interval between reading task->perf_counter_ctxp and getting the lock, so it is necessary to check and retry. To make sure that none of the contexts being looked at in find_get_context can get freed, this changes the context freeing code to use RCU. Thus an rcu_read_lock() is sufficient to ensure that no contexts can get freed. This part of the patch is lifted from a patch posted by Peter Zijlstra. This also adds a check to make sure that we can't add a counter to a task that is exiting. There is also a race between perf_counter_exit_task and find_get_context; this solves the race by moving the get_ctx that was in perf_counter_alloc into the locked region in find_get_context, so that once find_get_context has got the context for a task, it won't get freed even if the task calls perf_counter_exit_task. It doesn't matter if new top-level (non-inherited) counters get attached to the context after perf_counter_exit_task has detached the context from the task. They will just stay there and never get scheduled in until the counters' fds get closed, and then perf_release will remove them from the context and eventually free the context. With this, we are now doing the unclone in find_get_context rather than when a counter was added to or removed from a context (actually, we were missing the unclone_ctx() call when adding a counter to a context). We don't need to unclone when removing a counter from a context because we have no way to remove a counter from a cloned context. This also takes out the smp_wmb() in find_get_context, which Peter Zijlstra pointed out was unnecessary because the cmpxchg implies a full barrier anyway. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <18974.33033.667187.273886@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-28 15:03:50 +02:00
Ingo Molnar	d3e78ee3d0	perf_counter: Fix perf_counter_init_task() on !CONFIG_PERF_COUNTERS Pointed out by compiler warnings: tip/include/linux/perf_counter.h:644: warning: no return statement in function returning non-void Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-28 11:42:16 +02:00
David S. Miller	4d3383d0ad	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6	2009-05-27 15:51:25 -07:00
Eli Cohen	ab6bf42e23	mlx4_core: Add module parameter for number of MTTs per segment The current MTT allocator uses kmalloc() to allocate a buffer for its buddy allocator, and thus is limited in the amount of MTT segments that it can control. As a result, the size of memory that can be registered is limited too. This patch uses a module parameter to control the number of MTT entries that each segment represents, allowing more memory to be registered with the same number of segments. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-05-27 14:38:34 -07:00
Steven Rostedt	0f4fc29dd6	tracing: add __print_symbolic to trace events This patch adds __print_symbolic which is similar to __print_flags but works for an enumeration type instead. That is, there is only a one to one mapping between the values and the symbols. When a match is made, then it is printed, otherwise the hex value is outputed. [ Impact: add interface for showing symbol names in events ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2009-05-26 20:31:50 +02:00
Steven Rostedt	be74b73a57	tracing: add __print_flags for events Developers have been asking for the ability in the ftrace event tracer to display names of bits in a flags variable. Instead of printing out c2, it would be easier to read FOO\|BAR\|GOO, assuming that FOO is bit 1, BAR is bit 6 and GOO is bit 7. Some examples where this would be useful are the state flags in a context switch, kmalloc flags, and even permision flags in accessing files. [ v2 changes include: Frederic Weisbecker's idea of using a mask instead of bits, thus we can output GFP_KERNEL instead of GPF_WAIT\|GFP_IO\|GFP_FS. Li Zefan's idea of allowing the caller of __print_flags to add their own delimiter (or no delimiter) where we can get for file permissions rwx instead of r\|w\|x. ] [ v3 changes: Christoph Hellwig's idea of using an array instead of va_args. ] [ Impact: better displaying of flags in trace output ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2009-05-26 20:25:22 +02:00
Ingo Molnar	0127c3ea08	perf_counter: fix warning & lockup - remove bogus warning - fix wakeup from NMI path lockup - also fix up whitespace noise in perf_counter.h Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <20090525153931.703093461@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-25 22:02:23 +02:00
Peter Zijlstra	a78ac32587	perf_counter: Generic per counter interrupt throttle Introduce a generic per counter interrupt throttle. This uses the perf_counter_overflow() quick disable to throttle a specific counter when its going too fast when a pmu->unthrottle() method is provided which can undo the quick disable. Power needs to implement both the quick disable and the unthrottle method. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <20090525153931.703093461@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-25 21:41:12 +02:00
Peter Zijlstra	48e22d56ec	perf_counter: x86: Remove interrupt throttle remove the x86 specific interrupt throttle Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <20090525153931.616671838@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-25 21:41:12 +02:00
Jozsef Kadlecsik	bfcaa50270	netfilter: nf_ct_tcp: fix accepting invalid RST segments Robert L Mathews discovered that some clients send evil TCP RST segments, which are accepted by netfilter conntrack but discarded by the destination. Thus the conntrack entry is destroyed but the destination retransmits data until timeout. The same technique, i.e. sending properly crafted RST segments, can easily be used to bypass connlimit/connbytes based restrictions (the sample script written by Robert can be found in the netfilter mailing list archives). The patch below adds a new flag and new field to struct ip_ct_tcp_state so that checking RST segments can be made more strict and thus TCP conntrack can catch the invalid ones: the RST segment is accepted only if its sequence number higher than or equal to the highest ack we seen from the other direction. (The last_ack field cannot be reused because it is used to catch resent packets.) Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-05-25 17:23:15 +02:00
Peter Zijlstra	6ab423e0ea	perf_counter: Propagate inheritance failures down the fork() path Fail fork() when we fail inheritance for some reason (-ENOMEM most likely). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <20090525124600.324656474@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-25 14:55:01 +02:00
Peter Zijlstra	e527ea312f	perf_counter: Remove unused ABI bits extra_config_len isn't used for anything, remove it. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <20090525124600.116035832@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-25 14:55:00 +02:00
Peter Zijlstra	475c557973	perf_counter: Remove perf_counter_context::nr_enabled now that pctrl() no longer disables other people's counters, remove the PMU cache code that deals with that. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <20090523163013.032998331@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-24 08:24:30 +02:00
Peter Zijlstra	082ff5a276	perf_counter: Change pctrl() behaviour Instead of en/dis-abling all counters acting on a particular task, en/dis- able all counters we created. [ v2: fix crash on first counter enable ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <20090523163012.916937244@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-24 08:24:08 +02:00
Peter Zijlstra	fccc714b31	perf_counter: Sanitize counter->mutex s/counter->mutex/counter->child_mutex/ and make sure its only used to protect child_list. The usage in __perf_counter_exit_task() doesn't appear to be problematic since ctx->mutex also covers anything related to fd tear-down. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <20090523163012.533186528@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-23 19:37:45 +02:00
Peter Zijlstra	e220d2dcb9	perf_counter: Fix dynamic irq_period logging We call perf_adjust_freq() from perf_counter_task_tick() which is is called under the rq->lock causing lock recursion. However, it's no longer required to be called under the rq->lock, so remove it from under it. Also, fix up some related comments. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <20090523163012.476197912@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-23 19:37:44 +02:00
Henrik Rydberg	df391e0eda	Input: multitouch - add tracking ID to the protocol There are a few multi-touch devices that support finger tracking well in hardware, Stantum being the prime example. By exposing the tracking ID in the MT protocol, evdev bandwidth and cpu usage in user space can be reduced. This patch adds the ABS_MT_TRACKING_ID to the MT protocol. Signed-off-by: Henrik Rydberg <rydberg@euromail.se> Tested-by: Stéphane Chatty <chatty@enac.fr> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2009-05-23 09:53:18 -07:00
Martin K. Petersen	c72758f337	block: Export I/O topology for block devices and partitions To support devices with physical block sizes bigger than 512 bytes we need to ensure proper alignment. This patch adds support for exposing I/O topology characteristics as devices are stacked. logical_block_size is the smallest unit the device can address. physical_block_size indicates the smallest I/O the device can write without incurring a read-modify-write penalty. The io_min parameter is the smallest preferred I/O size reported by the device. In many cases this is the same as the physical block size. However, the io_min parameter can be scaled up when stacking (RAID5 chunk size > physical block size). The io_opt characteristic indicates the optimal I/O size reported by the device. This is usually the stripe width for arrays. The alignment_offset parameter indicates the number of bytes the start of the device/partition is offset from the device's natural alignment. Partition tools and MD/DM utilities can use this to pad their offsets so filesystems start on proper boundaries. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-22 23:22:55 +02:00
Martin K. Petersen	025146e13b	block: Move queue limits to an embedded struct To accommodate stacking drivers that do not have an associated request queue we're moving the limits to a separate, embedded structure. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-22 23:22:55 +02:00
Martin K. Petersen	ae03bf639a	block: Use accessor functions for queue limits Convert all external users of queue limits to using wrapper functions instead of poking the request queue variables directly. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-22 23:22:54 +02:00
Martin K. Petersen	e1defc4ff0	block: Do away with the notion of hardsect_size Until now we have had a 1:1 mapping between storage device physical block size and the logical block sized used when addressing the device. With SATA 4KB drives coming out that will no longer be the case. The sector size will be 4KB but the logical block size will remain 512-bytes. Hence we need to distinguish between the physical block size and the logical ditto. This patch renames hardsect_size to logical_block_size. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-22 23:22:54 +02:00
Jens Axboe	9bd7de51ee	Merge branch 'master' into for-2.6.31 Conflicts: drivers/ide/ide-io.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-22 20:28:35 +02:00
Jens Axboe	e4b636366c	Merge branch 'master' into for-2.6.31 Conflicts: drivers/block/hd.c drivers/block/mg_disk.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-22 20:25:34 +02:00
Linus Torvalds	5ae115af1d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: via82cxxx: Add VIA VX855 PCI Device ID ide: report timeouts in ide_busy_sleep() ide: improve failed opcode reporting ide: fix printk() levels in ide_dump_ata[pi]_error() ide: fix OOPS during ide-cd error recovery ide: fix 40-wire cable detection for TSST SH-S202* ATAPI devices (v2)	2009-05-22 08:22:39 -07:00
Bartlomiej Zolnierkiewicz	4c9773ed79	Merge branch 'for-linus' into for-next	2009-05-22 17:10:55 +02:00
Harald Welte	5993856e53	via82cxxx: Add VIA VX855 PCI Device ID This patch adds the PCI Device ID 0xc409 to the PCI ID table of via82cxxx.c, as well as the 0x8409 south bridge ID. This is required to make the IDE driver work on the VX855/VX875 integrated chipset. Signed-off-by: Harald Welte <HaraldWelte@viatech.com> Cc: Joseph Chan <JosephChan@via.com.tw> Cc: Bruce Chang <BruceChang@via.com.tw> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-05-22 16:23:39 +02:00
Bartlomiej Zolnierkiewicz	28ee9bc5cc	ide: report timeouts in ide_busy_sleep() * change 'hwif' argument to 'drive' * report an error on timeout Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-05-22 16:23:38 +02:00
Ingo Molnar	910431c7f2	perf_counter: fix !PERF_COUNTERS build failure Update the !CONFIG_PERF_COUNTERS prototype too, for perf_counter_task_sched_out(). [ Impact: build fix ] Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <18966.10666.517218.332164@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-22 12:33:14 +02:00
Paul Mackerras	564c2b210a	perf_counter: Optimize context switch between identical inherited contexts When monitoring a process and its descendants with a set of inherited counters, we can often get the situation in a context switch where both the old (outgoing) and new (incoming) process have the same set of counters, and their values are ultimately going to be added together. In that situation it doesn't matter which set of counters are used to count the activity for the new process, so there is really no need to go through the process of reading the hardware counters and updating the old task's counters and then setting up the PMU for the new task. This optimizes the context switch in this situation. Instead of scheduling out the perf_counter_context for the old task and scheduling in the new context, we simply transfer the old context to the new task and keep using it without interruption. The new context gets transferred to the old task. This means that both tasks still have a valid perf_counter_context, so no special case is introduced when the old task gets scheduled in again, either on this CPU or another CPU. The equivalence of contexts is detected by keeping a pointer in each cloned context pointing to the context it was cloned from. To cope with the situation where a context is changed by adding or removing counters after it has been cloned, we also keep a generation number on each context which is incremented every time a context is changed. When a context is cloned we take a copy of the parent's generation number, and two cloned contexts are equivalent only if they have the same parent and the same generation number. In order that the parent context pointer remains valid (and is not reused), we increment the parent context's reference count for each context cloned from it. Since we don't have individual fds for the counters in a cloned context, the only thing that can make two clones of a given parent different after they have been cloned is enabling or disabling all counters with prctl. To account for this, we keep a count of the number of enabled counters in each context. Two contexts must have the same number of enabled counters to be considered equivalent. Here are some measurements of the context switch time as measured with the lat_ctx benchmark from lmbench, comparing the times obtained with and without this patch series: -----Unmodified----- With this patch series Counters: none 2 HW 4H+4S none 2 HW 4H+4S 2 processes: Average 3.44 6.45 11.24 3.12 3.39 3.60 St dev 0.04 0.04 0.13 0.05 0.17 0.19 8 processes: Average 6.45 8.79 14.00 5.57 6.23 7.57 St dev 1.27 1.04 0.88 1.42 1.46 1.42 32 processes: Average 5.56 8.43 13.78 5.28 5.55 7.15 St dev 0.41 0.47 0.53 0.54 0.57 0.81 The numbers are the mean and standard deviation of 20 runs of lat_ctx. The "none" columns are lat_ctx run directly without any counters. The "2 HW" columns are with lat_ctx run under perfstat, counting cycles and instructions. The "4H+4S" columns are lat_ctx run under perfstat with 4 hardware counters and 4 software counters (cycles, instructions, cache references, cache misses, task clock, context switch, cpu migrations, and page faults). [ Impact: performance optimization of counter context-switches ] Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <18966.10666.517218.332164@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-22 12:18:20 +02:00
Paul Mackerras	a63eaf34ae	perf_counter: Dynamically allocate tasks' perf_counter_context struct This replaces the struct perf_counter_context in the task_struct with a pointer to a dynamically allocated perf_counter_context struct. The main reason for doing is this is to allow us to transfer a perf_counter_context from one task to another when we do lazy PMU switching in a later patch. This has a few side-benefits: the task_struct becomes a little smaller, we save some memory because only tasks that have perf_counters attached get a perf_counter_context allocated for them, and we can remove the inclusion of <linux/perf_counter.h> in sched.h, meaning that we don't end up recompiling nearly everything whenever perf_counter.h changes. The perf_counter_context structures are reference-counted and freed when the last reference is dropped. A context can have references from its task and the counters on its task. Counters can outlive the task so it is possible that a context will be freed well after its task has exited. Contexts are allocated on fork if the parent had a context, or otherwise the first time that a per-task counter is created on a task. In the latter case, we set the context pointer in the task struct locklessly using an atomic compare-and-exchange operation in case we raced with some other task in creating a context for the subject task. This also removes the task pointer from the perf_counter struct. The task pointer was not used anywhere and would make it harder to move a context from one task to another. Anything that needed to know which task a counter was attached to was already using counter->ctx->task. The __perf_counter_init_context function moves up in perf_counter.c so that it can be called from find_get_context, and now initializes the refcount, but is otherwise unchanged. We were potentially calling list_del_counter twice: once from __perf_counter_exit_task when the task exits and once from __perf_counter_remove_from_context when the counter's fd gets closed. This adds a check in list_del_counter so it doesn't do anything if the counter has already been removed from the lists. Since perf_counter_task_sched_in doesn't do anything if the task doesn't have a context, and leaves cpuctx->task_ctx = NULL, this adds code to __perf_install_in_context to set cpuctx->task_ctx if necessary, i.e. in the case where the current task adds the first counter to itself and thus creates a context for itself. This also adds similar code to __perf_counter_enable to handle a similar situation which can arise when the counters have been disabled using prctl; that also leaves cpuctx->task_ctx = NULL. [ Impact: refactor counter context management to prepare for new feature ] Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <18966.10075.781053.231153@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-22 12:18:19 +02:00
James Morris	2c9e703c61	Merge branch 'master' into next Conflicts: fs/exec.c Removed IMA changes (the IMA checks are now performed via may_open()). Signed-off-by: James Morris <jmorris@namei.org>	2009-05-22 18:40:59 +10:00
Paul Mundt	5f8371cec9	Merge branches 'sh/stable-updates' and 'sh/sparseirq'	2009-05-22 13:29:37 +09:00
Mimi Zohar	b9fc745db8	integrity: path_check update - Add support in ima_path_check() for integrity checking without incrementing the counts. (Required for nfsd.) - rename and export opencount_get to ima_counts_get - replace ima_shm_check calls with ima_counts_get - export ima_path_check Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-05-22 09:43:41 +10:00
Linus Torvalds	9fe02c03b4	Merge master.kernel.org:/home/rmk/linux-2.6-arm * master.kernel.org:/home/rmk/linux-2.6-arm: (25 commits) [ARM] 5519/1: amba probe: pass "struct amba_id " instead of void [ARM] 5517/1: integrator: don't put clock lookups in __initdata [ARM] 5518/1: versatile: don't put clock lookups in __initdata [ARM] mach-l7200: fix spelling of SYS_CLOCK_OFF [ARM] Double check memmap is actually valid with a memmap has unexpected holes V2 [ARM] realview: fix broadcast tick support [ARM] realview: remove useless smp_cross_call_done() [ARM] smp: fix cpumask usage in ARM SMP code [ARM] 5513/1: Eurotech VIPER SBC: fix compilation error [ARM] 5509/1: ep93xx: clkdev enable UARTS ARM: OMAP2/3: Change omapfb to use clkdev for dispc and rfbi, v2 ARM: OMAP3: Fix HW SAVEANDRESTORE shift define ARM: OMAP3: Fix number of GPIO lines for 34xx [ARM] S3C: Do not set clk->owner field if unset [ARM] S3C2410: mach-bast.c registering i2c data too early [ARM] S3C24XX: Fix unused code warning in arch/arm/plat-s3c24xx/dma.c [ARM] S3C64XX: fix GPIO debug [ARM] S3C64XX: GPIO include cleanup [ARM] nwfpe: fix 'floatx80_is_nan' sparse warning [ARM] nwfpe: Add decleration for ExtendedCPDO ...	2009-05-20 16:30:36 -07:00
Alessandro Rubini	03fbdb15c1	[ARM] 5519/1: amba probe: pass "struct amba_id " instead of void The second argument of the probe method points to the amba_id structure, so it's better passed with the correct type. None of the current in-tree drivers uses the pointer, so they have only been checked for a clean compile. Change suggested by Russell King. Signed-off-by: Alessandro Rubini <rubini@unipv.it> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2009-05-20 23:26:51 +01:00
Stephane Chatty	89f536ccfa	HID: add new multitouch and digitizer contants Added constants to hid.h for all digitizer usages (including the new multitouch ones that are not yet in the official USB spec but are being pushed by Microsft as described in their paper "Digitizer Drivers for Windows Touch and Pen-Based Computers"). Updated hid-debug.c to support the new MT input constants such as ABS_MT_POSITION_X. Signed-off-by: Stephane Chatty <chatty@enac.fr> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-05-20 15:48:35 +02:00
Peter Zijlstra	26b119bc81	perf_counter: Log irq_period changes For the dynamic irq_period code, log whenever we change the period so that analyzing code can normalize the event flow. [ Impact: add new feature to allow more precise profiling ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <20090520102553.298769743@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-20 12:43:33 +02:00
Peter Zijlstra	d7b629a34f	perf_counter: Solve the rotate_ctx vs inherit race differently Instead of disabling RR scheduling of the counters, use a different list that does not get rotated to iterate the counters on inheritance. [ Impact: cleanup, optimization ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <20090520102553.237504544@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-20 12:43:32 +02:00
Thomas Gleixner	521c180874	Merge branch 'core/urgent' into core/futexes Merge reason: this branch was on an pre -rc1 base, merge it up to -rc6+ to get the latest upstream fixes. Conflicts: kernel/futex.c Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-05-20 09:02:28 +02:00
Jens Axboe	0a7ae2ff0d	block: change the tag sync vs async restriction logic Make them fully share the tag space, but disallow async requests using the last any two slots. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-20 08:54:31 +02:00
Ingo Molnar	c44d70a340	perf_counter: fix counter inheritance race Context rotation should not occur when we are in the middle of walking the counter list when inheriting counters ... [ Impact: fix occasionally incorrect perf stat results ] Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-20 00:22:30 +02:00
Boaz Harrosh	a411f4bbb8	block: Un-export blk_rq_append_bio OSD was the last in-tree user of blk_rq_append_bio(). Now that it is fixed blk_rq_append_bio is un-exported and is only used internally by block layer. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-19 12:14:56 +02:00
Boaz Harrosh	79eb63e9e5	block: Add blk_make_request(), takes bio, returns a request New block API: given a struct bio allocates a new request. This is the parallel of generic_make_request for BLOCK_PC commands users. The passed bio may be a chained-bio. The bio is bounced if needed inside the call to this member. This is in the effort of un-exporting blk_rq_append_bio(). Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> CC: Jeff Garzik <jeff@garzik.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-19 12:14:56 +02:00
Ingo Molnar	4200efd9ac	sched: properly define the sched_group::cpumask and sched_domain::span fields Properly document the variable-size structure tricks we are doing wrt. struct sched_group and sched_domain, and use the field[0] GCC extension instead of defining a vla array. Dont use unions for this, as pointed out by Linus. [ Impact: cleanup, un-confuse Sparse and LLVM ] Reported-by: Jeff Garzik <jeff@garzik.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <alpine.LFD.2.01.0905180850110.3301@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-19 09:22:19 +02:00
Eric Paris	75834fc3b6	SELinux: move SELINUX_MAGIC into magic.h The selinuxfs superblock magic is used inside the IMA code, but is being defined in two places and could someday get out of sync. This patch moves the declaration into magic.h so it is only done once. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-05-19 08:19:00 +10:00
Hannes Reinecke	1cde26f928	virtio_blk: SG_IO passthru support Add support for SG_IO passthru to virtio_blk. We add the scsi command block after the normal outhdr, and the scsi inhdr with full status information aswell as the sense buffer before the regular inhdr. [hch: forward ported, added the VIRTIO_BLK_F_SCSI flags, some comments and tested the whole beast] [axboe: updated to use ->resid and not dual-path the byte count] Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (+ checkpatch.pl tweak) Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-18 14:41:30 +02:00
Mel Gorman	eb33575cf6	[ARM] Double check memmap is actually valid with a memmap has unexpected holes V2 pfn_valid() is meant to be able to tell if a given PFN has valid memmap associated with it or not. In FLATMEM, it is expected that holes always have valid memmap as long as there is valid PFNs either side of the hole. In SPARSEMEM, it is assumed that a valid section has a memmap for the entire section. However, ARM and maybe other embedded architectures in the future free memmap backing holes to save memory on the assumption the memmap is never used. The page_zone linkages are then broken even though pfn_valid() returns true. A walker of the full memmap must then do this additional check to ensure the memmap they are looking at is sane by making sure the zone and PFN linkages are still valid. This is expensive, but walkers of the full memmap are extremely rare. This was caught before for FLATMEM and hacked around but it hits again for SPARSEMEM because the page_zone linkages can look ok where the PFN linkages are totally screwed. This looks like a hatchet job but the reality is that any clean solution would end up consumning all the memory saved by punching these unexpected holes in the memmap. For example, we tried marking the memmap within the section invalid but the section size exceeds the size of the hole in most cases so pfn_valid() starts returning false where valid memmap exists. Shrinking the size of the section would increase memory consumption offsetting the gains. This patch identifies when an architecture is punching unexpected holes in the memmap that the memory model cannot automatically detect and sets ARCH_HAS_HOLES_MEMORYMODEL. At the moment, this is restricted to EP93xx which is the model sub-architecture this has been reported on but may expand later. When set, walkers of the full memmap must call memmap_valid_within() for each PFN and passing in what it expects the page and zone to be for that PFN. If it finds the linkages to be broken, it assumes the memmap is invalid for that PFN. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2009-05-18 11:22:24 +01:00
Ingo Molnar	1079cac0f4	Merge commit 'v2.6.30-rc6' into tracing/core Merge reason: we were on an -rc4 base, sync up to -rc6 Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-18 10:15:35 +02:00
Yinghai Lu	888a589f6b	mm, x86: remove MEMORY_HOTPLUG_RESERVE related code after: \| commit `b263295dbf` \| Author: Christoph Lameter <clameter@sgi.com> \| Date: Wed Jan 30 13:30:47 2008 +0100 \| \| x86: 64-bit, make sparsemem vmemmap the only memory model we don't have MEMORY_HOTPLUG_RESERVE anymore. Historically, x86-64 had an architecture-specific method for memory hotplug whereby it scanned the SRAT for physical memory ranges that could be potentially used for memory hot-add later. By reserving those ranges without physical memory, the memmap would be allocated and left dormant until needed. This depended on the DISCONTIG memory model which has been removed so the code implementing HOTPLUG_RESERVE is now dead. This patch removes the dead code used by MEMORY_HOTPLUG_RESERVE. (Changelog authored by Mel.) v2: updated changelog, and remove hotadd= in doc [ Impact: remove dead code ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Reviewed-by: Mel Gorman <mel@csn.ul.ie> Workflow-found-OK-by: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <4A0C4910.7090508@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-18 09:13:31 +02:00
Ingo Molnar	dc3f81b129	Merge commit 'v2.6.30-rc6' into perfcounters/core Merge reason: this branch was on an -rc4 base, merge it up to -rc6 to get the latest upstream fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-18 07:37:49 +02:00
Jeff Mahoney	b83674c0da	reiserfs: fixup perms when xattrs are disabled This adds CONFIG_REISERFS_FS_XATTR protection from reiserfs_permission. This is needed to avoid warnings during file deletions and chowns with xattrs disabled. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-17 11:45:45 -07:00
Bartlomiej Zolnierkiewicz	9f36d31437	ide: remove hw_regs_t typedef Remove hw_regs_t typedef and rename struct hw_regs_s to struct ide_hw. There should be no functional changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-05-17 19:12:25 +02:00
Bartlomiej Zolnierkiewicz	dca3983059	ide: pass number of ports to ide_host_{alloc,add}() (v2) Pass number of ports to ide_host_{alloc,add}() and then update all users accordingly. v2: - drop no longer needed NULL initializers in buddha.c, cmd640.c and gayle.c (noticed by Sergei) There should be no functional changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-05-17 19:12:24 +02:00
Bartlomiej Zolnierkiewicz	29e52cf793	ide: remove chipset field from hw_regs_t * Convert host drivers that still use hw_regs_t's chipset field to use the one in struct ide_port_info instead. * Move special handling of ide_pci chipset type from ide_hw_configure() to ide_init_port(). * Remove chipset field from hw_regs_t. While at it: - remove stale comment in delkin_cb.c There should be no functional changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-05-17 19:12:22 +02:00
Bartlomiej Zolnierkiewicz	ca1b96e00a	ide: replace special_t typedef by IDE_SFLAG_* flags Replace: - special_t typedef by IDE_SFLAG_* flags - 'special_t special' ide_drive_t's field by 'u8 special_flags' one There should be no functional changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2009-05-17 19:12:21 +02:00
Martin K. Petersen	4bca328643	libata: Media rotation rate and form factor heuristics This patch provides new heuristics for parsing both the form factor and media rotation rate ATA IDENFITY words. The reported ATA version must be 7 or greater and the device must return values defined as valid in the standard. Only then are the characteristics reported to SCSI via the VPD B1 page. This seems like a reasonable compromise to me considering that we have been shipping several kernel releases that key off the rotation rate bit without any version checking whatsoever. With no complaints so far. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-05-15 14:14:56 -04:00
Linus Torvalds	c653849981	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: Revert "mm: add /proc controls for pdflush threads" viocd: needs to depend on BLOCK block: fix the bio_vec array index out-of-bounds test	2009-05-15 08:05:37 -07:00
Paul Mackerras	9d23a90a67	perf_counter: allow arch to supply event misc flags and instruction pointer At present the values we put in overflow events for the misc flags indicating processor mode and the instruction pointer are obtained using the standard user_mode() and instruction_pointer() functions. Those functions tell you where the performance monitor interrupt was taken, which might not be exactly where the counter overflow occurred, for example because interrupts were disabled at the point where the overflow occurred, or because the processor had many instructions in flight and chose to complete some more instructions beyond the one that caused the counter overflow. Some architectures (e.g. powerpc) can supply more precise information about where the counter overflow occurred and the processor mode at that point. This introduces new functions, perf_misc_flags() and perf_instruction_pointer(), which arch code can override to provide more precise information if available. They have default implementations which are identical to the existing code. This also adds a new misc flag value, PERF_EVENT_MISC_HYPERVISOR, for the case where a counter overflow occurred in the hypervisor. We encode the processor mode in the 2 bits previously used to indicate user or kernel mode; the values for user and kernel mode are unchanged and hypervisor mode is indicated by both bits being set. [ Impact: generalize perfcounter core facilities ] Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <18956.1272.818511.561835@cargo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-15 16:38:56 +02:00
Thomas Gleixner	2d02494f5a	sched, timers: cleanup avenrun users avenrun is an rough estimate so we don't have to worry about consistency of the three avenrun values. Remove the xtime lock dependency and provide a function to scale the values. Cleanup the users. [ Impact: cleanup ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org>	2009-05-15 15:32:45 +02:00
Thomas Gleixner	dce48a84ad	sched, timers: move calc_load() to scheduler Dimitri Sivanich noticed that xtime_lock is held write locked across calc_load() which iterates over all online CPUs. That can cause long latencies for xtime_lock readers on large SMP systems. The load average calculation is an rough estimate anyway so there is no real need to protect the readers vs. the update. It's not a problem when the avenrun array is updated while a reader copies the values. Instead of iterating over all online CPUs let the scheduler_tick code update the number of active tasks shortly before the avenrun update happens. The avenrun update itself is handled by the CPU which calls do_timer(). [ Impact: reduce xtime_lock write locked section ] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org>	2009-05-15 15:32:45 +02:00
Peter Zijlstra	60db5e09c1	perf_counter: frequency based adaptive irq_period Instead of specifying the irq_period for a counter, provide a target interrupt frequency and dynamically adapt the irq_period to match this frequency. [ Impact: new perf-counter attribute/feature ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <20090515132018.646195868@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-15 15:26:56 +02:00
Peter Zijlstra	789f90fcf6	perf_counter: per user mlock gift Instead of a per-process mlock gift for perf-counters, use a per-user gift so that there is less of a DoS potential. [ Impact: allow less worst-case unprivileged memory consumption ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <20090515132018.496182835@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-15 15:26:56 +02:00
Jens Axboe	cd17cbfda0	Revert "mm: add /proc controls for pdflush threads" This reverts commit `fafd688e4c`. Work is progressing to switch away from pdflush as the process backing for flushing out dirty data. So it seems pointless to add more knobs to control pdflush threads. The original author of the patch did not have any specific use cases for adding the knobs, so we can easily revert this before 2.6.30 to avoid having to maintain this API forever. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-15 11:32:24 +02:00
Peter Zijlstra	9e35ad388b	perf_counter: Rework the perf counter disable/enable The current disable/enable mechanism is: token = hw_perf_save_disable(); ... /* do bits */ ... hw_perf_restore(token); This works well, provided that the use nests properly. Except we don't. x86 NMI/INT throttling has non-nested use of this, breaking things. Therefore provide a reference counter disable/enable interface, where the first disable disables the hardware, and the last enable enables the hardware again. [ Impact: refactor, simplify the PMU disable/enable logic ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-15 09:47:02 +02:00
Takashi Iwai	9fc20f030b	ALSA: ctxfi - Move PCI ID definitions to linux/pci_ids.h Signed-off-by: Takashi Iwai <tiwai@suse.de>	2009-05-14 15:14:18 +02:00
Linus Torvalds	bd99f5e17b	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: dma: fix ipu_idmac.c to not discard the last queued buffer ioatdma: fix "ioatdma frees DMA memory with wrong function" ipu_idmac: Use disable_irq_nosync() from within irq handlers. dmatest: fix max channels handling	2009-05-12 17:12:36 -07:00
Maciej Sosnowski	4f005dbe55	ioatdma: fix "ioatdma frees DMA memory with wrong function" as reported by Alexander Beregalov <a.beregalov@gmail.com> ioatdma 0000:00:08.0: DMA-API: device driver frees DMA memory with wrong function [device address=0x000000007f76f800] [size=2000 bytes] [map ped as single] [unmapped as page] The ioatdma driver was unmapping all regions (either allocated as page or single) using unmap_page. This patch lets dma driver recognize if unmap_single or unmap_page should be used. It introduces two new dma control flags: DMA_COMPL_SRC_UNMAP_SINGLE and DMA_COMPL_DEST_UNMAP_SINGLE. They should be set to indicate dma driver to do dma-unmapping as single (first one for the source, tha latter for the destination). If respective flag is not set, the driver assumes dma-unmapping as page. Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Tested-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2009-05-12 14:41:47 -07:00
Arnd Bergmann	ecf4667d30	syscalls.h add the missing sys_pipe2 declaration In order to build the generic syscall table, we need a declaration for every system call. sys_pipe2 was added without a proper declaration, so add this to syscalls.h now. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-12 14:11:35 -07:00
Ingo Molnar	6cda3eb62e	Merge branch 'x86/apic' into irq/numa Merge reason: both topics modify the APIC code but were able to do it in parallel so far. An upcoming patch generates a conflict so merge them to avoid the conflict. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-12 12:17:36 +02:00
Jens Axboe	2b1ccc0ee9	splice: fix misleading comment Splice is tied to pipes by design, it'll not change. And now that the splice stuff is in splice.h (and note pipe.h), the rest of the comment is out-of-date as well. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-12 11:11:48 +02:00
Ingo Molnar	41fb454ebe	Merge commit 'v2.6.30-rc5' into core/iommu Merge reason: core/iommu was on an .30-rc1 base, update it to .30-rc5 to refresh. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-11 14:44:31 +02:00
Miklos Szeredi	6818173bd6	splice: implement default splice_read method If f_op->splice_read() is not implemented, fall back to a plain read. Use vfs_readv() to read into previously allocated pages. This will allow splice and functions using splice, such as the loop device, to work on all filesystems. This includes "direct_io" files in fuse which bypass the page cache. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 14:13:10 +02:00
Ingo Molnar	7961386fe9	Merge commit 'v2.6.30-rc5' into sched/core Merge reason: sched/core was on .30-rc1 before, update to latest fixes Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-11 12:59:37 +02:00
FUJITA Tomonori	b1f744937f	block: move completion related functions back to blk-core.c Let's put the completion related functions back to block/blk-core.c where they have lived. We can also unexport blk_end_bidi_request() and __blk_end_bidi_request(), which nobody uses. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 11:06:48 +02:00
FUJITA Tomonori	1822952ba2	block: let blk_end_request_all handle bidi requests blk_end_request_all() and __blk_end_request_all() should finish all bytes including bidi, by definition. That's what all bidi users need , bidi requests must be complete as a whole (partial completion is impossible). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 11:06:47 +02:00
Tejun Heo	9934c8c045	block: implement and enforce request peek/start/fetch Till now block layer allowed two separate modes of request execution. A request is always acquired from the request queue via elv_next_request(). After that, drivers are free to either dequeue it or process it without dequeueing. Dequeue allows elv_next_request() to return the next request so that multiple requests can be in flight. Executing requests without dequeueing has its merits mostly in allowing drivers for simpler devices which can't do sg to deal with segments only without considering request boundary. However, the benefit this brings is dubious and declining while the cost of the API ambiguity is increasing. Segment based drivers are usually for very old or limited devices and as converting to dequeueing model isn't difficult, it doesn't justify the API overhead it puts on block layer and its more modern users. Previous patches converted all block low level drivers to dequeueing model. This patch completes the API transition by... * renaming elv_next_request() to blk_peek_request() * renaming blkdev_dequeue_request() to blk_start_request() * adding blk_fetch_request() which is combination of peek and start * disallowing completion of queued (not started) requests * applying new API to all LLDs Renamings are for consistency and to break out of tree code so that it's apparent that out of tree drivers need updating. [ Impact: block request issue API cleanup, no functional change ] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Mike Miller <mike.miller@hp.com> Cc: unsik Kim <donari75@gmail.com> Cc: Paul Clements <paul.clements@steeleye.com> Cc: Tim Waugh <tim@cyberelk.net> Cc: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: David S. Miller <davem@davemloft.net> Cc: Laurent Vivier <Laurent@lvivier.info> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Adrian McMenamin <adrian@mcmen.demon.co.uk> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Alex Dubov <oakad@yahoo.com> Cc: Pierre Ossman <drzeus@drzeus.cx> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Markus Lidel <Markus.Lidel@shadowconnect.com> Cc: Stefan Weinhuber <wein@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:52:18 +02:00
Tejun Heo	a2dec7b363	block: hide request sector and data_len Block low level drivers for some reason have been pretty good at abusing block layer API. Especially struct request's fields tend to get violated in all possible ways. Make it clear that low level drivers MUST NOT access or manipulate rq->sector and rq->data_len directly by prefixing them with double underscores. This change is also necessary to break build of out-of-tree codes which assume the previous block API where internal fields can be manipulated and rq->data_len carries residual count on completion. [ Impact: hide internal fields, block API change ] Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:50:55 +02:00
Tejun Heo	2e46e8b27a	block: drop request->hard_* and nr_sectors struct request has had a few different ways to represent some properties of a request. ->hard_ represent block layer's view of the request progress (completion cursor) and the ones without the prefix are supposed to represent the issue cursor and allowed to be updated as necessary by the low level drivers. The thing is that as block layer supports partial completion, the two cursors really aren't necessary and only cause confusion. In addition, manual management of request detail from low level drivers is cumbersome and error-prone at the very least. Another interesting duplicate fields are rq->[hard_]nr_sectors and rq->{hard_cur\|current}_nr_sectors against rq->data_len and rq->bio->bi_size. This is more convoluted than the hard_ case. rq->[hard_]nr_sectors are initialized for requests with bio but blk_rq_bytes() uses it only for !pc requests. rq->data_len is initialized for all request but blk_rq_bytes() uses it only for pc requests. This causes good amount of confusion throughout block layer and its drivers and determining the request length has been a bit of black magic which may or may not work depending on circumstances and what the specific LLD is actually doing. rq->{hard_cur\|current}_nr_sectors represent the number of sectors in the contiguous data area at the front. This is mainly used by drivers which transfers data by walking request segment-by-segment. This value always equals rq->bio->bi_size >> 9. However, data length for pc requests may not be multiple of 512 bytes and using this field becomes a bit confusing. In general, having multiple fields to represent the same property leads only to confusion and subtle bugs. With recent block low level driver cleanups, no driver is accessing or manipulating these duplicate fields directly. Drop all the duplicates. Now rq->sector means the current sector, rq->data_len the current total length and rq->bio->bi_size the current segment length. Everything else is defined in terms of these three and available only through accessors. * blk_recalc_rq_sectors() is collapsed into blk_update_request() and now handles pc and fs requests equally other than rq->sector update. This means that now pc requests can use partial completion too (no in-kernel user yet tho). * bio_cur_sectors() is replaced with bio_cur_bytes() as block layer now uses byte count as the primary data length. * blk_rq_pos() is now guranteed to be always correct. In-block users converted. * blk_rq_bytes() is now guaranteed to be always valid as is blk_rq_sectors(). In-block users converted. * blk_rq_sectors() is now guaranteed to equal blk_rq_bytes() >> 9. More convenient one is used. * blk_rq_bytes() and blk_rq_cur_bytes() are now inlined and take const pointer to request. [ Impact: API cleanup, single way to represent one property of a request ] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:50:54 +02:00
Tejun Heo	5b93629b45	block: implement blk_rq_pos/[cur_]sectors() and convert obvious ones Implement accessors - blk_rq_pos(), blk_rq_sectors() and blk_rq_cur_sectors() which return rq->hard_sector, rq->hard_nr_sectors and rq->hard_cur_sectors respectively and convert direct references of the said fields to the accessors. This is in preparation of request data length handling cleanup. Geert : suggested adding const to struct request * parameter to accessors Sergei : spotted error in patch description [ Impact: cleanup ] Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Acked-by: Stephen Rothwell <sfr@canb.auug.org.au> Tested-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Grant Likely <grant.likely@secretlab.ca> Ackec-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:50:53 +02:00
Tejun Heo	c3a4d78c58	block: add rq->resid_len rq->data_len served two purposes - the length of data buffer on issue and the residual count on completion. This duality creates some headaches. First of all, block layer and low level drivers can't really determine what rq->data_len contains while a request is executing. It could be the total request length or it coulde be anything else one of the lower layers is using to keep track of residual count. This complicates things because blk_rq_bytes() and thus [__]blk_end_request_all() relies on rq->data_len for PC commands. Drivers which want to report residual count should first cache the total request length, update rq->data_len and then complete the request with the cached data length. Secondly, it makes requests default to reporting full residual count, ie. reporting that no data transfer occurred. The residual count is an exception not the norm; however, the driver should clear rq->data_len to zero to signify the normal cases while leaving it alone means no data transfer occurred at all. This reverse default behavior complicates code unnecessarily and renders block PC on some drivers (ide-tape/floppy) unuseable. This patch adds rq->resid_len which is used only for residual count. While at it, remove now unnecessasry blk_rq_bytes() caching in ide_pc_intr() as rq->data_len is not changed anymore. Boaz : spotted missing conversion in osd Sergei : spotted too early conversion to blk_rq_bytes() in ide-tape [ Impact: cleanup residual count handling, report 0 resid by default ] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Mike Miller <mike.miller@hp.com> Cc: Eric Moore <Eric.Moore@lsi.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Doug Gilbert <dgilbert@interlog.com> Cc: Mike Miller <mike.miller@hp.com> Cc: Eric Moore <Eric.Moore@lsi.com> Cc: Darrick J. Wong <djwong@us.ibm.com> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:50:53 +02:00
Ingo Molnar	7a309490da	Merge commit 'v2.6.30-rc5' into x86/apic Merge reason: this branch was on a .30-rc2 base - sync it up with all the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-11 09:50:02 +02:00
David Howells	5e751e992f	CRED: Rename cred_exec_mutex to reflect that it's a guard against ptrace Rename cred_exec_mutex to reflect that it's a guard against foreign intervention on a process's credential state, such as is made by ptrace(). The attachment of a debugger to a process affects execve()'s calculation of the new credential state - _and_ also setprocattr()'s calculation of that state. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-05-11 08:15:36 +10:00
Linus Torvalds	0016effb90	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: Revert driver core: move platform_data into platform_device Revert driver core: fix passing platform_data Remove old PRINTK_DEBUG config item Doc/sysfs-rules: Swap the order of the words so the sentence makes more sense Driver core: platform: fix kernel-doc warnings	2009-05-10 10:49:31 -07:00
Linus Torvalds	93b49d45eb	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (22 commits) Fix the race between capifs remount and node creation Fix races around the access to ->s_options switch ufs directories to ufs_sync_file() Switch open_exec() and sys_uselib() to do_open_filp() Make open_exec() and sys_uselib() use may_open(), instead of duplicating its parts Reduce path_lookup() abuses Make checkpatch.pl shut up on fs/inode.c NULL noise in fs/super.c:kill_bdev_super() romfs: cleanup romfs_fs.h ROMFS: romfs_dev_read() error ignored fs: dcache fix LRU ordering ocfs2: Use nd_set_link(). Fix deadlock in ipathfs ->get_sb() Fix a leak in failure exit in 9p ->get_sb() Convert obvious places to deactivate_locked_super() New helper: deactivate_locked_super() reiserfs: remove privroot hiding in lookup reiserfs: dont associate security.* with xattr files reiserfs: fixup xattr_root caching Always lookup priv_root on reiserfs mount and keep it ...	2009-05-10 10:49:08 -07:00
Linus Torvalds	2ad20802b7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (26 commits) bonding: fix panic if initialization fails IXP4xx: complete Ethernet netdev setup before calling register_netdev(). IXP4xx: use "ENODEV" instead of "ENOSYS" in module initialization. ipvs: Fix IPv4 FWMARK virtual services ipv4: Make INET_LRO a bool instead of tristate. net: remove stale reference to fastroute from Kconfig help text net: update skb_recycle_check() for hardware timestamping changes bnx2: Fix panic in bnx2_poll_work(). net-sched: fix bfifo default limit igb: resolve panic on shutdown when SR-IOV is enabled wimax: oops: wimax_dev_add() is the only one that can initialize the state wimax: fix oops if netlink fails to add attribute Bluetooth: Move dev_set_name() to a context that can sleep netfilter: ctnetlink: fix wrong message type in user updates netfilter: xt_cluster: fix use of cluster match with 32 nodes netfilter: ip6t_ipv6header: fix match on packets ending with NEXTHDR_NONE netfilter: add missing linux/types.h include to xt_LED.h mac80211: pid, fix memory corruption mac80211: minstrel, fix memory corruption cfg80211: fix comment on regulatory hint processing ...	2009-05-10 10:46:45 -07:00
Al Viro	2a32cebd6c	Fix races around the access to ->s_options Put generic_show_options read access to s_options under rcu_read_lock, split save_mount_options() into "we are setting it the first time" (uses in foo_fill_super()) and "we are relacing and freeing the old one", synchronize_rcu() before kfree() in the latter. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-05-09 10:51:34 -04:00
Al Viro	6e8341a11e	Switch open_exec() and sys_uselib() to do_open_filp() ... and make path_lookup_open() static Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-05-09 10:49:42 -04:00
Christoph Hellwig	db6c1fbb92	romfs: cleanup romfs_fs.h There's no kernel-only content in it anymore, so move it to header-y and remove the superflous #ifdef __KERNEL__. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-05-09 10:49:41 -04:00
Al Viro	74dbbdd7fd	New helper: deactivate_locked_super() Does equivalent of up_write(&s->s_umount); deactivate_super(s); However, it does not does not unlock it until it's all over. As the result, it's safe to use to dispose of new superblock on ->get_sb() failure exits - nobody will see the sucker until it's all over. Equivalent using up_write/deactivate_super is safe for that purpose if superblock is either safe to use or has NULL ->s_root when we unlock. Normally filesystems take the required precautions, but a) we do have bugs in that area in some of them. b) up_write/deactivate_super sequence is extremely common, so the helper makes sense anyway. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-05-09 10:49:39 -04:00
Jeff Mahoney	677c9b2e39	reiserfs: remove privroot hiding in lookup With Al Viro's patch to move privroot lookup to fs mount, there's no need to have special code to hide the privroot in reiserfs_lookup. I've also cleaned up the privroot hiding in reiserfs_readdir_dentry and removed the last user of reiserfs_xattrs(). Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-05-09 10:49:39 -04:00
Jeff Mahoney	ab17c4f021	reiserfs: fixup xattr_root caching The xattr_root caching was broken from my previous patch set. It wouldn't cause corruption, but could cause decreased performance due to allocating a larger chunk of the journal (~ 27 blocks) than it would actually use. This patch loads the xattr root dentry at xattr initialization and creates it on-demand. Since we're using the cached dentry, there's no point in keeping lookup_or_create_dir around, so that's removed. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-05-09 10:49:39 -04:00
Al Viro	edcc37a047	Always lookup priv_root on reiserfs mount and keep it ... even if it's a negative dentry. That way we can set ->d_op on root before anyone could race with us. Simplify d_compare(), while we are at it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-05-09 10:49:38 -04:00
Greg Kroah-Hartman	e67c85626c	Revert driver core: move platform_data into platform_device This reverts commit 006f4571a15fae3a0575f2a0f9e9b63b3d1012f8: This patch moves platform_data from struct device into struct platform_device, based on the two ideas: 1. Now all platform_driver is registered by platform_driver_register, which makes probe()/release()/... of platform_driver passed parameter of platform_device *, so platform driver can get platform_data from platform_device; 2. Other kind of devices do not need to use platform_data, we can decrease size of device if moving it to platform_device. Taking into consideration of thousands of files to be fixed and they can't be finished in one night(maybe it will take a long time), so we keep platform_data in device to allow two kind of cases coexist until all platform devices pass its platfrom data from platform_device->platform_data. All patches to do this kind of conversion are welcome. As we don't really want to do it, it was a bad idea. Cc: David Brownell <david-b@pacbell.net> Cc: Ming Lei <tom.leiming@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-05-08 19:22:21 -07:00
Steven Rostedt	4671c79408	tracing: add trace_set_clr_event to export event enabling function Other parts of the kernel may need to be able to enable or disable specific events. Especially parts that create trace events. [ Impact: allow enabling of trace events by those that create the event ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2009-05-08 16:30:26 -04:00
Peter Zijlstra	f370e1e2f1	perf_counter: add PERF_RECORD_CPU Allow recording the CPU number the event was generated on. RFC: this leaves a u32 as reserved, should we fill in the node_id() there, or leave this open for future extention, as userspace can already easily do the cpu->node mapping if needed. [ Impact: extend perfcounter output record format ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> LKML-Reference: <20090508170029.008627711@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-08 20:36:59 +02:00
Peter Zijlstra	a85f61abe1	perf_counter: add PERF_RECORD_CONFIG Much like CONFIG_RECORD_GROUP records the hw_event.config to identify the values, allow to record this for all counters. [ Impact: extend perfcounter output record format ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> LKML-Reference: <20090508170028.923228280@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-08 20:36:58 +02:00
Peter Zijlstra	3df5edad87	perf_counter: rework ioctl()s Corey noticed that ioctl()s on grouped counters didn't work on the whole group. This extends the ioctl() interface to take a second argument that is interpreted as a flags field. We then provide PERF_IOC_FLAG_GROUP to toggle the behaviour. Having this flag gives the greatest flexibility, allowing you to individually enable/disable/reset counters in a group, or all together. [ Impact: fix group counter enable/disable semantics ] Reported-by: Corey Ashford <cjashfor@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20090508170028.837558214@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-08 20:36:58 +02:00
Magnus Damm	501b825d01	sh-sci: improve clock framework support Use enable/disable hooks for clock framework integration. Make sure we control the clock for the serial console as well. Signed-off-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2009-05-08 23:22:26 +09:00
Magnus Damm	9080b72819	sh-sci: remove early_sci_setup() Remove unused early_sci_setup() function from sh-sci. Signed-off-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2009-05-08 22:53:58 +09:00
James Morris	d254117099	Merge branch 'master' into next	2009-05-08 17:56:47 +10:00
Linus Torvalds	d7a5926978	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (32 commits) [CIFS] Fix double list addition in cifs posix open code [CIFS] Allow raw ntlmssp code to be enabled with sec=ntlmssp [CIFS] Fix SMB uid in NTLMSSP authenticate request [CIFS] NTLMSSP reenabled after move from connect.c to sess.c [CIFS] Remove sparse warning [CIFS] remove checkpatch warning [CIFS] Fix final user of old string conversion code [CIFS] remove cifs_strfromUCS_le [CIFS] NTLMSSP support moving into new file, old dead code removed [CIFS] Fix endian conversion of vcnum field [CIFS] Remove trailing whitespace [CIFS] Remove sparse endian warnings [CIFS] Add remaining ntlmssp flags and standardize field names [CIFS] Fix build warning cifs: fix length handling in cifs_get_name_from_search_buf [CIFS] Remove unneeded QuerySymlink call and fix mapping for unmapped status [CIFS] rename cifs_strndup to cifs_strndup_from_ucs Added loop check when mounting DFS tree. Enable dfs submounts to handle remote referrals. [CIFS] Remove older session setup implementation ...	2009-05-07 21:13:24 -07:00
Geert Uytterhoeven	08ce4c91e4	dlm: Make name input parameter of {,dlm_}new_lockspace() const \| fs/gfs2/lock_dlm.c:207: warning: passing argument 1 of 'dlm_new_lockspace' discards qualifiers from pointer target type Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David Teigland <teigland@redhat.com>	2009-05-07 10:14:26 -05:00
Ingo Molnar	0ad5d703c6	Merge branch 'tracing/hw-branch-tracing' into tracing/core Merge reason: this topic is ready for upstream now. It passed Oleg's review and Andrew had no further mm/* objections/observations either. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-07 13:36:22 +02:00
Ingo Molnar	44347d947f	Merge branch 'linus' into tracing/core Merge reason: tracing/core was on a .30-rc1 base and was missing out on on a handful of tracing fixes present in .30-rc5-almost. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-07 11:17:34 +02:00
Alan D. Brunelle	a42aaa3bbc	blktrace: correct remap names This attempts to clarify names utilized during block I/O remap operations (partition, volume manager). It correctly matches up the /from/ information for both device & sector. This takes in the concept from Kosaki Motohiro and extends it to include better naming for the "device_from" field. [ Impact: cleanup ] Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <49FF4FAE.3000301@hp.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-06 14:13:00 +02:00
Li Zefan	2df75e4157	tracing/events: fix memory leak when unloading module When unloading a module, memory allocated by init_preds() and trace_define_field() is not freed. [ Impact: fix memory leak ] Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <4A00F6E0.3040503@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-06 10:38:19 +02:00
Ingo Molnar	3611dfb8ed	Merge branch 'core/locking' into perfcounters/core Merge reason: we moved a mutex.h commit that originated from the perfcounters tree into core/locking - but now merge back that branch to solve a merge artifact and to pick up cleanups of this commit that happened in core/locking. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-06 08:47:26 +02:00
David S. Miller	356d6c2d55	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6	2009-05-05 12:00:53 -07:00
Peter Zijlstra	c5078f78b4	perf_counter: provide an mlock threshold Provide a threshold to relax the mlock accounting, increasing usability. Each counter gets perf_counter_mlock_kb for free. [ Impact: allow more mmap buffering ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> LKML-Reference: <20090505155437.112113632@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-05 20:18:32 +02:00
Peter Zijlstra	6de6a7b957	perf_counter: add ioctl(PERF_COUNTER_IOC_RESET) Provide a way to reset an existing counter - this eases PAPI libraries around perfcounters. Similar to read() it doesn't collapse pending child counters. [ Impact: new perfcounter fd ioctl method to reset counters ] Suggested-by: Corey Ashford <cjashfor@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20090505155437.022272933@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-05 20:18:31 +02:00
Peter Zijlstra	c66de4a5be	perf_counter: uncouple data_head updates from wakeups Keep data_head up-to-date irrespective of notifications. This fixes the case where you disable a counter and don't get a notification for the last few pending events, and it also allows polling usage. [ Impact: increase precision of perfcounter mmap-ed fields ] Suggested-by: Corey Ashford <cjashfor@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <20090505155436.925084300@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-05 20:18:30 +02:00
Steven Rostedt	f0d2c681ac	ring-buffer: add counters for commit overrun and nmi dropped entries The WARN_ON in the ring buffer when a commit is preempted and the buffer is filled by preceding writes can happen in normal operations. The WARN_ON makes it look like a bug, not to mention, because it does not stop tracing and calls printk which can also recurse, this is prone to deadlock (the WARN_ON is not in a position to recurse). This patch removes the WARN_ON and replaces it with a counter that can be retrieved by a tracer. This counter is called commit_overrun. While at it, I added a nmi_dropped counter to count any time an NMI entry is dropped because the NMI could not take the spinlock. [ Impact: prevent deadlock by printing normal case warning ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2009-05-05 13:51:02 -04:00
Pablo Neira Ayuso	280f37afa2	netfilter: xt_cluster: fix use of cluster match with 32 nodes This patch fixes a problem when you use 32 nodes in the cluster match: % iptables -I PREROUTING -t mangle -i eth0 -m cluster \ --cluster-total-nodes 32 --cluster-local-node 32 \ --cluster-hash-seed 0xdeadbeef -j MARK --set-mark 0xffff iptables: Invalid argument. Run `dmesg' for more information. % dmesg \| tail -1 xt_cluster: this node mask cannot be higher than the total number of nodes The problem is related to this checking: if (info->node_mask >= (1 << info->total_nodes)) { printk(KERN_ERR "xt_cluster: this node mask cannot be " "higher than the total number of nodes\n"); return false; } (1 << 32) is 1. Thus, the checking fails. BTW, I said this before but I insist: I have only tested the cluster match with 2 nodes getting ~45% extra performance in an active-active setup. The maximum limit of 32 nodes is still completely arbitrary. I'd really appreciate if people that have more nodes in their setups let me know. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-05-05 17:46:07 +02:00
Linus Torvalds	80445de577	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits) e1000: fix virtualization bug bonding: fix alb mode locking regression Bluetooth: Fix issue with sysfs handling for connections usbnet: CDC EEM support (v5) tcp: Fix tcp_prequeue() to get correct rto_min value ehea: fix invalid pointer access ne2k-pci: Do not register device until initialized. Subject: [PATCH] br2684: restore net_dev initialization net: Only store high 16 bits of kernel generated filter priorities virtio_net: Fix function name typo virtio_net: Cleanup command queue scatterlist usage bonding: correct the cleanup in bond_create() virtio: add missing include to virtio_net.h smsc95xx: add support for LAN9512 and LAN9514 smsc95xx: configure LED outputs netconsole: take care of NETDEV_UNREGISTER event xt_socket: checks for the state of nf_conntrack bonding: bond_slave_info_query() fix cxgb3: fixing gcc 4.4 compiler warning: suggest parentheses around operand of ‘!’ netfilter: use likely() in xt_info_rdlock_bh() ...	2009-05-05 08:26:10 -07:00
Patrick McHardy	a7ca7fccac	netfilter: add missing linux/types.h include to xt_LED.h Pointed out by Dave Miller: CHECK include/linux/netfilter (57 files) /home/davem/src/GIT/net-2.6/usr/include/linux/netfilter/xt_LED.h:6: found __[us]{8,16,32,64} type without #include <linux/types.h> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-05-05 14:31:12 +02:00
Omar Laazimani	9f722c0978	usbnet: CDC EEM support (v5) This introduces a CDC Ethernet Emulation Model (EEM) host side driver to support USB EEM devices. EEM is different from the Ethernet Control Model (ECM) currently supported by the "CDC Ethernet" driver. One key difference is that it doesn't require of USB interface alternate settings to manage interface state; some maldesigned hardware can't handle that part of USB. It also avoids a separate USB interface for control and status updates. [ dbrownell@users.sourceforge.net: fix skb leaks, add rx packet checks, improve fault handling, EEM conformance updates, cleanup ] Signed-off-by: Omar Laazimani <omar.oberthur@gmail.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-04 12:01:43 -07:00
Ingo Molnar	0d905bca23	perf_counter: initialize the per-cpu context earlier percpu scheduling for perfcounters wants to take the context lock, but that lock first needs to be initialized. Currently it is an early_initcall() - but that is too late, the task tick runs much sooner than that. Call it explicitly from the scheduler init sequence instead. [ Impact: fix access-before-init crash ] LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-05-04 19:30:32 +02:00
Paul Mundt	46a12f7426	sh: Consolidate MTU2/CMT/TMU timer platform data. All of the SH timers use a roughly identical structure for platform data, which presently is broken out for each block. Consolidate all of these definitions, as there is no reason for them to be broken out in the first place. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2009-05-03 17:57:17 +09:00
Magnus Damm	9570ef2042	clocksource: SuperH TMU Timer driver This patch adds a TMU driver for the SuperH architecture. The TMU driver is a platform driver with early platform support to allow using a TMU channel as clockevent or clocksource during system bootup or later. Clocksource or clockevent can be selected. Both periodic and oneshot clockevents are supported. Signed-off-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2009-05-03 17:41:15 +09:00
Magnus Damm	d5ed4c2e5c	clocksource: SuperH MTU2 Timer driver This patch adds a MTU2 driver for the SuperH architecture. The MTU2 driver is a platform driver with early platform support to allow using a MTU2 channel as only clockevent during system bootup. Clocksource on sh2a is currently unsupported due to code generation issues with 64-bit math, so at this point only periodic clockevent support is in place. Signed-off-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2009-05-03 17:36:02 +09:00
Linus Torvalds	7b39da786a	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: ide-cd: fix REQ_QUIET tests in cdrom_decode_status Fix up trivial conflicts in include/linux/blkdev.h	2009-05-02 16:48:32 -07:00
Linus Torvalds	8c0c3f7ff0	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: document the multi-touch (MT) protocol Input: add detailed multi-touch finger data report protocol Input: allow certain EV_ABS events to bypass all filtering Input: bcm5974 - add documentation for the driver Input: bcm5974 - augment debug information Input: bcm5974 - Add support for the Macbook 5 (Unibody) Input: bcm5974 - add quad-finger tapping Input: bcm5974 - prepare for a new trackpad header type Input: appletouch - fix DMA to/from stack buffer Input: wacom - fix TabletPC touch bug Input: lifebook - add DMI entry for Fujitsu B-2130 Input: ALPS - add signature for Toshiba Satellite Pro M10 Input: elantech - make sure touchpad is really in absolute mode Input: elantech - provide a workaround for jumpy cursor on firmware 2.34 Input: ucb1400 - use disable_irq_nosync() in irq handler Input: tsc2007 - use disable_irq_nosync() in irq handler Input: sa1111ps2 - use disable_irq_nosync() in irq handlers Input: omap-keypad - use disable_irq_nosync() in irq handler	2009-05-02 16:35:45 -07:00
Trond Myklebust	f75e6745aa	SUNRPC: Fix the problem of EADDRNOTAVAIL syslog floods on reconnect See http://bugzilla.kernel.org/show_bug.cgi?id=13034 If the port gets into a TIME_WAIT state, then we cannot reconnect without binding to a new port. Tested-by: Petr Vandrovec <petr@vandrovec.name> Tested-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-02 16:35:08 -07:00
KOSAKI Motohiro	00a62ce91e	mm: fix Committed_AS underflow on large NR_CPUS environment The Committed_AS field can underflow in certain situations: > # while true; do cat /proc/meminfo \| grep _AS; sleep 1; done \| uniq -c > 1 Committed_AS: 18446744073709323392 kB > 11 Committed_AS: 18446744073709455488 kB > 6 Committed_AS: 35136 kB > 5 Committed_AS: 18446744073709454400 kB > 7 Committed_AS: 35904 kB > 3 Committed_AS: 18446744073709453248 kB > 2 Committed_AS: 34752 kB > 9 Committed_AS: 18446744073709453248 kB > 8 Committed_AS: 34752 kB > 3 Committed_AS: 18446744073709320960 kB > 7 Committed_AS: 18446744073709454080 kB > 3 Committed_AS: 18446744073709320960 kB > 5 Committed_AS: 18446744073709454080 kB > 6 Committed_AS: 18446744073709320960 kB Because NR_CPUS can be greater than 1000 and meminfo_proc_show() does not check for underflow. But NR_CPUS proportional isn't good calculation. In general, possibility of lock contention is proportional to the number of online cpus, not theorical maximum cpus (NR_CPUS). The current kernel has generic percpu-counter stuff. using it is right way. it makes code simplify and percpu_counter_read_positive() don't make underflow issue. Reported-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Eric B Munson <ebmunson@us.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: <stable@kernel.org> [All kernel versions] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-02 15:36:10 -07:00
Grant Likely	0763ed2355	of: make of_(un)register_platform_driver common code Some drivers using of_register_platform_driver() wrapper break on sparc because the wrapper isn't in the header file. This patch moves it from Microblaze and PowerPC implementations and makes it common code. Fixes this sparc64 allmodconfig build error (at least): drivers/leds/leds-gpio.c: In function `gpio_led_init': drivers/leds/leds-gpio.c:295: error: implicit declaration of function `of_register_platform_driver' drivers/leds/leds-gpio.c: In function `gpio_led_exit': drivers/leds/leds-gpio.c:311: error: implicit declaration of function `of_unregister_platform_driver' Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: David S. Miller <davem@davemloft.net> Cc: Michal Simek <monstr@monstr.eu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-02 15:36:10 -07:00
Ivan Kokshaysky	74641f584d	alpha: binfmt_aout fix This fixes the problem introduced by commit `3bfacef412` (get rid of special-casing the /sbin/loader on alpha): osf/1 ecoff binary segfaults when binfmt_aout built as module. That happens because aout binary handler gets on the top of the binfmt list due to late registration, and kernel attempts to execute the binary without preparatory work that must be done by binfmt_loader. Fixed by changing the registration order of the default binfmt handlers using list_add_tail() and introducing insert_binfmt() function which places new handler on the top of the binfmt list. This might be generally useful for installing arch-specific frontends for default handlers or just for overriding them. Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Richard Henderson <rth@twiddle.net Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-02 15:36:10 -07:00

... 2 3 4 5 6 ...

16204 Commits