linux_old1

Commit Graph

Author	SHA1	Message	Date
Bartlomiej Zolnierkiewicz	9a3c49be5c	ide: add ide_no_data_taskfile() helper * Add ide_no_data_taskfile() helper and convert ide_raw_taskfile() w/ NO DATA protocol users to use it instead. * Set ->data_phase explicitly in ide_no_data_taskfile() (TASKFILE_NO_DATA is defined as 0x0000). * Unexport task_no_data_intr(). Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:07 +01:00
Bartlomiej Zolnierkiewicz	9e42237f26	ide: add ide_tf_load() helper Based on the earlier work by Tejun Heo. * Add 'tf_flags' field (for taskfile flags) to ide_task_t. * Add IDE_TFLAG_LBA48 taskfile flag for LBA48 taskfiles. * Add IDE_TFLAG_NO_SELECT_MASK taskfile flag for __ide_do_rw_disk() which doesn't use SELECT_MASK() (looks like a bug but it requires some more investigation). * Split off ide_tf_load() helper from do_rw_taskfile(). * Convert __ide_do_rw_disk() to use ide_tf_load(). There should be no functionality changes caused by this patch. Cc: Tejun Heo <htejun@gmail.com> Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:07 +01:00
Bartlomiej Zolnierkiewicz	650d841d9e	ide: add struct ide_taskfile (take 2) * Don't set write-only ide_task_t.hobRegister[6] and ide_task_t.hobRegister[7] in idedisk_set_max_address_ext(). * Add struct ide_taskfile and use it in ide_task_t instead of tfRegister[] and hobRegister[]. * Remove no longer needed IDE_CONTROL_OFFSET_HOB define. * Add #ifndef/#endif __KERNEL__ around definitions of {task,hob}_struct_t. While at it: * Use ATA_LBA define for LBA bit (0x40) as suggested by Tejun Heo. v2: * Add missing newlines. (Noticed by Sergei) * Use ~ATA_LBA instead of 0xBF. (Noticed by Sergei) * Use unnamed unions for error/feature and status/command. (Suggested by Sergei). There should be no functionality changes caused by this patch. Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Tejun Heo <htejun@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:06 +01:00
Bartlomiej Zolnierkiewicz	cd2a2d9697	ide: remove task_ioreg_t typedef (take 2) Remove task_ioreg_t typedef from the kernel code (but leave it in <linux/hdreg.h> for #ifndef/#endif __KERNEL__ case). While at it also move sata_ioreg_t typedef under #ifndef/#endif __KERNEL__. v2: Remove name of the second parameter from ide_execute_command() declaration. (Noticed by Sergei). Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:06 +01:00
Bartlomiej Zolnierkiewicz	1c029fd658	ide: remove ->dma_master field from ide_hwif_t (take 5) * Convert cmd64x, hpt366 and pdc202xx_old host drivers to use pci_resource_start(hwif->pci_dev, 4) instead of hwif->dma_master. * Remove no longer needed ->dma_master field from ide_hwif_t. v2: * Use the more readable 'hwif->dma_base - (hwif->channel * 8)' instead of pci_resource_start(hwif->pci_dev, 4). v3: * Use hwif->extra_base in hpt366/pdc20xx_old + some cosmetic fixups over v2 (suggested by Sergei). v4: * Correct offsets in hpt3xxn_set_clock(). v5: * Use hwif->extra_base in hpt366 for _real_ this time. (Noticed by Sergei) Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2008-01-25 22:17:05 +01:00
Hans Verkuil	0b394def21	V4L/DVB (6868): i2c-id.h: add I2C_DRIVERID_CS5345 Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-01-25 19:04:07 -02:00
Hans Verkuil	05e997197e	V4L/DVB (6487): i2c-id: add M52790 driver ID Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>	2008-01-25 19:01:47 -02:00
Arjan van de Ven	6d082592b6	sched: keep total / count stats in addition to the max for Right now, the linux kernel (with scheduler statistics enabled) keeps track of the maximum time a process is waiting to be scheduled. While the maximum is a very useful metric, tracking average and total is equally useful (at least for latencytop) to figure out the accumulated effect of scheduler delays. The accumulated effect is important to judge the performance impact of scheduler tuning/behavior. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:35 +01:00
Alexey Dobriyan	286100a6cf	sched, futex: detach sched.h and futex.h Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:34 +01:00
Ingo Molnar	90739081ef	softlockup: fix signedness fix softlockup tunables signedness. mark tunables read-mostly. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:34 +01:00
Arjan van de Ven	9745512ce7	sched: latencytop support LatencyTOP kernel infrastructure; it measures latencies in the scheduler and tracks it system wide and per process. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:34 +01:00
Pavel Machek	e118adef23	timers: don't #error on higher HZ values For some crazy reason (trying to work around hw problem in i810) I wanted to use HZ around 4000. Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:34 +01:00
Ingo Molnar	6478d8800b	sched: remove the !PREEMPT_BKL code remove the !PREEMPT_BKL code. this removes 160 lines of legacy code. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:33 +01:00
Peter Zijlstra	d3d74453c3	hrtimer: fixup the HRTIMER_CB_IRQSAFE_NO_SOFTIRQ fallback Currently all highres=off timers are run from softirq context, but HRTIMER_CB_IRQSAFE_NO_SOFTIRQ timers expect to run from irq context. Fix this up by splitting it similar to the highres=on case. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:31 +01:00
Peter Zijlstra	48d5e25821	sched: rt throttling vs no_hz We need to teach no_hz about the rt throttling because its tick driven. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:31 +01:00
Peter Zijlstra	6f505b1642	sched: rt group scheduling Extend group scheduling to also cover the realtime classes. It uses the time limiting introduced by the previous patch to allow multiple realtime groups. The hard time limit is required to keep behaviour deterministic. The algorithms used make the realtime scheduler O(tg), linear scaling wrt the number of task groups. This is the worst case behaviour I can't seem to get out of, the avg. case of the algorithms can be improved, I focused on correctness and worst case. [ akpm@linux-foundation.org: move side-effects out of BUG_ON(). ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:30 +01:00
Peter Zijlstra	fa85ae2418	sched: rt time limit Very simple time limit on the realtime scheduling classes. Allow the rq's realtime class to consume sched_rt_ratio of every sched_rt_period slice. If the class exceeds this quota the fair class will preempt the realtime class. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:29 +01:00
Peter Zijlstra	8f4d37ec07	sched: high-res preemption tick Use HR-timers (when available) to deliver an accurate preemption tick. The regular scheduler tick that runs at 1/HZ can be too coarse when nice level are used. The fairness system will still keep the cpu utilisation 'fair' by then delaying the task that got an excessive amount of CPU time but try to minimize this by delivering preemption points spot-on. The average frequency of this extra interrupt is sched_latency / nr_latency. Which need not be higher than 1/HZ, its just that the distribution within the sched_latency period is important. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:29 +01:00
Herbert Xu	02b67cc3ba	sched: do not do cond_resched() when CONFIG_PREEMPT Why do we even have cond_resched when real preemption is on? It seems to be a waste of space and time. remove cond_resched with CONFIG_PREEMPT on. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:28 +01:00
Peter Zijlstra	78f2c7db60	sched: SCHED_FIFO/SCHED_RR watchdog timer Introduce a new rlimit that allows the user to set a runtime timeout on real-time tasks their slice. Once this limit is exceeded the task will receive SIGXCPU. So it measures runtime since the last sleep. Input and ideas by Thomas Gleixner and Lennart Poettering. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Lennart Poettering <mzxreary@0pointer.de> CC: Michael Kerrisk <mtk.manpages@googlemail.com> CC: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:27 +01:00
Peter Zijlstra	fa717060f1	sched: sched_rt_entity Move the task_struct members specific to rt scheduling together. A future optimization could be to put sched_entity and sched_rt_entity into a union. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:27 +01:00
Paul E. McKenney	e260be673a	Preempt-RCU: implementation This patch implements a new version of RCU which allows its read-side critical sections to be preempted. It uses a set of counter pairs to keep track of the read-side critical sections and flips them when all tasks exit read-side critical section. The details of this implementation can be found in this paper - http://www.rdrop.com/users/paulmck/RCU/OLSrtRCU.2006.08.11a.pdf and the article- http://lwn.net/Articles/253651/ This patch was developed as a part of the -rt kernel development and meant to provide better latencies when read-side critical sections of RCU don't disable preemption. As a consequence of keeping track of RCU readers, the readers have a slight overhead (optimizations in the paper). This implementation co-exists with the "classic" RCU implementations and can be switched to at compiler. Also includes RCU tracing summarized in debugfs. [ akpm@linux-foundation.org: build fixes on non-preempt architectures ] Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Reviewed-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:24 +01:00
Paul E. McKenney	01c1c660f4	Preempt-RCU: reorganize RCU code into rcuclassic.c and rcupdate.c This patch re-organizes the RCU code to enable multiple implementations of RCU. Users of RCU continues to include rcupdate.h and the RCU interfaces remain the same. This is in preparation for subsequently merging the preemptible RCU implementation. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:24 +01:00
Dipankar Sarma	c2d727aa2f	Preempt-RCU: Use softirq instead of tasklets for This patch makes RCU use softirq instead of tasklets. It also adds a memory barrier after raising the softirq inorder to ensure that the cpu sees the most recently updated value of rcu->cur while processing callbacks. The discussion of the related theoretical race pointed out by James Huang can be found here --> http://lkml.org/lkml/2007/11/20/603 Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Reviewed-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:23 +01:00
Steven Rostedt	cb46984504	sched: RT-balance, add new methods to sched_class Dmitry Adamushko found that the current implementation of the RT balancing code left out changes to the sched_setscheduler and rt_mutex_setprio. This patch addresses this issue by adding methods to the schedule classes to handle being switched out of (switched_from) and being switched into (switched_to) a sched_class. Also a method for changing of priorities is also added (prio_changed). This patch also removes some duplicate logic between rt_mutex_setprio and sched_setscheduler. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:22 +01:00
Steven Rostedt	9a897c5a67	sched: RT-balance, replace hooks with pre/post schedule and wakeup methods To make the main sched.c code more agnostic to the schedule classes. Instead of having specific hooks in the schedule code for the RT class balancing. They are replaced with a pre_schedule, post_schedule and task_wake_up methods. These methods may be used by any of the classes but currently, only the sched_rt class implements them. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:22 +01:00
Ingo Molnar	32525d022a	sched: whitespace cleanups in topology.h whitespace cleanups in topology.h. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:20 +01:00
Ingo Molnar	52d853431e	sched: reactivate fork balancing reactivate fork balancing. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:20 +01:00
Gregory Haskins	57d885fea0	sched: add sched-domain roots We add the notion of a root-domain which will be used later to rescope global variables to per-domain variables. Each exclusive cpuset essentially defines an island domain by fully partitioning the member cpus from any other cpuset. However, we currently still maintain some policy/state as global variables which transcend all cpusets. Consider, for instance, rt-overload state. Whenever a new exclusive cpuset is created, we also create a new root-domain object and move each cpu member to the root-domain's span. By default the system creates a single root-domain with all cpus as members (mimicking the global state we have today). We add some plumbing for storing class specific data in our root-domain. Whenever a RQ is switching root-domains (because of repartitioning) we give each sched_class the opportunity to remove any state from its old domain and add state to the new one. This logic doesn't have any clients yet but it will later in the series. Signed-off-by: Gregory Haskins <ghaskins@novell.com> CC: Christoph Lameter <clameter@sgi.com> CC: Paul Jackson <pj@sgi.com> CC: Simon Derr <simon.derr@bull.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:18 +01:00
Gregory Haskins	e7693a362e	sched: de-SCHED_OTHER-ize the RT path The current wake-up code path tries to determine if it can optimize the wake-up to "this_cpu" by computing load calculations. The problem is that these calculations are only relevant to SCHED_OTHER tasks where load is king. For RT tasks, priority is king. So the load calculation is completely wasted bandwidth. Therefore, we create a new sched_class interface to help with pre-wakeup routing decisions and move the load calculation as a function of CFS task's class. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:09 +01:00
Gregory Haskins	73fe6aae84	sched: add RT-balance cpu-weight Some RT tasks (particularly kthreads) are bound to one specific CPU. It is fairly common for two or more bound tasks to get queued up at the same time. Consider, for instance, softirq_timer and softirq_sched. A timer goes off in an ISR which schedules softirq_thread to run at RT50. Then the timer handler determines that it's time to smp-rebalance the system so it schedules softirq_sched to run. So we are in a situation where we have two RT50 tasks queued, and the system will go into rt-overload condition to request other CPUs for help. This causes two problems in the current code: 1) If a high-priority bound task and a low-priority unbounded task queue up behind the running task, we will fail to ever relocate the unbounded task because we terminate the search on the first unmovable task. 2) We spend precious futile cycles in the fast-path trying to pull overloaded tasks over. It is therefore optimial to strive to avoid the overhead all together if we can cheaply detect the condition before overload even occurs. This patch tries to achieve this optimization by utilizing the hamming weight of the task->cpus_allowed mask. A weight of 1 indicates that the task cannot be migrated. We will then utilize this information to skip non-migratable tasks and to eliminate uncessary rebalance attempts. We introduce a per-rq variable to count the number of migratable tasks that are currently running. We only go into overload if we have more than one rt task, AND at least one of them is migratable. In addition, we introduce a per-task variable to cache the cpus_allowed weight, since the hamming calculation is probably relatively expensive. We only update the cached value when the mask is updated which should be relatively infrequent, especially compared to scheduling frequency in the fast path. Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:07 +01:00
Ingo Molnar	82a1fcb902	softlockup: automatically detect hung TASK_UNINTERRUPTIBLE tasks this patch extends the soft-lockup detector to automatically detect hung TASK_UNINTERRUPTIBLE tasks. Such hung tasks are printed the following way: ------------------> INFO: task prctl:3042 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message prctl D fd5e3793 0 3042 2997 f6050f38 00000046 00000001 fd5e3793 00000009 c06d8264 c06dae80 00000286 f6050f40 f6050f00 f7d34d90 f7d34fc8 c1e1be80 00000001 f6050000 00000000 f7e92d00 00000286 f6050f18 c0489d1a f6050f40 00006605 00000000 c0133a5b Call Trace: [<c04883a5>] schedule_timeout+0x6d/0x8b [<c04883d8>] schedule_timeout_uninterruptible+0x15/0x17 [<c0133a76>] msleep+0x10/0x16 [<c0138974>] sys_prctl+0x30/0x1e2 [<c0104c52>] sysenter_past_esp+0x5f/0xa5 ======================= 2 locks held by prctl/3042: #0: (&sb->s_type->i_mutex_key#5){--..}, at: [<c0197d11>] do_fsync+0x38/0x7a #1: (jbd_handle){--..}, at: [<c01ca3d2>] journal_start+0xc7/0xe9 <------------------ the current default timeout is 120 seconds. Such messages are printed up to 10 times per bootup. If the system has crashed already then the messages are not printed. if lockdep is enabled then all held locks are printed as well. this feature is a natural extension to the softlockup-detector (kernel locked up without scheduling) and to the NMI watchdog (kernel locked up with IRQs disabled). [ Gautham R Shenoy <ego@in.ibm.com>: CPU hotplug fixes. ] [ Andrew Morton <akpm@linux-foundation.org>: build warning fix. ] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>	2008-01-25 21:08:02 +01:00
Ingo Molnar	d0d23b5432	cpu-hotplug: fix build on !CONFIG_SMP fix build on !CONFIG_SMP. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:02 +01:00
Gautham R Shenoy	95402b3829	cpu-hotplug: replace per-subsystem mutexes with get_online_cpus() This patch converts the known per-subsystem mutexes to get_online_cpus put_online_cpus. It also eliminates the CPU_LOCK_ACQUIRE and CPU_LOCK_RELEASE hotplug notification events. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:02 +01:00
Gautham R Shenoy	86ef5c9a8e	cpu-hotplug: replace lock_cpu_hotplug() with get_online_cpus() Replace all lock_cpu_hotplug/unlock_cpu_hotplug from the kernel and use get_online_cpus and put_online_cpus instead as it highlights the refcount semantics in these operations. The new API guarantees protection against the cpu-hotplug operation, but it doesn't guarantee serialized access to any of the local data structures. Hence the changes needs to be reviewed. In case of pseries_add_processor/pseries_remove_processor, use cpu_maps_update_begin()/cpu_maps_update_done() as we're modifying the cpu_present_map there. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:02 +01:00
Gautham R Shenoy	d221938c04	cpu-hotplug: refcount based cpu hotplug This patch implements a Refcount + Waitqueue based model for cpu-hotplug. Now, a thread which wants to prevent cpu-hotplug, will bump up a global refcount and the thread which wants to perform a cpu-hotplug operation will block till the global refcount goes to zero. The readers, if any, during an ongoing cpu-hotplug operation are blocked until the cpu-hotplug operation is over. Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Paul Jackson <pj@sgi.com> [For !CONFIG_HOTPLUG_CPU ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:01 +01:00
Srivatsa Vaddagiri	6b2d770026	sched: group scheduler, fix fairness of cpu bandwidth allocation for task groups The current load balancing scheme isn't good enough for precise group fairness. For example: on a 8-cpu system, I created 3 groups as under: a = 8 tasks (cpu.shares = 1024) b = 4 tasks (cpu.shares = 1024) c = 3 tasks (cpu.shares = 1024) a, b and c are task groups that have equal weight. We would expect each of the groups to receive 33.33% of cpu bandwidth under a fair scheduler. This is what I get with the latest scheduler git tree: Signed-off-by: Ingo Molnar <mingo@elte.hu> -------------------------------------------------------------------------------- Col1 \| Col2 \| Col3 \| Col4 ------\|---------\|-------\|------------------------------------------------------- a \| 277.676 \| 57.8% \| 54.1% 54.1% 54.1% 54.2% 56.7% 62.2% 62.8% 64.5% b \| 116.108 \| 24.2% \| 47.4% 48.1% 48.7% 49.3% c \| 86.326 \| 18.0% \| 47.5% 47.9% 48.5% -------------------------------------------------------------------------------- Explanation of o/p: Col1 -> Group name Col2 -> Cumulative execution time (in seconds) received by all tasks of that group in a 60sec window across 8 cpus Col3 -> CPU bandwidth received by the group in the 60sec window, expressed in percentage. Col3 data is derived as: Col3 = 100 * Col2 / (NR_CPUS * 60) Col4 -> CPU bandwidth received by each individual task of the group. Col4 = 100 * cpu_time_recd_by_task / 60 [I can share the test case that produces a similar o/p if reqd] The deviation from desired group fairness is as below: a = +24.47% b = -9.13% c = -15.33% which is quite high. After the patch below is applied, here are the results: -------------------------------------------------------------------------------- Col1 \| Col2 \| Col3 \| Col4 ------\|---------\|-------\|------------------------------------------------------- a \| 163.112 \| 34.0% \| 33.2% 33.4% 33.5% 33.5% 33.7% 34.4% 34.8% 35.3% b \| 156.220 \| 32.5% \| 63.3% 64.5% 66.1% 66.5% c \| 160.653 \| 33.5% \| 85.8% 90.6% 91.4% -------------------------------------------------------------------------------- Deviation from desired group fairness is as below: a = +0.67% b = -0.83% c = +0.17% which is far better IMO. Most of other runs have yielded a deviation within +-2% at the most, which is good. Why do we see bad (group) fairness with current scheuler? ========================================================= Currently cpu's weight is just the summation of individual task weights. This can yield incorrect results. For ex: consider three groups as below on a 2-cpu system: CPU0 CPU1 --------------------------- A (10) B(5) C(5) --------------------------- Group A has 10 tasks, all on CPU0, Group B and C have 5 tasks each all of which are on CPU1. Each task has the same weight (NICE_0_LOAD = 1024). The current scheme would yield a cpu weight of 10240 (101024) for each cpu and the load balancer will think both CPUs are perfectly balanced and won't move around any tasks. This, however, would yield this bandwidth: A = 50% B = 25% C = 25% which is not the desired result. What's changing in the patch? ============================= - How cpu weights are calculated when CONFIF_FAIR_GROUP_SCHED is defined (see below) - API Change - Two tunables introduced in sysfs (under SCHED_DEBUG) to control the frequency at which the load balance monitor thread runs. The basic change made in this patch is how cpu weight (rq->load.weight) is calculated. Its now calculated as the summation of group weights on a cpu, rather than summation of task weights. Weight exerted by a group on a cpu is dependent on the shares allocated to it and also the number of tasks the group has on that cpu compared to the total number of (runnable) tasks the group has in the system. Let, W(K,i) = Weight of group K on cpu i T(K,i) = Task load present in group K's cfs_rq on cpu i T(K) = Total task load of group K across various cpus S(K) = Shares allocated to group K NRCPUS = Number of online cpus in the scheduler domain to which group K is assigned. Then, W(K,i) = S(K) NRCPUS * T(K,i) / T(K) A load balance monitor thread is created at bootup, which periodically runs and adjusts group's weight on each cpu. To avoid its overhead, two min/max tunables are introduced (under SCHED_DEBUG) to control the rate at which it runs. Fixes from: Peter Zijlstra <a.p.zijlstra@chello.nl> - don't start the load_balance_monitor when there is only a single cpu. - rename the kthread because its currently longer than TASK_COMM_LEN Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:00 +01:00
Linus Torvalds	b47711bfbc	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6: selinux: make mls_compute_sid always polyinstantiate security/selinux: constify function pointer tables and fields security: add a secctx_to_secid() hook security: call security_file_permission from rw_verify_area security: remove security_sb_post_mountroot hook Security: remove security.h include from mm.h Security: remove security_file_mmap hook sparse-warnings (NULL as 0). Security: add get, set, and cloning of superblock security information security/selinux: Add missing "space"	2008-01-25 08:44:29 -08:00
Linus Torvalds	eba0e319c1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (125 commits) [CRYPTO] twofish: Merge common glue code [CRYPTO] hifn_795x: Fixup container_of() usage [CRYPTO] cast6: inline bloat-- [CRYPTO] api: Set default CRYPTO_MINALIGN to unsigned long long [CRYPTO] tcrypt: Make xcbc available as a standalone test [CRYPTO] xcbc: Remove bogus hash/cipher test [CRYPTO] xcbc: Fix algorithm leak when block size check fails [CRYPTO] tcrypt: Zero axbuf in the right function [CRYPTO] padlock: Only reset the key once for each CBC and ECB operation [CRYPTO] api: Include sched.h for cond_resched in scatterwalk.h [CRYPTO] salsa20-asm: Remove unnecessary dependency on CRYPTO_SALSA20 [CRYPTO] tcrypt: Add select of AEAD [CRYPTO] salsa20: Add x86-64 assembly version [CRYPTO] salsa20_i586: Salsa20 stream cipher algorithm (i586 version) [CRYPTO] gcm: Introduce rfc4106 [CRYPTO] api: Show async type [CRYPTO] chainiv: Avoid lock spinning where possible [CRYPTO] seqiv: Add select AEAD in Kconfig [CRYPTO] scatterwalk: Handle zero nbytes in scatterwalk_map_and_copy [CRYPTO] null: Allow setkey on digest_null ...	2008-01-25 08:38:25 -08:00
Greg Kroah-Hartman	79a6ee42fd	Kobject: fix coding style issues in kobject.h Finally clean up the odd spaces and other mess in kobject.h Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 21:27:06 -08:00
Greg Kroah-Hartman	d462943afe	Driver core: fix coding style issues in device.h Finally clean up the odd spaces and other mess in device.h Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 21:04:46 -08:00
Dave Young	fd04897bb2	Driver Core: add class iteration api Add the following class iteration functions for driver use: class_for_each_device class_find_device class_for_each_child class_find_child Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:44 -08:00
Stephen Rothwell	ae72cddb23	Driver Core: constify the name passed to platform_device_register_simple This name is just passed to platform_device_alloc which has its parameter declared const. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:43 -08:00
Kay Sievers	af5ca3f4ec	Driver core: change sysdev classes to use dynamic kobject names All kobjects require a dynamically allocated name now. We no longer need to keep track if the name is statically assigned, we can just unconditionally free() all kobject names on cleanup. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:40 -08:00
Greg Kroah-Hartman	528a4bf1d5	Kobject: remove kobject_unregister() as no one uses it anymore There are no in-kernel users of kobject_unregister() so it should be removed. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:40 -08:00
Kay Sievers	0f4dafc056	Kobject: auto-cleanup on final unref We save the current state in the object itself, so we can do proper cleanup when the last reference is dropped. If the initial reference is dropped, the object will be removed from sysfs if needed, if an "add" event was sent, "remove" will be send, and the allocated resources are released. This allows us to clean up some driver core usage as well as allowing us to do other such changes to the rest of the kernel. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:39 -08:00
Greg Kroah-Hartman	12e339ac6e	Kset: remove kset_add function No one is calling this anymore, so just remove it and hard-code the one internal-use of it. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:39 -08:00
Greg Kroah-Hartman	6d06adfaf8	Kobject: remove kobject_register() The function is no longer used by anyone in the kernel, and it prevents the proper sending of the kobject uevent after the needed files are set up by the caller. kobject_init_and_add() can be used in its place. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:39 -08:00
Greg Kroah-Hartman	f9cb074bff	Kobject: rename kobject_init_ng() to kobject_init() Now that the old kobject_init() function is gone, rename kobject_init_ng() to kobject_init() to clean up the namespace. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:38 -08:00
Greg Kroah-Hartman	e1543ddf73	Kobject: remove kobject_init() as no one uses it anymore The old kobject_init() function is on longer in use, so let us remove it from the public scope (kset mess in the kobject.c file still uses it, but that can be cleaned up later very simply.) Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:38 -08:00
Greg Kroah-Hartman	b2d6db5878	Kobject: rename kobject_add_ng() to kobject_add() Now that the old kobject_add() function is gone, rename kobject_add_ng() to kobject_add() to clean up the namespace. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:38 -08:00
Greg Kroah-Hartman	9e7bbccd02	Kobject: remove kobject_add() as no one uses it anymore The old kobject_add() function is on longer in use, so let us remove it from the public scope (kset mess in the kobject.c file still uses it, but that can be cleaned up later very simply.) Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:38 -08:00
Kay Sievers	edfaa7c365	Driver core: convert block from raw kobjects to core devices This moves the block devices to /sys/class/block. It will create a flat list of all block devices, with the disks and partitions in one directory. For compatibility /sys/block is created and contains symlinks to the disks. /sys/class/block \|-- sda -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda \|-- sda1 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda1 \|-- sda10 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda10 \|-- sda5 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda5 \|-- sda6 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda6 \|-- sda7 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda7 \|-- sda8 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda8 \|-- sda9 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda9 `-- sr0 -> ../../devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 /sys/block/ \|-- sda -> ../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda `-- sr0 -> ../devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0 Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:36 -08:00
Greg Kroah-Hartman	e5dd127846	Driver core: move the static kobject out of struct driver This patch removes the kobject, and a few other driver-core-only fields out of struct driver and into the driver core only. Now drivers can be safely create on the stack or statically (like they currently are.) Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:35 -08:00
Greg Kroah-Hartman	c63469a398	Driver core: move the driver specific module code into the driver core The module driver specific code should belong in the driver core, not in the kernel/ directory. So move this code. This is done in preparation for some struct device_driver rework that should be confined to the driver core code only. This also lets us keep from exporting these functions, as no external code should ever be calling it. Thanks to Andrew Morton for the !CONFIG_MODULES fix. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:35 -08:00
Greg Kroah-Hartman	cbe9c595f1	Driver: add driver_add_kobj for looney iseries_veth driver The iseries driver wants to hang kobjects off of its driver, so, to preserve backwards compatibility, we need to add a call to the driver core to allow future changes to work properly. Hopefully no one uses this function in the future and the iseries_veth driver authors come to their senses so I can remove this hack... Cc: Dave Larson <larson1@us.ibm.com> Cc: Santiago Leon <santil@us.ibm.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:34 -08:00
Cornelia Huck	57c745340a	driver core: Introduce default attribute groups. This is lot like default attributes for devices (and indeed, a lot of the code is lifted from there). Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:34 -08:00
Greg Kroah-Hartman	c6f7e72a3f	driver core: remove fields from struct bus_type struct bus_type is static everywhere in the kernel. This moves the kobject in the structure out of it, and a bunch of other private only to the driver core fields are now moved to a private structure. This lets us dynamically create the backing kobject properly and gives us the chance to be able to document to users exactly how to use the struct bus_type as there are no fields they can improperly access. Thanks to Kay for the build fixes on this patch. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:33 -08:00
Greg Kroah-Hartman	b249072ee6	driver core: add way to get to bus device klist This allows an easier way to get to the device klist associated with a struct bus_type (you have three to choose from...) This will make it easier to move these fields to be dynamic in a future patch. The only user of this is the PCI core which horribly abuses this interface to rearrange the order of the pci devices. This should be done using the existing bus device walking functions, but that's left for future patches. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:33 -08:00
Greg Kroah-Hartman	0fed80f7a6	driver core: add way to get to bus kset This allows an easier way to get to the kset associated with a struct bus_type (you have three to choose from...) This will make it easier to move these fields to be dynamic in a future patch. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:33 -08:00
Greg Kroah-Hartman	cc972e896b	driver core: remove owner field from struct bus_type This isn't used by anything in the driver core, and by no one in the 204 different usages of it in the kernel tree. Remove this field so no one gets any idea that it is needed to be used. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:31 -08:00
Greg Kroah-Hartman	81e7c6a636	UIO: fix kobject usage The uio kobject code is "wierd". This patch should hopefully fix it up to be sane and not leak memory anymore. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benedikt Spranger <b.spranger@linutronix.de> Signed-off-by: Hans J. Koch <hjk@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:26 -08:00
Greg Kroah-Hartman	d76e15fb20	driver core: make /sys/power a kobject /sys/power should not be a kset, that's overkill. This patch renames it to power_kset and fixes up all usages of it in the tree. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:25 -08:00
Greg Kroah-Hartman	2fb9113b97	kobject: remove subsystem_(un)register functions These functions are no longer used and are the last remants of the old subsystem crap. So delete them for good. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:24 -08:00
Greg Kroah-Hartman	0ff21e4663	kobject: convert kernel_kset to be a kobject kernel_kset does not need to be a kset, but a much simpler kobject now that we have kobj_attributes. We also rename kernel_kset to kernel_kobj to catch all users of this symbol with a build error instead of an easy-to-ignore build warning. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:24 -08:00
Greg Kroah-Hartman	5c03c7ab88	kset: remove decl_subsys macro This macro is no longer used. ksets should be created dynamically with a call to kset_create_and_add() not declared statically. Yes, there are 5 remaining static struct kset usages in the kernel tree, but they will be fixed up soon. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:24 -08:00
Greg Kroah-Hartman	f62ed9e33b	firmware: change firmware_kset to firmware_kobj There is no firmware "subsystem" it's just a directory in /sys that other portions of the kernel want to hook into. So make it a kobject not a kset to help alivate anyone who tries to do some odd kset-like things with this. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:23 -08:00
Greg Kroah-Hartman	15f2f9b3a9	firmware: remove firmware_(un)register() These functions are no longer called or needed, so we can remove them. As I rewrote the whole firmware.c file, add my copyright. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:23 -08:00
Kay Sievers	000f2a4d8c	Driver Core: kill subsys_attribute and default sysfs ops Remove the no longer needed subsys_attributes, they are all converted to the more sensical kobj_attributes. There is no longer a magic fallback in sysfs attribute operations, all kobjects which create simple attributes need explicitely a ktype assigned, which tells the core what was intended here. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:22 -08:00
Greg Kroah-Hartman	9e5f7f9abe	firmware: export firmware_kset so that people can use that instead of the braindead firmware_register interface Needed for future firmware subsystem cleanups. In the end, the firmware_register/unregister functions will be deleted entirely, but we need this symbol so that subsystems can migrate over. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Matt Domsch <Matt_Domsch@dell.com> Cc: Matt Tolentino <matthew.e.tolentino@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:19 -08:00
Kay Sievers	eb41d9465c	fix struct user_info export's sysfs interaction Clean up the use of ksets and kobjects. Kobjects are instances of objects (like struct user_info), ksets are collections of objects of a similar type (like the uids directory containing the user_info directories). So, use kobjects for the user_info directories, and a kset for the "uids" directory. On object cleanup, the final kobject_put() was missing. Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:18 -08:00
Kay Sievers	23b5212cc7	Driver Core: add kobj_attribute handling Add kobj_sysfs_ops to replace subsys_sysfs_ops. There is no need for special kset operations, we want to be able to use simple attribute operations at any kobject, not only ksets. The whole concept of any default sysfs attribute operations will go away with the upcoming removal of subsys_sysfs_ops. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:18 -08:00
Greg Kroah-Hartman	6dcec2511f	kset: convert struct bus_device->drivers to use kset_create Dynamically create the kset instead of declaring it statically. Having 3 static kobjects in one structure is not only foolish, but ripe for nasty race conditions if handled improperly. We also rename the field to catch any potential users of it (not that there should be outside of the driver core...) Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:16 -08:00
Greg Kroah-Hartman	3d8995963d	kset: convert struct bus_device->devices to use kset_create Dynamically create the kset instead of declaring it statically. Having 3 static kobjects in one structure is not only foolish, but ripe for nasty race conditions if handled improperly. We also rename the field to catch any potential users of it (not that there should be outside of the driver core...) Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:16 -08:00
Greg Kroah-Hartman	039a5dcd2f	kset: convert /sys/power to use kset_create Dynamically create the kset instead of declaring it statically. We also rename power_subsys to power_kset to catch all users of the variable and we properly export it so that people don't have to guess that it really is present in the system. The pseries code is wierd, why is it createing /sys/power if CONFIG_PM is disabled? Oh well, stupid big boxes ignoring config options... Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:16 -08:00
Greg Kroah-Hartman	7405c1e15e	kset: convert /sys/module to use kset_create Dynamically create the kset instead of declaring it statically. We also rename module_subsys to module_kset to catch all users of the variable. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:16 -08:00
Greg Kroah-Hartman	2d72fc00a1	kobject: convert /sys/hypervisor to use kobject_create We don't need a kset here, a simple kobject will do just fine, so dynamically create the kobject and use it. We also rename hypervisor_subsys to hypervisor_kset to catch all users of the variable. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:15 -08:00
Greg Kroah-Hartman	bd35b93d80	kset: convert kernel_subsys to use kset_create Dynamically create the kset instead of declaring it statically. We also rename kernel_subsys to kernel_kset to catch all users of this symbol with a build error instead of an easy-to-ignore build warning. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:14 -08:00
Greg Kroah-Hartman	e5e38a86c0	kset: remove decl_subsys_name The last user of this macro (pci hotplug core) is now switched over to using a dynamic kset, so this macro is no longer needed at all. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:14 -08:00
Greg Kroah-Hartman	81ace5cd8f	kset: convert pci hotplug to use kset_create_and_add This also renames pci_hotplug_slots_subsys to pcis_hotplug_slots_kset catch all current users with a build error instead of a build warning which can easily be missed. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:14 -08:00
Greg Kroah-Hartman	00d2666623	kobject: convert main fs kobject to use kobject_create This also renames fs_subsys to fs_kobj to catch all current users with a build error instead of a build warning which can easily be missed. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:13 -08:00
Greg Kroah-Hartman	43968d2f16	kobject: get rid of kobject_kset_add_dir kobject_kset_add_dir is only called in one place so remove it and use kobject_create() instead. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:11 -08:00
Greg Kroah-Hartman	4ff6abff83	kobject: get rid of kobject_add_dir kobject_create_and_add is the same as kobject_add_dir, so drop kobject_add_dir. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:11 -08:00
Greg Kroah-Hartman	3f9e3ee9dc	kobject: add kobject_create_and_add function This lets users create dynamic kobjects much easier. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:10 -08:00
Greg Kroah-Hartman	b727c70289	kset: add kset_create_and_add function Now ksets can be dynamically created on the fly, no static definitions are required. Thanks to Miklos for hints on how to make this work better for the callers. And thanks to Kay for finding some stupid bugs in my original version and pointing out that we need to handle the fact that kobject's can have a kset as a parent and to handle that properly in kobject_add(). Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:10 -08:00
Greg Kroah-Hartman	12d03da7c1	kobject: remove kobj_set_kset_s as no one is using it anymore What a confusing name for a macro... Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:10 -08:00
Greg Kroah-Hartman	3514faca19	kobject: remove struct kobj_type from struct kset We don't need a "default" ktype for a kset. We should set this explicitly every time for each kset. This change is needed so that we can make ksets dynamic, and cleans up one of the odd, undocumented assumption that the kset/kobject/ktype model has. This patch is based on a lot of help from Kay Sievers. Nasty bug in the block code was found by Dave Young <hidave.darkstar@gmail.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:10 -08:00
Greg Kroah-Hartman	c11c4154e7	kobject: add kobject_init_and_add function Also add a kobject_init_and_add function which bundles up what a lot of the current callers want to do all at once, and it properly handles the memory usages, unlike kobject_register(); Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:10 -08:00
Greg Kroah-Hartman	244f6cee9a	kobject: add kobject_add_ng function This is what the kobject_add function is going to become. Add this to the kernel and then we can convert the tree over to use it. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:09 -08:00
Greg Kroah-Hartman	e86000d042	kobject: add kobject_init_ng function This is what the kobject_init function is going to become. Add this to the kernel and then we can convert the tree over to use it. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:09 -08:00
Greg Kroah-Hartman	18041f4775	kobject: make kobject_cleanup be static No one except the kobject core calls it so make the function static. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:09 -08:00
Emil Medve	7b8712e563	driver core: Make the dev_*() family of macros in device.h complete Removed duplicates defined elsewhere Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:08 -08:00
Tony Jones	7dd817d083	tifm: Convert from class_device to device for TI flash media Signed-off-by: Tony Jones <tonyj@suse.de> Cc: Alex Dubov <oakad@yahoo.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:06 -08:00
Tony Jones	6013c12be8	pktcdvd: Convert from class_device to device for block/pktcdvd struct class_device is going away, this converts the code to use struct device instead. Signed-off-by: Tony Jones <tonyj@suse.de> Cc: Peter Osterlund <petero2@telia.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:06 -08:00
Tony Jones	891f78ea83	DMA: Convert from class_device to device for DMA engine Signed-off-by: Tony Jones <tonyj@suse.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: Shannon Nelson <shannon.nelson@intel.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:05 -08:00
Evgeniy Polyakov	41ca28ab2a	kref: add kref_set() This adds kref_set() to the kref api for future use by people who really know what they are doing with krefs... From: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:05 -08:00
Rafael J. Wysocki	775b64d2b6	PM: Acquire device locks on suspend This patch reorganizes the way suspend and resume notifications are sent to drivers. The major changes are that now the PM core acquires every device semaphore before calling the methods, and calls to device_add() during suspends will fail, while calls to device_del() during suspends will block. It also provides a way to safely remove a suspended device with the help of the PM core, by using the device_pm_schedule_removal() callback introduced specifically for this purpose, and updates two drivers (msr and cpuid) that need to use it. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2008-01-24 20:40:04 -08:00
Jan Engelhardt	1996a10948	security/selinux: constify function pointer tables and fields Constify function pointer tables and fields. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: James Morris <jmorris@namei.org>	2008-01-25 11:29:54 +11:00
David Howells	63cb344923	security: add a secctx_to_secid() hook Add a secctx_to_secid() LSM hook to go along with the existing secid_to_secctx() LSM hook. This patch also includes the SELinux implementation for this hook. Signed-off-by: Paul Moore <paul.moore@hp.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>	2008-01-25 11:29:53 +11:00
H. Peter Anvin	bced95283e	security: remove security_sb_post_mountroot hook The security_sb_post_mountroot() hook is long-since obsolete, and is fundamentally broken: it is never invoked if someone uses initramfs. This is particularly damaging, because the existence of this hook has been used as motivation for not using initramfs. Stephen Smalley confirmed on 2007-07-19 that this hook was originally used by SELinux but can now be safely removed: http://marc.info/?l=linux-kernel&m=118485683612916&w=2 Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Eric Paris <eparis@parisplace.org> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-01-25 11:29:50 +11:00
James Morris	42d7896ebc	Security: remove security.h include from mm.h Remove security.h include from mm.h, as it is only needed for a single extern declaration, and pulls in all kinds of crud. Fine-by-me: David Chinner <dgc@sgi.com> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-01-25 11:29:49 +11:00
Eric Paris	c9180a57a9	Security: add get, set, and cloning of superblock security information Adds security_get_sb_mnt_opts, security_set_sb_mnt_opts, and security_clont_sb_mnt_opts to the LSM and to SELinux. This will allow filesystems to directly own and control all of their mount options if they so choose. This interface deals only with option identifiers and strings so it should generic enough for any LSM which may come in the future. Filesystems which pass text mount data around in the kernel (almost all of them) need not currently make use of this interface when dealing with SELinux since it will still parse those strings as it always has. I assume future LSM's would do the same. NFS is the primary FS which does not use text mount data and thus must make use of this interface. An LSM would need to implement these functions only if they had mount time options, such as selinux has context= or fscontext=. If the LSM has no mount time options they could simply not implement and let the dummy ops take care of things. An LSM other than SELinux would need to define new option numbers in security.h and any FS which decides to own there own security options would need to be patched to use this new interface for every possible LSM. This is because it was stated to me very clearly that LSM's should not attempt to understand FS mount data and the burdon to understand security should be in the FS which owns the options. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>	2008-01-25 11:29:46 +11:00
Len Brown	d4b7dc499d	ACPI: make _OSI(Linux) console messages smarter If BIOS invokes _OSI(Linux), the kernel response depends on what the ACPI DMI list knows about the system, and that is reflectd in dmesg: 1) System unknown to DMI: ACPI: BIOS _OSI(Linux) query ignored ACPI: DMI System Vendor: LENOVO ACPI: DMI Product Name: 7661W1P ACPI: DMI Product Version: ThinkPad T61 ACPI: DMI Board Name: 7661W1P ACPI: DMI BIOS Vendor: LENOVO ACPI: DMI BIOS Date: 10/18/2007 ACPI: Please send DMI info above to linux-acpi@vger.kernel.org ACPI: If "acpi_osi=Linux" works better, please notify linux-acpi@vger.kernel.org 2) System known to DMI, but effect of OSI(Linux) unknown: ACPI: DMI detected: Lenovo ThinkPad T61 ... ACPI: BIOS _OSI(Linux) query ignored via DMI ACPI: If "acpi_osi=Linux" works better, please notify linux-acpi@vger.kernel.org 3) System known to DMI, which disables _OSI(Linux): ACPI: DMI detected: Lenovo ThinkPad T61 ... ACPI: BIOS _OSI(Linux) query ignored via DMI 4) System known to DMI, which enable _OSI(Linux): ACPI: DMI detected: Lenovo ThinkPad T61 ACPI: Added _OSI(Linux) ... ACPI: BIOS _OSI(Linux) query honored via DMI cmdline overrides take precidence over the built-in default and the DMI prescribed default. cmdline "acpi_osi=Linux" results in: ACPI: BIOS _OSI(Linux) query honored via cmdline Signed-off-by: Len Brown <len.brown@intel.com>	2008-01-23 21:26:15 -05:00
Len Brown	f89e3b0620	DMI: create dmi_get_slot() This simply allows other sub-systems (such as ACPI) to access and print out slots in static dmi_ident[]. Signed-off-by: Len Brown <len.brown@intel.com>	2008-01-23 21:23:13 -05:00
Len Brown	81b4e1f626	DMI: move dmi_available declaration to linux/dmi.h Signed-off-by: Len Brown <len.brown@intel.com>	2008-01-23 21:22:21 -05:00
James Bottomley	d4acd722b7	[SCSI] sysfs: add filter function to groups This patch allows the various users of attribute_groups to selectively allow the appearance of group attributes. The primary consumer of this will be the transport classes in which we currently have elaborate attribute selection algorithms to do this same thing. Acked-by: Greg KH <greg@kroah.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-01-23 11:29:18 -06:00
James Bottomley	fd1109711d	[SCSI] attribute_container: update to use the group interface This patch is the beginning of moving the attribute_containers to use attribute groups exclusively. The attr element is now deprecated and will eventually be removed (along with all the hand rolled code for doing exactly what attribute groups do) when all the consumers are converted to attribute groups. Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-01-23 11:29:17 -06:00
Alan Cox	5e8f757cb2	ata_generic: Cenatek support Not much to do here. It's an ata memory as disk. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:17 -05:00
Tejun Heo	4e6b79fa61	libata: factor out ata_pci_activate_sff_host() from ata_pci_one() Factor out ata_pci_activate_sff_host() from ata_pci_one(). This does about the same thing as ata_host_activate() but needs to be separate because SFF controllers use different and multiple IRQs in legacy mode. This will be used to make SFF LLD initialization more flexible. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:16 -05:00
Al Viro	4ca4e43964	libata annotations and fixes Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:15 -05:00
Jeff Garzik	442eacc362	libata: make ata_port_queue_task() an internal function ata_port_queue_task() served a single user: ata_pio_task() Rename to ata_pio_queue_task() and un-export it, as nobody outside of libata-core.c uses it. Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-01-23 05:24:15 -05:00
Tejun Heo	0bcc65ad78	libata: make qc->nbytes include extra buffers qc->nbytes didn't use to include extra buffers setup by libata core layer and my be odd. This patch makes qc->nbytes include any extra buffers setup by libata core layer and guaranteed to be aligned on 4 byte boundary. This value is to be used to program the host controller. As this represents the actual length of buffer available to the controller and the controller must be able to deal with short transfers for ATAPI commands which can transfer variable length, this shouldn't break any controllers while making problems like rounding-down and controllers choking up on odd transfer bytes much less likely. The unmodified value is stored in new field qc->raw_nbytes. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:15 -05:00
Tejun Heo	ff2aeb1eb6	libata: convert to chained sg libata used private sg iterator to handle padding sg. Now that sg can be chained, padding can be handled using standard sg ops. Convert to chained sg. * s/qc->__sg/qc->sg/ * s/qc->pad_sgent/qc->extra_sg[]/. Because chaining consumes one sg entry. There need to be two extra sg entries. The renaming is also for future addition of other extra sg entries. * Padding setup is moved into ata_sg_setup_extra() which is organized in a way that future addition of other extra sg entries is easy. * qc->orig_n_elem is unused and removed. * qc->n_elem now contains the number of sg entries that LLDs should map. qc->mapped_n_elem is added to carry the original number of mapped sgs for unmapping. * The last sg of the original sg list is used to chain to extra sg list. The original last sg is pointed to by qc->last_sg and the content is stored in qc->saved_last_sg. It's restored during ata_sg_clean(). * All sg walking code has been updated. Unnecessary assertions and checks for conditions the core layer already guarantees are removed. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:14 -05:00
Tejun Heo	001102d785	libata: kill non-sg DMA interface With atapi_request_sense() converted to use sg, there's no user of non-sg interface. Kill non-sg interface. * ATA_QCFLAG_SINGLE and ATA_QCFLAG_SG are removed. ATA_QCFLAG_DMAMAP is used instead. (this way no LLD change is necessary) * qc->buf_virt is removed. * ata_sg_init_one() and ata_sg_setup_one() are removed. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Rusty Russel <rusty@rustcorp.com.au> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:14 -05:00
Tejun Heo	55dba3120f	libata: update ->data_xfer hook for ATAPI Depending on how many bytes are transferred as a unit, PIO data transfer may consume more bytes than requested. Knowing how much data is consumed is necessary to determine how much is left for draining. This patch update ->data_xfer such that it returns the number of consumed bytes. While at it, it also makes the following changes. * s/adev/dev/ * use READ/WRITE constants for rw indication * misc clean ups Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:14 -05:00
Tejun Heo	ceb0c64262	libata: add ATAPI_* cmd types and implement atapi_cmd_type() Add ATAPI command types - ATAPI_READ, WRITE, RW_BUF, READ_CD and MISC, and implement atapi_cmd_type() which takes SCSI opcode and returns to which class the opcode belongs. This will be used later to improve ATAPI handling. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:14 -05:00
Tejun Heo	0dc36888d4	libata: rename ATA_PROT_ATAPI_* to ATAPI_PROT_* ATA_PROT_ATAPI_* are ugly and naming schemes between ATA_PROT_* and ATA_PROT_ATAPI_* are inconsistent causing confusion. Rename them to ATAPI_PROT_* and make them consistent with ATA counterpart. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2008-01-23 05:24:14 -05:00
Tejun Heo	537b53c169	cdrom: add more GPCMD_* constants Add GPCMD_* constants for READ_BUFFER, WRITE_12 and WRITE_BUFFER for completeness. These will be used libata. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:14 -05:00
Tejun Heo	021ee9a6da	libata: reimplement ata_acpi_cbl_80wire() using ata_acpi_gtm_xfermask() Reimplement ata_acpi_cbl_80wire() using ata_acpi_gtm_xfermask() and while at it relocate the function below ata_acpi_gtm_xfermask(). New ata_acpi_cbl_80wire() implementation takes @gtm, in both pata_via and pata_amd, use the initial GTM value. Both are trying to peek initial BIOS configuration, so using initial caching value makes sense. This fixes ACPI part of cable detection in pata_amd which previously always returned 0 because configuring PIO0 during reset clears DMA configuration. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:12 -05:00
Tejun Heo	a0f79b929a	libata: implement ata_timing_cycle2mode() and use it in libata-acpi and pata_acpi libata-acpi is using separate timing tables for transfer modes although libata-core has the complete ata_timing table. Implement ata_timing_cycle2mode() to look for matching mode given transfer type and cycle duration and use it in libata-acpi and pata_acpi to replace private timing tables. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:12 -05:00
Tejun Heo	7c77fa4d51	libata: separate out ata_acpi_gtm_xfermask() from pacpi_discover_modes() Finding out matching transfer mode from ACPI GTM values is useful for other purposes too. Separate out the function and timing tables from pata_acpi::pacpi_discover_modes(). Other than checking shared-configuration bit after doing ata_acpi_gtm() in pacpi_discover_modes() which should be safe, this patch doesn't introduce any behavior change. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:12 -05:00
Tejun Heo	c88f90c377	libata: add ATA_CBL_PATA_IGN ATA_CBL_PATA_UNK indicates that the cable type can't be determined from the host side and might be either 80c or 40c. libata applies drive or other generic limit in this case. However, there are controllers where both host and drive side detections are misimplemented and the driver has to rely solely on private method - peeking BIOS or ACPI configuration or using some other private mechanism. This patch adds ATA_CBL_PATA_IGN which tells libata to ignore the cable type completely and just let the LLD determine the transfer mode via host transfer mode masks and ->mode_filter(). Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:12 -05:00
Tejun Heo	7dc951aefd	libata: xfer_mask is unsigned long not unsigned int Jeff says xfer_mask is unsigned long not unsigned int. Convert all xfermask fields and handling functions to deal with unsigned longs. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:12 -05:00
Tejun Heo	9d3501ab96	libata: kill ata_id_to_dma_mode() ata_id_to_dma_mode() isn't quite generic. The function is basically privately implemented ata_id_xfermask() combined with hardcoded mode printing and configuration which are specific to ata_generic. Kill the function and open code it in generic_set_mode() using generic xfermode handling functions. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:11 -05:00
Tejun Heo	70cd071e4e	libata: clean up xfermode / PATA timing related stuff * s/ATA_BITS_(PIO\|MWDMA\|UDMA)/ATA_NR_\1_MODES/g * Consistently use 0xff to indicate invalid transfer mode (0x00 is valid for PIO_SLOW). * Make ata_xfer_mode2mask() return proper mode mask instead of just the highest bit. * Sort ata_timing table in increasing xfermode order and update ata_timing_find_mode() accordingly. This patch doesn't introduce any behavior change. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:11 -05:00
Tejun Heo	6357357cae	libata: export xfermode / PATA timing related functions Export the following xfermode related functions. * ata_pack_xfermask() * ata_unpack_xfermask() * ata_xfer_mask2mode() * ata_xfer_mode2mask() * ata_xfer_mode2shift() * ata_mode_string() * ata_id_xfermask() * ata_timing_find_mode() These functions will be used later by LLD updates. While at it, change unsigned short @speed to u8 @xfer_mode in ata_timing_find_mode() for consistency. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:11 -05:00
Tejun Heo	00115e0f5b	libata: implement ATA_DFLAG_DUBIOUS_XFER ATA_DFLAG_DUBIOUS_XFER is set whenever data transfer speed or method changes and gets cleared when data transfer command succeeds in the newly configured transfer mode. This will be used to improve speed down logic. Signed-off-by: Tejun Heo <htejun@gmail.com< Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:11 -05:00
Tejun Heo	3884f7b0a8	libata: clean up EH speed down implementation Clean up EH speed down implementation. * is_io boolean variable is replaced eflags. is_io is ATA_EFLAG_IS_IO. * Error categories now have names. * Better comments. * Reorder 5min and 10min rules in ata_eh_speed_down_verdict() * Use local variable @link to cache @dev->link in ata_eh_speed_down() These changes are to improve readability and ease further changes. This patch doesn't introduce any behavior change. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:10 -05:00
Tejun Heo	405e66b387	libata: implement protocol tests Implement protocol tests - ata_is_atapi(), ata_is_nodata(), ata_is_pio(), ata_is_dma(), ata_is_ncq() and ata_is_data() and use them to replace is_atapi_taskfile() and hard coded protocol tests. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:10 -05:00
Tejun Heo	f20ded38aa	libata: rearrange ATA_DFLAG_* Area for DFLAGs which are cleared on INIT is full. Extend it by 8 bits. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-23 05:24:10 -05:00
Alan Cox	ae8d4ee7ff	libata: Disable ATA8-ACS proposed Trusted Computing features by default Historically word 48 in the identify data was used to mean 32bit I/O was supported for VLB IDE etc. ATA8 reassigns this word to the Trusted Computing Group, where it is used for TCG features. This means that an ATA8 TCG drive is going to trigger 32bit I/O on some systems which will be funny. Anyway we need to sort this out ready for ATA8 so: - Reorder the ata.h header a bit so the ata_version function occurs early in it - Make dword_io check the ATA version - Add an ATA8 version checking TCG presence test While we are at it the current drafts have a flaw where it may not be possible to disable TCG features at boot (and opt out of the trusted model) as TCG intends because it relies on presence of a different optional feature (DCS). Handle this in software by refusing the TCG commands if libata.allow_tpm is not set. (We must make it possible as some environments such as proprietary VDR devices will doubtless want to use it to lock up content) Finally as with CPRM print a warning so that the user knows they may not be able to full access and use the device. Signed-off-by: Alan Cox <alan@redhat.com>	2008-01-23 05:24:09 -05:00
Johannes Berg	eb13ba8738	lockdep: fix workqueue creation API lockdep interaction Dave Young reported warnings from lockdep that the workqueue API can sometimes try to register lockdep classes with the same key but different names. This is not permitted in lockdep. Unfortunately, I was unaware of that restriction when I wrote the code to debug workqueue problems with lockdep and used the workqueue name as the lockdep class name. This can obviously lead to the problem if the workqueue name is dynamic. This patch solves the problem by always using a constant name for the workqueue's lockdep class, namely either the constant name that was passed in or a string consisting of the variable name. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2008-01-16 09:51:58 +01:00
Alan Cox	121a09e590	libata: correct handling of TSS DVD Devices that misreport the validity bit for word 93 look like SATA. If they are on the blacklist then we must not test for SATA but assume 40 wire in the 40 wire case (The TSSCorp reports 80 wire on SATA it seems!) Signed-off-by: Alan Cox <alan@redhat.com> Cc: Tejun Heo <htejun@gmail.com> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2008-01-15 16:35:21 -05:00
Linus Torvalds	c23f72cae9	Revert "writeback: introduce writeback_control.more_io to indicate more io" This reverts commit `2e6883bdf4`, as requested by Fengguang Wu. It's not quite fully baked yet, and while there are patches around to fix the problems it caused, they should get more testing. Says Fengguang: "I'll resend them both for -mm later on, in a more complete patchset". See http://bugzilla.kernel.org/show_bug.cgi?id=9738 for some of this discussion. Requested-by: Fengguang Wu <wfg@mail.ustc.edu.cn> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-14 21:21:29 -08:00
Jean Delvare	f9dd0194ff	i2c: Driver IDs are optional Document the fact that I2C driver IDs are optional. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2008-01-14 21:53:31 +01:00
Linus Torvalds	417009f64f	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: pnpacpi: print resource shortage message only once PM: ACPI and APM must not be enabled at the same time ACPI: apply quirk_ich6_lpc_acpi to more ICH8 and ICH9 ACPICA: fix acpi_serialize hang regression ACPI : Not register gsi for PCI IDE controller in legacy mode ACPI: Reintroduce run time configurable max_cstate for !CPU_IDLE case ACPI: Make sysfs interface in ACPI power optional. ACPI: EC: Enable boot EC before bus_scan increase PNP_MAX_PORT to 40 from 24	2008-01-13 09:58:22 -08:00
Roland McGrath	84427eaef1	remove task_ppid_nr_ns task_ppid_nr_ns is called in three places. One of these should never have called it. In the other two, using it broke the existing semantics. This was presumably accidental. If the function had not been there, it would have been much more obvious to the eye that those patches were changing the behavior. We don't need this function. In task_state, the pid of the ptracer is not the ppid of the ptracer. In do_task_stat, ppid is the tgid of the real_parent, not its pid. I also moved the call outside of lock_task_sighand, since it doesn't need it. In sys_getppid, ppid is the tgid of the real_parent, not its pid. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-13 09:56:43 -08:00
James Bottomley	11c3e689f1	[SCSI] block: Introduce new blk_queue_update_dma_alignment interface The purpose of this is to allow stacked alignment settings, with the ultimate queue alignment being set to the largest alignment requirement in the stack. The reason for this is so that the SCSI mid-layer can relax the default alignment requirements (which are basically causing a lot of superfluous copying to go on in the SG_IO interface) while allowing transports, devices or HBAs to add stricter limits if they need them. Acked-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-01-11 18:29:20 -06:00
Len Brown	6b74c92521	Pull bugzilla-9535 into release branch	2008-01-11 12:27:50 -05:00
Len Brown	02d5bccf8e	Pull bugzilla-9194 into release branch	2008-01-11 12:27:13 -05:00
Len Brown	9f9adecd2d	PM: ACPI and APM must not be enabled at the same time ACPI and APM used "pm_active" to guarantee that they would not be simultaneously active. But pm_active was recently moved under CONFIG_PM_LEGACY, so that without CONFIG_PM_LEGACY, pm_active became a NOP -- allowing ACPI and APM to both be simultaneously enabled. This caused unpredictable results, including boot hangs. Further, the code under CONFIG_PM_LEGACY is scheduled for removal. So replace pm_active with pm_flags. pm_flags depends only on CONFIG_PM, which is present for both CONFIG_APM and CONFIG_ACPI. http://bugzilla.kernel.org/show_bug.cgi?id=9194 Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2008-01-11 12:26:47 -05:00
Rusty Russell	b801a1e7db	Don't blatt first element of prv in sg_chain() I realize that sg chaining is a ploy to make the rest of the kernel devs feel the pain of the SCSI subsystem. But this was a little unsubtle. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-01-11 10:12:55 +01:00
Zhao Yakui	d1ec7298fc	ACPI: apply quirk_ich6_lpc_acpi to more ICH8 and ICH9 It is important that these resources be reserved to avoid conflicts with well known ACPI registers. Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2008-01-11 00:24:55 -05:00
Herbert Xu	6eb7228421	[CRYPTO] api: Set default CRYPTO_MINALIGN to unsigned long long Thanks to David Miller for pointing out that the SLAB (or SLOB/SLUB) cache uses the alignment of unsigned long long if the architecture kmalloc/slab alignment macros are not defined. This patch changes the CRYPTO_MINALIGN so that it uses the same default value. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:17:01 +11:00
Herbert Xu	d29ce988ae	[CRYPTO] aead: Create default givcipher instances This patch makes crypto_alloc_aead always return algorithms that is capable of generating their own IVs through givencrypt and givdecrypt. All existing AEAD algorithms already do. New ones must either supply their own or specify a generic IV generator with the geniv field. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:52 +11:00
Herbert Xu	5b6d2d7fdf	[CRYPTO] aead: Add aead_geniv_alloc/aead_geniv_free This patch creates the infrastructure to help the construction of IV generator templates that wrap around AEAD algorithms by adding an IV generator to them. This is useful for AEAD algorithms with no built-in IV generator or to replace their built-in generator. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:51 +11:00
Herbert Xu	743edf5727	[CRYPTO] aead: Add givcrypt operations This patch adds the underlying givcrypt operations for aead and associated support elements. The rationale is identical to that of the skcipher givcrypt operations, i.e., sometimes only the algorithm knows how the IV should be generated. A new request type aead_givcrypt_request is added which contains an embedded aead_request structure with two new elements to support this operation. The new elements are seq and giv. The seq field should contain a strictly increasing 64-bit integer which may be used by certain IV generators as an input value. The giv field will be used to store the generated IV. It does not need to obey the alignment requirements of the algorithm because it's not used during the operation. The existing iv field must still be available as it will be used to store intermediate IVs and the output IV if chaining is desired. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:49 +11:00
Herbert Xu	b9c55aa475	[CRYPTO] skcipher: Create default givcipher instances This patch makes crypto_alloc_ablkcipher/crypto_grab_skcipher always return algorithms that are capable of generating their own IVs through givencrypt and givdecrypt. Each algorithm may specify its default IV generator through the geniv field. For algorithms that do not set the geniv field, the blkcipher layer will pick a default. Currently it's chainiv for synchronous algorithms and eseqiv for asynchronous algorithms. Note that if these wrappers do not work on an algorithm then that algorithm must specify its own geniv or it can't be used at all. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:46 +11:00
Herbert Xu	ecfc43292f	[CRYPTO] skcipher: Add skcipher_geniv_alloc/skcipher_geniv_free This patch creates the infrastructure to help the construction of givcipher templates that wrap around existing blkcipher/ablkcipher algorithms by adding an IV generator to them. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:44 +11:00
Herbert Xu	23508e11ab	[CRYPTO] skcipher: Added geniv field This patch introduces the geniv field which indicates the default IV generator for each algorithm. It should point to a string that is not freed as long as the algorithm is registered. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:43 +11:00
Herbert Xu	61da88e2b8	[CRYPTO] skcipher: Add givcrypt operations and givcipher type Different block cipher modes have different requirements for intialisation vectors. For example, CBC can use a simple randomly generated IV while modes such as CTR must use an IV generation mechanisms that give a stronger guarantee on the lack of collisions. Furthermore, disk encryption modes have their own IV generation algorithms. Up until now IV generation has been left to the users of the symmetric key cipher API. This is inconvenient as the number of block cipher modes increase because the user needs to be aware of which mode is supposed to be paired with which IV generation algorithm. Therefore it makes sense to integrate the IV generation into the crypto API. This patch takes the first step in that direction by creating two new ablkcipher operations, givencrypt and givdecrypt that generates an IV before performing the actual encryption or decryption. The operations are currently not exposed to the user. That will be done once the underlying functionality has actually been implemented. It also creates the underlying givcipher type. Algorithms that directly generate IVs would use it instead of ablkcipher. All other algorithms (including all existing ones) would generate a givcipher algorithm upon registration. This givcipher algorithm will be constructed from the geniv string that's stored in every algorithm. That string will locate a template which is instantiated by the blkcipher/ablkcipher algorithm in question to give a givcipher algorithm. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:43 +11:00
Herbert Xu	378f4f51f9	[CRYPTO] skcipher: Add crypto_grab_skcipher interface Note: From now on the collective of ablkcipher/blkcipher/givcipher will be known as skcipher, i.e., symmetric key cipher. The name blkcipher has always been much of a misnomer since it supports stream ciphers too. This patch adds the function crypto_grab_skcipher as a new way of getting an ablkcipher spawn. The problem is that previously we did this in two steps, first getting the algorithm and then calling crypto_init_spawn. This meant that each spawn user had to be aware of what type and mask to use for these two steps. This is difficult and also presents a problem when the type/mask changes as they're about to be for IV generators. The new interface does both steps together just like crypto_alloc_ablkcipher. As a side-effect this also allows us to be stronger on type enforcement for spawns. For now this is only done for ablkcipher but it's trivial to extend for other types. This patch also moves the type/mask logic for skcipher into the helpers crypto_skcipher_type and crypto_skcipher_mask. Finally this patch introduces the function crypto_require_sync to determine whether the user is specifically requesting a sync algorithm. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:42 +11:00
Herbert Xu	551a09a7a9	[CRYPTO] api: Sanitise mask when allocating ablkcipher/hash When allocating ablkcipher/hash objects, we use a mask that's wider than the usual type mask. This patch sanitises the mask supplied by the user so we don't end up using a narrower mask which may lead to unintended results. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:39 +11:00
Herbert Xu	7ba683a6de	[CRYPTO] aead: Make authsize a run-time parameter As it is authsize is an algorithm paramter which cannot be changed at run-time. This is inconvenient because hardware that implements such algorithms would have to register each authsize that they support separately. Since authsize is a property common to all AEAD algorithms, we can add a function setauthsize that sets it at run-time, just like setkey. This patch does exactly that and also changes authenc so that authsize is no longer a parameter of its template. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:29 +11:00
Patrick McHardy	984e976f53	[HWRNG]: move status polling loop to data_present callbacks Handle waiting for new random within the drivers themselves, this allows to use better suited timeouts for the individual rngs. Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:16 +11:00
Herbert Xu	332f8840f7	[CRYPTO] ablkcipher: Add distinct ABLKCIPHER type Up until now we have ablkcipher algorithms have been identified as type BLKCIPHER with the ASYNC bit set. This is suboptimal because ablkcipher refers to two things. On the one hand it refers to the top-level ablkcipher interface with requests. On the other hand it refers to and algorithm type underneath. As it is you cannot request a synchronous block cipher algorithm with the ablkcipher interface on top. This is a problem because we want to be able to eventually phase out the blkcipher top-level interface. This patch fixes this by making ABLKCIPHER its own type, just as we have distinct types for HASH and DIGEST. The type it associated with the algorithm implementation only. Which top-level interface is used for synchronous block ciphers is then determined by the mask that's used. If it's a specific mask then the old blkcipher interface is given, otherwise we go with the new ablkcipher interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-01-11 08:16:15 +11:00
David S. Miller	a0a46196cd	[NET]: Add NAPI_STATE_DISABLE. Create a bit to signal that a napi_disable() is in progress. This sets up infrastructure such that net_rx_action() can generically break out of the ->poll() loop on a NAPI context that has a pending napi_disable() yet is being bombed with packets (and thus would otherwise poll endlessly and not allow the napi_disable() to finish). Now, what napi_disable() does is first set the NAPI_STATE_DISABLE bit (to indicate that a disable is pending), then it polls for the NAPI_STATE_SCHED bit, and once the NAPI_STATE_SCHED bit is acquired the NAPI_STATE_DISABLE bit is cleared. Here, the test_and_set_bit() provides the necessary memory barrier between the various bitops. napi_schedule_prep() now tests for a pending disable as it's first action and won't try to obtain the NAPI_STATE_SCHED bit if a disable is pending. As a result, we can remove the netif_running() check in netif_rx_schedule_prep() because the NAPI disable pending state serves this purpose. And, it does so in a NAPI centric manner which is what we really want. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-08 23:30:07 -08:00
David S. Miller	bdb95b1792	[NET]: Do not grab device reference when scheduling a NAPI poll. It is pointless, because everything that can make a device go away will do a napi_disable() first. The main impetus behind this is that now we can legally do a NAPI completion in generic code like net_rx_action() which a following changeset needs to do. net_rx_action() can only perform actions in NAPI centric ways, because there may be a one to many mapping between NAPI contexts and network devices (SKY2 is one example). We also want to get rid of this because it's an extra atomic in the NAPI paths, and also because it is one of the last instances where the NAPI interfaces care about net devices. The one remaining netdev detail the NAPI stuff cares about is the netif_running() check which will be killed off in a subsequent changeset. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-08 23:30:07 -08:00
Alan Cox	bf5e5834bf	pl2303: Fix mode switching regression Cleaning out all the incorrect 'no change made' checks for termios settings showed up a problem with the PL2303. The hardware here seems to lose sync and bits if you tell it to make no changes. This shows up with a real world application. To fix this the driver check for meaningful hardware changes is restored but doing the tests correctly and as a tty layer function so it doesn't get duplicated wrongly everywhere if other drivers turn out to need it. Signed-off-by: Alan Cox <alan@redhat.com> Tested-by: Mirko Parthey <mirko.parthey@informatik.tu-chemnitz.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-08 16:16:34 -08:00
Sebastian Siewior	5b7741b332	KEYS: fix macro Commit `664cceb009` changed the parameters of the function make_key_ref(). The macros that are used in case CONFIG_KEY is not defined did not change. Cc: David Howells <dhowells@redhat.com> Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-08 16:10:35 -08:00
Ingo Molnar	a263898f62	CPU hotplug: fix cpu_is_offline() on !CONFIG_HOTPLUG_CPU make randconfig bootup testing found that the cpufreq code crashes on bootup, if the powernow-k8 driver is enabled and if maxcpus=1 passed on the boot line to a !CONFIG_HOTPLUG_CPU kernel. First lockdep found out that there's an inconsistent unlock sequence: ===================================== [ BUG: bad unlock balance detected! ] ------------------------------------- swapper/1 is trying to release lock (&per_cpu(cpu_policy_rwsem, cpu)) at: [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42 but there are no more locks to release! Call Trace: [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42 [<ffffffff80251c29>] print_unlock_inbalance_bug+0x104/0x12c [<ffffffff80252f3a>] mark_held_locks+0x56/0x94 [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42 [<ffffffff807008b6>] cpufreq_add_dev+0x2a8/0x5c4 ... then shortly afterwards the cpufreq code crashed on an assert: ------------[ cut here ]------------ kernel BUG at drivers/cpufreq/cpufreq.c:1068! invalid opcode: 0000 [1] SMP [...] Call Trace: [<ffffffff805145d6>] sysdev_driver_unregister+0x5b/0x91 [<ffffffff806ff520>] cpufreq_register_driver+0x15d/0x1a2 [<ffffffff80cc0596>] powernowk8_init+0x86/0x94 [...] ---[ end trace 1e9219be2b4431de ]--- the bug was caused by maxcpus=1 bootup, which brought up the secondary core as !cpu_online() but !cpu_is_offline() either, which on on !CONFIG_HOTPLUG_CPU is always 0 (include/linux/cpu.h): /* CPUs don't go offline once they're online w/o CONFIG_HOTPLUG_CPU */ static inline int cpu_is_offline(int cpu) { return 0; } but the cpufreq code uses cpu_online() and cpu_is_offline() in a mixed way - the low-level drivers use cpu_online(), while the cpufreq core uses cpu_is_offline(). This opened up the possibility to add the non-initialized sysdev device of the secondary core: cpufreq-core: trying to register driver powernow-k8 cpufreq-core: adding CPU 0 powernow-k8: BIOS error - no PSB or ACPI _PSS objects cpufreq-core: initialization failed cpufreq-core: adding CPU 1 cpufreq-core: initialization failed which then blew up. The fix is to make cpu_is_offline() always the negation of cpu_online(). With that fix applied the kernel boots up fine without crashing: Calling initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94() powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ processors (1 cpu cores) (version 2.20.00) powernow-k8: BIOS error - no PSB or ACPI _PSS objects initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94() returned -19. initcall 0xffffffff80cc0510 ran for 19 msecs: powernowk8_init+0x0/0x94() Calling initcall 0xffffffff80cc328f: init_lapic_nmi_sysfs+0x0/0x39() We could fix this by making CPU enumeration aware of max_cpus, but that would be more fragile IMO, and the cpu_online(cpu) != cpu_is_offline(cpu) possibility was quite confusing and a continuous source of bugs too. Most distributions have kernels with CPU hotplug enabled, so this bug remained hidden for a long time. Bug forensics: The broken cpu_is_offline() API variant was introduced via: commit a59d2e4e6977e7b94e003c96a41f07e96cddc340 Author: Rusty Russell <rusty@rustcorp.com.au> Date: Mon Mar 8 06:06:03 2004 -0800 [PATCH] minor cleanups for hotplug CPUs ( this predates linux-2.6.git, this commit is available from Thomas's historic git tree. ) Then 1.5 years later the cpufreq code made use of it: commit `c32b6b8e52` Author: Ashok Raj <ashok.raj@intel.com> Date: Sun Oct 30 14:59:54 2005 -0800 [PATCH] create and destroy cpufreq sysfs entries based on cpu notifiers + if (cpu_is_offline(cpu)) + return 0; which is a correct use of the subtly broken new API. v2.6.15 then shipped with this bug included. then it took two more years for random-kernel qa to hit it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-06 12:39:42 -08:00
Al Viro	831830b5a2	restrict reading from /proc/<pid>/maps to those who share ->mm or can ptrace pid Contents of /proc/*/maps is sensitive and may become sensitive after open() (e.g. if target originally shares our ->mm and later does exec on suid-root binary). Check at read() (actually, ->start() of iterator) time that mm_struct we'd grabbed and locked is - still the ->mm of target - equal to reader's ->mm or the target is ptracable by reader. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-02 13:13:27 -08:00
Linus Torvalds	158a962422	Unify /proc/slabinfo configuration Both SLUB and SLAB really did almost exactly the same thing for /proc/slabinfo setup, using duplicate code and per-allocator #ifdef's. This just creates a common CONFIG_SLABINFO that is enabled by both SLUB and SLAB, and shares all the setup code. Maybe SLOB will want this some day too. Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-02 13:04:48 -08:00
Pekka J Enberg	57ed3eda97	slub: provide /proc/slabinfo This adds a read-only /proc/slabinfo file on SLUB, that makes slabtop work. [ mingo@elte.hu: build fix. ] Cc: Andi Kleen <andi@firstfloor.org> Cc: Christoph Lameter <clameter@sgi.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-01 11:32:02 -08:00
Len Brown	2c83819775	increase PNP_MAX_PORT to 40 from 24 `a7839e9606` (PNP: increase the maximum number of resources) increased PNP_MAX_PORT to 24 from 8. It also added a test and a complaint when a machine exceeded the limit, causing: pnpacpi: exceeded the max number of IO resources: 24 http://bugzilla.kernel.org/show_bug.cgi?id=9535 We should have been squawking about this all along, as this is a potentially serious issue. For now, simply burn some dynamic bytes and increase the limit by another 16 to 40. There is no guarantee that this will satisfy every system on Earth. It probably will not, but it should be an improvement. In the future, PNPACPI should allocate resource structures as needed, rather than max-sized arrays. Signed-off-by: Len Brown <len.brown@intel.com>	2007-12-27 23:55:13 -05:00
Stephen Hemminger	ecef969e5b	[VETH]: move veth.h to include/linux Move veth.h from net/ to linux/ since it is a user api, and add it to user header processing Kbuild. [ Use header-y as suggested by Sam Ravnborg. -DaveM ] Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-26 19:36:35 -08:00
Stephen Hemminger	75ec533ec3	[NET] tc_nat: header install iproute2 build needs tc_nat.h header from kernel make install_headers. Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-26 19:36:35 -08:00
Christoph Lameter	ed367fc3a7	quicklists: do not release off node pages early quicklists must keep even off node pages on the quicklists until the TLB flush has been completed. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-23 12:54:36 -08:00
Neil Brown	91212507f9	dm: merge max_hw_sector Make sure dm honours max_hw_sectors of underlying devices We still have no firm testing evidence in support of this patch but believe it may help to resolve some bug reports. - agk Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com>	2007-12-20 17:32:12 +00:00
Linus Torvalds	3e3b3916a9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86 * git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!" genirq: revert lazy irq disable for simple irqs x86: also define AT_VECTOR_SIZE_ARCH x86: kprobes bugfix x86: jprobe bugfix timer: kernel/timer.c section fixes genirq: add unlocked version of set_irq_handler() clockevents: fix reprogramming decision in oneshot broadcast oprofile: op_model_athlon.c support for AMD family 10h barcelona performance counters	2007-12-18 09:42:44 -08:00
Kevin Hilman	b019e57321	genirq: add unlocked version of set_irq_handler() Add unlocked version for use by irq_chip.set_type handlers which may wish to change handler to level or edge handler when IRQ type is changed. The normal set_irq_handler() call cannot be used because it tries to take irq_desc.lock which is already held when the irq_chip.set_type hook is called. Signed-off-by: Kevin Hilman <khilman@mvista.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-12-18 18:05:58 +01:00
Linus Torvalds	3c615e19a4	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: Cleanup umem driver: fix most checkpatch warnings, conform to kernel block: let elv_register() return void as-iosched: fix write batch start point as-iosched: fix incorrect comments block: use jiffies conversion functions in scsi_ioctl.c	2007-12-18 08:04:24 -08:00
Linus Torvalds	d55653377d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: mmc: remove unused 'mode' from the mmc_host structure sdhci: support JMicron JMB38x chips sdhci: use PIO when DMA can't satisfy the request sdhci: don't warn about sdhci 2.0 controllers sdhci: describe quirks	2007-12-18 08:03:01 -08:00
Adrian Bunk	2fdd82bd88	block: let elv_register() return void elv_register() always returns 0, and there isn't anything it does where it should return an error (the only error condition is so grave that it's handled with a BUG_ON). Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-12-18 08:29:28 +01:00
Linus Torvalds	ededa4d396	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: libata: fix ATAPI draining libata: update atapi_eh_request_sense() such that lbam/lbah contains buffer size libata-acpi: implement _GTF command filtering libata-acpi: improve _GTF execution error handling and reporting libata-acpi: improve ACPI disabling libata-acpi: implement dev->gtf_cache and evaluate _GTF right after _STM during resume libata-acpi: implement and use ata_acpi_init_gtm() libata-acpi: add new hooks ata_acpi_dissociate() and ata_acpi_on_disable() libata: ata_dev_disable() should be called from EH context libata: add more opcodes to ata.h libata: update ata_*_printk() macros such that level can be a variable libata-acpi: adjust constness in ata_acpi_gtm/stm() parameters sata_mv: improve warnings about Highpoint RocketRAID 23xx cards libata: add ST3160023AS / 3.42 to NCQ blacklist libata: clear link->eh_info.serror from ata_std_postreset() sata_sil: fix spurious IRQ handling	2007-12-17 19:29:32 -08:00
Nishanth Aravamudan	368d2c6358	Revert "hugetlb: Add hugetlb_dynamic_pool sysctl" This reverts commit `54f9f80d65` ("hugetlb: Add hugetlb_dynamic_pool sysctl") Given the new sysctl nr_overcommit_hugepages, the boolean dynamic pool sysctl is not needed, as its semantics can be expressed by 0 in the overcommit sysctl (no dynamic pool) and non-0 in the overcommit sysctl (pool enabled). (Needed in 2.6.24 since it reverts a post-2.6.23 userspace-visible change) Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Acked-by: Adam Litke <agl@us.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-17 19:28:17 -08:00
Nishanth Aravamudan	d1c3fb1f8f	hugetlb: introduce nr_overcommit_hugepages sysctl hugetlb: introduce nr_overcommit_hugepages sysctl While examining the code to support /proc/sys/vm/hugetlb_dynamic_pool, I became convinced that having a boolean sysctl was insufficient: 1) To support per-node control of hugepages, I have previously submitted patches to add a sysfs attribute related to nr_hugepages. However, with a boolean global value and per-mount quota enforcement constraining the dynamic pool, adding corresponding control of the dynamic pool on a per-node basis seems inconsistent to me. 2) Administration of the hugetlb dynamic pool with multiple hugetlbfs mount points is, arguably, more arduous than it needs to be. Each quota would need to be set separately, and the sum would need to be monitored. To ease the administration, and to help make the way for per-node control of the static & dynamic hugepage pool, I added a separate sysctl, nr_overcommit_hugepages. This value serves as a high watermark for the overall hugepage pool, while nr_hugepages serves as a low watermark. The boolean sysctl can then be removed, as the condition nr_overcommit_hugepages > 0 indicates the same administrative setting as hugetlb_dynamic_pool == 1 Quotas still serve as local enforcement of the size of the pool on a per-mount basis. A few caveats: 1) There is a race whereby the global surplus huge page counter is incremented before a hugepage has allocated. Another process could then try grow the pool, and fail to convert a surplus huge page to a normal huge page and instead allocate a fresh huge page. I believe this is benign, as no memory is leaked (the actual pages are still tracked correctly) and the counters won't go out of sync. 2) Shrinking the static pool while a surplus is in effect will allow the number of surplus huge pages to exceed the overcommit value. As long as this condition holds, however, no more surplus huge pages will be allowed on the system until one of the two sysctls are increased sufficiently, or the surplus huge pages go out of use and are freed. Successfully tested on x86_64 with the current libhugetlbfs snapshot, modified to use the new sysctl. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Acked-by: Adam Litke <agl@us.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-17 19:28:17 -08:00
Adam Jackson	8d936626dd	apm_event{,info}_t are userspace types These types define the size of data read from /dev/apm_bios. They should not be hidden behind #ifdef __KERNEL__. This is killing my xserver compile, apm_event_t is used in the xserver source. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-17 19:28:16 -08:00
Andrew Morton	755271358c	fix headers_install make[3]: *** No rule to make target `/usr/src/devel/include/linux/ticable.h', needed by `/usr/src/devel/usr/include/linux/ticable.h'. Stop. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-17 19:28:15 -08:00
Tejun Heo	140b5e5911	libata: fix ATAPI draining With ATAPI transfer chunk size properly programmed, libata PIO HSM should be able to handle full spurious data chunks. Also, it's a good idea to suppress trailing data warning for misc ATAPI commands as there can be many of them per command - for example, if the chunk size is 16 and the drive tries to transfer 510 bytes, there can be 31 trailing data messages. This patch makes the following updates to libata ATAPI PIO HSM implementation. * Make it drain full spurious chunks. * Suppress trailing data warning message for misc commands. * Put limit on how many bytes can be drained. * If odd, round up consumed bytes and the number of bytes to be drained. This gets the number of bytes to drain right for drivers which do 16bit PIO. This patch is partial backport of improve-ATAPI-data-xfer patchset pending for #upstream. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-17 20:43:28 -05:00
Tejun Heo	398e07826b	libata-acpi: implement dev->gtf_cache and evaluate _GTF right after _STM during resume On certain implementations, _GTF evaluation depends on preceding _STM and both can be pretty picky about the configuration. Using _GTM result cached during controller initialization satisfies the most neurotic _STM implementation. However, libata evaluates _GTF after reset during device configuration and the hardware state can be different from what _GTF expects and can cause evaluation failure. This patch adds dev->gtf_cache and updates ata_dev_get_GTF() such that it uses the cached value if available. Cache is cleared with a call to ata_acpi_clear_gtf(). Because for SATA ACPI nodes _GTF must be evaluated after _SDD which can't be done till IDENTIFY is complete, _GTF caching from ata_acpi_on_resume() is used only for IDE ACPI nodes. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-17 20:33:14 -05:00
Tejun Heo	c05e6ff035	libata-acpi: implement and use ata_acpi_init_gtm() _GTM fetches currently configured transfer mode while _STM configures controller according to _GTM parameter and prepares transfer mode configuration TFs for _GTF. In many cases _GTM and _STM implementations are quite brittle and can't cope with configuration changed by libata. libata does not depend on ATA ACPI to configure devices. The only reason libata performs _GTM and _STM are to make _GTF evaluation succeed and libata also doesn't care about how _GTF TFs configure transfer mode. It overrides that configuration anyway, so from libata's POV, it doesn't matter what value is feeded to _STM as long as evaluation succeeds for _STM and following _GTF. This patch adds dev->__acpi_init_gtm and store initial _GTM values on host initialization before modified by reset and mode configuration. If the field is valid, ata_acpi_init_gtm() returns pointer to the saved _GTM structure; otherwise, NULL. This saved value is used for _STM during resume and peek at BIOS/firmware programmed initial timing for later use. The accessor is there to make building w/o ACPI easy as dev->__acpi_init doesn't exist if ACPI is not enabled. On driver detach, the initial BIOS configuration is restored by executing _STM with the initial _GTM values such that the next driver can also use the initial BIOS configured values. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-17 20:33:14 -05:00
Tejun Heo	ce2e0abbd3	libata: add more opcodes to ata.h Add constants for DEVICE CONFIGURATION OVERLAY and SET_MAX to include/linux/ata.h. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-17 20:33:12 -05:00
Tejun Heo	c2e366a107	libata: update ata_*_printk() macros such that level can be a variable Make prink helpers format @lv together rather than prepending to the format string as constant. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-17 20:33:12 -05:00
Tejun Heo	0d02f0b22b	libata-acpi: adjust constness in ata_acpi_gtm/stm() parameters * No internal function uses const ata_port. Drop const from @ap. * Make ata_acpi_stm() copy @stm before using it and change @stm to const. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-17 20:33:12 -05:00
Linus Torvalds	87d5df6bde	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6: HOWTO: update misspelling and word incorrected add stable_api_nonsense.txt in korean HOWTO: change addresses of maintainer and lxr url for Korean HOWTO Add Documentation for FAIR_USER_SCHED sysfs files HOWTO: Change man-page maintainer address for Japanese HOWTO tipar: remove obsolete module kobject: fix the documentation of how kobject_set_name works	2007-12-17 13:33:47 -08:00
Linus Torvalds	4942093e9d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: USB: revert portions of "UNUSUAL_DEV: Sync up some reported devices from Ubuntu" usb: Remove broken optimisation in OHCI IRQ handler USB: at91_udc: correct hanging while disconnecting usb cable USB: use IRQF_DISABLED for HCD interrupt handlers USB: fix locking loop by avoiding flush_scheduled_work usb.h: fix kernel-doc warning USB: option: Bind to the correct interface of the Huawei E220 USB: cp2101: new device id usb-storage: Fix devices that cannot handle 32k transfers USB: sierra: fix product id	2007-12-17 13:33:30 -08:00
Linus Torvalds	07232b9715	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: ide: fix ->io_32bit race in set_io_32bit() ide: remove stale changelog from ide-probe.c ide: remove stale changelog from ide-disk.c ide: remove dead code from __ide_dma_test_irq() hpt366: fix HPT37x PIO mode timings (take 2) pdc202xx_new: fix Promise TX4 support ide-cd: remove dead post_transform_command() ide: DMA reporting and validity checking fixes (take 3) ide: add /sys/bus/ide/devices/*/{model,firmware,serial} sysfs entries ide: coding style fixes for drivers/ide/setup-pci.c ide: fix ide_scan_pcibus() error message ide: deprecate CONFIG_BLK_DEV_OFFBOARD ide: add missing checks for control register existence ide-scsi: add ide_scsi_hex_dump() helper	2007-12-17 13:32:49 -08:00
Randy Dunlap	f88ed90d86	usb.h: fix kernel-doc warning Fix kernel-doc warning in usb.h: Warning(linux-2.6.24-rc3-git7//include/linux/usb.h:166): No description found for parameter 'sysfs_files_created' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-12-17 10:47:15 -08:00
Doug Maxey	33abc04f04	usb-storage: Fix devices that cannot handle 32k transfers When a device cannot handle the smallest previously limited transfer size (64 blocks) without stalling, limit the device to the amount of packets that fit in a platform native page. The lowest possible limit is PAGE_CACHE_SIZE, so if the device is ever used on a platform that has larger than 8K pages, you lose unless you can convince the device firmware folks to fix the issue. Cc: Mathew Dharm <mdharm-scsi@one-eyed-alien.net> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Pete Zaitcev <zaitcev@redhat.com> Signed-off-by: Doug Maxey <dwm@austin.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-12-17 10:47:14 -08:00
Romain Liévin	cb8c9b6de0	tipar: remove obsolete module tipar: remove obsolete module The tipar character driver was used to implement bit-banging access to Texas Instruments parallel link cable. A user-land method now exists thru PPDEV & PARPORT. Signed-off-by: Romain Liévin <roms@lpg.ticalc.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-12-17 10:33:18 -08:00
Patrick McHardy	4a9ecd5960	[NETFILTER]: bridge: fix missing link layer headers on outgoing routed packets As reported by Damien Thebault, the double POSTROUTING hook invocation fix caused outgoing packets routed between two bridges to appear without a link-layer header. The reason for this is that we're skipping the br_nf_post_routing hook for routed packets now and don't save the original link layer header, but nevertheless tries to restore it on output, causing corruption. The root cause for this is that skb->nf_bridge has no clearly defined lifetime and is used to indicate all kind of things, but that is quite complicated to fix. For now simply don't touch these packets and handle them like packets from any other device. Tested-by: Damien Thebault <damien.thebault@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-14 13:54:39 -08:00
Bartlomiej Zolnierkiewicz	3ab7efe8e2	ide: DMA reporting and validity checking fixes (take 3) * ide_xfer_verbose() fixups: - beautify returned mode names - fix PIO5 reporting - make it return 'const char ' Change printk() level from KERN_DEBUG to KERN_INFO in ide_find_dma_mode(). * Add ide_id_dma_bug() helper based on ide_dma_verbose() to check for invalid DMA info in identify block. * Use ide_id_dma_bug() in ide_tune_dma() and ide_driveid_update(). As a result DMA won't be tuned or will be disabled after tuning if device reports inconsistent info about enabled DMA mode (ide_dma_verbose() does the same checks while the IDE device is probed by ide-{cd,disk} device driver). * Remove no longer needed ide_dma_verbose(). This patch should fix the following problem with out-of-sync IDE messages reported by Nick Warne: hdd: ATAPI 48X DVD-ROM DVD-R-RAM CD-R/RW drive, 2048kB Cache<7>hdd: skipping word 93 validity check , UDMA(66) and later debugged by Mark Lord to be caused by: ide_dma_verbose() printk( ... "2048kB Cache"); eighty_ninty_three() printk(KERN_DEBUG "%s: skipping word 93 validity check\n"); ide_dma_verbose() printk(", UDMA(66)" Please note that as a result ide-{cd,disk} device drivers won't report the DMA speed used but this is intended since now DMA mode being used is always reported by IDE core code. v2: * fixes suggested by Randy: - use KERN_CONT for printk()-s in ide-{cd,disk}.c - don't remove argument name from ide_xfer_verbose() declaration v3: * Remove incorrect check for (id->field_valid & 1) from ide_id_dma_bug() (spotted by Sergei). * "XFER SLOW" -> "PIO SLOW" in ide_xfer_verbose() (suggested by Sergei). * Fix ide_find_dma_mode() to report the correct mode ('mode' after being limited by 'req_mode'). Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Nick Warne <nick@ukfsn.org> Cc: Mark Lord <lkml@rtr.ca> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>	2007-12-12 23:31:58 +01:00
Nicolas Pitre	cc3000e4ef	mmc: remove unused 'mode' from the mmc_host structure This field and corresponding defines are simply never used anywhere in the code. But its mere presence is enough to confuse some host driver authors who attempt to rely on it. Let's eliminate the possibility for confusion and remove it entirely. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2007-12-12 20:01:01 +01:00
Pierre Ossman	84c46a53fc	sdhci: support JMicron JMB38x chips The JMicron JMB38x chip doesn't support transfers that aren't 32-bit aligned (both size and start address). It also doesn't like switching between PIO and DMA mode, so it needs to be reset after each request. Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>	2007-12-12 20:01:00 +01:00
Jay Vosburgh	6f6652be18	bonding: Add new layer2+3 hash for xor/802.3ad modes Add new hash for balance-xor and 802.3ad modes. Originally submitted by "Glenn Griffin" <ggriffin.kernel@gmail.com>; modified by Jay Vosburgh to move setting of hash policy out of line, tweak the documentation update and add version update to 3.2.2. Glenn's original comment follows: Included is a patch for a new xmit_hash_policy for the bonding driver that selects slaves based on MAC and IP information. This is a middle ground between what currently exists in the layer2 only policy and the layer3+4 policy. This policy strives to be fully 802.3ad compliant by transmitting every packet of any particular flow over the same link. As documented the layer3+4 policy is not fully compliant for extreme cases such as ip fragmentation, so this policy is a nice compromise for environments that require full compliance but desire more than the layer2 only policy. Signed-off-by: "Glenn Griffin" <ggriffin.kernel@gmail.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-07 15:00:32 -05:00
Richard Purdie	dc47206e55	leds: Fix led trigger locking bugs Convert part of the led trigger core from rw spinlocks to rw semaphores. We're calling functions which can sleep from invalid contexts otherwise. Fixes bug #9264. Signed-off-by: Richard Purdie <rpurdie@rpsys.net>	2007-12-07 09:06:53 +00:00
Linus Torvalds	7e1fb765c6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: futex: correctly return -EFAULT not -EINVAL lockdep: in_range() fix lockdep: fix debug_show_all_locks() sched: style cleanups futex: fix for futex_wait signal stack corruption	2007-12-05 09:27:46 -08:00
Linus Torvalds	ad658cec23	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6: VM/Security: add security hook to do_brk Security: round mmap hint address above mmap_min_addr security: protect from stack expantion into low vm addresses Security: allow capable check to permit mmap or low vm space SELinux: detect dead booleans SELinux: do not clear f_op when removing entries	2007-12-05 09:26:52 -08:00
Linus Torvalds	2a1292b36b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: [LRO]: fix lro_gen_skb() alignment [TCP]: NAGLE_PUSH seems to be a wrong way around [TCP]: Move prior_in_flight collect to more robust place [TCP] FRTO: Use of existing funcs make code more obvious & robust [IRDA]: Move ircomm_tty_line_info() under #ifdef CONFIG_PROC_FS [ROSE]: Trivial compilation CONFIG_INET=n case [IPVS]: Fix sched registration race when checking for name collision. [IPVS]: Don't leak sysctl tables if the scheduler registration fails.	2007-12-05 09:26:13 -08:00
Alexey Dobriyan	5a622f2d0f	proc: fix proc_dir_entry refcounting Creating PDEs with refcount 0 and "deleted" flag has problems (see below). Switch to usual scheme: * PDE is created with refcount 1 * every de_get does +1 * every de_put() and remove_proc_entry() do -1 * once refcount reaches 0, PDE is freed. This elegantly fixes at least two following races (both observed) without introducing new locks, without abusing old locks, without spreading lock_kernel(): 1) PDE leak remove_proc_entry de_put ----------------- ------ [refcnt = 1] if (atomic_read(&de->count) == 0) if (atomic_dec_and_test(&de->count)) if (de->deleted) /* also not taken! / free_proc_entry(de); else de->deleted = 1; [refcount=0, deleted=1] 2) use after free remove_proc_entry de_put ----------------- ------ [refcnt = 1] if (atomic_dec_and_test(&de->count)) if (atomic_read(&de->count) == 0) free_proc_entry(de); / boom! / if (de->deleted) free_proc_entry(de); BUG: unable to handle kernel paging request at virtual address 6b6b6b6b printing eip: c10acdda pdpt = 00000000338f8001 *pde = 0000000000000000 Oops: 0000 [#1] PREEMPT SMP Modules linked in: af_packet ipv6 cpufreq_ondemand loop serio_raw psmouse k8temp hwmon sr_mod cdrom Pid: 23161, comm: cat Not tainted (2.6.24-rc2-8c0863403f109a43d7000b4646da4818220d501f #4) EIP: 0060:[<c10acdda>] EFLAGS: 00210097 CPU: 1 EIP is at strnlen+0x6/0x18 EAX: 6b6b6b6b EBX: 6b6b6b6b ECX: 6b6b6b6b EDX: fffffffe ESI: c128fa3b EDI: f380bf34 EBP: ffffffff ESP: f380be44 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process cat (pid: 23161, ti=f380b000 task=f38f2570 task.ti=f380b000) Stack: c10ac4f0 00000278 c12ce000 f43cd2a8 00000163 00000000 7da86067 00000400 c128fa20 00896b18 f38325a8 c128fe20 ffffffff 00000000 c11f291e 00000400 f75be300 c128fa20 f769c9a0 c10ac779 f380bf34 f7bfee70 c1018e6b f380bf34 Call Trace: [<c10ac4f0>] vsnprintf+0x2ad/0x49b [<c10ac779>] vscnprintf+0x14/0x1f [<c1018e6b>] vprintk+0xc5/0x2f9 [<c10379f1>] handle_fasteoi_irq+0x0/0xab [<c1004f44>] do_IRQ+0x9f/0xb7 [<c117db3b>] preempt_schedule_irq+0x3f/0x5b [<c100264e>] need_resched+0x1f/0x21 [<c10190ba>] printk+0x1b/0x1f [<c107c8ad>] de_put+0x3d/0x50 [<c107c8f8>] proc_delete_inode+0x38/0x41 [<c107c8c0>] proc_delete_inode+0x0/0x41 [<c1066298>] generic_delete_inode+0x5e/0xc6 [<c1065aa9>] iput+0x60/0x62 [<c1063c8e>] d_kill+0x2d/0x46 [<c1063fa9>] dput+0xdc/0xe4 [<c10571a1>] __fput+0xb0/0xcd [<c1054e49>] filp_close+0x48/0x4f [<c1055ee9>] sys_close+0x67/0xa5 [<c10026b6>] sysenter_past_esp+0x5f/0x85 ======================= Code: c9 74 0c f2 ae 74 05 bf 01 00 00 00 4f 89 fa 5f 89 d0 c3 85 c9 57 89 c7 89 d0 74 05 f2 ae 75 01 4f 89 f8 5f c3 89 c1 89 c8 eb 06 <80> 38 00 74 07 40 4a 83 fa ff 75 f4 29 c8 c3 90 90 90 57 83 c9 EIP: [<c10acdda>] strnlen+0x6/0x18 SS:ESP 0068:f380be44 Also, remove broken usage of ->deleted from reiserfs: if sget() succeeds, module is already pinned and remove_proc_entry() can't happen => nobody can mark PDE deleted. Dummy proc root in netns code is not marked with refcount 1. AFAICS, we never get it, it's just for proper /proc/net removal. I double checked CLONE_NETNS continues to work. Patch survives many hours of modprobe/rmmod/cat loops without new bugs which can be attributed to refcounting. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:21:20 -08:00
Jan Kara	d4beaf4ab5	jbd: Fix assertion failure in fs/jbd/checkpoint.c Before we start committing a transaction, we call __journal_clean_checkpoint_list() to cleanup transaction's written-back buffers. If this call happens to remove all of them (and there were already some buffers), __journal_remove_checkpoint() will decide to free the transaction because it isn't (yet) a committing transaction and soon we fail some assertion - the transaction really isn't ready to be freed :). We change the check in __journal_remove_checkpoint() to free only a transaction in T_FINISHED state. The locking there is subtle though (as everywhere in JBD ;(). We use j_list_lock to protect the check and a subsequent call to __journal_drop_transaction() and do the same in the end of journal_commit_transaction() which is the only place where a transaction can get to T_FINISHED state. Probably I'm too paranoid here and such locking is not really necessary - checkpoint lists are processed only from log_do_checkpoint() where a transaction must be already committed to be processed or from __journal_clean_checkpoint_list() where kjournald itself calls it and thus transaction cannot change state either. Better be safe if something changes in future... Signed-off-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:21:20 -08:00
Steven Rostedt	ce6bd420f4	futex: fix for futex_wait signal stack corruption David Holmes found a bug in the -rt tree with respect to pthread_cond_timedwait. After trying his test program on the latest git from mainline, I found the bug was there too. The bug he was seeing that his test program showed, was that if one were to do a "Ctrl-Z" on a process that was in the pthread_cond_timedwait, and then did a "bg" on that process, it would return with a "-ETIMEDOUT" but early. That is, the timer would go off early. Looking into this, I found the source of the problem. And it is a rather nasty bug at that. Here's the relevant code from kernel/futex.c: (not in order in the file) [...] smlinkage long sys_futex(u32 __user uaddr, int op, u32 val, struct timespec __user utime, u32 __user uaddr2, u32 val3) { struct timespec ts; ktime_t t, tp = NULL; u32 val2 = 0; int cmd = op & FUTEX_CMD_MASK; if (utime && (cmd == FUTEX_WAIT \|\| cmd == FUTEX_LOCK_PI)) { if (copy_from_user(&ts, utime, sizeof(ts)) != 0) return -EFAULT; if (!timespec_valid(&ts)) return -EINVAL; t = timespec_to_ktime(ts); if (cmd == FUTEX_WAIT) t = ktime_add(ktime_get(), t); tp = &t; } [...] return do_futex(uaddr, op, val, tp, uaddr2, val2, val3); } [...] long do_futex(u32 __user uaddr, int op, u32 val, ktime_t timeout, u32 __user uaddr2, u32 val2, u32 val3) { int ret; int cmd = op & FUTEX_CMD_MASK; struct rw_semaphore fshared = NULL; if (!(op & FUTEX_PRIVATE_FLAG)) fshared = &current->mm->mmap_sem; switch (cmd) { case FUTEX_WAIT: ret = futex_wait(uaddr, fshared, val, timeout); [...] static int futex_wait(u32 __user uaddr, struct rw_semaphore fshared, u32 val, ktime_t abs_time) { [...] struct restart_block restart; restart = &current_thread_info()->restart_block; restart->fn = futex_wait_restart; restart->arg0 = (unsigned long)uaddr; restart->arg1 = (unsigned long)val; restart->arg2 = (unsigned long)abs_time; restart->arg3 = 0; if (fshared) restart->arg3 \|= ARG3_SHARED; return -ERESTART_RESTARTBLOCK; [...] static long futex_wait_restart(struct restart_block restart) { u32 __user uaddr = (u32 __user )restart->arg0; u32 val = (u32)restart->arg1; ktime_t abs_time = (ktime_t )restart->arg2; struct rw_semaphore fshared = NULL; restart->fn = do_no_restart_syscall; if (restart->arg3 & ARG3_SHARED) fshared = &current->mm->mmap_sem; return (long)futex_wait(uaddr, fshared, val, abs_time); } So when the futex_wait is interrupt by a signal we break out of the hrtimer code and set up or return from signal. This code does not return back to userspace, so we set up a RESTARTBLOCK. The bug here is that we save the "abs_time" which is a pointer to the stack variable "ktime_t t" from sys_futex. This returns and unwinds the stack before we get to call our signal. On return from the signal we go to futex_wait_restart, where we update all the parameters for futex_wait and call it. But here we have a problem where abs_time is no longer valid. I verified this with print statements, and sure enough, what abs_time was set to ends up being garbage when we get to futex_wait_restart. The solution I did to solve this (with input from Linus Torvalds) was to add unions to the restart_block to allow system calls to use the restart with specific parameters. This way the futex code now saves the time in a 64bit value in the restart block instead of storing it on the stack. Note: I'm a bit nervious to add "linux/types.h" and use u32 and u64 in thread_info.h, when there's a #ifdef __KERNEL__ just below that. Not sure what that is there for. If this turns out to be a problem, I've tested this with using "unsigned int" for u32 and "unsigned long long" for u64 and it worked just the same. I'm using u32 and u64 just to be consistent with what the futex code uses. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 15:46:09 +01:00
Andrew Gallatin	621544eb8c	[LRO]: fix lro_gen_skb() alignment Add a field to the lro_mgr struct so that drivers can specify how much padding is required to align layer 3 headers when a packet is copied into a freshly allocated skb by inet_lro.c:lro_gen_skb(). Without padding, skbs generated by LRO will cause alignment warnings on architectures which require strict alignment (seen on sparc64). Myri10GE is updated to use this field. Signed-off-by: Andrew Gallatin <gallatin@myri.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-12-05 05:37:32 -08:00
Eric Paris	7cd94146cd	Security: round mmap hint address above mmap_min_addr If mmap_min_addr is set and a process attempts to mmap (not fixed) with a non-null hint address less than mmap_min_addr the mapping will fail the security checks. Since this is just a hint address this patch will round such a hint address above mmap_min_addr. gcj was found to try to be very frugal with vm usage and give hint addresses in the 8k-32k range. Without this patch all such programs failed and with the patch they happily get a higher address. This patch is wrappad in CONFIG_SECURITY since mmap_min_addr doesn't exist without it and there would be no security check possible no matter what. So we should not bother compiling in this rounding if it is just a waste of time. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2007-12-06 00:25:10 +11:00
Anton Vorontsov	6f4a7f4183	PHY: Add the phy_device_release device method. Lately I've got this nice badness on mdio bus removal: Device 'e0103120:06' does not have a release() function, it is broken and must be fixed. ------------[ cut here ]------------ Badness at drivers/base/core.c:107 NIP: c015c1a8 LR: c015c1a8 CTR: c0157488 REGS: c34bdcf0 TRAP: 0700 Not tainted (2.6.23-rc5-g9ebadfbb-dirty) MSR: 00029032 <EE,ME,IR,DR> CR: 24088422 XER: 00000000 ... [c34bdda0] [c015c1a8] device_release+0x78/0x80 (unreliable) [c34bddb0] [c01354cc] kobject_cleanup+0x80/0xbc [c34bddd0] [c01365f0] kref_put+0x54/0x6c [c34bdde0] [c013543c] kobject_put+0x24/0x34 [c34bddf0] [c015c384] put_device+0x1c/0x2c [c34bde00] [c0180e84] mdiobus_unregister+0x2c/0x58 ... Though actually there is nothing broken, it just device subsystem core expects another "pattern" of resource managment. This patch implement phy device's release function, thus we're getting rid of this badness. Also small hidden bug fixed, hope none other introduced. ;-) Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-by: Andy Fleming <afleming@freescale.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-04 15:06:33 -05:00
Linus Torvalds	ca6435f188	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: sched: cpu accounting controller (V2)	2007-12-03 08:21:06 -08:00
Linus Torvalds	8002cedc1a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6: (27 commits) [INET]: Fix inet_diag dead-lock regression [NETNS]: Fix /proc/net breakage [TEXTSEARCH]: Do not allow zero length patterns in the textsearch infrastructure [NETFILTER]: fix forgotten module release in xt_CONNMARK and xt_CONNSECMARK [NETFILTER]: xt_TCPMSS: remove network triggerable WARN_ON [DECNET]: dn_nl_deladdr() almost always returns no error [IPV6]: Restore IPv6 when MTU is big enough [RXRPC]: Add missing select on CRYPTO mac80211: rate limit wep decrypt failed messages rfkill: fix double-mutex-locking mac80211: drop unencrypted frames if encryption is expected mac80211: Fix behavior of ieee80211_open and ieee80211_close ieee80211: fix unaligned access in ieee80211_copy_snap mac80211: free ifsta->extra_ie and clear IEEE80211_STA_PRIVACY_INVOKED SCTP: Fix build issues with SCTP AUTH. SCTP: Fix chunk acceptance when no authenticated chunks were listed. SCTP: Fix the supported extensions paramter SCTP: Fix SCTP-AUTH to correctly add HMACS paramter. SCTP: Fix the number of HB transmissions. [TCP] illinois: Incorrect beta usage ...	2007-12-03 08:15:36 -08:00
Srivatsa Vaddagiri	d842de871c	sched: cpu accounting controller (V2) Commit `cfb5285660` removed a useful feature for us, which provided a cpu accounting resource controller. This feature would be useful if someone wants to group tasks only for accounting purpose and doesnt really want to exercise any control over their cpu consumption. The patch below reintroduces the feature. It is based on Paul Menage's original patch (Commit `62d0df6406`), with these differences: - Removed load average information. I felt it needs more thought (esp to deal with SMP and virtualized platforms) and can be added for 2.6.25 after more discussions. - Convert group cpu usage to be nanosecond accurate (as rest of the cfs stats are) and invoke cpuacct_charge() from the respective scheduler classes - Make accounting scalable on SMP systems by splitting the usage counter to be per-cpu - Move the code from kernel/cpu_acct.c to kernel/sched.c (since the code is not big enough to warrant a new file and also this rightly needs to live inside the scheduler. Also things like accessing rq->lock while reading cpu usage becomes easier if the code lived in kernel/sched.c) The patch also modifies the cpu controller not to provide the same accounting information. Tested-by: Balbir Singh <balbir@linux.vnet.ibm.com> Tested the patches on top of 2.6.24-rc3. The patches work fine. Ran some simple tests like cpuspin (spin on the cpu), ran several tasks in the same group and timed them. Compared their time stamps with cpuacct.usage. Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-12-02 20:04:49 +01:00
Kim Phillips	7d400a4c58	phylib: add PHY interface modes for internal delay for tx and rx only Allow phylib specification of cases where hardware needs to configure PHYs for Internal Delay only on either RX or TX (not both). Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Tested-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-by: Li Yang <leoli@freescale.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-12-01 16:32:30 -05:00
Jeff Garzik	c99da91e7a	Merge branch 'master' into upstream-fixes	2007-12-01 16:18:56 -05:00
Eric W. Biederman	2b1e300a9d	[NETNS]: Fix /proc/net breakage Well I clearly goofed when I added the initial network namespace support for /proc/net. Currently things work but there are odd details visible to user space, even when we have a single network namespace. Since we do not cache proc_dir_entry dentries at the moment we can just modify ->lookup to return a different directory inode depending on the network namespace of the process looking at /proc/net, replacing the current technique of using a magic and fragile follow_link method. To accomplish that this patch: - introduces a shadow_proc method to allow different dentries to be returned from proc_lookup. - Removes the old /proc/net follow_link magic - Fixes a weakness in our not caching of proc generic dentries. As shadow_proc uses a task struct to decided which dentry to return we can go back later and fix the proc generic caching without modifying any code that uses the shadow_proc method. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2007-12-02 00:33:17 +11:00
Jiri Kosina	8853c202b4	RTC: convert mutex to bitfield RTC code is using mutex to assure exclusive access to /dev/rtc. This is however wrong usage, as it leaves the mutex locked when returning into userspace, which is unacceptable. Convert rtc->char_lock into bit operation. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Alessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Miklos Szeredi	a6643094e7	fuse: pass open flags to read and write Some open flags (O_APPEND, O_DIRECT) can be changed with fcntl(F_SETFL, ...) after open, but fuse currently only sends the flags to userspace in open. To make it possible to correcly handle changing flags, send the current value to userspace in each read and write. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Huang, Ying	7c83172b98	x86_64 EFI boot support: EFI frame buffer driver This patch adds Graphics Output Protocol support to the kernel. UEFI2.0 spec deprecates Universal Graphics Adapter (UGA) protocol and only Graphics Output Protocol (GOP) is produced. Therefore, the boot loader needs to query the UEFI firmware with appropriate Output Protocol and pass the video information to the kernel. As a result of GOP protocol, an EFI framebuffer driver is needed for displaying console messages. The patch adds a EFI framebuffer driver. The EFI frame buffer driver in this patch is based on the Intel Mac framebuffer driver. The ELILO bootloader takes care of passing the video information as appropriate for EFI firmware. The framebuffer driver has been tested in i386 kernel and x86_64 kernel on EFI platform. Signed-off-by: Chandramouli Narayanan <mouli@linux.intel.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <ak@suse.de> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Tobias Poschwatta	6454d1f903	fix up ext2_fs.h for userspace after reservations backport In commit a686cd898bd999fd026a51e90fb0a3410d258ddb: "Val's cross-port of the ext3 reservations code into ext2." include/linux/ext2_fs.h got a new function whose return value is only defined if __KERNEL__ is defined. Putting #ifdef __KERNEL__ around the function seems to help, patch below. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:53 -08:00
Thomas Bogendoerfer	68576cf122	IP22ZILOG: fix lockup and sysrq - fix lockup when switching from early console to real console - make sysrq reliable - fix panic, if sysrq is issued before console is opened Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:53 -08:00
David Woodhouse	79288f5e93	Fix <linux/kd.h> usage in userspace For reasons unclear to me, glibc's <sys/kd.h> deliberately defeats the attempt we make in <linux/kd.h> to include <linux/types.h> For now, change the one instance of __u32 to 'unsigned int' instead because it's breaking userspace. We should probably also remove our inclusion of <linux/types.h>, since we don't use it -- but that's not a change to make in -rc. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: David Woodhouse <dwmw2@infradead.org> Cc: Samuel Thibault <samuel.thibault@ens-lyon.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:52 -08:00
Zhao Yakui	a7839e9606	PNP: increase the maximum number of resources On some systems the number of resources(IO,MEM) returnedy by PNP device is greater than the PNP constant, for example motherboard devices. It brings that some resources can't be reserved and resource confilicts. This will cause PCI resources are assigned wrongly in some systems, and cause hang. This is a regression since we deleted ACPI motherboard driver and use PNP system driver. [akpm@linux-foundation.org: fix text and coding-style a bit] Signed-off-by: Li Shaohua <shaohua.li@intel.com> Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Thomas Renninger <trenn@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:52 -08:00
Linus Torvalds	bb0851ff9d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (25 commits) USB: s3c2410 gadget: ensure vbus pin in input mode during read USB: s3c2410 gadget: allow sharing of vbus irq USB: s3c2410 gadget: Header move fixups USB: usb-storage: unusual_devs entry for JetFlash TS1GJF2A USB: fix up EHCI startup synchronization USB: make the microtek driver and HAL cooperate USB: uevent environment key fix USB: keep track of whether interface sysfs files exist USB: sierra: new product id USB HCD: avoid duplicate local_irq_disable() USB: mailing lists have changed USB: remove USB HUB entry from MAINTAINERS USB: fix directory references in usb/README USB: add support for an older firmware revision for the Nikon D200 USB: FIx locks and urb->status in adutux (updated) USB: power-management documenation update USB: Fix signr comment in usbdevice_fs.h usbserial: fix inconsistent lock state USB: fix usbled disconnect read race #2 USB: free memory when writing fails in usb/serial/mos7840.c ...	2007-11-28 16:03:09 -08:00
Alan Stern	7e61559f61	USB: keep track of whether interface sysfs files exist This patch (as1009) solves the problem of multiple registrations for USB sysfs files in a more satisfying way than the existing code. It simply adds a flag to keep track of whether or not the files have been created; that way the files can be created or removed as needed. Signed-off-by: Alan Stern <stern@rowland.harvard.edu>	2007-11-28 13:58:35 -08:00
Phil Endecott	bc59462b80	USB: Fix signr comment in usbdevice_fs.h This trivial documentation patch corrects a comment in usbdevice_fs.h; it previously suggested that the signal would only be sent on error, but I am told that it is sent on both successful and unsuccessful completion, and that zero indicates that no signal should be sent. Signed-off-by: Phil Endecott <spam_from_usb_devel@chezphil.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-11-28 13:58:34 -08:00
Ingo Molnar	deaf2227dd	sched: clean up, move __sched_text_start/end to sched.h move __sched_text_start/end to sched.h. No code changed: text data bss dec hex filename 26582 2310 28 28920 70f8 sched.o.before 26582 2310 28 28920 70f8 sched.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-11-28 15:52:56 +01:00
Linus Torvalds	9c8ff4f4da	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: scatterlist: add more safeguards Revert "ll_rw_blk: temporarily enable max_segments tweaking" mmc: Add missing sg_init_table() call block: Fix memory leak in alloc_disk_node() alpha: fix sg_page breakage blktrace: Make sure BLKTRACETEARDOWN does the full cleanup.	2007-11-27 14:21:19 -08:00
Linus Torvalds	febb187761	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: adds the context menu key (HUT GenDesc 0x84) Input: add definitions for frame forward and frame back keys Input: bf54x-keys - keypad does not exist on BF544 parts Input: gpio-keys - request and configure GPIOs Input: i8042 - add i8042.noloop quirk for MS Virtual Machine Sonypi: use synchronize_irq instead of sycnronize_sched sonypi: fit input devices into sysfs tree sony-laptop: fit input devices into sysfs tree	2007-11-27 14:20:35 -08:00
Tejun Heo	645a8d9462	scatterlist: add more safeguards Add more safeguards to protect against misinterpreting a chain entry as a normal scatterlist and vice-versa. * Make sure the entry isn't a chain when assigning and reading a normal sg. * Clear offset and length when chaining. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-11-27 09:30:39 +01:00
Aristeu Rozanski	35baef2afb	Input: adds the context menu key (HUT GenDesc 0x84) Signed-off-by: Aristeu Rozanski <aris@ruivo.org> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2007-11-27 00:47:04 -05:00
Aristeu Rozanski	c23f1f9c40	Input: add definitions for frame forward and frame back keys Signed-off-by: Aristeu Rozanski <aris@ruivo.org> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>	2007-11-27 00:46:57 -05:00
Linus Torvalds	8c27eba549	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6: (41 commits) [XFRM]: Fix leak of expired xfrm_states [ATM]: [he] initialize lock and tasklet earlier [IPV4]: Remove bogus ifdef mess in arp_process [SKBUFF]: Free old skb properly in skb_morph [IPV4]: Fix memory leak in inet_hashtables.h when NUMA is on [IPSEC]: Temporarily remove locks around copying of non-atomic fields [TCP] MTUprobe: Cleanup send queue check (no need to loop) [TCP]: MTUprobe: receiver window & data available checks fixed [MAINTAINERS]: tlan list is subscribers-only [SUNRPC]: Remove SPIN_LOCK_UNLOCKED [SUNRPC]: Make xprtsock.c:xs_setup_{udp,tcp}() static [PFKEY]: Sending an SADB_GET responds with an SADB_GET [IRDA]: Compilation for CONFIG_INET=n case [IPVS]: Fix compiler warning about unused register_ip_vs_protocol [ARP]: Fix arp reply when sender ip 0 [IPV6] TCPMD5: Fix deleting key operation. [IPV6] TCPMD5: Check return value of tcp_alloc_md5sig_pool(). [IPV4] TCPMD5: Use memmove() instead of memcpy() because we have overlaps. [IPV4] TCPMD5: Omit redundant NULL check for kfree() argument. ieee80211: Stop net_ratelimit/IEEE80211_DEBUG_DROP log pollution ...	2007-11-26 20:09:07 -08:00
Linus Torvalds	423eaf8f00	Merge git://git.linux-nfs.org/pub/linux/nfs-2.6 * git://git.linux-nfs.org/pub/linux/nfs-2.6: NFS: Clean up new multi-segment direct I/O changes NFS: Ensure we return zero if applications attempt to write zero bytes NFS: Support multiple segment iovecs in the NFS direct I/O path NFS: Introduce iovec I/O helpers to fs/nfs/direct.c SUNRPC: Add missing "space" to net/sunrpc/auth_gss.c SUNRPC: make sunrpc/xprtsock.c:xs_setup_{udp,tcp}() static NFS: fs/nfs/dir.c should #include "internal.h" NFS: make nfs_wb_page_priority() static NFS: mount failure causes bad page state SUNRPC: remove NFS/RDMA client's binary sysctls kernel BUG at fs/nfs/namespace.c:108! - can be triggered by bad server sunrpc: rpc_pipe_poll may miss available data in some cases sunrpc: return error if unsupported enctype or cksumtype is encountered sunrpc: gss_pipe_downcall(), don't assume all errors are transient NFS: Fix the ustat() regression	2007-11-26 19:42:59 -08:00
Linus Torvalds	ff1ea52fa3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86 * git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86: x86: fix APIC related bootup crash on Athlon XP CPUs time: add ADJ_OFFSET_SS_READ x86: export the symbol empty_zero_page on the 32-bit x86 architecture x86: fix kprobes_64.c inlining borkage pci: use pci=bfsort for HP DL385 G2, DL585 G2 x86: correctly set UTS_MACHINE for "make ARCH=x86" lockdep: annotate do_debug() trap handler x86: turn off iommu merge by default x86: fix ACPI compile for LOCAL_APIC=n x86: printk kernel version in WARN_ON and other dump_stack users ACPI: Set max_cstate to 1 for early Opterons. x86: fix NMI watchdog & 'stopped time' problem	2007-11-26 19:41:28 -08:00
Linus Torvalds	6d27294053	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (21 commits) libata: bump transfer chunk size if it's odd libata: Return proper ATA INT status in pata_bf54x driver pata_ali: trim trailing whitespace (fix checkpatch complaints) pata_isapnp: Polled devices pata_hpt37x: Fix cable detect bug spotted by Sergei pata_ali: Lots of problems still showing up with small ATAPI DMA pata_ali: Add Mitac 8317 and derivatives libata-core: List more documentation sources for reference ata_piix: Invalid use of writel/readl with iomap sata_sil24: fix sg table sizing pata_jmicron: fix disabled port handling in jmicron_pre_reset() pata_sil680: kill bogus reset code (take 2) ata_piix: port enable for the first SATA controller of ICH8 is 0xf not 0x3 ata_piix: only enable the first port on apple macbook pro ata_piix: reorganize controller IDs pata_sis.c: Add Packard Bell EasyNote K5305 to laptops libata-scsi: be tolerant of 12-byte ATAPI commands in 16-byte CDBs libata: use ATA_HORKAGE_STUCK_ERR for ATAPI tape drives libata: workaround DRQ=1 ERR=1 for ATAPI tape drives libata: remove unused functions ...	2007-11-26 19:18:22 -08:00
Linus Torvalds	6b41016032	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (39 commits) ACPI: EC: Workaround for optimized controllers (version 3) ACPI: EC: use printk_ratelimit(), add some DEBUG mode messages Revert "ACPI: EC: Workaround for optimized controllers" ACPI: fix two IRQ8 issues in IOAPIC mode ACPI: Add missing spaces to printk format cpuidle: fix HP nx6125 regression cpuidle: add sched_clock_idle_[sleep\|wakeup]_event() hooks cpuidle: fix C3 for no bus-master control case ACPI: thinkpad-acpi: fix oops when a module parameter has no value Revert "Fix very high interrupt rate for IRQ8 (rtc) unless pnpacpi=off" ACPI: EC: Don't init EC early if it has no _INI Revert "acpi: make ACPI_PROCFS default to y" Revert "ACPI: add documentation for deprecated /proc/acpi/battery in ACPI_PROCFS" ACPI: Split out control for /proc/acpi entries from battery, ac, and sbs. ACPI: Video: Increase buffer size for writes to brightness proc file. ACPI: EC: Workaround for optimized controllers ACPI: SBS: Fix retval warning ACPI: Enable MSR (FixedHW) support for T-States ACPI: Get throttling info from BIOS only after evaluating _PDC ACPI: Use _TSS for throttling control, when present. Add error checks. ...	2007-11-26 19:09:58 -08:00
Adrian Bunk	483066d62e	SUNRPC: make sunrpc/xprtsock.c:xs_setup_{udp,tcp}() static xs_setup_{udp,tcp}() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-26 16:24:50 -05:00
Adrian Bunk	5334eb13d4	NFS: make nfs_wb_page_priority() static nfs_wb_page_priority() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-26 16:24:48 -05:00
James Lentini	cfcb43ff7c	SUNRPC: remove NFS/RDMA client's binary sysctls Support for binary sysctls is being deprecated in 2.6.24. Since there are no applications using the NFS/RDMA client's binary sysctls, it makes sense to remove them. The patch below does this while leaving the /proc/sys interface unchanged. Please consider this for 2.6.24. Signed-off-by: James Lentini <jlentini@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-26 16:21:19 -05:00
John Stultz	52bfb36050	time: add ADJ_OFFSET_SS_READ Michael Kerrisk reported that a long standing bug in the adjtimex() system call causes glibc's adjtime(3) function to deliver the wrong results if 'delta' is NULL. add the ADJ_OFFSET_SS_READ API detail, which will be used by glibc to fix this API compatibility bug. Also see: http://bugzilla.kernel.org/show_bug.cgi?id=6761 [ mingo@elte.hu: added patch description and made it backwards compatible ] NOTE: the new flag is defined 0xa001 so that it returns -EINVAL on older kernels - this way glibc can use it safely. Suggested by Ulrich Drepper. Acked-by: Ulrich Drepper <drepper@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2007-11-26 20:42:19 +01:00
Herbert Xu	2d4baff8da	[SKBUFF]: Free old skb properly in skb_morph The skb_morph function only freed the data part of the dst skb, but leaked the auxiliary data such as the netfilter fields. This patch fixes this by moving the relevant parts from __kfree_skb to skb_release_all and calling it in skb_morph. It also makes kfree_skbmem static since it's no longer called anywhere else and it now no longer does skb_release_data. Thanks to Yasuyuki KOZAKAI for finding this problem and posting a patch for it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2007-11-26 23:11:19 +08:00
Ayaz Abdulla	490dde8990	forcedeth: new mcp79 pci ids This patch adds new device ids and features for mcp79 devices into the forcedeth driver. Signed-off-by: Ayaz Abdulla <aabdulla@nvidia.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2007-11-23 20:54:01 -05:00
Adrian Bunk	5fe4a33430	[SUNRPC]: Make xprtsock.c:xs_setup_{udp,tcp}() static xs_setup_{udp,tcp}() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2007-11-22 19:38:25 +08:00
Heiko Carstens	37e3a6ac5a	[S390] appldata: remove unused binary sysctls. Remove binary sysctls that never worked due to missing strategy functions. Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-11-20 11:13:45 +01:00
Heiko Carstens	43ebbf119a	[S390] cmm: remove unused binary sysctls. Remove binary sysctls that never worked due to missing strategy functions. Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-11-20 11:13:45 +01:00
Len Brown	95b00786f3	Pull cpuidle into release branch	2007-11-20 01:18:37 -05:00
Shaohua Li	61fd47e0c8	ACPI: fix two IRQ8 issues in IOAPIC mode Use mp_irqs[] to get PNP device's interrupt polarity and trigger. There are two reasons to do this: 1. BIOS bug for PNP interrupt 2. BIOS explictly does override mp_irqs[] should cover all the cases. http://bugzilla.kernel.org/show_bug.cgi?id=5243 http://bugzilla.kernel.org/show_bug.cgi?id=7679 http://bugzilla.kernel.org/show_bug.cgi?id=9153 [lenb: fixed !IOAPIC and 64-bit !SMP builds] Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2007-11-20 01:16:29 -05:00
Venkatesh Pallipadi	ddc081a195	cpuidle: fix HP nx6125 regression Fix for http://bugzilla.kernel.org/show_bug.cgi?id=9355 cpuidle always used to fallback to C2 if there is some bm activity while entering C3. But, presence of C2 is not always guaranteed. Change cpuidle algorithm to detect a safe_state to fallback in case of bm_activity and use that state instead of C2. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2007-11-19 21:43:22 -05:00
Albert Lee	2d3b8eea7f	libata: workaround DRQ=1 ERR=1 for ATAPI tape drives After an error condition, some ATAPI tape drives set DRQ=1 together with ERR=1 when asking the host to transfer the CDB of the next packet command (i.e. request sense). This patch, a revised version of Alan/Mark's previous patch, adds ATA_HORKAGE_STUCK_ERR to workaround the problem by ignoring the ERR bit and proceed sending the CDB. Signed-off-by: Albert Lee <albertcc@tw.ibm.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Mark Lord <liml@rtr.ca> Signed-off-by: Tejun Heo <htejun@gmail.com>	2007-11-19 12:28:11 +09:00
Adrian Bunk	21bef6dd2b	libata: remove unused functions This patch removes the following obsolete functions: - libata-core.c: __sata_phy_reset() - libata-core.c: sata_phy_reset() - libata-eh.c: ata_qc_timeout() - libata-eh.c: ata_eng_timeout() Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Tejun Heo <htejun@gmail.com>	2007-11-19 12:28:09 +09:00
Eric Paris	ec41878170	SELinux: return EOPNOTSUPP not ENOTSUPP ENOTSUPP is not a valid error code in the kernel (it is defined in some NFS internal error codes and has been improperly used other places). In the !CONFIG_SECURITY_SELINUX case though it is possible that we could return this from selinux_audit_rule_init(). This patch just returns the userspace valid EOPNOTSUPP. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2007-11-17 10:38:16 +11:00
Jean Delvare	5e31c2bd3c	i2c: Make i2c_check_addr static i2c_check_addr is only used inside i2c-core now, so we can make it static and stop exporting it. Thanks to David Brownell for noticing. Signed-off-by: Jean Delvare <khali@linux-fr.org>	2007-11-15 19:24:02 +01:00
Jan Kara	7c06a8dc64	Fix 64KB blocksize in ext3 directories With 64KB blocksize, a directory entry can have size 64KB which does not fit into 16 bits we have for entry lenght. So we store 0xffff instead and convert value when read from / written to disk. The patch also converts some places to use ext3_next_entry() when we are changing them anyway. [akpm@linux-foundation.org: coding-style cleanups] Signed-off-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:43 -08:00

... 3 4 5 6 7 ...

8802 Commits