Commit Graph

8633 Commits

Author SHA1 Message Date
Peter Zijlstra 78f2c7db60 sched: SCHED_FIFO/SCHED_RR watchdog timer
Introduce a new rlimit that allows the user to set a runtime timeout on
real-time tasks their slice. Once this limit is exceeded the task will receive
SIGXCPU.

So it measures runtime since the last sleep.

Input and ideas by Thomas Gleixner and Lennart Poettering.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Lennart Poettering <mzxreary@0pointer.de>
CC: Michael Kerrisk <mtk.manpages@googlemail.com>
CC: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:27 +01:00
Peter Zijlstra fa717060f1 sched: sched_rt_entity
Move the task_struct members specific to rt scheduling together.
A future optimization could be to put sched_entity and sched_rt_entity
into a union.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:27 +01:00
Paul E. McKenney e260be673a Preempt-RCU: implementation
This patch implements a new version of RCU which allows its read-side
critical sections to be preempted. It uses a set of counter pairs
to keep track of the read-side critical sections and flips them
when all tasks exit read-side critical section. The details
of this implementation can be found in this paper -

	http://www.rdrop.com/users/paulmck/RCU/OLSrtRCU.2006.08.11a.pdf

and the article-

	http://lwn.net/Articles/253651/

This patch was developed as a part of the -rt kernel development and
meant to provide better latencies when read-side critical sections of
RCU don't disable preemption.  As a consequence of keeping track of RCU
readers, the readers have a slight overhead (optimizations in the paper).
This implementation co-exists with the "classic" RCU implementations
and can be switched to at compiler.

Also includes RCU tracing summarized in debugfs.

[ akpm@linux-foundation.org: build fixes on non-preempt architectures ]

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Reviewed-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:24 +01:00
Paul E. McKenney 01c1c660f4 Preempt-RCU: reorganize RCU code into rcuclassic.c and rcupdate.c
This patch re-organizes the RCU code to enable multiple implementations
of RCU. Users of RCU continues to include rcupdate.h and the
RCU interfaces remain the same. This is in preparation for
subsequently merging the preemptible RCU implementation.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:24 +01:00
Dipankar Sarma c2d727aa2f Preempt-RCU: Use softirq instead of tasklets for
This patch makes RCU use softirq instead of tasklets.

It also adds a memory barrier after raising the softirq
inorder to ensure that the cpu sees the most recently updated
value of rcu->cur while processing callbacks.
The discussion of the related theoretical race pointed out
by James Huang can be found here --> http://lkml.org/lkml/2007/11/20/603

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com>
Reviewed-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:23 +01:00
Steven Rostedt cb46984504 sched: RT-balance, add new methods to sched_class
Dmitry Adamushko found that the current implementation of the RT
balancing code left out changes to the sched_setscheduler and
rt_mutex_setprio.

This patch addresses this issue by adding methods to the schedule classes
to handle being switched out of (switched_from) and being switched into
(switched_to) a sched_class. Also a method for changing of priorities
is also added (prio_changed).

This patch also removes some duplicate logic between rt_mutex_setprio and
sched_setscheduler.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:22 +01:00
Steven Rostedt 9a897c5a67 sched: RT-balance, replace hooks with pre/post schedule and wakeup methods
To make the main sched.c code more agnostic to the schedule classes.
Instead of having specific hooks in the schedule code for the RT class
balancing. They are replaced with a pre_schedule, post_schedule
and task_wake_up methods. These methods may be used by any of the classes
but currently, only the sched_rt class implements them.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:22 +01:00
Ingo Molnar 32525d022a sched: whitespace cleanups in topology.h
whitespace cleanups in topology.h.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:20 +01:00
Ingo Molnar 52d853431e sched: reactivate fork balancing
reactivate fork balancing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:20 +01:00
Gregory Haskins 57d885fea0 sched: add sched-domain roots
We add the notion of a root-domain which will be used later to rescope
global variables to per-domain variables.  Each exclusive cpuset
essentially defines an island domain by fully partitioning the member cpus
from any other cpuset.  However, we currently still maintain some
policy/state as global variables which transcend all cpusets.  Consider,
for instance, rt-overload state.

Whenever a new exclusive cpuset is created, we also create a new
root-domain object and move each cpu member to the root-domain's span.
By default the system creates a single root-domain with all cpus as
members (mimicking the global state we have today).

We add some plumbing for storing class specific data in our root-domain.
Whenever a RQ is switching root-domains (because of repartitioning) we
give each sched_class the opportunity to remove any state from its old
domain and add state to the new one.  This logic doesn't have any clients
yet but it will later in the series.

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
CC: Christoph Lameter <clameter@sgi.com>
CC: Paul Jackson <pj@sgi.com>
CC: Simon Derr <simon.derr@bull.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:18 +01:00
Gregory Haskins e7693a362e sched: de-SCHED_OTHER-ize the RT path
The current wake-up code path tries to determine if it can optimize the
wake-up to "this_cpu" by computing load calculations.  The problem is that
these calculations are only relevant to SCHED_OTHER tasks where load is king.
For RT tasks, priority is king.  So the load calculation is completely wasted
bandwidth.

Therefore, we create a new sched_class interface to help with
pre-wakeup routing decisions and move the load calculation as a function
of CFS task's class.

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:09 +01:00
Gregory Haskins 73fe6aae84 sched: add RT-balance cpu-weight
Some RT tasks (particularly kthreads) are bound to one specific CPU.
It is fairly common for two or more bound tasks to get queued up at the
same time.  Consider, for instance, softirq_timer and softirq_sched.  A
timer goes off in an ISR which schedules softirq_thread to run at RT50.
Then the timer handler determines that it's time to smp-rebalance the
system so it schedules softirq_sched to run.  So we are in a situation
where we have two RT50 tasks queued, and the system will go into
rt-overload condition to request other CPUs for help.

This causes two problems in the current code:

1) If a high-priority bound task and a low-priority unbounded task queue
   up behind the running task, we will fail to ever relocate the unbounded
   task because we terminate the search on the first unmovable task.

2) We spend precious futile cycles in the fast-path trying to pull
   overloaded tasks over.  It is therefore optimial to strive to avoid the
   overhead all together if we can cheaply detect the condition before
   overload even occurs.

This patch tries to achieve this optimization by utilizing the hamming
weight of the task->cpus_allowed mask.  A weight of 1 indicates that
the task cannot be migrated.  We will then utilize this information to
skip non-migratable tasks and to eliminate uncessary rebalance attempts.

We introduce a per-rq variable to count the number of migratable tasks
that are currently running.  We only go into overload if we have more
than one rt task, AND at least one of them is migratable.

In addition, we introduce a per-task variable to cache the cpus_allowed
weight, since the hamming calculation is probably relatively expensive.
We only update the cached value when the mask is updated which should be
relatively infrequent, especially compared to scheduling frequency
in the fast path.

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:07 +01:00
Ingo Molnar 82a1fcb902 softlockup: automatically detect hung TASK_UNINTERRUPTIBLE tasks
this patch extends the soft-lockup detector to automatically
detect hung TASK_UNINTERRUPTIBLE tasks. Such hung tasks are
printed the following way:

 ------------------>
 INFO: task prctl:3042 blocked for more than 120 seconds.
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message
 prctl         D fd5e3793     0  3042   2997
        f6050f38 00000046 00000001 fd5e3793 00000009 c06d8264 c06dae80 00000286
        f6050f40 f6050f00 f7d34d90 f7d34fc8 c1e1be80 00000001 f6050000 00000000
        f7e92d00 00000286 f6050f18 c0489d1a f6050f40 00006605 00000000 c0133a5b
 Call Trace:
  [<c04883a5>] schedule_timeout+0x6d/0x8b
  [<c04883d8>] schedule_timeout_uninterruptible+0x15/0x17
  [<c0133a76>] msleep+0x10/0x16
  [<c0138974>] sys_prctl+0x30/0x1e2
  [<c0104c52>] sysenter_past_esp+0x5f/0xa5
  =======================
 2 locks held by prctl/3042:
 #0:  (&sb->s_type->i_mutex_key#5){--..}, at: [<c0197d11>] do_fsync+0x38/0x7a
 #1:  (jbd_handle){--..}, at: [<c01ca3d2>] journal_start+0xc7/0xe9
 <------------------

the current default timeout is 120 seconds. Such messages are printed
up to 10 times per bootup. If the system has crashed already then the
messages are not printed.

if lockdep is enabled then all held locks are printed as well.

this feature is a natural extension to the softlockup-detector (kernel
locked up without scheduling) and to the NMI watchdog (kernel locked up
with IRQs disabled).

[ Gautham R Shenoy <ego@in.ibm.com>: CPU hotplug fixes. ]
[ Andrew Morton <akpm@linux-foundation.org>: build warning fix. ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2008-01-25 21:08:02 +01:00
Ingo Molnar d0d23b5432 cpu-hotplug: fix build on !CONFIG_SMP
fix build on !CONFIG_SMP.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:02 +01:00
Gautham R Shenoy 95402b3829 cpu-hotplug: replace per-subsystem mutexes with get_online_cpus()
This patch converts the known per-subsystem mutexes to get_online_cpus
put_online_cpus. It also eliminates the CPU_LOCK_ACQUIRE and
CPU_LOCK_RELEASE hotplug notification events.

Signed-off-by: Gautham  R Shenoy <ego@in.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:02 +01:00
Gautham R Shenoy 86ef5c9a8e cpu-hotplug: replace lock_cpu_hotplug() with get_online_cpus()
Replace all lock_cpu_hotplug/unlock_cpu_hotplug from the kernel and use
get_online_cpus and put_online_cpus instead as it highlights the
refcount semantics in these operations.

The new API guarantees protection against the cpu-hotplug operation, but
it doesn't guarantee serialized access to any of the local data
structures. Hence the changes needs to be reviewed.

In case of pseries_add_processor/pseries_remove_processor, use
cpu_maps_update_begin()/cpu_maps_update_done() as we're modifying the
cpu_present_map there.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:02 +01:00
Gautham R Shenoy d221938c04 cpu-hotplug: refcount based cpu hotplug
This patch implements a Refcount + Waitqueue based model for
cpu-hotplug.

Now, a thread which wants to prevent cpu-hotplug, will bump up a global
refcount and the thread which wants to perform a cpu-hotplug operation
will block till the global refcount goes to zero.

The readers, if any, during an ongoing cpu-hotplug operation are blocked
until the cpu-hotplug operation is over.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Paul Jackson <pj@sgi.com> [For !CONFIG_HOTPLUG_CPU ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:01 +01:00
Srivatsa Vaddagiri 6b2d770026 sched: group scheduler, fix fairness of cpu bandwidth allocation for task groups
The current load balancing scheme isn't good enough for precise
group fairness.

For example: on a 8-cpu system, I created 3 groups as under:

	a = 8 tasks (cpu.shares = 1024)
	b = 4 tasks (cpu.shares = 1024)
	c = 3 tasks (cpu.shares = 1024)

a, b and c are task groups that have equal weight. We would expect each
of the groups to receive 33.33% of cpu bandwidth under a fair scheduler.

This is what I get with the latest scheduler git tree:

Signed-off-by: Ingo Molnar <mingo@elte.hu>
--------------------------------------------------------------------------------
Col1  | Col2    | Col3  |  Col4
------|---------|-------|-------------------------------------------------------
a     | 277.676 | 57.8% | 54.1%  54.1%  54.1%  54.2%  56.7%  62.2%  62.8% 64.5%
b     | 116.108 | 24.2% | 47.4%  48.1%  48.7%  49.3%
c     |  86.326 | 18.0% | 47.5%  47.9%  48.5%
--------------------------------------------------------------------------------

Explanation of o/p:

Col1 -> Group name
Col2 -> Cumulative execution time (in seconds) received by all tasks of that
	group in a 60sec window across 8 cpus
Col3 -> CPU bandwidth received by the group in the 60sec window, expressed in
        percentage. Col3 data is derived as:
		Col3 = 100 * Col2 / (NR_CPUS * 60)
Col4 -> CPU bandwidth received by each individual task of the group.
		Col4 = 100 * cpu_time_recd_by_task / 60

[I can share the test case that produces a similar o/p if reqd]

The deviation from desired group fairness is as below:

	a = +24.47%
	b = -9.13%
	c = -15.33%

which is quite high.

After the patch below is applied, here are the results:

--------------------------------------------------------------------------------
Col1  | Col2    | Col3  |  Col4
------|---------|-------|-------------------------------------------------------
a     | 163.112 | 34.0% | 33.2%  33.4%  33.5%  33.5%  33.7%  34.4%  34.8% 35.3%
b     | 156.220 | 32.5% | 63.3%  64.5%  66.1%  66.5%
c     | 160.653 | 33.5% | 85.8%  90.6%  91.4%
--------------------------------------------------------------------------------

Deviation from desired group fairness is as below:

	a = +0.67%
	b = -0.83%
	c = +0.17%

which is far better IMO. Most of other runs have yielded a deviation within
+-2% at the most, which is good.

Why do we see bad (group) fairness with current scheuler?
=========================================================

Currently cpu's weight is just the summation of individual task weights.
This can yield incorrect results. For ex: consider three groups as below
on a 2-cpu system:

	CPU0	CPU1
---------------------------
	A (10)  B(5)
		C(5)
---------------------------

Group A has 10 tasks, all on CPU0, Group B and C have 5 tasks each all
of which are on CPU1. Each task has the same weight (NICE_0_LOAD =
1024).

The current scheme would yield a cpu weight of 10240 (10*1024) for each cpu and
the load balancer will think both CPUs are perfectly balanced and won't
move around any tasks. This, however, would yield this bandwidth:

	A = 50%
	B = 25%
	C = 25%

which is not the desired result.

What's changing in the patch?
=============================

	- How cpu weights are calculated when CONFIF_FAIR_GROUP_SCHED is
	  defined (see below)
	- API Change
		- Two tunables introduced in sysfs (under SCHED_DEBUG) to
		  control the frequency at which the load balance monitor
		  thread runs.

The basic change made in this patch is how cpu weight (rq->load.weight) is
calculated. Its now calculated as the summation of group weights on a cpu,
rather than summation of task weights. Weight exerted by a group on a
cpu is dependent on the shares allocated to it and also the number of
tasks the group has on that cpu compared to the total number of
(runnable) tasks the group has in the system.

Let,
	W(K,i)  = Weight of group K on cpu i
	T(K,i)  = Task load present in group K's cfs_rq on cpu i
	T(K)    = Total task load of group K across various cpus
	S(K) 	= Shares allocated to group K
	NRCPUS	= Number of online cpus in the scheduler domain to
	 	  which group K is assigned.

Then,
	W(K,i) = S(K) * NRCPUS * T(K,i) / T(K)

A load balance monitor thread is created at bootup, which periodically
runs and adjusts group's weight on each cpu. To avoid its overhead, two
min/max tunables are introduced (under SCHED_DEBUG) to control the rate
at which it runs.

Fixes from: Peter Zijlstra <a.p.zijlstra@chello.nl>

- don't start the load_balance_monitor when there is only a single cpu.
- rename the kthread because its currently longer than TASK_COMM_LEN

Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-25 21:08:00 +01:00
Linus Torvalds b47711bfbc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
  selinux: make mls_compute_sid always polyinstantiate
  security/selinux: constify function pointer tables and fields
  security: add a secctx_to_secid() hook
  security: call security_file_permission from rw_verify_area
  security: remove security_sb_post_mountroot hook
  Security: remove security.h include from mm.h
  Security: remove security_file_mmap hook sparse-warnings (NULL as 0).
  Security: add get, set, and cloning of superblock security information
  security/selinux: Add missing "space"
2008-01-25 08:44:29 -08:00
Linus Torvalds eba0e319c1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (125 commits)
  [CRYPTO] twofish: Merge common glue code
  [CRYPTO] hifn_795x: Fixup container_of() usage
  [CRYPTO] cast6: inline bloat--
  [CRYPTO] api: Set default CRYPTO_MINALIGN to unsigned long long
  [CRYPTO] tcrypt: Make xcbc available as a standalone test
  [CRYPTO] xcbc: Remove bogus hash/cipher test
  [CRYPTO] xcbc: Fix algorithm leak when block size check fails
  [CRYPTO] tcrypt: Zero axbuf in the right function
  [CRYPTO] padlock: Only reset the key once for each CBC and ECB operation
  [CRYPTO] api: Include sched.h for cond_resched in scatterwalk.h
  [CRYPTO] salsa20-asm: Remove unnecessary dependency on CRYPTO_SALSA20
  [CRYPTO] tcrypt: Add select of AEAD
  [CRYPTO] salsa20: Add x86-64 assembly version
  [CRYPTO] salsa20_i586: Salsa20 stream cipher algorithm (i586 version)
  [CRYPTO] gcm: Introduce rfc4106
  [CRYPTO] api: Show async type
  [CRYPTO] chainiv: Avoid lock spinning where possible
  [CRYPTO] seqiv: Add select AEAD in Kconfig
  [CRYPTO] scatterwalk: Handle zero nbytes in scatterwalk_map_and_copy
  [CRYPTO] null: Allow setkey on digest_null 
  ...
2008-01-25 08:38:25 -08:00
Greg Kroah-Hartman 79a6ee42fd Kobject: fix coding style issues in kobject.h
Finally clean up the odd spaces and other mess in kobject.h

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 21:27:06 -08:00
Greg Kroah-Hartman d462943afe Driver core: fix coding style issues in device.h
Finally clean up the odd spaces and other mess in device.h

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 21:04:46 -08:00
Dave Young fd04897bb2 Driver Core: add class iteration api
Add the following class iteration functions for driver use:
	class_for_each_device
	class_find_device
	class_for_each_child
	class_find_child

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:44 -08:00
Stephen Rothwell ae72cddb23 Driver Core: constify the name passed to platform_device_register_simple
This name is just passed to platform_device_alloc which has its parameter
declared const.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:43 -08:00
Kay Sievers af5ca3f4ec Driver core: change sysdev classes to use dynamic kobject names
All kobjects require a dynamically allocated name now. We no longer
need to keep track if the name is statically assigned, we can just
unconditionally free() all kobject names on cleanup.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:40 -08:00
Greg Kroah-Hartman 528a4bf1d5 Kobject: remove kobject_unregister() as no one uses it anymore
There are no in-kernel users of kobject_unregister() so it should be
removed.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:40 -08:00
Kay Sievers 0f4dafc056 Kobject: auto-cleanup on final unref
We save the current state in the object itself, so we can do proper
cleanup when the last reference is dropped.

If the initial reference is dropped, the object will be removed from
sysfs if needed, if an "add" event was sent, "remove" will be send, and
the allocated resources are released.

This allows us to clean up some driver core usage as well as allowing us
to do other such changes to the rest of the kernel.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:39 -08:00
Greg Kroah-Hartman 12e339ac6e Kset: remove kset_add function
No one is calling this anymore, so just remove it and hard-code the one
internal-use of it.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:39 -08:00
Greg Kroah-Hartman 6d06adfaf8 Kobject: remove kobject_register()
The function is no longer used by anyone in the kernel, and it prevents
the proper sending of the kobject uevent after the needed files are set
up by the caller.  kobject_init_and_add() can be used in its place.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:39 -08:00
Greg Kroah-Hartman f9cb074bff Kobject: rename kobject_init_ng() to kobject_init()
Now that the old kobject_init() function is gone, rename
kobject_init_ng() to kobject_init() to clean up the namespace.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:38 -08:00
Greg Kroah-Hartman e1543ddf73 Kobject: remove kobject_init() as no one uses it anymore
The old kobject_init() function is on longer in use, so let us remove it
from the public scope (kset mess in the kobject.c file still uses it,
but that can be cleaned up later very simply.)

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:38 -08:00
Greg Kroah-Hartman b2d6db5878 Kobject: rename kobject_add_ng() to kobject_add()
Now that the old kobject_add() function is gone, rename kobject_add_ng()
to kobject_add() to clean up the namespace.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:38 -08:00
Greg Kroah-Hartman 9e7bbccd02 Kobject: remove kobject_add() as no one uses it anymore
The old kobject_add() function is on longer in use, so let us remove it
from the public scope (kset mess in the kobject.c file still uses it,
but that can be cleaned up later very simply.)

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:38 -08:00
Kay Sievers edfaa7c365 Driver core: convert block from raw kobjects to core devices
This moves the block devices to /sys/class/block. It will create a
flat list of all block devices, with the disks and partitions in one
directory. For compatibility /sys/block is created and contains symlinks
to the disks.

  /sys/class/block
  |-- sda -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda
  |-- sda1 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda1
  |-- sda10 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda10
  |-- sda5 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda5
  |-- sda6 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda6
  |-- sda7 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda7
  |-- sda8 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda8
  |-- sda9 -> ../../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda9
  `-- sr0 -> ../../devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0

  /sys/block/
  |-- sda -> ../devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda
  `-- sr0 -> ../devices/pci0000:00/0000:00:1f.2/host1/target1:0:0/1:0:0:0/block/sr0

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:36 -08:00
Greg Kroah-Hartman e5dd127846 Driver core: move the static kobject out of struct driver
This patch removes the kobject, and a few other driver-core-only fields
out of struct driver and into the driver core only.  Now drivers can be
safely create on the stack or statically (like they currently are.)

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:35 -08:00
Greg Kroah-Hartman c63469a398 Driver core: move the driver specific module code into the driver core
The module driver specific code should belong in the driver core, not in
the kernel/ directory.  So move this code.  This is done in preparation
for some struct device_driver rework that should be confined to the
driver core code only.

This also lets us keep from exporting these functions, as no external
code should ever be calling it.

Thanks to Andrew Morton for the !CONFIG_MODULES fix.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:35 -08:00
Greg Kroah-Hartman cbe9c595f1 Driver: add driver_add_kobj for looney iseries_veth driver
The iseries driver wants to hang kobjects off of its driver, so, to
preserve backwards compatibility, we need to add a call to the driver
core to allow future changes to work properly.

Hopefully no one uses this function in the future and the iseries_veth
driver authors come to their senses so I can remove this hack...

Cc: Dave Larson <larson1@us.ibm.com>
Cc: Santiago Leon <santil@us.ibm.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:34 -08:00
Cornelia Huck 57c745340a driver core: Introduce default attribute groups.
This is lot like default attributes for devices (and indeed,
a lot of the code is lifted from there).

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:34 -08:00
Greg Kroah-Hartman c6f7e72a3f driver core: remove fields from struct bus_type
struct bus_type is static everywhere in the kernel.  This moves the
kobject in the structure out of it, and a bunch of other private only to
the driver core fields are now moved to a private structure.  This lets
us dynamically create the backing kobject properly and gives us the
chance to be able to document to users exactly how to use the struct
bus_type as there are no fields they can improperly access.

Thanks to Kay for the build fixes on this patch.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:33 -08:00
Greg Kroah-Hartman b249072ee6 driver core: add way to get to bus device klist
This allows an easier way to get to the device klist associated with a
struct bus_type (you have three to choose from...)  This will make it
easier to move these fields to be dynamic in a future patch.

The only user of this is the PCI core which horribly abuses this
interface to rearrange the order of the pci devices.  This should be
done using the existing bus device walking functions, but that's left
for future patches.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:33 -08:00
Greg Kroah-Hartman 0fed80f7a6 driver core: add way to get to bus kset
This allows an easier way to get to the kset associated with a struct
bus_type (you have three to choose from...)  This will make it easier to
move these fields to be dynamic in a future patch.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:33 -08:00
Greg Kroah-Hartman cc972e896b driver core: remove owner field from struct bus_type
This isn't used by anything in the driver core, and by no one in the 204
different usages of it in the kernel tree.  Remove this field so no one
gets any idea that it is needed to be used.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:31 -08:00
Greg Kroah-Hartman 81e7c6a636 UIO: fix kobject usage
The uio kobject code is "wierd".  This patch should hopefully fix it up
to be sane and not leak memory anymore.


Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benedikt Spranger <b.spranger@linutronix.de>
Signed-off-by: Hans J. Koch <hjk@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:26 -08:00
Greg Kroah-Hartman d76e15fb20 driver core: make /sys/power a kobject
/sys/power should not be a kset, that's overkill.  This patch renames it
to power_kset and fixes up all usages of it in the tree.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:25 -08:00
Greg Kroah-Hartman 2fb9113b97 kobject: remove subsystem_(un)register functions
These functions are no longer used and are the last remants of the old
subsystem crap.  So delete them for good.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:24 -08:00
Greg Kroah-Hartman 0ff21e4663 kobject: convert kernel_kset to be a kobject
kernel_kset does not need to be a kset, but a much simpler kobject now
that we have kobj_attributes.

We also rename kernel_kset to kernel_kobj to catch all users of this
symbol with a build error instead of an easy-to-ignore build warning.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:24 -08:00
Greg Kroah-Hartman 5c03c7ab88 kset: remove decl_subsys macro
This macro is no longer used.  ksets should be created dynamically with
a call to kset_create_and_add() not declared statically.

Yes, there are 5 remaining static struct kset usages in the kernel tree,
but they will be fixed up soon.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:24 -08:00
Greg Kroah-Hartman f62ed9e33b firmware: change firmware_kset to firmware_kobj
There is no firmware "subsystem" it's just a directory in /sys that
other portions of the kernel want to hook into.  So make it a kobject
not a kset to help alivate anyone who tries to do some odd kset-like
things with this.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:23 -08:00
Greg Kroah-Hartman 15f2f9b3a9 firmware: remove firmware_(un)register()
These functions are no longer called or needed, so we can remove them.

As I rewrote the whole firmware.c file, add my copyright.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:23 -08:00
Kay Sievers 000f2a4d8c Driver Core: kill subsys_attribute and default sysfs ops
Remove the no longer needed subsys_attributes, they are all converted to
the more sensical kobj_attributes.

There is no longer a magic fallback in sysfs attribute operations, all
kobjects which create simple attributes need explicitely a ktype
assigned, which tells the core what was intended here.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:22 -08:00
Greg Kroah-Hartman 9e5f7f9abe firmware: export firmware_kset so that people can use that instead of the braindead firmware_register interface
Needed for future firmware subsystem cleanups.

In the end, the firmware_register/unregister functions will be deleted
entirely, but we need this symbol so that subsystems can migrate over.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Matt Domsch <Matt_Domsch@dell.com>
Cc: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:19 -08:00
Kay Sievers eb41d9465c fix struct user_info export's sysfs interaction
Clean up the use of ksets and kobjects. Kobjects are instances of
objects (like struct user_info), ksets are collections of objects of a
similar type (like the uids directory containing the user_info directories).
So, use kobjects for the user_info directories, and a kset for the "uids"
directory.

On object cleanup, the final kobject_put() was missing.

Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:18 -08:00
Kay Sievers 23b5212cc7 Driver Core: add kobj_attribute handling
Add kobj_sysfs_ops to replace subsys_sysfs_ops. There is no
need for special kset operations, we want to be able to use
simple attribute operations at any kobject, not only ksets.

The whole concept of any default sysfs attribute operations
will go away with the upcoming removal of subsys_sysfs_ops.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:18 -08:00
Greg Kroah-Hartman 6dcec2511f kset: convert struct bus_device->drivers to use kset_create
Dynamically create the kset instead of declaring it statically.

Having 3 static kobjects in one structure is not only foolish, but ripe
for nasty race conditions if handled improperly.  We also rename the
field to catch any potential users of it (not that there should be
outside of the driver core...)

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:16 -08:00
Greg Kroah-Hartman 3d8995963d kset: convert struct bus_device->devices to use kset_create
Dynamically create the kset instead of declaring it statically.

Having 3 static kobjects in one structure is not only foolish, but ripe
for nasty race conditions if handled improperly.  We also rename the
field to catch any potential users of it (not that there should be
outside of the driver core...)

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:16 -08:00
Greg Kroah-Hartman 039a5dcd2f kset: convert /sys/power to use kset_create
Dynamically create the kset instead of declaring it statically.  We also
rename power_subsys to power_kset to catch all users of the variable and
we properly export it so that people don't have to guess that it really
is present in the system.

The pseries code is wierd, why is it createing /sys/power if CONFIG_PM
is disabled?  Oh well, stupid big boxes ignoring config options...

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:16 -08:00
Greg Kroah-Hartman 7405c1e15e kset: convert /sys/module to use kset_create
Dynamically create the kset instead of declaring it statically.  We also
rename module_subsys to module_kset to catch all users of the variable.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:16 -08:00
Greg Kroah-Hartman 2d72fc00a1 kobject: convert /sys/hypervisor to use kobject_create
We don't need a kset here, a simple kobject will do just fine, so
dynamically create the kobject and use it.

We also rename hypervisor_subsys to hypervisor_kset to catch all users
of the variable.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:15 -08:00
Greg Kroah-Hartman bd35b93d80 kset: convert kernel_subsys to use kset_create
Dynamically create the kset instead of declaring it statically.  We also
rename kernel_subsys to kernel_kset to catch all users of this symbol
with a build error instead of an easy-to-ignore build warning.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:14 -08:00
Greg Kroah-Hartman e5e38a86c0 kset: remove decl_subsys_name
The last user of this macro (pci hotplug core) is now switched over to
using a dynamic kset, so this macro is no longer needed at all.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:14 -08:00
Greg Kroah-Hartman 81ace5cd8f kset: convert pci hotplug to use kset_create_and_add
This also renames pci_hotplug_slots_subsys to pcis_hotplug_slots_kset
catch all current users with a build error instead of a build warning
which can easily be missed.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:14 -08:00
Greg Kroah-Hartman 00d2666623 kobject: convert main fs kobject to use kobject_create
This also renames fs_subsys to fs_kobj to catch all current users with a
build error instead of a build warning which can easily be missed.


Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:13 -08:00
Greg Kroah-Hartman 43968d2f16 kobject: get rid of kobject_kset_add_dir
kobject_kset_add_dir is only called in one place so remove it and use
kobject_create() instead.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:11 -08:00
Greg Kroah-Hartman 4ff6abff83 kobject: get rid of kobject_add_dir
kobject_create_and_add is the same as kobject_add_dir, so drop
kobject_add_dir.


Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:11 -08:00
Greg Kroah-Hartman 3f9e3ee9dc kobject: add kobject_create_and_add function
This lets users create dynamic kobjects much easier.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:10 -08:00
Greg Kroah-Hartman b727c70289 kset: add kset_create_and_add function
Now ksets can be dynamically created on the fly, no static definitions
are required.  Thanks to Miklos for hints on how to make this work
better for the callers.

And thanks to Kay for finding some stupid bugs in my original version
and pointing out that we need to handle the fact that kobject's can have
a kset as a parent and to handle that properly in kobject_add().

Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:10 -08:00
Greg Kroah-Hartman 12d03da7c1 kobject: remove kobj_set_kset_s as no one is using it anymore
What a confusing name for a macro...

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:10 -08:00
Greg Kroah-Hartman 3514faca19 kobject: remove struct kobj_type from struct kset
We don't need a "default" ktype for a kset.  We should set this
explicitly every time for each kset.  This change is needed so that we
can make ksets dynamic, and cleans up one of the odd, undocumented
assumption that the kset/kobject/ktype model has.

This patch is based on a lot of help from Kay Sievers.

Nasty bug in the block code was found by Dave Young
<hidave.darkstar@gmail.com>

Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:10 -08:00
Greg Kroah-Hartman c11c4154e7 kobject: add kobject_init_and_add function
Also add a kobject_init_and_add function which bundles up what a lot of
the current callers want to do all at once, and it properly handles the
memory usages, unlike kobject_register();

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:10 -08:00
Greg Kroah-Hartman 244f6cee9a kobject: add kobject_add_ng function
This is what the kobject_add function is going to become.

Add this to the kernel and then we can convert the tree over to use it.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:09 -08:00
Greg Kroah-Hartman e86000d042 kobject: add kobject_init_ng function
This is what the kobject_init function is going to become.

Add this to the kernel and then we can convert the tree over to use it.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:09 -08:00
Greg Kroah-Hartman 18041f4775 kobject: make kobject_cleanup be static
No one except the kobject core calls it so make the function static.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:09 -08:00
Emil Medve 7b8712e563 driver core: Make the dev_*() family of macros in device.h complete
Removed duplicates defined elsewhere

Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:08 -08:00
Tony Jones 7dd817d083 tifm: Convert from class_device to device for TI flash media
Signed-off-by: Tony Jones <tonyj@suse.de>
Cc: Alex Dubov <oakad@yahoo.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:06 -08:00
Tony Jones 6013c12be8 pktcdvd: Convert from class_device to device for block/pktcdvd
struct class_device is going away, this converts the code to use struct
device instead.

Signed-off-by: Tony Jones <tonyj@suse.de>
Cc: Peter Osterlund <petero2@telia.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:06 -08:00
Tony Jones 891f78ea83 DMA: Convert from class_device to device for DMA engine
Signed-off-by: Tony Jones <tonyj@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Shannon Nelson <shannon.nelson@intel.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:05 -08:00
Evgeniy Polyakov 41ca28ab2a kref: add kref_set()
This adds kref_set() to the kref api for future use by people who really
know what they are doing with krefs...

From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:05 -08:00
Rafael J. Wysocki 775b64d2b6 PM: Acquire device locks on suspend
This patch reorganizes the way suspend and resume notifications are
sent to drivers.  The major changes are that now the PM core acquires
every device semaphore before calling the methods, and calls to
device_add() during suspends will fail, while calls to device_del()
during suspends will block.

It also provides a way to safely remove a suspended device with the
help of the PM core, by using the device_pm_schedule_removal() callback
introduced specifically for this purpose, and updates two drivers (msr
and cpuid) that need to use it.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-01-24 20:40:04 -08:00
Jan Engelhardt 1996a10948 security/selinux: constify function pointer tables and fields
Constify function pointer tables and fields.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-25 11:29:54 +11:00
David Howells 63cb344923 security: add a secctx_to_secid() hook
Add a secctx_to_secid() LSM hook to go along with the existing
secid_to_secctx() LSM hook.  This patch also includes the SELinux
implementation for this hook.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-25 11:29:53 +11:00
H. Peter Anvin bced95283e security: remove security_sb_post_mountroot hook
The security_sb_post_mountroot() hook is long-since obsolete, and is
fundamentally broken: it is never invoked if someone uses initramfs.
This is particularly damaging, because the existence of this hook has
been used as motivation for not using initramfs.

Stephen Smalley confirmed on 2007-07-19 that this hook was originally
used by SELinux but can now be safely removed:

     http://marc.info/?l=linux-kernel&m=118485683612916&w=2

Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-25 11:29:50 +11:00
James Morris 42d7896ebc Security: remove security.h include from mm.h
Remove security.h include from mm.h, as it is only needed for a single
extern declaration, and pulls in all kinds of crud.

Fine-by-me: David Chinner <dgc@sgi.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-25 11:29:49 +11:00
Eric Paris c9180a57a9 Security: add get, set, and cloning of superblock security information
Adds security_get_sb_mnt_opts, security_set_sb_mnt_opts, and
security_clont_sb_mnt_opts to the LSM and to SELinux.  This will allow
filesystems to directly own and control all of their mount options if they
so choose.  This interface deals only with option identifiers and strings so
it should generic enough for any LSM which may come in the future.

Filesystems which pass text mount data around in the kernel (almost all of
them) need not currently make use of this interface when dealing with
SELinux since it will still parse those strings as it always has.  I assume
future LSM's would do the same.  NFS is the primary FS which does not use
text mount data and thus must make use of this interface.

An LSM would need to implement these functions only if they had mount time
options, such as selinux has context= or fscontext=.  If the LSM has no
mount time options they could simply not implement and let the dummy ops
take care of things.

An LSM other than SELinux would need to define new option numbers in
security.h and any FS which decides to own there own security options would
need to be patched to use this new interface for every possible LSM.  This
is because it was stated to me very clearly that LSM's should not attempt to
understand FS mount data and the burdon to understand security should be in
the FS which owns the options.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
2008-01-25 11:29:46 +11:00
Len Brown d4b7dc499d ACPI: make _OSI(Linux) console messages smarter
If BIOS invokes _OSI(Linux), the kernel response
depends on what the ACPI DMI list knows about the system,
and that is reflectd in dmesg:

1) System unknown to DMI:

ACPI: BIOS _OSI(Linux) query ignored
ACPI: DMI System Vendor: LENOVO
ACPI: DMI Product Name: 7661W1P
ACPI: DMI Product Version: ThinkPad T61
ACPI: DMI Board Name: 7661W1P
ACPI: DMI BIOS Vendor: LENOVO
ACPI: DMI BIOS Date: 10/18/2007
ACPI: Please send DMI info above to linux-acpi@vger.kernel.org
ACPI: If "acpi_osi=Linux" works better, please notify linux-acpi@vger.kernel.org

2) System known to DMI, but effect of OSI(Linux) unknown:

ACPI: DMI detected: Lenovo ThinkPad T61
...
ACPI: BIOS _OSI(Linux) query ignored via DMI
ACPI: If "acpi_osi=Linux" works better, please notify linux-acpi@vger.kernel.org

3) System known to DMI, which disables _OSI(Linux):

ACPI: DMI detected: Lenovo ThinkPad T61
...
ACPI: BIOS _OSI(Linux) query ignored via DMI

4) System known to DMI, which enable _OSI(Linux):

ACPI: DMI detected: Lenovo ThinkPad T61
ACPI: Added _OSI(Linux)
...
ACPI: BIOS _OSI(Linux) query honored via DMI

cmdline overrides take precidence over the built-in
default and the DMI prescribed default.
cmdline "acpi_osi=Linux" results in:

ACPI: BIOS _OSI(Linux) query honored via cmdline

Signed-off-by: Len Brown <len.brown@intel.com>
2008-01-23 21:26:15 -05:00
Len Brown f89e3b0620 DMI: create dmi_get_slot()
This simply allows other sub-systems (such as ACPI)
to access and print out slots in static dmi_ident[].

Signed-off-by: Len Brown <len.brown@intel.com>
2008-01-23 21:23:13 -05:00
Len Brown 81b4e1f626 DMI: move dmi_available declaration to linux/dmi.h
Signed-off-by: Len Brown <len.brown@intel.com>
2008-01-23 21:22:21 -05:00
James Bottomley d4acd722b7 [SCSI] sysfs: add filter function to groups
This patch allows the various users of attribute_groups to selectively
allow the appearance of group attributes.  The primary consumer of
this will be the transport classes in which we currently have
elaborate attribute selection algorithms to do this same thing.

Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-23 11:29:18 -06:00
James Bottomley fd1109711d [SCSI] attribute_container: update to use the group interface
This patch is the beginning of moving the attribute_containers to use
attribute groups exclusively.  The attr element is now deprecated and
will eventually be removed (along with all the hand rolled code for
doing exactly what attribute groups do) when all the consumers are
converted to attribute groups.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-01-23 11:29:17 -06:00
Alan Cox 5e8f757cb2 ata_generic: Cenatek support
Not much to do here. It's an ata memory as disk.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:17 -05:00
Tejun Heo 4e6b79fa61 libata: factor out ata_pci_activate_sff_host() from ata_pci_one()
Factor out ata_pci_activate_sff_host() from ata_pci_one().  This does
about the same thing as ata_host_activate() but needs to be separate
because SFF controllers use different and multiple IRQs in legacy
mode.

This will be used to make SFF LLD initialization more flexible.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:16 -05:00
Al Viro 4ca4e43964 libata annotations and fixes
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:15 -05:00
Jeff Garzik 442eacc362 libata: make ata_port_queue_task() an internal function
ata_port_queue_task() served a single user:  ata_pio_task()

Rename to ata_pio_queue_task() and un-export it, as nobody outside of
libata-core.c uses it.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-01-23 05:24:15 -05:00
Tejun Heo 0bcc65ad78 libata: make qc->nbytes include extra buffers
qc->nbytes didn't use to include extra buffers setup by libata core
layer and my be odd.  This patch makes qc->nbytes include any extra
buffers setup by libata core layer and guaranteed to be aligned on 4
byte boundary.

This value is to be used to program the host controller.  As this
represents the actual length of buffer available to the controller and
the controller must be able to deal with short transfers for ATAPI
commands which can transfer variable length, this shouldn't break any
controllers while making problems like rounding-down and controllers
choking up on odd transfer bytes much less likely.

The unmodified value is stored in new field qc->raw_nbytes.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:15 -05:00
Tejun Heo ff2aeb1eb6 libata: convert to chained sg
libata used private sg iterator to handle padding sg.  Now that sg can
be chained, padding can be handled using standard sg ops.  Convert to
chained sg.

* s/qc->__sg/qc->sg/

* s/qc->pad_sgent/qc->extra_sg[]/.  Because chaining consumes one sg
  entry.  There need to be two extra sg entries.  The renaming is also
  for future addition of other extra sg entries.

* Padding setup is moved into ata_sg_setup_extra() which is organized
  in a way that future addition of other extra sg entries is easy.

* qc->orig_n_elem is unused and removed.

* qc->n_elem now contains the number of sg entries that LLDs should
  map.  qc->mapped_n_elem is added to carry the original number of
  mapped sgs for unmapping.

* The last sg of the original sg list is used to chain to extra sg
  list.  The original last sg is pointed to by qc->last_sg and the
  content is stored in qc->saved_last_sg.  It's restored during
  ata_sg_clean().

* All sg walking code has been updated.  Unnecessary assertions and
  checks for conditions the core layer already guarantees are removed.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:14 -05:00
Tejun Heo 001102d785 libata: kill non-sg DMA interface
With atapi_request_sense() converted to use sg, there's no user of
non-sg interface.  Kill non-sg interface.

* ATA_QCFLAG_SINGLE and ATA_QCFLAG_SG are removed.  ATA_QCFLAG_DMAMAP
  is used instead.  (this way no LLD change is necessary)

* qc->buf_virt is removed.

* ata_sg_init_one() and ata_sg_setup_one() are removed.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Rusty Russel <rusty@rustcorp.com.au>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:14 -05:00
Tejun Heo 55dba3120f libata: update ->data_xfer hook for ATAPI
Depending on how many bytes are transferred as a unit, PIO data
transfer may consume more bytes than requested.  Knowing how much
data is consumed is necessary to determine how much is left for
draining.  This patch update ->data_xfer such that it returns the
number of consumed bytes.

While at it, it also makes the following changes.

* s/adev/dev/
* use READ/WRITE constants for rw indication
* misc clean ups

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:14 -05:00
Tejun Heo ceb0c64262 libata: add ATAPI_* cmd types and implement atapi_cmd_type()
Add ATAPI command types - ATAPI_READ, WRITE, RW_BUF, READ_CD and MISC,
and implement atapi_cmd_type() which takes SCSI opcode and returns to
which class the opcode belongs.  This will be used later to improve
ATAPI handling.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:14 -05:00
Tejun Heo 0dc36888d4 libata: rename ATA_PROT_ATAPI_* to ATAPI_PROT_*
ATA_PROT_ATAPI_* are ugly and naming schemes between ATA_PROT_* and
ATA_PROT_ATAPI_* are inconsistent causing confusion.  Rename them to
ATAPI_PROT_* and make them consistent with ATA counterpart.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-01-23 05:24:14 -05:00
Tejun Heo 537b53c169 cdrom: add more GPCMD_* constants
Add GPCMD_* constants for READ_BUFFER, WRITE_12 and WRITE_BUFFER for
completeness.  These will be used libata.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:14 -05:00
Tejun Heo 021ee9a6da libata: reimplement ata_acpi_cbl_80wire() using ata_acpi_gtm_xfermask()
Reimplement ata_acpi_cbl_80wire() using ata_acpi_gtm_xfermask() and
while at it relocate the function below ata_acpi_gtm_xfermask().

New ata_acpi_cbl_80wire() implementation takes @gtm, in both pata_via
and pata_amd, use the initial GTM value.  Both are trying to peek
initial BIOS configuration, so using initial caching value makes
sense.  This fixes ACPI part of cable detection in pata_amd which
previously always returned 0 because configuring PIO0 during reset
clears DMA configuration.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-23 05:24:12 -05:00