mirror of https://gitee.com/openkylin/linux.git
Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar: - Add 'cross-release' support to lockdep, which allows APIs like completions, where it's not the 'owner' who releases the lock, to be tracked. It's all activated automatically under CONFIG_PROVE_LOCKING=y. - Clean up (restructure) the x86 atomics op implementation to be more readable, in preparation of KASAN annotations. (Dmitry Vyukov) - Fix static keys (Paolo Bonzini) - Add killable versions of down_read() et al (Kirill Tkhai) - Rework and fix jump_label locking (Marc Zyngier, Paolo Bonzini) - Rework (and fix) tlb_flush_pending() barriers (Peter Zijlstra) - Remove smp_mb__before_spinlock() and convert its usages, introduce smp_mb__after_spinlock() (Peter Zijlstra) * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits) locking/lockdep/selftests: Fix mixed read-write ABBA tests sched/completion: Avoid unnecessary stack allocation for COMPLETION_INITIALIZER_ONSTACK() acpi/nfit: Fix COMPLETION_INITIALIZER_ONSTACK() abuse locking/pvqspinlock: Relax cmpxchg's to improve performance on some architectures smp: Avoid using two cache lines for struct call_single_data locking/lockdep: Untangle xhlock history save/restore from task independence locking/refcounts, x86/asm: Disable CONFIG_ARCH_HAS_REFCOUNT for the time being futex: Remove duplicated code and fix undefined behaviour Documentation/locking/atomic: Finish the document... locking/lockdep: Fix workqueue crossrelease annotation workqueue/lockdep: 'Fix' flush_work() annotation locking/lockdep/selftests: Add mixed read-write ABBA tests mm, locking/barriers: Clarify tlb_flush_pending() barriers locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS truly non-interactive locking/lockdep: Explicitly initialize wq_barrier::done::map locking/lockdep: Rename CONFIG_LOCKDEP_COMPLETE to CONFIG_LOCKDEP_COMPLETIONS locking/lockdep: Reword title of LOCKDEP_CROSSRELEASE config locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE part of CONFIG_PROVE_LOCKING locking/refcounts, x86/asm: Implement fast refcount overflow protection locking/lockdep: Fix the rollback and overwrite detection logic in crossrelease ...
This commit is contained in:
commit
5f82e71a00
|
@ -0,0 +1,66 @@
|
|||
|
||||
On atomic bitops.
|
||||
|
||||
|
||||
While our bitmap_{}() functions are non-atomic, we have a number of operations
|
||||
operating on single bits in a bitmap that are atomic.
|
||||
|
||||
|
||||
API
|
||||
---
|
||||
|
||||
The single bit operations are:
|
||||
|
||||
Non-RMW ops:
|
||||
|
||||
test_bit()
|
||||
|
||||
RMW atomic operations without return value:
|
||||
|
||||
{set,clear,change}_bit()
|
||||
clear_bit_unlock()
|
||||
|
||||
RMW atomic operations with return value:
|
||||
|
||||
test_and_{set,clear,change}_bit()
|
||||
test_and_set_bit_lock()
|
||||
|
||||
Barriers:
|
||||
|
||||
smp_mb__{before,after}_atomic()
|
||||
|
||||
|
||||
All RMW atomic operations have a '__' prefixed variant which is non-atomic.
|
||||
|
||||
|
||||
SEMANTICS
|
||||
---------
|
||||
|
||||
Non-atomic ops:
|
||||
|
||||
In particular __clear_bit_unlock() suffers the same issue as atomic_set(),
|
||||
which is why the generic version maps to clear_bit_unlock(), see atomic_t.txt.
|
||||
|
||||
|
||||
RMW ops:
|
||||
|
||||
The test_and_{}_bit() operations return the original value of the bit.
|
||||
|
||||
|
||||
ORDERING
|
||||
--------
|
||||
|
||||
Like with atomic_t, the rule of thumb is:
|
||||
|
||||
- non-RMW operations are unordered;
|
||||
|
||||
- RMW operations that have no return value are unordered;
|
||||
|
||||
- RMW operations that have a return value are fully ordered.
|
||||
|
||||
Except for test_and_set_bit_lock() which has ACQUIRE semantics and
|
||||
clear_bit_unlock() which has RELEASE semantics.
|
||||
|
||||
Since a platform only has a single means of achieving atomic operations
|
||||
the same barriers as for atomic_t are used, see atomic_t.txt.
|
||||
|
|
@ -0,0 +1,242 @@
|
|||
|
||||
On atomic types (atomic_t atomic64_t and atomic_long_t).
|
||||
|
||||
The atomic type provides an interface to the architecture's means of atomic
|
||||
RMW operations between CPUs (atomic operations on MMIO are not supported and
|
||||
can lead to fatal traps on some platforms).
|
||||
|
||||
API
|
||||
---
|
||||
|
||||
The 'full' API consists of (atomic64_ and atomic_long_ prefixes omitted for
|
||||
brevity):
|
||||
|
||||
Non-RMW ops:
|
||||
|
||||
atomic_read(), atomic_set()
|
||||
atomic_read_acquire(), atomic_set_release()
|
||||
|
||||
|
||||
RMW atomic operations:
|
||||
|
||||
Arithmetic:
|
||||
|
||||
atomic_{add,sub,inc,dec}()
|
||||
atomic_{add,sub,inc,dec}_return{,_relaxed,_acquire,_release}()
|
||||
atomic_fetch_{add,sub,inc,dec}{,_relaxed,_acquire,_release}()
|
||||
|
||||
|
||||
Bitwise:
|
||||
|
||||
atomic_{and,or,xor,andnot}()
|
||||
atomic_fetch_{and,or,xor,andnot}{,_relaxed,_acquire,_release}()
|
||||
|
||||
|
||||
Swap:
|
||||
|
||||
atomic_xchg{,_relaxed,_acquire,_release}()
|
||||
atomic_cmpxchg{,_relaxed,_acquire,_release}()
|
||||
atomic_try_cmpxchg{,_relaxed,_acquire,_release}()
|
||||
|
||||
|
||||
Reference count (but please see refcount_t):
|
||||
|
||||
atomic_add_unless(), atomic_inc_not_zero()
|
||||
atomic_sub_and_test(), atomic_dec_and_test()
|
||||
|
||||
|
||||
Misc:
|
||||
|
||||
atomic_inc_and_test(), atomic_add_negative()
|
||||
atomic_dec_unless_positive(), atomic_inc_unless_negative()
|
||||
|
||||
|
||||
Barriers:
|
||||
|
||||
smp_mb__{before,after}_atomic()
|
||||
|
||||
|
||||
|
||||
SEMANTICS
|
||||
---------
|
||||
|
||||
Non-RMW ops:
|
||||
|
||||
The non-RMW ops are (typically) regular LOADs and STOREs and are canonically
|
||||
implemented using READ_ONCE(), WRITE_ONCE(), smp_load_acquire() and
|
||||
smp_store_release() respectively.
|
||||
|
||||
The one detail to this is that atomic_set{}() should be observable to the RMW
|
||||
ops. That is:
|
||||
|
||||
C atomic-set
|
||||
|
||||
{
|
||||
atomic_set(v, 1);
|
||||
}
|
||||
|
||||
P1(atomic_t *v)
|
||||
{
|
||||
atomic_add_unless(v, 1, 0);
|
||||
}
|
||||
|
||||
P2(atomic_t *v)
|
||||
{
|
||||
atomic_set(v, 0);
|
||||
}
|
||||
|
||||
exists
|
||||
(v=2)
|
||||
|
||||
In this case we would expect the atomic_set() from CPU1 to either happen
|
||||
before the atomic_add_unless(), in which case that latter one would no-op, or
|
||||
_after_ in which case we'd overwrite its result. In no case is "2" a valid
|
||||
outcome.
|
||||
|
||||
This is typically true on 'normal' platforms, where a regular competing STORE
|
||||
will invalidate a LL/SC or fail a CMPXCHG.
|
||||
|
||||
The obvious case where this is not so is when we need to implement atomic ops
|
||||
with a lock:
|
||||
|
||||
CPU0 CPU1
|
||||
|
||||
atomic_add_unless(v, 1, 0);
|
||||
lock();
|
||||
ret = READ_ONCE(v->counter); // == 1
|
||||
atomic_set(v, 0);
|
||||
if (ret != u) WRITE_ONCE(v->counter, 0);
|
||||
WRITE_ONCE(v->counter, ret + 1);
|
||||
unlock();
|
||||
|
||||
the typical solution is to then implement atomic_set{}() with atomic_xchg().
|
||||
|
||||
|
||||
RMW ops:
|
||||
|
||||
These come in various forms:
|
||||
|
||||
- plain operations without return value: atomic_{}()
|
||||
|
||||
- operations which return the modified value: atomic_{}_return()
|
||||
|
||||
these are limited to the arithmetic operations because those are
|
||||
reversible. Bitops are irreversible and therefore the modified value
|
||||
is of dubious utility.
|
||||
|
||||
- operations which return the original value: atomic_fetch_{}()
|
||||
|
||||
- swap operations: xchg(), cmpxchg() and try_cmpxchg()
|
||||
|
||||
- misc; the special purpose operations that are commonly used and would,
|
||||
given the interface, normally be implemented using (try_)cmpxchg loops but
|
||||
are time critical and can, (typically) on LL/SC architectures, be more
|
||||
efficiently implemented.
|
||||
|
||||
All these operations are SMP atomic; that is, the operations (for a single
|
||||
atomic variable) can be fully ordered and no intermediate state is lost or
|
||||
visible.
|
||||
|
||||
|
||||
ORDERING (go read memory-barriers.txt first)
|
||||
--------
|
||||
|
||||
The rule of thumb:
|
||||
|
||||
- non-RMW operations are unordered;
|
||||
|
||||
- RMW operations that have no return value are unordered;
|
||||
|
||||
- RMW operations that have a return value are fully ordered;
|
||||
|
||||
- RMW operations that are conditional are unordered on FAILURE,
|
||||
otherwise the above rules apply.
|
||||
|
||||
Except of course when an operation has an explicit ordering like:
|
||||
|
||||
{}_relaxed: unordered
|
||||
{}_acquire: the R of the RMW (or atomic_read) is an ACQUIRE
|
||||
{}_release: the W of the RMW (or atomic_set) is a RELEASE
|
||||
|
||||
Where 'unordered' is against other memory locations. Address dependencies are
|
||||
not defeated.
|
||||
|
||||
Fully ordered primitives are ordered against everything prior and everything
|
||||
subsequent. Therefore a fully ordered primitive is like having an smp_mb()
|
||||
before and an smp_mb() after the primitive.
|
||||
|
||||
|
||||
The barriers:
|
||||
|
||||
smp_mb__{before,after}_atomic()
|
||||
|
||||
only apply to the RMW ops and can be used to augment/upgrade the ordering
|
||||
inherent to the used atomic op. These barriers provide a full smp_mb().
|
||||
|
||||
These helper barriers exist because architectures have varying implicit
|
||||
ordering on their SMP atomic primitives. For example our TSO architectures
|
||||
provide full ordered atomics and these barriers are no-ops.
|
||||
|
||||
Thus:
|
||||
|
||||
atomic_fetch_add();
|
||||
|
||||
is equivalent to:
|
||||
|
||||
smp_mb__before_atomic();
|
||||
atomic_fetch_add_relaxed();
|
||||
smp_mb__after_atomic();
|
||||
|
||||
However the atomic_fetch_add() might be implemented more efficiently.
|
||||
|
||||
Further, while something like:
|
||||
|
||||
smp_mb__before_atomic();
|
||||
atomic_dec(&X);
|
||||
|
||||
is a 'typical' RELEASE pattern, the barrier is strictly stronger than
|
||||
a RELEASE. Similarly for something like:
|
||||
|
||||
atomic_inc(&X);
|
||||
smp_mb__after_atomic();
|
||||
|
||||
is an ACQUIRE pattern (though very much not typical), but again the barrier is
|
||||
strictly stronger than ACQUIRE. As illustrated:
|
||||
|
||||
C strong-acquire
|
||||
|
||||
{
|
||||
}
|
||||
|
||||
P1(int *x, atomic_t *y)
|
||||
{
|
||||
r0 = READ_ONCE(*x);
|
||||
smp_rmb();
|
||||
r1 = atomic_read(y);
|
||||
}
|
||||
|
||||
P2(int *x, atomic_t *y)
|
||||
{
|
||||
atomic_inc(y);
|
||||
smp_mb__after_atomic();
|
||||
WRITE_ONCE(*x, 1);
|
||||
}
|
||||
|
||||
exists
|
||||
(r0=1 /\ r1=0)
|
||||
|
||||
This should not happen; but a hypothetical atomic_inc_acquire() --
|
||||
(void)atomic_fetch_inc_acquire() for instance -- would allow the outcome,
|
||||
since then:
|
||||
|
||||
P1 P2
|
||||
|
||||
t = LL.acq *y (0)
|
||||
t++;
|
||||
*x = 1;
|
||||
r0 = *x (1)
|
||||
RMB
|
||||
r1 = *y (0)
|
||||
SC *y, t;
|
||||
|
||||
is allowed.
|
|
@ -0,0 +1,874 @@
|
|||
Crossrelease
|
||||
============
|
||||
|
||||
Started by Byungchul Park <byungchul.park@lge.com>
|
||||
|
||||
Contents:
|
||||
|
||||
(*) Background
|
||||
|
||||
- What causes deadlock
|
||||
- How lockdep works
|
||||
|
||||
(*) Limitation
|
||||
|
||||
- Limit lockdep
|
||||
- Pros from the limitation
|
||||
- Cons from the limitation
|
||||
- Relax the limitation
|
||||
|
||||
(*) Crossrelease
|
||||
|
||||
- Introduce crossrelease
|
||||
- Introduce commit
|
||||
|
||||
(*) Implementation
|
||||
|
||||
- Data structures
|
||||
- How crossrelease works
|
||||
|
||||
(*) Optimizations
|
||||
|
||||
- Avoid duplication
|
||||
- Lockless for hot paths
|
||||
|
||||
(*) APPENDIX A: What lockdep does to work aggresively
|
||||
|
||||
(*) APPENDIX B: How to avoid adding false dependencies
|
||||
|
||||
|
||||
==========
|
||||
Background
|
||||
==========
|
||||
|
||||
What causes deadlock
|
||||
--------------------
|
||||
|
||||
A deadlock occurs when a context is waiting for an event to happen,
|
||||
which is impossible because another (or the) context who can trigger the
|
||||
event is also waiting for another (or the) event to happen, which is
|
||||
also impossible due to the same reason.
|
||||
|
||||
For example:
|
||||
|
||||
A context going to trigger event C is waiting for event A to happen.
|
||||
A context going to trigger event A is waiting for event B to happen.
|
||||
A context going to trigger event B is waiting for event C to happen.
|
||||
|
||||
A deadlock occurs when these three wait operations run at the same time,
|
||||
because event C cannot be triggered if event A does not happen, which in
|
||||
turn cannot be triggered if event B does not happen, which in turn
|
||||
cannot be triggered if event C does not happen. After all, no event can
|
||||
be triggered since any of them never meets its condition to wake up.
|
||||
|
||||
A dependency might exist between two waiters and a deadlock might happen
|
||||
due to an incorrect releationship between dependencies. Thus, we must
|
||||
define what a dependency is first. A dependency exists between them if:
|
||||
|
||||
1. There are two waiters waiting for each event at a given time.
|
||||
2. The only way to wake up each waiter is to trigger its event.
|
||||
3. Whether one can be woken up depends on whether the other can.
|
||||
|
||||
Each wait in the example creates its dependency like:
|
||||
|
||||
Event C depends on event A.
|
||||
Event A depends on event B.
|
||||
Event B depends on event C.
|
||||
|
||||
NOTE: Precisely speaking, a dependency is one between whether a
|
||||
waiter for an event can be woken up and whether another waiter for
|
||||
another event can be woken up. However from now on, we will describe
|
||||
a dependency as if it's one between an event and another event for
|
||||
simplicity.
|
||||
|
||||
And they form circular dependencies like:
|
||||
|
||||
-> C -> A -> B -
|
||||
/ \
|
||||
\ /
|
||||
----------------
|
||||
|
||||
where 'A -> B' means that event A depends on event B.
|
||||
|
||||
Such circular dependencies lead to a deadlock since no waiter can meet
|
||||
its condition to wake up as described.
|
||||
|
||||
CONCLUSION
|
||||
|
||||
Circular dependencies cause a deadlock.
|
||||
|
||||
|
||||
How lockdep works
|
||||
-----------------
|
||||
|
||||
Lockdep tries to detect a deadlock by checking dependencies created by
|
||||
lock operations, acquire and release. Waiting for a lock corresponds to
|
||||
waiting for an event, and releasing a lock corresponds to triggering an
|
||||
event in the previous section.
|
||||
|
||||
In short, lockdep does:
|
||||
|
||||
1. Detect a new dependency.
|
||||
2. Add the dependency into a global graph.
|
||||
3. Check if that makes dependencies circular.
|
||||
4. Report a deadlock or its possibility if so.
|
||||
|
||||
For example, consider a graph built by lockdep that looks like:
|
||||
|
||||
A -> B -
|
||||
\
|
||||
-> E
|
||||
/
|
||||
C -> D -
|
||||
|
||||
where A, B,..., E are different lock classes.
|
||||
|
||||
Lockdep will add a dependency into the graph on detection of a new
|
||||
dependency. For example, it will add a dependency 'E -> C' when a new
|
||||
dependency between lock E and lock C is detected. Then the graph will be:
|
||||
|
||||
A -> B -
|
||||
\
|
||||
-> E -
|
||||
/ \
|
||||
-> C -> D - \
|
||||
/ /
|
||||
\ /
|
||||
------------------
|
||||
|
||||
where A, B,..., E are different lock classes.
|
||||
|
||||
This graph contains a subgraph which demonstrates circular dependencies:
|
||||
|
||||
-> E -
|
||||
/ \
|
||||
-> C -> D - \
|
||||
/ /
|
||||
\ /
|
||||
------------------
|
||||
|
||||
where C, D and E are different lock classes.
|
||||
|
||||
This is the condition under which a deadlock might occur. Lockdep
|
||||
reports it on detection after adding a new dependency. This is the way
|
||||
how lockdep works.
|
||||
|
||||
CONCLUSION
|
||||
|
||||
Lockdep detects a deadlock or its possibility by checking if circular
|
||||
dependencies were created after adding each new dependency.
|
||||
|
||||
|
||||
==========
|
||||
Limitation
|
||||
==========
|
||||
|
||||
Limit lockdep
|
||||
-------------
|
||||
|
||||
Limiting lockdep to work on only typical locks e.g. spin locks and
|
||||
mutexes, which are released within the acquire context, the
|
||||
implementation becomes simple but its capacity for detection becomes
|
||||
limited. Let's check pros and cons in next section.
|
||||
|
||||
|
||||
Pros from the limitation
|
||||
------------------------
|
||||
|
||||
Given the limitation, when acquiring a lock, locks in a held_locks
|
||||
cannot be released if the context cannot acquire it so has to wait to
|
||||
acquire it, which means all waiters for the locks in the held_locks are
|
||||
stuck. It's an exact case to create dependencies between each lock in
|
||||
the held_locks and the lock to acquire.
|
||||
|
||||
For example:
|
||||
|
||||
CONTEXT X
|
||||
---------
|
||||
acquire A
|
||||
acquire B /* Add a dependency 'A -> B' */
|
||||
release B
|
||||
release A
|
||||
|
||||
where A and B are different lock classes.
|
||||
|
||||
When acquiring lock A, the held_locks of CONTEXT X is empty thus no
|
||||
dependency is added. But when acquiring lock B, lockdep detects and adds
|
||||
a new dependency 'A -> B' between lock A in the held_locks and lock B.
|
||||
They can be simply added whenever acquiring each lock.
|
||||
|
||||
And data required by lockdep exists in a local structure, held_locks
|
||||
embedded in task_struct. Forcing to access the data within the context,
|
||||
lockdep can avoid racy problems without explicit locks while handling
|
||||
the local data.
|
||||
|
||||
Lastly, lockdep only needs to keep locks currently being held, to build
|
||||
a dependency graph. However, relaxing the limitation, it needs to keep
|
||||
even locks already released, because a decision whether they created
|
||||
dependencies might be long-deferred.
|
||||
|
||||
To sum up, we can expect several advantages from the limitation:
|
||||
|
||||
1. Lockdep can easily identify a dependency when acquiring a lock.
|
||||
2. Races are avoidable while accessing local locks in a held_locks.
|
||||
3. Lockdep only needs to keep locks currently being held.
|
||||
|
||||
CONCLUSION
|
||||
|
||||
Given the limitation, the implementation becomes simple and efficient.
|
||||
|
||||
|
||||
Cons from the limitation
|
||||
------------------------
|
||||
|
||||
Given the limitation, lockdep is applicable only to typical locks. For
|
||||
example, page locks for page access or completions for synchronization
|
||||
cannot work with lockdep.
|
||||
|
||||
Can we detect deadlocks below, under the limitation?
|
||||
|
||||
Example 1:
|
||||
|
||||
CONTEXT X CONTEXT Y CONTEXT Z
|
||||
--------- --------- ----------
|
||||
mutex_lock A
|
||||
lock_page B
|
||||
lock_page B
|
||||
mutex_lock A /* DEADLOCK */
|
||||
unlock_page B held by X
|
||||
unlock_page B
|
||||
mutex_unlock A
|
||||
mutex_unlock A
|
||||
|
||||
where A and B are different lock classes.
|
||||
|
||||
No, we cannot.
|
||||
|
||||
Example 2:
|
||||
|
||||
CONTEXT X CONTEXT Y
|
||||
--------- ---------
|
||||
mutex_lock A
|
||||
mutex_lock A
|
||||
wait_for_complete B /* DEADLOCK */
|
||||
complete B
|
||||
mutex_unlock A
|
||||
mutex_unlock A
|
||||
|
||||
where A is a lock class and B is a completion variable.
|
||||
|
||||
No, we cannot.
|
||||
|
||||
CONCLUSION
|
||||
|
||||
Given the limitation, lockdep cannot detect a deadlock or its
|
||||
possibility caused by page locks or completions.
|
||||
|
||||
|
||||
Relax the limitation
|
||||
--------------------
|
||||
|
||||
Under the limitation, things to create dependencies are limited to
|
||||
typical locks. However, synchronization primitives like page locks and
|
||||
completions, which are allowed to be released in any context, also
|
||||
create dependencies and can cause a deadlock. So lockdep should track
|
||||
these locks to do a better job. We have to relax the limitation for
|
||||
these locks to work with lockdep.
|
||||
|
||||
Detecting dependencies is very important for lockdep to work because
|
||||
adding a dependency means adding an opportunity to check whether it
|
||||
causes a deadlock. The more lockdep adds dependencies, the more it
|
||||
thoroughly works. Thus Lockdep has to do its best to detect and add as
|
||||
many true dependencies into a graph as possible.
|
||||
|
||||
For example, considering only typical locks, lockdep builds a graph like:
|
||||
|
||||
A -> B -
|
||||
\
|
||||
-> E
|
||||
/
|
||||
C -> D -
|
||||
|
||||
where A, B,..., E are different lock classes.
|
||||
|
||||
On the other hand, under the relaxation, additional dependencies might
|
||||
be created and added. Assuming additional 'FX -> C' and 'E -> GX' are
|
||||
added thanks to the relaxation, the graph will be:
|
||||
|
||||
A -> B -
|
||||
\
|
||||
-> E -> GX
|
||||
/
|
||||
FX -> C -> D -
|
||||
|
||||
where A, B,..., E, FX and GX are different lock classes, and a suffix
|
||||
'X' is added on non-typical locks.
|
||||
|
||||
The latter graph gives us more chances to check circular dependencies
|
||||
than the former. However, it might suffer performance degradation since
|
||||
relaxing the limitation, with which design and implementation of lockdep
|
||||
can be efficient, might introduce inefficiency inevitably. So lockdep
|
||||
should provide two options, strong detection and efficient detection.
|
||||
|
||||
Choosing efficient detection:
|
||||
|
||||
Lockdep works with only locks restricted to be released within the
|
||||
acquire context. However, lockdep works efficiently.
|
||||
|
||||
Choosing strong detection:
|
||||
|
||||
Lockdep works with all synchronization primitives. However, lockdep
|
||||
suffers performance degradation.
|
||||
|
||||
CONCLUSION
|
||||
|
||||
Relaxing the limitation, lockdep can add additional dependencies giving
|
||||
additional opportunities to check circular dependencies.
|
||||
|
||||
|
||||
============
|
||||
Crossrelease
|
||||
============
|
||||
|
||||
Introduce crossrelease
|
||||
----------------------
|
||||
|
||||
In order to allow lockdep to handle additional dependencies by what
|
||||
might be released in any context, namely 'crosslock', we have to be able
|
||||
to identify those created by crosslocks. The proposed 'crossrelease'
|
||||
feature provoides a way to do that.
|
||||
|
||||
Crossrelease feature has to do:
|
||||
|
||||
1. Identify dependencies created by crosslocks.
|
||||
2. Add the dependencies into a dependency graph.
|
||||
|
||||
That's all. Once a meaningful dependency is added into graph, then
|
||||
lockdep would work with the graph as it did. The most important thing
|
||||
crossrelease feature has to do is to correctly identify and add true
|
||||
dependencies into the global graph.
|
||||
|
||||
A dependency e.g. 'A -> B' can be identified only in the A's release
|
||||
context because a decision required to identify the dependency can be
|
||||
made only in the release context. That is to decide whether A can be
|
||||
released so that a waiter for A can be woken up. It cannot be made in
|
||||
other than the A's release context.
|
||||
|
||||
It's no matter for typical locks because each acquire context is same as
|
||||
its release context, thus lockdep can decide whether a lock can be
|
||||
released in the acquire context. However for crosslocks, lockdep cannot
|
||||
make the decision in the acquire context but has to wait until the
|
||||
release context is identified.
|
||||
|
||||
Therefore, deadlocks by crosslocks cannot be detected just when it
|
||||
happens, because those cannot be identified until the crosslocks are
|
||||
released. However, deadlock possibilities can be detected and it's very
|
||||
worth. See 'APPENDIX A' section to check why.
|
||||
|
||||
CONCLUSION
|
||||
|
||||
Using crossrelease feature, lockdep can work with what might be released
|
||||
in any context, namely crosslock.
|
||||
|
||||
|
||||
Introduce commit
|
||||
----------------
|
||||
|
||||
Since crossrelease defers the work adding true dependencies of
|
||||
crosslocks until they are actually released, crossrelease has to queue
|
||||
all acquisitions which might create dependencies with the crosslocks.
|
||||
Then it identifies dependencies using the queued data in batches at a
|
||||
proper time. We call it 'commit'.
|
||||
|
||||
There are four types of dependencies:
|
||||
|
||||
1. TT type: 'typical lock A -> typical lock B'
|
||||
|
||||
Just when acquiring B, lockdep can see it's in the A's release
|
||||
context. So the dependency between A and B can be identified
|
||||
immediately. Commit is unnecessary.
|
||||
|
||||
2. TC type: 'typical lock A -> crosslock BX'
|
||||
|
||||
Just when acquiring BX, lockdep can see it's in the A's release
|
||||
context. So the dependency between A and BX can be identified
|
||||
immediately. Commit is unnecessary, too.
|
||||
|
||||
3. CT type: 'crosslock AX -> typical lock B'
|
||||
|
||||
When acquiring B, lockdep cannot identify the dependency because
|
||||
there's no way to know if it's in the AX's release context. It has
|
||||
to wait until the decision can be made. Commit is necessary.
|
||||
|
||||
4. CC type: 'crosslock AX -> crosslock BX'
|
||||
|
||||
When acquiring BX, lockdep cannot identify the dependency because
|
||||
there's no way to know if it's in the AX's release context. It has
|
||||
to wait until the decision can be made. Commit is necessary.
|
||||
But, handling CC type is not implemented yet. It's a future work.
|
||||
|
||||
Lockdep can work without commit for typical locks, but commit step is
|
||||
necessary once crosslocks are involved. Introducing commit, lockdep
|
||||
performs three steps. What lockdep does in each step is:
|
||||
|
||||
1. Acquisition: For typical locks, lockdep does what it originally did
|
||||
and queues the lock so that CT type dependencies can be checked using
|
||||
it at the commit step. For crosslocks, it saves data which will be
|
||||
used at the commit step and increases a reference count for it.
|
||||
|
||||
2. Commit: No action is reauired for typical locks. For crosslocks,
|
||||
lockdep adds CT type dependencies using the data saved at the
|
||||
acquisition step.
|
||||
|
||||
3. Release: No changes are required for typical locks. When a crosslock
|
||||
is released, it decreases a reference count for it.
|
||||
|
||||
CONCLUSION
|
||||
|
||||
Crossrelease introduces commit step to handle dependencies of crosslocks
|
||||
in batches at a proper time.
|
||||
|
||||
|
||||
==============
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Data structures
|
||||
---------------
|
||||
|
||||
Crossrelease introduces two main data structures.
|
||||
|
||||
1. hist_lock
|
||||
|
||||
This is an array embedded in task_struct, for keeping lock history so
|
||||
that dependencies can be added using them at the commit step. Since
|
||||
it's local data, it can be accessed locklessly in the owner context.
|
||||
The array is filled at the acquisition step and consumed at the
|
||||
commit step. And it's managed in circular manner.
|
||||
|
||||
2. cross_lock
|
||||
|
||||
One per lockdep_map exists. This is for keeping data of crosslocks
|
||||
and used at the commit step.
|
||||
|
||||
|
||||
How crossrelease works
|
||||
----------------------
|
||||
|
||||
It's the key of how crossrelease works, to defer necessary works to an
|
||||
appropriate point in time and perform in at once at the commit step.
|
||||
Let's take a look with examples step by step, starting from how lockdep
|
||||
works without crossrelease for typical locks.
|
||||
|
||||
acquire A /* Push A onto held_locks */
|
||||
acquire B /* Push B onto held_locks and add 'A -> B' */
|
||||
acquire C /* Push C onto held_locks and add 'B -> C' */
|
||||
release C /* Pop C from held_locks */
|
||||
release B /* Pop B from held_locks */
|
||||
release A /* Pop A from held_locks */
|
||||
|
||||
where A, B and C are different lock classes.
|
||||
|
||||
NOTE: This document assumes that readers already understand how
|
||||
lockdep works without crossrelease thus omits details. But there's
|
||||
one thing to note. Lockdep pretends to pop a lock from held_locks
|
||||
when releasing it. But it's subtly different from the original pop
|
||||
operation because lockdep allows other than the top to be poped.
|
||||
|
||||
In this case, lockdep adds 'the top of held_locks -> the lock to acquire'
|
||||
dependency every time acquiring a lock.
|
||||
|
||||
After adding 'A -> B', a dependency graph will be:
|
||||
|
||||
A -> B
|
||||
|
||||
where A and B are different lock classes.
|
||||
|
||||
And after adding 'B -> C', the graph will be:
|
||||
|
||||
A -> B -> C
|
||||
|
||||
where A, B and C are different lock classes.
|
||||
|
||||
Let's performs commit step even for typical locks to add dependencies.
|
||||
Of course, commit step is not necessary for them, however, it would work
|
||||
well because this is a more general way.
|
||||
|
||||
acquire A
|
||||
/*
|
||||
* Queue A into hist_locks
|
||||
*
|
||||
* In hist_locks: A
|
||||
* In graph: Empty
|
||||
*/
|
||||
|
||||
acquire B
|
||||
/*
|
||||
* Queue B into hist_locks
|
||||
*
|
||||
* In hist_locks: A, B
|
||||
* In graph: Empty
|
||||
*/
|
||||
|
||||
acquire C
|
||||
/*
|
||||
* Queue C into hist_locks
|
||||
*
|
||||
* In hist_locks: A, B, C
|
||||
* In graph: Empty
|
||||
*/
|
||||
|
||||
commit C
|
||||
/*
|
||||
* Add 'C -> ?'
|
||||
* Answer the following to decide '?'
|
||||
* What has been queued since acquire C: Nothing
|
||||
*
|
||||
* In hist_locks: A, B, C
|
||||
* In graph: Empty
|
||||
*/
|
||||
|
||||
release C
|
||||
|
||||
commit B
|
||||
/*
|
||||
* Add 'B -> ?'
|
||||
* Answer the following to decide '?'
|
||||
* What has been queued since acquire B: C
|
||||
*
|
||||
* In hist_locks: A, B, C
|
||||
* In graph: 'B -> C'
|
||||
*/
|
||||
|
||||
release B
|
||||
|
||||
commit A
|
||||
/*
|
||||
* Add 'A -> ?'
|
||||
* Answer the following to decide '?'
|
||||
* What has been queued since acquire A: B, C
|
||||
*
|
||||
* In hist_locks: A, B, C
|
||||
* In graph: 'B -> C', 'A -> B', 'A -> C'
|
||||
*/
|
||||
|
||||
release A
|
||||
|
||||
where A, B and C are different lock classes.
|
||||
|
||||
In this case, dependencies are added at the commit step as described.
|
||||
|
||||
After commits for A, B and C, the graph will be:
|
||||
|
||||
A -> B -> C
|
||||
|
||||
where A, B and C are different lock classes.
|
||||
|
||||
NOTE: A dependency 'A -> C' is optimized out.
|
||||
|
||||
We can see the former graph built without commit step is same as the
|
||||
latter graph built using commit steps. Of course the former way leads to
|
||||
earlier finish for building the graph, which means we can detect a
|
||||
deadlock or its possibility sooner. So the former way would be prefered
|
||||
when possible. But we cannot avoid using the latter way for crosslocks.
|
||||
|
||||
Let's look at how commit steps work for crosslocks. In this case, the
|
||||
commit step is performed only on crosslock AX as real. And it assumes
|
||||
that the AX release context is different from the AX acquire context.
|
||||
|
||||
BX RELEASE CONTEXT BX ACQUIRE CONTEXT
|
||||
------------------ ------------------
|
||||
acquire A
|
||||
/*
|
||||
* Push A onto held_locks
|
||||
* Queue A into hist_locks
|
||||
*
|
||||
* In held_locks: A
|
||||
* In hist_locks: A
|
||||
* In graph: Empty
|
||||
*/
|
||||
|
||||
acquire BX
|
||||
/*
|
||||
* Add 'the top of held_locks -> BX'
|
||||
*
|
||||
* In held_locks: A
|
||||
* In hist_locks: A
|
||||
* In graph: 'A -> BX'
|
||||
*/
|
||||
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
It must be guaranteed that the following operations are seen after
|
||||
acquiring BX globally. It can be done by things like barrier.
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
acquire C
|
||||
/*
|
||||
* Push C onto held_locks
|
||||
* Queue C into hist_locks
|
||||
*
|
||||
* In held_locks: C
|
||||
* In hist_locks: C
|
||||
* In graph: 'A -> BX'
|
||||
*/
|
||||
|
||||
release C
|
||||
/*
|
||||
* Pop C from held_locks
|
||||
*
|
||||
* In held_locks: Empty
|
||||
* In hist_locks: C
|
||||
* In graph: 'A -> BX'
|
||||
*/
|
||||
acquire D
|
||||
/*
|
||||
* Push D onto held_locks
|
||||
* Queue D into hist_locks
|
||||
* Add 'the top of held_locks -> D'
|
||||
*
|
||||
* In held_locks: A, D
|
||||
* In hist_locks: A, D
|
||||
* In graph: 'A -> BX', 'A -> D'
|
||||
*/
|
||||
acquire E
|
||||
/*
|
||||
* Push E onto held_locks
|
||||
* Queue E into hist_locks
|
||||
*
|
||||
* In held_locks: E
|
||||
* In hist_locks: C, E
|
||||
* In graph: 'A -> BX', 'A -> D'
|
||||
*/
|
||||
|
||||
release E
|
||||
/*
|
||||
* Pop E from held_locks
|
||||
*
|
||||
* In held_locks: Empty
|
||||
* In hist_locks: D, E
|
||||
* In graph: 'A -> BX', 'A -> D'
|
||||
*/
|
||||
release D
|
||||
/*
|
||||
* Pop D from held_locks
|
||||
*
|
||||
* In held_locks: A
|
||||
* In hist_locks: A, D
|
||||
* In graph: 'A -> BX', 'A -> D'
|
||||
*/
|
||||
commit BX
|
||||
/*
|
||||
* Add 'BX -> ?'
|
||||
* What has been queued since acquire BX: C, E
|
||||
*
|
||||
* In held_locks: Empty
|
||||
* In hist_locks: D, E
|
||||
* In graph: 'A -> BX', 'A -> D',
|
||||
* 'BX -> C', 'BX -> E'
|
||||
*/
|
||||
|
||||
release BX
|
||||
/*
|
||||
* In held_locks: Empty
|
||||
* In hist_locks: D, E
|
||||
* In graph: 'A -> BX', 'A -> D',
|
||||
* 'BX -> C', 'BX -> E'
|
||||
*/
|
||||
release A
|
||||
/*
|
||||
* Pop A from held_locks
|
||||
*
|
||||
* In held_locks: Empty
|
||||
* In hist_locks: A, D
|
||||
* In graph: 'A -> BX', 'A -> D',
|
||||
* 'BX -> C', 'BX -> E'
|
||||
*/
|
||||
|
||||
where A, BX, C,..., E are different lock classes, and a suffix 'X' is
|
||||
added on crosslocks.
|
||||
|
||||
Crossrelease considers all acquisitions after acqiuring BX are
|
||||
candidates which might create dependencies with BX. True dependencies
|
||||
will be determined when identifying the release context of BX. Meanwhile,
|
||||
all typical locks are queued so that they can be used at the commit step.
|
||||
And then two dependencies 'BX -> C' and 'BX -> E' are added at the
|
||||
commit step when identifying the release context.
|
||||
|
||||
The final graph will be, with crossrelease:
|
||||
|
||||
-> C
|
||||
/
|
||||
-> BX -
|
||||
/ \
|
||||
A - -> E
|
||||
\
|
||||
-> D
|
||||
|
||||
where A, BX, C,..., E are different lock classes, and a suffix 'X' is
|
||||
added on crosslocks.
|
||||
|
||||
However, the final graph will be, without crossrelease:
|
||||
|
||||
A -> D
|
||||
|
||||
where A and D are different lock classes.
|
||||
|
||||
The former graph has three more dependencies, 'A -> BX', 'BX -> C' and
|
||||
'BX -> E' giving additional opportunities to check if they cause
|
||||
deadlocks. This way lockdep can detect a deadlock or its possibility
|
||||
caused by crosslocks.
|
||||
|
||||
CONCLUSION
|
||||
|
||||
We checked how crossrelease works with several examples.
|
||||
|
||||
|
||||
=============
|
||||
Optimizations
|
||||
=============
|
||||
|
||||
Avoid duplication
|
||||
-----------------
|
||||
|
||||
Crossrelease feature uses a cache like what lockdep already uses for
|
||||
dependency chains, but this time it's for caching CT type dependencies.
|
||||
Once that dependency is cached, the same will never be added again.
|
||||
|
||||
|
||||
Lockless for hot paths
|
||||
----------------------
|
||||
|
||||
To keep all locks for later use at the commit step, crossrelease adopts
|
||||
a local array embedded in task_struct, which makes access to the data
|
||||
lockless by forcing it to happen only within the owner context. It's
|
||||
like how lockdep handles held_locks. Lockless implmentation is important
|
||||
since typical locks are very frequently acquired and released.
|
||||
|
||||
|
||||
=================================================
|
||||
APPENDIX A: What lockdep does to work aggresively
|
||||
=================================================
|
||||
|
||||
A deadlock actually occurs when all wait operations creating circular
|
||||
dependencies run at the same time. Even though they don't, a potential
|
||||
deadlock exists if the problematic dependencies exist. Thus it's
|
||||
meaningful to detect not only an actual deadlock but also its potential
|
||||
possibility. The latter is rather valuable. When a deadlock occurs
|
||||
actually, we can identify what happens in the system by some means or
|
||||
other even without lockdep. However, there's no way to detect possiblity
|
||||
without lockdep unless the whole code is parsed in head. It's terrible.
|
||||
Lockdep does the both, and crossrelease only focuses on the latter.
|
||||
|
||||
Whether or not a deadlock actually occurs depends on several factors.
|
||||
For example, what order contexts are switched in is a factor. Assuming
|
||||
circular dependencies exist, a deadlock would occur when contexts are
|
||||
switched so that all wait operations creating the dependencies run
|
||||
simultaneously. Thus to detect a deadlock possibility even in the case
|
||||
that it has not occured yet, lockdep should consider all possible
|
||||
combinations of dependencies, trying to:
|
||||
|
||||
1. Use a global dependency graph.
|
||||
|
||||
Lockdep combines all dependencies into one global graph and uses them,
|
||||
regardless of which context generates them or what order contexts are
|
||||
switched in. Aggregated dependencies are only considered so they are
|
||||
prone to be circular if a problem exists.
|
||||
|
||||
2. Check dependencies between classes instead of instances.
|
||||
|
||||
What actually causes a deadlock are instances of lock. However,
|
||||
lockdep checks dependencies between classes instead of instances.
|
||||
This way lockdep can detect a deadlock which has not happened but
|
||||
might happen in future by others but the same class.
|
||||
|
||||
3. Assume all acquisitions lead to waiting.
|
||||
|
||||
Although locks might be acquired without waiting which is essential
|
||||
to create dependencies, lockdep assumes all acquisitions lead to
|
||||
waiting since it might be true some time or another.
|
||||
|
||||
CONCLUSION
|
||||
|
||||
Lockdep detects not only an actual deadlock but also its possibility,
|
||||
and the latter is more valuable.
|
||||
|
||||
|
||||
==================================================
|
||||
APPENDIX B: How to avoid adding false dependencies
|
||||
==================================================
|
||||
|
||||
Remind what a dependency is. A dependency exists if:
|
||||
|
||||
1. There are two waiters waiting for each event at a given time.
|
||||
2. The only way to wake up each waiter is to trigger its event.
|
||||
3. Whether one can be woken up depends on whether the other can.
|
||||
|
||||
For example:
|
||||
|
||||
acquire A
|
||||
acquire B /* A dependency 'A -> B' exists */
|
||||
release B
|
||||
release A
|
||||
|
||||
where A and B are different lock classes.
|
||||
|
||||
A depedency 'A -> B' exists since:
|
||||
|
||||
1. A waiter for A and a waiter for B might exist when acquiring B.
|
||||
2. Only way to wake up each is to release what it waits for.
|
||||
3. Whether the waiter for A can be woken up depends on whether the
|
||||
other can. IOW, TASK X cannot release A if it fails to acquire B.
|
||||
|
||||
For another example:
|
||||
|
||||
TASK X TASK Y
|
||||
------ ------
|
||||
acquire AX
|
||||
acquire B /* A dependency 'AX -> B' exists */
|
||||
release B
|
||||
release AX held by Y
|
||||
|
||||
where AX and B are different lock classes, and a suffix 'X' is added
|
||||
on crosslocks.
|
||||
|
||||
Even in this case involving crosslocks, the same rule can be applied. A
|
||||
depedency 'AX -> B' exists since:
|
||||
|
||||
1. A waiter for AX and a waiter for B might exist when acquiring B.
|
||||
2. Only way to wake up each is to release what it waits for.
|
||||
3. Whether the waiter for AX can be woken up depends on whether the
|
||||
other can. IOW, TASK X cannot release AX if it fails to acquire B.
|
||||
|
||||
Let's take a look at more complicated example:
|
||||
|
||||
TASK X TASK Y
|
||||
------ ------
|
||||
acquire B
|
||||
release B
|
||||
fork Y
|
||||
acquire AX
|
||||
acquire C /* A dependency 'AX -> C' exists */
|
||||
release C
|
||||
release AX held by Y
|
||||
|
||||
where AX, B and C are different lock classes, and a suffix 'X' is
|
||||
added on crosslocks.
|
||||
|
||||
Does a dependency 'AX -> B' exist? Nope.
|
||||
|
||||
Two waiters are essential to create a dependency. However, waiters for
|
||||
AX and B to create 'AX -> B' cannot exist at the same time in this
|
||||
example. Thus the dependency 'AX -> B' cannot be created.
|
||||
|
||||
It would be ideal if the full set of true ones can be considered. But
|
||||
we can ensure nothing but what actually happened. Relying on what
|
||||
actually happens at runtime, we can anyway add only true ones, though
|
||||
they might be a subset of true ones. It's similar to how lockdep works
|
||||
for typical locks. There might be more true dependencies than what
|
||||
lockdep has detected in runtime. Lockdep has no choice but to rely on
|
||||
what actually happens. Crossrelease also relies on it.
|
||||
|
||||
CONCLUSION
|
||||
|
||||
Relying on what actually happens, lockdep can avoid adding false
|
||||
dependencies.
|
|
@ -498,11 +498,11 @@ And a couple of implicit varieties:
|
|||
This means that ACQUIRE acts as a minimal "acquire" operation and
|
||||
RELEASE acts as a minimal "release" operation.
|
||||
|
||||
A subset of the atomic operations described in core-api/atomic_ops.rst have
|
||||
ACQUIRE and RELEASE variants in addition to fully-ordered and relaxed (no
|
||||
barrier semantics) definitions. For compound atomics performing both a load
|
||||
and a store, ACQUIRE semantics apply only to the load and RELEASE semantics
|
||||
apply only to the store portion of the operation.
|
||||
A subset of the atomic operations described in atomic_t.txt have ACQUIRE and
|
||||
RELEASE variants in addition to fully-ordered and relaxed (no barrier
|
||||
semantics) definitions. For compound atomics performing both a load and a
|
||||
store, ACQUIRE semantics apply only to the load and RELEASE semantics apply
|
||||
only to the store portion of the operation.
|
||||
|
||||
Memory barriers are only required where there's a possibility of interaction
|
||||
between two CPUs or between a CPU and a device. If it can be guaranteed that
|
||||
|
@ -1883,8 +1883,7 @@ There are some more advanced barrier functions:
|
|||
This makes sure that the death mark on the object is perceived to be set
|
||||
*before* the reference counter is decremented.
|
||||
|
||||
See Documentation/core-api/atomic_ops.rst for more information. See the
|
||||
"Atomic operations" subsection for information on where to use these.
|
||||
See Documentation/atomic_{t,bitops}.txt for more information.
|
||||
|
||||
|
||||
(*) lockless_dereference();
|
||||
|
@ -1989,10 +1988,7 @@ for each construct. These operations all imply certain barriers:
|
|||
ACQUIRE operation has completed.
|
||||
|
||||
Memory operations issued before the ACQUIRE may be completed after
|
||||
the ACQUIRE operation has completed. An smp_mb__before_spinlock(),
|
||||
combined with a following ACQUIRE, orders prior stores against
|
||||
subsequent loads and stores. Note that this is weaker than smp_mb()!
|
||||
The smp_mb__before_spinlock() primitive is free on many architectures.
|
||||
the ACQUIRE operation has completed.
|
||||
|
||||
(2) RELEASE operation implication:
|
||||
|
||||
|
@ -2510,88 +2506,7 @@ operations are noted specially as some of them imply full memory barriers and
|
|||
some don't, but they're very heavily relied on as a group throughout the
|
||||
kernel.
|
||||
|
||||
Any atomic operation that modifies some state in memory and returns information
|
||||
about the state (old or new) implies an SMP-conditional general memory barrier
|
||||
(smp_mb()) on each side of the actual operation (with the exception of
|
||||
explicit lock operations, described later). These include:
|
||||
|
||||
xchg();
|
||||
atomic_xchg(); atomic_long_xchg();
|
||||
atomic_inc_return(); atomic_long_inc_return();
|
||||
atomic_dec_return(); atomic_long_dec_return();
|
||||
atomic_add_return(); atomic_long_add_return();
|
||||
atomic_sub_return(); atomic_long_sub_return();
|
||||
atomic_inc_and_test(); atomic_long_inc_and_test();
|
||||
atomic_dec_and_test(); atomic_long_dec_and_test();
|
||||
atomic_sub_and_test(); atomic_long_sub_and_test();
|
||||
atomic_add_negative(); atomic_long_add_negative();
|
||||
test_and_set_bit();
|
||||
test_and_clear_bit();
|
||||
test_and_change_bit();
|
||||
|
||||
/* when succeeds */
|
||||
cmpxchg();
|
||||
atomic_cmpxchg(); atomic_long_cmpxchg();
|
||||
atomic_add_unless(); atomic_long_add_unless();
|
||||
|
||||
These are used for such things as implementing ACQUIRE-class and RELEASE-class
|
||||
operations and adjusting reference counters towards object destruction, and as
|
||||
such the implicit memory barrier effects are necessary.
|
||||
|
||||
|
||||
The following operations are potential problems as they do _not_ imply memory
|
||||
barriers, but might be used for implementing such things as RELEASE-class
|
||||
operations:
|
||||
|
||||
atomic_set();
|
||||
set_bit();
|
||||
clear_bit();
|
||||
change_bit();
|
||||
|
||||
With these the appropriate explicit memory barrier should be used if necessary
|
||||
(smp_mb__before_atomic() for instance).
|
||||
|
||||
|
||||
The following also do _not_ imply memory barriers, and so may require explicit
|
||||
memory barriers under some circumstances (smp_mb__before_atomic() for
|
||||
instance):
|
||||
|
||||
atomic_add();
|
||||
atomic_sub();
|
||||
atomic_inc();
|
||||
atomic_dec();
|
||||
|
||||
If they're used for statistics generation, then they probably don't need memory
|
||||
barriers, unless there's a coupling between statistical data.
|
||||
|
||||
If they're used for reference counting on an object to control its lifetime,
|
||||
they probably don't need memory barriers because either the reference count
|
||||
will be adjusted inside a locked section, or the caller will already hold
|
||||
sufficient references to make the lock, and thus a memory barrier unnecessary.
|
||||
|
||||
If they're used for constructing a lock of some description, then they probably
|
||||
do need memory barriers as a lock primitive generally has to do things in a
|
||||
specific order.
|
||||
|
||||
Basically, each usage case has to be carefully considered as to whether memory
|
||||
barriers are needed or not.
|
||||
|
||||
The following operations are special locking primitives:
|
||||
|
||||
test_and_set_bit_lock();
|
||||
clear_bit_unlock();
|
||||
__clear_bit_unlock();
|
||||
|
||||
These implement ACQUIRE-class and RELEASE-class operations. These should be
|
||||
used in preference to other operations when implementing locking primitives,
|
||||
because their implementations can be optimised on many architectures.
|
||||
|
||||
[!] Note that special memory barrier primitives are available for these
|
||||
situations because on some CPUs the atomic instructions used imply full memory
|
||||
barriers, and so barrier instructions are superfluous in conjunction with them,
|
||||
and in such cases the special barrier primitives will be no-ops.
|
||||
|
||||
See Documentation/core-api/atomic_ops.rst for more information.
|
||||
See Documentation/atomic_t.txt for more information.
|
||||
|
||||
|
||||
ACCESSING DEVICES
|
||||
|
|
|
@ -149,6 +149,26 @@ static_branch_inc(), will change the branch back to true. Likewise, if the
|
|||
key is initialized false, a 'static_branch_inc()', will change the branch to
|
||||
true. And then a 'static_branch_dec()', will again make the branch false.
|
||||
|
||||
The state and the reference count can be retrieved with 'static_key_enabled()'
|
||||
and 'static_key_count()'. In general, if you use these functions, they
|
||||
should be protected with the same mutex used around the enable/disable
|
||||
or increment/decrement function.
|
||||
|
||||
Note that switching branches results in some locks being taken,
|
||||
particularly the CPU hotplug lock (in order to avoid races against
|
||||
CPUs being brought in the kernel whilst the kernel is getting
|
||||
patched). Calling the static key API from within a hotplug notifier is
|
||||
thus a sure deadlock recipe. In order to still allow use of the
|
||||
functionnality, the following functions are provided:
|
||||
|
||||
static_key_enable_cpuslocked()
|
||||
static_key_disable_cpuslocked()
|
||||
static_branch_enable_cpuslocked()
|
||||
static_branch_disable_cpuslocked()
|
||||
|
||||
These functions are *not* general purpose, and must only be used when
|
||||
you really know that you're in the above context, and no other.
|
||||
|
||||
Where an array of keys is required, it can be defined as::
|
||||
|
||||
DEFINE_STATIC_KEY_ARRAY_TRUE(keys, count);
|
||||
|
|
|
@ -1956,10 +1956,7 @@ MMIO 쓰기 배리어
|
|||
뒤에 완료됩니다.
|
||||
|
||||
ACQUIRE 앞에서 요청된 메모리 오퍼레이션은 ACQUIRE 오퍼레이션이 완료된 후에
|
||||
완료될 수 있습니다. smp_mb__before_spinlock() 뒤에 ACQUIRE 가 실행되는
|
||||
코드 블록은 블록 앞의 스토어를 블록 뒤의 로드와 스토어에 대해 순서
|
||||
맞춥니다. 이건 smp_mb() 보다 완화된 것임을 기억하세요! 많은 아키텍쳐에서
|
||||
smp_mb__before_spinlock() 은 사실 아무일도 하지 않습니다.
|
||||
완료될 수 있습니다.
|
||||
|
||||
(2) RELEASE 오퍼레이션의 영향:
|
||||
|
||||
|
|
12
arch/Kconfig
12
arch/Kconfig
|
@ -931,6 +931,18 @@ config STRICT_MODULE_RWX
|
|||
config ARCH_WANT_RELAX_ORDER
|
||||
bool
|
||||
|
||||
config ARCH_HAS_REFCOUNT
|
||||
bool
|
||||
help
|
||||
An architecture selects this when it has implemented refcount_t
|
||||
using open coded assembly primitives that provide an optimized
|
||||
refcount_t implementation, possibly at the expense of some full
|
||||
refcount state checks of CONFIG_REFCOUNT_FULL=y.
|
||||
|
||||
The refcount overflow check behavior, however, must be retained.
|
||||
Catching overflows is the primary security concern for protecting
|
||||
against bugs in reference counts.
|
||||
|
||||
config REFCOUNT_FULL
|
||||
bool "Perform full reference count validation at the expense of speed"
|
||||
help
|
||||
|
|
|
@ -25,18 +25,10 @@
|
|||
: "r" (uaddr), "r"(oparg) \
|
||||
: "memory")
|
||||
|
||||
static inline int futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
|
||||
static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
|
||||
u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int oldval = 0, ret;
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
|
||||
return -EFAULT;
|
||||
|
||||
pagefault_disable();
|
||||
|
||||
|
@ -62,17 +54,9 @@ static inline int futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
|
|||
|
||||
pagefault_enable();
|
||||
|
||||
if (!ret) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ: ret = (oldval == cmparg); break;
|
||||
case FUTEX_OP_CMP_NE: ret = (oldval != cmparg); break;
|
||||
case FUTEX_OP_CMP_LT: ret = (oldval < cmparg); break;
|
||||
case FUTEX_OP_CMP_GE: ret = (oldval >= cmparg); break;
|
||||
case FUTEX_OP_CMP_LE: ret = (oldval <= cmparg); break;
|
||||
case FUTEX_OP_CMP_GT: ret = (oldval > cmparg); break;
|
||||
default: ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
|
@ -123,6 +123,8 @@ static inline void atomic_set(atomic_t *v, int i)
|
|||
atomic_ops_unlock(flags);
|
||||
}
|
||||
|
||||
#define atomic_set_release(v, i) atomic_set((v), (i))
|
||||
|
||||
#endif
|
||||
|
||||
/*
|
||||
|
|
|
@ -73,20 +73,11 @@
|
|||
|
||||
#endif
|
||||
|
||||
static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
||||
static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
|
||||
u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int oldval = 0, ret;
|
||||
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int)))
|
||||
return -EFAULT;
|
||||
|
||||
#ifndef CONFIG_ARC_HAS_LLSC
|
||||
preempt_disable(); /* to guarantee atomic r-m-w of futex op */
|
||||
#endif
|
||||
|
@ -118,30 +109,9 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
|||
preempt_enable();
|
||||
#endif
|
||||
|
||||
if (!ret) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ:
|
||||
ret = (oldval == cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_NE:
|
||||
ret = (oldval != cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_LT:
|
||||
ret = (oldval < cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_GE:
|
||||
ret = (oldval >= cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_LE:
|
||||
ret = (oldval <= cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_GT:
|
||||
ret = (oldval > cmparg);
|
||||
break;
|
||||
default:
|
||||
ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
|
@ -128,20 +128,10 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
|
|||
#endif /* !SMP */
|
||||
|
||||
static inline int
|
||||
futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
|
||||
arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int oldval = 0, ret, tmp;
|
||||
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
|
||||
return -EFAULT;
|
||||
|
||||
#ifndef CONFIG_SMP
|
||||
preempt_disable();
|
||||
#endif
|
||||
|
@ -172,17 +162,9 @@ futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
|
|||
preempt_enable();
|
||||
#endif
|
||||
|
||||
if (!ret) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ: ret = (oldval == cmparg); break;
|
||||
case FUTEX_OP_CMP_NE: ret = (oldval != cmparg); break;
|
||||
case FUTEX_OP_CMP_LT: ret = (oldval < cmparg); break;
|
||||
case FUTEX_OP_CMP_GE: ret = (oldval >= cmparg); break;
|
||||
case FUTEX_OP_CMP_LE: ret = (oldval <= cmparg); break;
|
||||
case FUTEX_OP_CMP_GT: ret = (oldval > cmparg); break;
|
||||
default: ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
|
@ -48,20 +48,10 @@ do { \
|
|||
} while (0)
|
||||
|
||||
static inline int
|
||||
futex_atomic_op_inuser(unsigned int encoded_op, u32 __user *uaddr)
|
||||
arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (int)(encoded_op << 8) >> 20;
|
||||
int cmparg = (int)(encoded_op << 20) >> 20;
|
||||
int oldval = 0, ret, tmp;
|
||||
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1U << (oparg & 0x1f);
|
||||
|
||||
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
|
||||
return -EFAULT;
|
||||
|
||||
pagefault_disable();
|
||||
|
||||
switch (op) {
|
||||
|
@ -91,17 +81,9 @@ futex_atomic_op_inuser(unsigned int encoded_op, u32 __user *uaddr)
|
|||
|
||||
pagefault_enable();
|
||||
|
||||
if (!ret) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ: ret = (oldval == cmparg); break;
|
||||
case FUTEX_OP_CMP_NE: ret = (oldval != cmparg); break;
|
||||
case FUTEX_OP_CMP_LT: ret = (oldval < cmparg); break;
|
||||
case FUTEX_OP_CMP_GE: ret = (oldval >= cmparg); break;
|
||||
case FUTEX_OP_CMP_LE: ret = (oldval <= cmparg); break;
|
||||
case FUTEX_OP_CMP_GT: ret = (oldval > cmparg); break;
|
||||
default: ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
|
@ -310,14 +310,7 @@ static inline int arch_read_trylock(arch_rwlock_t *rw)
|
|||
#define arch_read_relax(lock) cpu_relax()
|
||||
#define arch_write_relax(lock) cpu_relax()
|
||||
|
||||
/*
|
||||
* Accesses appearing in program order before a spin_lock() operation
|
||||
* can be reordered with accesses inside the critical section, by virtue
|
||||
* of arch_spin_lock being constructed using acquire semantics.
|
||||
*
|
||||
* In cases where this is problematic (e.g. try_to_wake_up), an
|
||||
* smp_mb__before_spinlock() can restore the required ordering.
|
||||
*/
|
||||
#define smp_mb__before_spinlock() smp_mb()
|
||||
/* See include/linux/spinlock.h */
|
||||
#define smp_mb__after_spinlock() smp_mb()
|
||||
|
||||
#endif /* __ASM_SPINLOCK_H */
|
||||
|
|
|
@ -7,7 +7,8 @@
|
|||
#include <asm/errno.h>
|
||||
#include <linux/uaccess.h>
|
||||
|
||||
extern int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr);
|
||||
extern int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
|
||||
u32 __user *uaddr);
|
||||
|
||||
static inline int
|
||||
futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
|
||||
|
|
|
@ -186,20 +186,10 @@ static inline int atomic_futex_op_xchg_xor(int oparg, u32 __user *uaddr, int *_o
|
|||
/*
|
||||
* do the futex operations
|
||||
*/
|
||||
int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
||||
int arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int oldval = 0, ret;
|
||||
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
|
||||
return -EFAULT;
|
||||
|
||||
pagefault_disable();
|
||||
|
||||
switch (op) {
|
||||
|
@ -225,18 +215,9 @@ int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
|||
|
||||
pagefault_enable();
|
||||
|
||||
if (!ret) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ: ret = (oldval == cmparg); break;
|
||||
case FUTEX_OP_CMP_NE: ret = (oldval != cmparg); break;
|
||||
case FUTEX_OP_CMP_LT: ret = (oldval < cmparg); break;
|
||||
case FUTEX_OP_CMP_GE: ret = (oldval >= cmparg); break;
|
||||
case FUTEX_OP_CMP_LE: ret = (oldval <= cmparg); break;
|
||||
case FUTEX_OP_CMP_GT: ret = (oldval > cmparg); break;
|
||||
default: ret = -ENOSYS; break;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
|
||||
} /* end futex_atomic_op_inuser() */
|
||||
} /* end arch_futex_atomic_op_inuser() */
|
||||
|
|
|
@ -42,6 +42,8 @@ static inline void atomic_set(atomic_t *v, int new)
|
|||
);
|
||||
}
|
||||
|
||||
#define atomic_set_release(v, i) atomic_set((v), (i))
|
||||
|
||||
/**
|
||||
* atomic_read - reads a word, atomically
|
||||
* @v: pointer to atomic value
|
||||
|
|
|
@ -31,18 +31,9 @@
|
|||
|
||||
|
||||
static inline int
|
||||
futex_atomic_op_inuser(int encoded_op, int __user *uaddr)
|
||||
arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int oldval = 0, ret;
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int)))
|
||||
return -EFAULT;
|
||||
|
||||
pagefault_disable();
|
||||
|
||||
|
@ -72,30 +63,9 @@ futex_atomic_op_inuser(int encoded_op, int __user *uaddr)
|
|||
|
||||
pagefault_enable();
|
||||
|
||||
if (!ret) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ:
|
||||
ret = (oldval == cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_NE:
|
||||
ret = (oldval != cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_LT:
|
||||
ret = (oldval < cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_GE:
|
||||
ret = (oldval >= cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_LE:
|
||||
ret = (oldval <= cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_GT:
|
||||
ret = (oldval > cmparg);
|
||||
break;
|
||||
default:
|
||||
ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
|
@ -45,18 +45,9 @@ do { \
|
|||
} while (0)
|
||||
|
||||
static inline int
|
||||
futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
|
||||
arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int oldval = 0, ret;
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
if (! access_ok (VERIFY_WRITE, uaddr, sizeof(u32)))
|
||||
return -EFAULT;
|
||||
|
||||
pagefault_disable();
|
||||
|
||||
|
@ -84,17 +75,9 @@ futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
|
|||
|
||||
pagefault_enable();
|
||||
|
||||
if (!ret) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ: ret = (oldval == cmparg); break;
|
||||
case FUTEX_OP_CMP_NE: ret = (oldval != cmparg); break;
|
||||
case FUTEX_OP_CMP_LT: ret = (oldval < cmparg); break;
|
||||
case FUTEX_OP_CMP_GE: ret = (oldval >= cmparg); break;
|
||||
case FUTEX_OP_CMP_LE: ret = (oldval <= cmparg); break;
|
||||
case FUTEX_OP_CMP_GT: ret = (oldval > cmparg); break;
|
||||
default: ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
|
@ -37,6 +37,8 @@ static inline int atomic_set(atomic_t *v, int i)
|
|||
return i;
|
||||
}
|
||||
|
||||
#define atomic_set_release(v, i) atomic_set((v), (i))
|
||||
|
||||
#define ATOMIC_OP(op, c_op) \
|
||||
static inline void atomic_##op(int i, atomic_t *v) \
|
||||
{ \
|
||||
|
|
|
@ -29,18 +29,9 @@
|
|||
})
|
||||
|
||||
static inline int
|
||||
futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
||||
arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int oldval = 0, ret;
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
|
||||
return -EFAULT;
|
||||
|
||||
pagefault_disable();
|
||||
|
||||
|
@ -66,30 +57,9 @@ futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
|||
|
||||
pagefault_enable();
|
||||
|
||||
if (!ret) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ:
|
||||
ret = (oldval == cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_NE:
|
||||
ret = (oldval != cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_LT:
|
||||
ret = (oldval < cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_GE:
|
||||
ret = (oldval >= cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_LE:
|
||||
ret = (oldval <= cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_GT:
|
||||
ret = (oldval > cmparg);
|
||||
break;
|
||||
default:
|
||||
ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
|
@ -83,18 +83,9 @@
|
|||
}
|
||||
|
||||
static inline int
|
||||
futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
||||
arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int oldval = 0, ret;
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
if (! access_ok (VERIFY_WRITE, uaddr, sizeof(u32)))
|
||||
return -EFAULT;
|
||||
|
||||
pagefault_disable();
|
||||
|
||||
|
@ -125,17 +116,9 @@ futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
|||
|
||||
pagefault_enable();
|
||||
|
||||
if (!ret) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ: ret = (oldval == cmparg); break;
|
||||
case FUTEX_OP_CMP_NE: ret = (oldval != cmparg); break;
|
||||
case FUTEX_OP_CMP_LT: ret = (oldval < cmparg); break;
|
||||
case FUTEX_OP_CMP_GE: ret = (oldval >= cmparg); break;
|
||||
case FUTEX_OP_CMP_LE: ret = (oldval <= cmparg); break;
|
||||
case FUTEX_OP_CMP_GT: ret = (oldval > cmparg); break;
|
||||
default: ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
|
@ -648,12 +648,12 @@ EXPORT_SYMBOL(flush_tlb_one);
|
|||
#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
|
||||
|
||||
static DEFINE_PER_CPU(atomic_t, tick_broadcast_count);
|
||||
static DEFINE_PER_CPU(struct call_single_data, tick_broadcast_csd);
|
||||
static DEFINE_PER_CPU(call_single_data_t, tick_broadcast_csd);
|
||||
|
||||
void tick_broadcast(const struct cpumask *mask)
|
||||
{
|
||||
atomic_t *count;
|
||||
struct call_single_data *csd;
|
||||
call_single_data_t *csd;
|
||||
int cpu;
|
||||
|
||||
for_each_cpu(cpu, mask) {
|
||||
|
@ -674,7 +674,7 @@ static void tick_broadcast_callee(void *info)
|
|||
|
||||
static int __init tick_broadcast_init(void)
|
||||
{
|
||||
struct call_single_data *csd;
|
||||
call_single_data_t *csd;
|
||||
int cpu;
|
||||
|
||||
for (cpu = 0; cpu < NR_CPUS; cpu++) {
|
||||
|
|
|
@ -30,20 +30,10 @@
|
|||
})
|
||||
|
||||
static inline int
|
||||
futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
||||
arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int oldval = 0, ret;
|
||||
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
|
||||
return -EFAULT;
|
||||
|
||||
pagefault_disable();
|
||||
|
||||
switch (op) {
|
||||
|
@ -68,30 +58,9 @@ futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
|||
|
||||
pagefault_enable();
|
||||
|
||||
if (!ret) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ:
|
||||
ret = (oldval == cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_NE:
|
||||
ret = (oldval != cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_LT:
|
||||
ret = (oldval < cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_GE:
|
||||
ret = (oldval >= cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_LE:
|
||||
ret = (oldval <= cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_GT:
|
||||
ret = (oldval > cmparg);
|
||||
break;
|
||||
default:
|
||||
ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
|
@ -65,6 +65,8 @@ static __inline__ void atomic_set(atomic_t *v, int i)
|
|||
_atomic_spin_unlock_irqrestore(v, flags);
|
||||
}
|
||||
|
||||
#define atomic_set_release(v, i) atomic_set((v), (i))
|
||||
|
||||
static __inline__ int atomic_read(const atomic_t *v)
|
||||
{
|
||||
return READ_ONCE((v)->counter);
|
||||
|
|
|
@ -32,22 +32,12 @@ _futex_spin_unlock_irqrestore(u32 __user *uaddr, unsigned long int *flags)
|
|||
}
|
||||
|
||||
static inline int
|
||||
futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
|
||||
arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
|
||||
{
|
||||
unsigned long int flags;
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int oldval, ret;
|
||||
u32 tmp;
|
||||
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(*uaddr)))
|
||||
return -EFAULT;
|
||||
|
||||
_futex_spin_lock_irqsave(uaddr, &flags);
|
||||
pagefault_disable();
|
||||
|
||||
|
@ -85,17 +75,9 @@ futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
|
|||
pagefault_enable();
|
||||
_futex_spin_unlock_irqrestore(uaddr, &flags);
|
||||
|
||||
if (ret == 0) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ: ret = (oldval == cmparg); break;
|
||||
case FUTEX_OP_CMP_NE: ret = (oldval != cmparg); break;
|
||||
case FUTEX_OP_CMP_LT: ret = (oldval < cmparg); break;
|
||||
case FUTEX_OP_CMP_GE: ret = (oldval >= cmparg); break;
|
||||
case FUTEX_OP_CMP_LE: ret = (oldval <= cmparg); break;
|
||||
case FUTEX_OP_CMP_GT: ret = (oldval > cmparg); break;
|
||||
default: ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
|
@ -74,13 +74,6 @@ do { \
|
|||
___p1; \
|
||||
})
|
||||
|
||||
/*
|
||||
* This must resolve to hwsync on SMP for the context switch path.
|
||||
* See _switch, and core scheduler context switch memory ordering
|
||||
* comments.
|
||||
*/
|
||||
#define smp_mb__before_spinlock() smp_mb()
|
||||
|
||||
#include <asm-generic/barrier.h>
|
||||
|
||||
#endif /* _ASM_POWERPC_BARRIER_H */
|
||||
|
|
|
@ -29,18 +29,10 @@
|
|||
: "b" (uaddr), "i" (-EFAULT), "r" (oparg) \
|
||||
: "cr0", "memory")
|
||||
|
||||
static inline int futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
|
||||
static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
|
||||
u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int oldval = 0, ret;
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
if (! access_ok (VERIFY_WRITE, uaddr, sizeof(u32)))
|
||||
return -EFAULT;
|
||||
|
||||
pagefault_disable();
|
||||
|
||||
|
@ -66,17 +58,9 @@ static inline int futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
|
|||
|
||||
pagefault_enable();
|
||||
|
||||
if (!ret) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ: ret = (oldval == cmparg); break;
|
||||
case FUTEX_OP_CMP_NE: ret = (oldval != cmparg); break;
|
||||
case FUTEX_OP_CMP_LT: ret = (oldval < cmparg); break;
|
||||
case FUTEX_OP_CMP_GE: ret = (oldval >= cmparg); break;
|
||||
case FUTEX_OP_CMP_LE: ret = (oldval <= cmparg); break;
|
||||
case FUTEX_OP_CMP_GT: ret = (oldval > cmparg); break;
|
||||
default: ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
|
@ -309,5 +309,8 @@ static inline void arch_write_unlock(arch_rwlock_t *rw)
|
|||
#define arch_read_relax(lock) __rw_yield(lock)
|
||||
#define arch_write_relax(lock) __rw_yield(lock)
|
||||
|
||||
/* See include/linux/spinlock.h */
|
||||
#define smp_mb__after_spinlock() smp_mb()
|
||||
|
||||
#endif /* __KERNEL__ */
|
||||
#endif /* __ASM_SPINLOCK_H */
|
||||
|
|
|
@ -21,17 +21,12 @@
|
|||
: "0" (-EFAULT), "d" (oparg), "a" (uaddr), \
|
||||
"m" (*uaddr) : "cc");
|
||||
|
||||
static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
||||
static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
|
||||
u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int oldval = 0, newval, ret;
|
||||
|
||||
load_kernel_asce();
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
pagefault_disable();
|
||||
switch (op) {
|
||||
|
@ -60,17 +55,9 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
|||
}
|
||||
pagefault_enable();
|
||||
|
||||
if (!ret) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ: ret = (oldval == cmparg); break;
|
||||
case FUTEX_OP_CMP_NE: ret = (oldval != cmparg); break;
|
||||
case FUTEX_OP_CMP_LT: ret = (oldval < cmparg); break;
|
||||
case FUTEX_OP_CMP_GE: ret = (oldval >= cmparg); break;
|
||||
case FUTEX_OP_CMP_LE: ret = (oldval <= cmparg); break;
|
||||
case FUTEX_OP_CMP_GT: ret = (oldval > cmparg); break;
|
||||
default: ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
|
@ -27,21 +27,12 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
|
|||
return atomic_futex_op_cmpxchg_inatomic(uval, uaddr, oldval, newval);
|
||||
}
|
||||
|
||||
static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
||||
static inline int arch_futex_atomic_op_inuser(int op, u32 oparg, int *oval,
|
||||
u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
u32 oparg = (encoded_op << 8) >> 20;
|
||||
u32 cmparg = (encoded_op << 20) >> 20;
|
||||
u32 oldval, newval, prev;
|
||||
int ret;
|
||||
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
|
||||
return -EFAULT;
|
||||
|
||||
pagefault_disable();
|
||||
|
||||
do {
|
||||
|
@ -80,17 +71,8 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
|||
|
||||
pagefault_enable();
|
||||
|
||||
if (!ret) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ: ret = (oldval == cmparg); break;
|
||||
case FUTEX_OP_CMP_NE: ret = (oldval != cmparg); break;
|
||||
case FUTEX_OP_CMP_LT: ret = ((int)oldval < (int)cmparg); break;
|
||||
case FUTEX_OP_CMP_GE: ret = ((int)oldval >= (int)cmparg); break;
|
||||
case FUTEX_OP_CMP_LE: ret = ((int)oldval <= (int)cmparg); break;
|
||||
case FUTEX_OP_CMP_GT: ret = ((int)oldval > (int)cmparg); break;
|
||||
default: ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
|
|
@ -29,6 +29,8 @@ int atomic_xchg(atomic_t *, int);
|
|||
int __atomic_add_unless(atomic_t *, int, int);
|
||||
void atomic_set(atomic_t *, int);
|
||||
|
||||
#define atomic_set_release(v, i) atomic_set((v), (i))
|
||||
|
||||
#define atomic_read(v) ACCESS_ONCE((v)->counter)
|
||||
|
||||
#define atomic_add(i, v) ((void)atomic_add_return( (int)(i), (v)))
|
||||
|
|
|
@ -29,22 +29,14 @@
|
|||
: "r" (uaddr), "r" (oparg), "i" (-EFAULT) \
|
||||
: "memory")
|
||||
|
||||
static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
||||
static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
|
||||
u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int oldval = 0, ret, tem;
|
||||
|
||||
if (unlikely(!access_ok(VERIFY_WRITE, uaddr, sizeof(u32))))
|
||||
return -EFAULT;
|
||||
if (unlikely((((unsigned long) uaddr) & 0x3UL)))
|
||||
return -EINVAL;
|
||||
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
pagefault_disable();
|
||||
|
||||
switch (op) {
|
||||
|
@ -69,17 +61,9 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
|||
|
||||
pagefault_enable();
|
||||
|
||||
if (!ret) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ: ret = (oldval == cmparg); break;
|
||||
case FUTEX_OP_CMP_NE: ret = (oldval != cmparg); break;
|
||||
case FUTEX_OP_CMP_LT: ret = (oldval < cmparg); break;
|
||||
case FUTEX_OP_CMP_GE: ret = (oldval >= cmparg); break;
|
||||
case FUTEX_OP_CMP_LE: ret = (oldval <= cmparg); break;
|
||||
case FUTEX_OP_CMP_GT: ret = (oldval > cmparg); break;
|
||||
default: ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
|
@ -101,6 +101,8 @@ static inline void atomic_set(atomic_t *v, int n)
|
|||
_atomic_xchg(&v->counter, n);
|
||||
}
|
||||
|
||||
#define atomic_set_release(v, i) atomic_set((v), (i))
|
||||
|
||||
/* A 64bit atomic type */
|
||||
|
||||
typedef struct {
|
||||
|
|
|
@ -106,12 +106,9 @@
|
|||
lock = __atomic_hashed_lock((int __force *)uaddr)
|
||||
#endif
|
||||
|
||||
static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
||||
static inline int arch_futex_atomic_op_inuser(int op, u32 oparg, int *oval,
|
||||
u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int uninitialized_var(val), ret;
|
||||
|
||||
__futex_prolog();
|
||||
|
@ -119,12 +116,6 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
|||
/* The 32-bit futex code makes this assumption, so validate it here. */
|
||||
BUILD_BUG_ON(sizeof(atomic_t) != sizeof(int));
|
||||
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
|
||||
return -EFAULT;
|
||||
|
||||
pagefault_disable();
|
||||
switch (op) {
|
||||
case FUTEX_OP_SET:
|
||||
|
@ -148,30 +139,9 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
|||
}
|
||||
pagefault_enable();
|
||||
|
||||
if (!ret) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ:
|
||||
ret = (val == cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_NE:
|
||||
ret = (val != cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_LT:
|
||||
ret = (val < cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_GE:
|
||||
ret = (val >= cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_LE:
|
||||
ret = (val <= cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_GT:
|
||||
ret = (val > cmparg);
|
||||
break;
|
||||
default:
|
||||
ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = val;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
|
@ -55,6 +55,8 @@ config X86
|
|||
select ARCH_HAS_KCOV if X86_64
|
||||
select ARCH_HAS_MMIO_FLUSH
|
||||
select ARCH_HAS_PMEM_API if X86_64
|
||||
# Causing hangs/crashes, see the commit that added this change for details.
|
||||
select ARCH_HAS_REFCOUNT if BROKEN
|
||||
select ARCH_HAS_UACCESS_FLUSHCACHE if X86_64
|
||||
select ARCH_HAS_SET_MEMORY
|
||||
select ARCH_HAS_SG_CHAIN
|
||||
|
|
|
@ -74,6 +74,9 @@
|
|||
# define _ASM_EXTABLE_EX(from, to) \
|
||||
_ASM_EXTABLE_HANDLE(from, to, ex_handler_ext)
|
||||
|
||||
# define _ASM_EXTABLE_REFCOUNT(from, to) \
|
||||
_ASM_EXTABLE_HANDLE(from, to, ex_handler_refcount)
|
||||
|
||||
# define _ASM_NOKPROBE(entry) \
|
||||
.pushsection "_kprobe_blacklist","aw" ; \
|
||||
_ASM_ALIGN ; \
|
||||
|
@ -123,6 +126,9 @@
|
|||
# define _ASM_EXTABLE_EX(from, to) \
|
||||
_ASM_EXTABLE_HANDLE(from, to, ex_handler_ext)
|
||||
|
||||
# define _ASM_EXTABLE_REFCOUNT(from, to) \
|
||||
_ASM_EXTABLE_HANDLE(from, to, ex_handler_refcount)
|
||||
|
||||
/* For C file, we already have NOKPROBE_SYMBOL macro */
|
||||
#endif
|
||||
|
||||
|
|
|
@ -197,35 +197,56 @@ static inline int atomic_xchg(atomic_t *v, int new)
|
|||
return xchg(&v->counter, new);
|
||||
}
|
||||
|
||||
#define ATOMIC_OP(op) \
|
||||
static inline void atomic_##op(int i, atomic_t *v) \
|
||||
{ \
|
||||
asm volatile(LOCK_PREFIX #op"l %1,%0" \
|
||||
: "+m" (v->counter) \
|
||||
: "ir" (i) \
|
||||
: "memory"); \
|
||||
static inline void atomic_and(int i, atomic_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "andl %1,%0"
|
||||
: "+m" (v->counter)
|
||||
: "ir" (i)
|
||||
: "memory");
|
||||
}
|
||||
|
||||
#define ATOMIC_FETCH_OP(op, c_op) \
|
||||
static inline int atomic_fetch_##op(int i, atomic_t *v) \
|
||||
{ \
|
||||
int val = atomic_read(v); \
|
||||
do { \
|
||||
} while (!atomic_try_cmpxchg(v, &val, val c_op i)); \
|
||||
return val; \
|
||||
static inline int atomic_fetch_and(int i, atomic_t *v)
|
||||
{
|
||||
int val = atomic_read(v);
|
||||
|
||||
do { } while (!atomic_try_cmpxchg(v, &val, val & i));
|
||||
|
||||
return val;
|
||||
}
|
||||
|
||||
#define ATOMIC_OPS(op, c_op) \
|
||||
ATOMIC_OP(op) \
|
||||
ATOMIC_FETCH_OP(op, c_op)
|
||||
static inline void atomic_or(int i, atomic_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "orl %1,%0"
|
||||
: "+m" (v->counter)
|
||||
: "ir" (i)
|
||||
: "memory");
|
||||
}
|
||||
|
||||
ATOMIC_OPS(and, &)
|
||||
ATOMIC_OPS(or , |)
|
||||
ATOMIC_OPS(xor, ^)
|
||||
static inline int atomic_fetch_or(int i, atomic_t *v)
|
||||
{
|
||||
int val = atomic_read(v);
|
||||
|
||||
#undef ATOMIC_OPS
|
||||
#undef ATOMIC_FETCH_OP
|
||||
#undef ATOMIC_OP
|
||||
do { } while (!atomic_try_cmpxchg(v, &val, val | i));
|
||||
|
||||
return val;
|
||||
}
|
||||
|
||||
static inline void atomic_xor(int i, atomic_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "xorl %1,%0"
|
||||
: "+m" (v->counter)
|
||||
: "ir" (i)
|
||||
: "memory");
|
||||
}
|
||||
|
||||
static inline int atomic_fetch_xor(int i, atomic_t *v)
|
||||
{
|
||||
int val = atomic_read(v);
|
||||
|
||||
do { } while (!atomic_try_cmpxchg(v, &val, val ^ i));
|
||||
|
||||
return val;
|
||||
}
|
||||
|
||||
/**
|
||||
* __atomic_add_unless - add unless the number is already a given value
|
||||
|
@ -239,10 +260,12 @@ ATOMIC_OPS(xor, ^)
|
|||
static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u)
|
||||
{
|
||||
int c = atomic_read(v);
|
||||
|
||||
do {
|
||||
if (unlikely(c == u))
|
||||
break;
|
||||
} while (!atomic_try_cmpxchg(v, &c, c + a));
|
||||
|
||||
return c;
|
||||
}
|
||||
|
||||
|
|
|
@ -312,37 +312,70 @@ static inline long long atomic64_dec_if_positive(atomic64_t *v)
|
|||
#undef alternative_atomic64
|
||||
#undef __alternative_atomic64
|
||||
|
||||
#define ATOMIC64_OP(op, c_op) \
|
||||
static inline void atomic64_##op(long long i, atomic64_t *v) \
|
||||
{ \
|
||||
long long old, c = 0; \
|
||||
while ((old = atomic64_cmpxchg(v, c, c c_op i)) != c) \
|
||||
c = old; \
|
||||
static inline void atomic64_and(long long i, atomic64_t *v)
|
||||
{
|
||||
long long old, c = 0;
|
||||
|
||||
while ((old = atomic64_cmpxchg(v, c, c & i)) != c)
|
||||
c = old;
|
||||
}
|
||||
|
||||
#define ATOMIC64_FETCH_OP(op, c_op) \
|
||||
static inline long long atomic64_fetch_##op(long long i, atomic64_t *v) \
|
||||
{ \
|
||||
long long old, c = 0; \
|
||||
while ((old = atomic64_cmpxchg(v, c, c c_op i)) != c) \
|
||||
c = old; \
|
||||
return old; \
|
||||
static inline long long atomic64_fetch_and(long long i, atomic64_t *v)
|
||||
{
|
||||
long long old, c = 0;
|
||||
|
||||
while ((old = atomic64_cmpxchg(v, c, c & i)) != c)
|
||||
c = old;
|
||||
|
||||
return old;
|
||||
}
|
||||
|
||||
ATOMIC64_FETCH_OP(add, +)
|
||||
static inline void atomic64_or(long long i, atomic64_t *v)
|
||||
{
|
||||
long long old, c = 0;
|
||||
|
||||
while ((old = atomic64_cmpxchg(v, c, c | i)) != c)
|
||||
c = old;
|
||||
}
|
||||
|
||||
static inline long long atomic64_fetch_or(long long i, atomic64_t *v)
|
||||
{
|
||||
long long old, c = 0;
|
||||
|
||||
while ((old = atomic64_cmpxchg(v, c, c | i)) != c)
|
||||
c = old;
|
||||
|
||||
return old;
|
||||
}
|
||||
|
||||
static inline void atomic64_xor(long long i, atomic64_t *v)
|
||||
{
|
||||
long long old, c = 0;
|
||||
|
||||
while ((old = atomic64_cmpxchg(v, c, c ^ i)) != c)
|
||||
c = old;
|
||||
}
|
||||
|
||||
static inline long long atomic64_fetch_xor(long long i, atomic64_t *v)
|
||||
{
|
||||
long long old, c = 0;
|
||||
|
||||
while ((old = atomic64_cmpxchg(v, c, c ^ i)) != c)
|
||||
c = old;
|
||||
|
||||
return old;
|
||||
}
|
||||
|
||||
static inline long long atomic64_fetch_add(long long i, atomic64_t *v)
|
||||
{
|
||||
long long old, c = 0;
|
||||
|
||||
while ((old = atomic64_cmpxchg(v, c, c + i)) != c)
|
||||
c = old;
|
||||
|
||||
return old;
|
||||
}
|
||||
|
||||
#define atomic64_fetch_sub(i, v) atomic64_fetch_add(-(i), (v))
|
||||
|
||||
#define ATOMIC64_OPS(op, c_op) \
|
||||
ATOMIC64_OP(op, c_op) \
|
||||
ATOMIC64_FETCH_OP(op, c_op)
|
||||
|
||||
ATOMIC64_OPS(and, &)
|
||||
ATOMIC64_OPS(or, |)
|
||||
ATOMIC64_OPS(xor, ^)
|
||||
|
||||
#undef ATOMIC64_OPS
|
||||
#undef ATOMIC64_FETCH_OP
|
||||
#undef ATOMIC64_OP
|
||||
|
||||
#endif /* _ASM_X86_ATOMIC64_32_H */
|
||||
|
|
|
@ -177,7 +177,7 @@ static inline long atomic64_cmpxchg(atomic64_t *v, long old, long new)
|
|||
}
|
||||
|
||||
#define atomic64_try_cmpxchg atomic64_try_cmpxchg
|
||||
static __always_inline bool atomic64_try_cmpxchg(atomic64_t *v, long *old, long new)
|
||||
static __always_inline bool atomic64_try_cmpxchg(atomic64_t *v, s64 *old, long new)
|
||||
{
|
||||
return try_cmpxchg(&v->counter, old, new);
|
||||
}
|
||||
|
@ -198,7 +198,7 @@ static inline long atomic64_xchg(atomic64_t *v, long new)
|
|||
*/
|
||||
static inline bool atomic64_add_unless(atomic64_t *v, long a, long u)
|
||||
{
|
||||
long c = atomic64_read(v);
|
||||
s64 c = atomic64_read(v);
|
||||
do {
|
||||
if (unlikely(c == u))
|
||||
return false;
|
||||
|
@ -217,7 +217,7 @@ static inline bool atomic64_add_unless(atomic64_t *v, long a, long u)
|
|||
*/
|
||||
static inline long atomic64_dec_if_positive(atomic64_t *v)
|
||||
{
|
||||
long dec, c = atomic64_read(v);
|
||||
s64 dec, c = atomic64_read(v);
|
||||
do {
|
||||
dec = c - 1;
|
||||
if (unlikely(dec < 0))
|
||||
|
@ -226,34 +226,55 @@ static inline long atomic64_dec_if_positive(atomic64_t *v)
|
|||
return dec;
|
||||
}
|
||||
|
||||
#define ATOMIC64_OP(op) \
|
||||
static inline void atomic64_##op(long i, atomic64_t *v) \
|
||||
{ \
|
||||
asm volatile(LOCK_PREFIX #op"q %1,%0" \
|
||||
: "+m" (v->counter) \
|
||||
: "er" (i) \
|
||||
: "memory"); \
|
||||
static inline void atomic64_and(long i, atomic64_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "andq %1,%0"
|
||||
: "+m" (v->counter)
|
||||
: "er" (i)
|
||||
: "memory");
|
||||
}
|
||||
|
||||
#define ATOMIC64_FETCH_OP(op, c_op) \
|
||||
static inline long atomic64_fetch_##op(long i, atomic64_t *v) \
|
||||
{ \
|
||||
long val = atomic64_read(v); \
|
||||
do { \
|
||||
} while (!atomic64_try_cmpxchg(v, &val, val c_op i)); \
|
||||
return val; \
|
||||
static inline long atomic64_fetch_and(long i, atomic64_t *v)
|
||||
{
|
||||
s64 val = atomic64_read(v);
|
||||
|
||||
do {
|
||||
} while (!atomic64_try_cmpxchg(v, &val, val & i));
|
||||
return val;
|
||||
}
|
||||
|
||||
#define ATOMIC64_OPS(op, c_op) \
|
||||
ATOMIC64_OP(op) \
|
||||
ATOMIC64_FETCH_OP(op, c_op)
|
||||
static inline void atomic64_or(long i, atomic64_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "orq %1,%0"
|
||||
: "+m" (v->counter)
|
||||
: "er" (i)
|
||||
: "memory");
|
||||
}
|
||||
|
||||
ATOMIC64_OPS(and, &)
|
||||
ATOMIC64_OPS(or, |)
|
||||
ATOMIC64_OPS(xor, ^)
|
||||
static inline long atomic64_fetch_or(long i, atomic64_t *v)
|
||||
{
|
||||
s64 val = atomic64_read(v);
|
||||
|
||||
#undef ATOMIC64_OPS
|
||||
#undef ATOMIC64_FETCH_OP
|
||||
#undef ATOMIC64_OP
|
||||
do {
|
||||
} while (!atomic64_try_cmpxchg(v, &val, val | i));
|
||||
return val;
|
||||
}
|
||||
|
||||
static inline void atomic64_xor(long i, atomic64_t *v)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "xorq %1,%0"
|
||||
: "+m" (v->counter)
|
||||
: "er" (i)
|
||||
: "memory");
|
||||
}
|
||||
|
||||
static inline long atomic64_fetch_xor(long i, atomic64_t *v)
|
||||
{
|
||||
s64 val = atomic64_read(v);
|
||||
|
||||
do {
|
||||
} while (!atomic64_try_cmpxchg(v, &val, val ^ i));
|
||||
return val;
|
||||
}
|
||||
|
||||
#endif /* _ASM_X86_ATOMIC64_64_H */
|
||||
|
|
|
@ -157,7 +157,7 @@ extern void __add_wrong_size(void)
|
|||
#define __raw_try_cmpxchg(_ptr, _pold, _new, size, lock) \
|
||||
({ \
|
||||
bool success; \
|
||||
__typeof__(_ptr) _old = (_pold); \
|
||||
__typeof__(_ptr) _old = (__typeof__(_ptr))(_pold); \
|
||||
__typeof__(*(_ptr)) __old = *_old; \
|
||||
__typeof__(*(_ptr)) __new = (_new); \
|
||||
switch (size) { \
|
||||
|
|
|
@ -41,20 +41,11 @@
|
|||
"+m" (*uaddr), "=&r" (tem) \
|
||||
: "r" (oparg), "i" (-EFAULT), "1" (0))
|
||||
|
||||
static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
||||
static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
|
||||
u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int oldval = 0, ret, tem;
|
||||
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
|
||||
return -EFAULT;
|
||||
|
||||
pagefault_disable();
|
||||
|
||||
switch (op) {
|
||||
|
@ -80,30 +71,9 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
|||
|
||||
pagefault_enable();
|
||||
|
||||
if (!ret) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ:
|
||||
ret = (oldval == cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_NE:
|
||||
ret = (oldval != cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_LT:
|
||||
ret = (oldval < cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_GE:
|
||||
ret = (oldval >= cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_LE:
|
||||
ret = (oldval <= cmparg);
|
||||
break;
|
||||
case FUTEX_OP_CMP_GT:
|
||||
ret = (oldval > cmparg);
|
||||
break;
|
||||
default:
|
||||
ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
|
@ -0,0 +1,109 @@
|
|||
#ifndef __ASM_X86_REFCOUNT_H
|
||||
#define __ASM_X86_REFCOUNT_H
|
||||
/*
|
||||
* x86-specific implementation of refcount_t. Based on PAX_REFCOUNT from
|
||||
* PaX/grsecurity.
|
||||
*/
|
||||
#include <linux/refcount.h>
|
||||
|
||||
/*
|
||||
* This is the first portion of the refcount error handling, which lives in
|
||||
* .text.unlikely, and is jumped to from the CPU flag check (in the
|
||||
* following macros). This saves the refcount value location into CX for
|
||||
* the exception handler to use (in mm/extable.c), and then triggers the
|
||||
* central refcount exception. The fixup address for the exception points
|
||||
* back to the regular execution flow in .text.
|
||||
*/
|
||||
#define _REFCOUNT_EXCEPTION \
|
||||
".pushsection .text.unlikely\n" \
|
||||
"111:\tlea %[counter], %%" _ASM_CX "\n" \
|
||||
"112:\t" ASM_UD0 "\n" \
|
||||
ASM_UNREACHABLE \
|
||||
".popsection\n" \
|
||||
"113:\n" \
|
||||
_ASM_EXTABLE_REFCOUNT(112b, 113b)
|
||||
|
||||
/* Trigger refcount exception if refcount result is negative. */
|
||||
#define REFCOUNT_CHECK_LT_ZERO \
|
||||
"js 111f\n\t" \
|
||||
_REFCOUNT_EXCEPTION
|
||||
|
||||
/* Trigger refcount exception if refcount result is zero or negative. */
|
||||
#define REFCOUNT_CHECK_LE_ZERO \
|
||||
"jz 111f\n\t" \
|
||||
REFCOUNT_CHECK_LT_ZERO
|
||||
|
||||
/* Trigger refcount exception unconditionally. */
|
||||
#define REFCOUNT_ERROR \
|
||||
"jmp 111f\n\t" \
|
||||
_REFCOUNT_EXCEPTION
|
||||
|
||||
static __always_inline void refcount_add(unsigned int i, refcount_t *r)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "addl %1,%0\n\t"
|
||||
REFCOUNT_CHECK_LT_ZERO
|
||||
: [counter] "+m" (r->refs.counter)
|
||||
: "ir" (i)
|
||||
: "cc", "cx");
|
||||
}
|
||||
|
||||
static __always_inline void refcount_inc(refcount_t *r)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "incl %0\n\t"
|
||||
REFCOUNT_CHECK_LT_ZERO
|
||||
: [counter] "+m" (r->refs.counter)
|
||||
: : "cc", "cx");
|
||||
}
|
||||
|
||||
static __always_inline void refcount_dec(refcount_t *r)
|
||||
{
|
||||
asm volatile(LOCK_PREFIX "decl %0\n\t"
|
||||
REFCOUNT_CHECK_LE_ZERO
|
||||
: [counter] "+m" (r->refs.counter)
|
||||
: : "cc", "cx");
|
||||
}
|
||||
|
||||
static __always_inline __must_check
|
||||
bool refcount_sub_and_test(unsigned int i, refcount_t *r)
|
||||
{
|
||||
GEN_BINARY_SUFFIXED_RMWcc(LOCK_PREFIX "subl", REFCOUNT_CHECK_LT_ZERO,
|
||||
r->refs.counter, "er", i, "%0", e);
|
||||
}
|
||||
|
||||
static __always_inline __must_check bool refcount_dec_and_test(refcount_t *r)
|
||||
{
|
||||
GEN_UNARY_SUFFIXED_RMWcc(LOCK_PREFIX "decl", REFCOUNT_CHECK_LT_ZERO,
|
||||
r->refs.counter, "%0", e);
|
||||
}
|
||||
|
||||
static __always_inline __must_check
|
||||
bool refcount_add_not_zero(unsigned int i, refcount_t *r)
|
||||
{
|
||||
int c, result;
|
||||
|
||||
c = atomic_read(&(r->refs));
|
||||
do {
|
||||
if (unlikely(c == 0))
|
||||
return false;
|
||||
|
||||
result = c + i;
|
||||
|
||||
/* Did we try to increment from/to an undesirable state? */
|
||||
if (unlikely(c < 0 || c == INT_MAX || result < c)) {
|
||||
asm volatile(REFCOUNT_ERROR
|
||||
: : [counter] "m" (r->refs.counter)
|
||||
: "cc", "cx");
|
||||
break;
|
||||
}
|
||||
|
||||
} while (!atomic_try_cmpxchg(&(r->refs), &c, result));
|
||||
|
||||
return c != 0;
|
||||
}
|
||||
|
||||
static __always_inline __must_check bool refcount_inc_not_zero(refcount_t *r)
|
||||
{
|
||||
return refcount_add_not_zero(1, r);
|
||||
}
|
||||
|
||||
#endif
|
|
@ -36,6 +36,48 @@ bool ex_handler_fault(const struct exception_table_entry *fixup,
|
|||
}
|
||||
EXPORT_SYMBOL_GPL(ex_handler_fault);
|
||||
|
||||
/*
|
||||
* Handler for UD0 exception following a failed test against the
|
||||
* result of a refcount inc/dec/add/sub.
|
||||
*/
|
||||
bool ex_handler_refcount(const struct exception_table_entry *fixup,
|
||||
struct pt_regs *regs, int trapnr)
|
||||
{
|
||||
/* First unconditionally saturate the refcount. */
|
||||
*(int *)regs->cx = INT_MIN / 2;
|
||||
|
||||
/*
|
||||
* Strictly speaking, this reports the fixup destination, not
|
||||
* the fault location, and not the actually overflowing
|
||||
* instruction, which is the instruction before the "js", but
|
||||
* since that instruction could be a variety of lengths, just
|
||||
* report the location after the overflow, which should be close
|
||||
* enough for finding the overflow, as it's at least back in
|
||||
* the function, having returned from .text.unlikely.
|
||||
*/
|
||||
regs->ip = ex_fixup_addr(fixup);
|
||||
|
||||
/*
|
||||
* This function has been called because either a negative refcount
|
||||
* value was seen by any of the refcount functions, or a zero
|
||||
* refcount value was seen by refcount_dec().
|
||||
*
|
||||
* If we crossed from INT_MAX to INT_MIN, OF (Overflow Flag: result
|
||||
* wrapped around) will be set. Additionally, seeing the refcount
|
||||
* reach 0 will set ZF (Zero Flag: result was zero). In each of
|
||||
* these cases we want a report, since it's a boundary condition.
|
||||
*
|
||||
*/
|
||||
if (regs->flags & (X86_EFLAGS_OF | X86_EFLAGS_ZF)) {
|
||||
bool zero = regs->flags & X86_EFLAGS_ZF;
|
||||
|
||||
refcount_error_report(regs, zero ? "hit zero" : "overflow");
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(ex_handler_refcount);
|
||||
|
||||
bool ex_handler_ext(const struct exception_table_entry *fixup,
|
||||
struct pt_regs *regs, int trapnr)
|
||||
{
|
||||
|
|
|
@ -44,18 +44,10 @@
|
|||
: "r" (uaddr), "I" (-EFAULT), "r" (oparg) \
|
||||
: "memory")
|
||||
|
||||
static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
||||
static inline int arch_futex_atomic_op_inuser(int op, int oparg, int *oval,
|
||||
u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int oldval = 0, ret;
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
|
||||
return -EFAULT;
|
||||
|
||||
#if !XCHAL_HAVE_S32C1I
|
||||
return -ENOSYS;
|
||||
|
@ -89,19 +81,10 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
|||
|
||||
pagefault_enable();
|
||||
|
||||
if (ret)
|
||||
return ret;
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ: return (oldval == cmparg);
|
||||
case FUTEX_OP_CMP_NE: return (oldval != cmparg);
|
||||
case FUTEX_OP_CMP_LT: return (oldval < cmparg);
|
||||
case FUTEX_OP_CMP_GE: return (oldval >= cmparg);
|
||||
case FUTEX_OP_CMP_LE: return (oldval <= cmparg);
|
||||
case FUTEX_OP_CMP_GT: return (oldval > cmparg);
|
||||
}
|
||||
|
||||
return -ENOSYS;
|
||||
return ret;
|
||||
}
|
||||
|
||||
static inline int
|
||||
|
|
|
@ -60,7 +60,7 @@ static void trigger_softirq(void *data)
|
|||
static int raise_blk_irq(int cpu, struct request *rq)
|
||||
{
|
||||
if (cpu_online(cpu)) {
|
||||
struct call_single_data *data = &rq->csd;
|
||||
call_single_data_t *data = &rq->csd;
|
||||
|
||||
data->func = trigger_softirq;
|
||||
data->info = rq;
|
||||
|
|
|
@ -2884,7 +2884,7 @@ static int acpi_nfit_flush_probe(struct nvdimm_bus_descriptor *nd_desc)
|
|||
* need to be interruptible while waiting.
|
||||
*/
|
||||
INIT_WORK_ONSTACK(&flush.work, flush_probe);
|
||||
COMPLETION_INITIALIZER_ONSTACK(flush.cmp);
|
||||
init_completion(&flush.cmp);
|
||||
queue_work(nfit_wq, &flush.work);
|
||||
mutex_unlock(&acpi_desc->init_mutex);
|
||||
|
||||
|
|
|
@ -13,7 +13,7 @@
|
|||
struct nullb_cmd {
|
||||
struct list_head list;
|
||||
struct llist_node ll_list;
|
||||
struct call_single_data csd;
|
||||
call_single_data_t csd;
|
||||
struct request *rq;
|
||||
struct bio *bio;
|
||||
unsigned int tag;
|
||||
|
|
|
@ -455,7 +455,11 @@ void arch_timer_enable_workaround(const struct arch_timer_erratum_workaround *wa
|
|||
per_cpu(timer_unstable_counter_workaround, i) = wa;
|
||||
}
|
||||
|
||||
static_branch_enable(&arch_timer_read_ool_enabled);
|
||||
/*
|
||||
* Use the locked version, as we're called from the CPU
|
||||
* hotplug framework. Otherwise, we end-up in deadlock-land.
|
||||
*/
|
||||
static_branch_enable_cpuslocked(&arch_timer_read_ool_enabled);
|
||||
|
||||
/*
|
||||
* Don't use the vdso fastpath if errata require using the
|
||||
|
|
|
@ -119,13 +119,13 @@ struct cpuidle_coupled {
|
|||
|
||||
#define CPUIDLE_COUPLED_NOT_IDLE (-1)
|
||||
|
||||
static DEFINE_PER_CPU(struct call_single_data, cpuidle_coupled_poke_cb);
|
||||
static DEFINE_PER_CPU(call_single_data_t, cpuidle_coupled_poke_cb);
|
||||
|
||||
/*
|
||||
* The cpuidle_coupled_poke_pending mask is used to avoid calling
|
||||
* __smp_call_function_single with the per cpu call_single_data struct already
|
||||
* __smp_call_function_single with the per cpu call_single_data_t struct already
|
||||
* in use. This prevents a deadlock where two cpus are waiting for each others
|
||||
* call_single_data struct to be available
|
||||
* call_single_data_t struct to be available
|
||||
*/
|
||||
static cpumask_t cpuidle_coupled_poke_pending;
|
||||
|
||||
|
@ -339,7 +339,7 @@ static void cpuidle_coupled_handle_poke(void *info)
|
|||
*/
|
||||
static void cpuidle_coupled_poke(int cpu)
|
||||
{
|
||||
struct call_single_data *csd = &per_cpu(cpuidle_coupled_poke_cb, cpu);
|
||||
call_single_data_t *csd = &per_cpu(cpuidle_coupled_poke_cb, cpu);
|
||||
|
||||
if (!cpumask_test_and_set_cpu(cpu, &cpuidle_coupled_poke_pending))
|
||||
smp_call_function_single_async(cpu, csd);
|
||||
|
@ -651,7 +651,7 @@ int cpuidle_coupled_register_device(struct cpuidle_device *dev)
|
|||
{
|
||||
int cpu;
|
||||
struct cpuidle_device *other_dev;
|
||||
struct call_single_data *csd;
|
||||
call_single_data_t *csd;
|
||||
struct cpuidle_coupled *coupled;
|
||||
|
||||
if (cpumask_empty(&dev->coupled_cpus))
|
||||
|
|
|
@ -28,6 +28,7 @@
|
|||
|
||||
#include <linux/debugfs.h>
|
||||
#include <linux/sort.h>
|
||||
#include <linux/sched/mm.h>
|
||||
#include "intel_drv.h"
|
||||
|
||||
static inline struct drm_i915_private *node_to_i915(struct drm_info_node *node)
|
||||
|
@ -4305,7 +4306,7 @@ i915_drop_caches_set(void *data, u64 val)
|
|||
mutex_unlock(&dev->struct_mutex);
|
||||
}
|
||||
|
||||
lockdep_set_current_reclaim_state(GFP_KERNEL);
|
||||
fs_reclaim_acquire(GFP_KERNEL);
|
||||
if (val & DROP_BOUND)
|
||||
i915_gem_shrink(dev_priv, LONG_MAX, I915_SHRINK_BOUND);
|
||||
|
||||
|
@ -4314,7 +4315,7 @@ i915_drop_caches_set(void *data, u64 val)
|
|||
|
||||
if (val & DROP_SHRINK_ALL)
|
||||
i915_gem_shrink_all(dev_priv);
|
||||
lockdep_clear_current_reclaim_state();
|
||||
fs_reclaim_release(GFP_KERNEL);
|
||||
|
||||
if (val & DROP_FREED) {
|
||||
synchronize_rcu();
|
||||
|
|
|
@ -2468,7 +2468,7 @@ static void liquidio_napi_drv_callback(void *arg)
|
|||
if (OCTEON_CN23XX_PF(oct) || droq->cpu_id == this_cpu) {
|
||||
napi_schedule_irqoff(&droq->napi);
|
||||
} else {
|
||||
struct call_single_data *csd = &droq->csd;
|
||||
call_single_data_t *csd = &droq->csd;
|
||||
|
||||
csd->func = napi_schedule_wrapper;
|
||||
csd->info = &droq->napi;
|
||||
|
|
|
@ -328,7 +328,7 @@ struct octeon_droq {
|
|||
|
||||
u32 cpu_id;
|
||||
|
||||
struct call_single_data csd;
|
||||
call_single_data_t csd;
|
||||
};
|
||||
|
||||
#define OCT_DROQ_SIZE (sizeof(struct octeon_droq))
|
||||
|
|
|
@ -446,14 +446,14 @@ static int ovl_dir_fsync(struct file *file, loff_t start, loff_t end,
|
|||
|
||||
ovl_path_upper(dentry, &upperpath);
|
||||
realfile = ovl_path_open(&upperpath, O_RDONLY);
|
||||
smp_mb__before_spinlock();
|
||||
|
||||
inode_lock(inode);
|
||||
if (!od->upperfile) {
|
||||
if (IS_ERR(realfile)) {
|
||||
inode_unlock(inode);
|
||||
return PTR_ERR(realfile);
|
||||
}
|
||||
od->upperfile = realfile;
|
||||
smp_store_release(&od->upperfile, realfile);
|
||||
} else {
|
||||
/* somebody has beaten us to it */
|
||||
if (!IS_ERR(realfile))
|
||||
|
|
|
@ -109,27 +109,24 @@ static int userfaultfd_wake_function(wait_queue_entry_t *wq, unsigned mode,
|
|||
goto out;
|
||||
WRITE_ONCE(uwq->waken, true);
|
||||
/*
|
||||
* The implicit smp_mb__before_spinlock in try_to_wake_up()
|
||||
* renders uwq->waken visible to other CPUs before the task is
|
||||
* waken.
|
||||
* The Program-Order guarantees provided by the scheduler
|
||||
* ensure uwq->waken is visible before the task is woken.
|
||||
*/
|
||||
ret = wake_up_state(wq->private, mode);
|
||||
if (ret)
|
||||
if (ret) {
|
||||
/*
|
||||
* Wake only once, autoremove behavior.
|
||||
*
|
||||
* After the effect of list_del_init is visible to the
|
||||
* other CPUs, the waitqueue may disappear from under
|
||||
* us, see the !list_empty_careful() in
|
||||
* handle_userfault(). try_to_wake_up() has an
|
||||
* implicit smp_mb__before_spinlock, and the
|
||||
* wq->private is read before calling the extern
|
||||
* function "wake_up_state" (which in turns calls
|
||||
* try_to_wake_up). While the spin_lock;spin_unlock;
|
||||
* wouldn't be enough, the smp_mb__before_spinlock is
|
||||
* enough to avoid an explicit smp_mb() here.
|
||||
* After the effect of list_del_init is visible to the other
|
||||
* CPUs, the waitqueue may disappear from under us, see the
|
||||
* !list_empty_careful() in handle_userfault().
|
||||
*
|
||||
* try_to_wake_up() has an implicit smp_mb(), and the
|
||||
* wq->private is read before calling the extern function
|
||||
* "wake_up_state" (which in turns calls try_to_wake_up).
|
||||
*/
|
||||
list_del_init(&wq->entry);
|
||||
}
|
||||
out:
|
||||
return ret;
|
||||
}
|
||||
|
|
|
@ -21,6 +21,8 @@ typedef struct {
|
|||
extern long long atomic64_read(const atomic64_t *v);
|
||||
extern void atomic64_set(atomic64_t *v, long long i);
|
||||
|
||||
#define atomic64_set_release(v, i) atomic64_set((v), (i))
|
||||
|
||||
#define ATOMIC64_OP(op) \
|
||||
extern void atomic64_##op(long long a, atomic64_t *v);
|
||||
|
||||
|
|
|
@ -13,7 +13,7 @@
|
|||
*/
|
||||
|
||||
/**
|
||||
* futex_atomic_op_inuser() - Atomic arithmetic operation with constant
|
||||
* arch_futex_atomic_op_inuser() - Atomic arithmetic operation with constant
|
||||
* argument and comparison of the previous
|
||||
* futex value with another constant.
|
||||
*
|
||||
|
@ -25,18 +25,11 @@
|
|||
* <0 - On error
|
||||
*/
|
||||
static inline int
|
||||
futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
||||
arch_futex_atomic_op_inuser(int op, u32 oparg, int *oval, u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int oldval, ret;
|
||||
u32 tmp;
|
||||
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
preempt_disable();
|
||||
pagefault_disable();
|
||||
|
||||
|
@ -74,17 +67,9 @@ futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
|
|||
pagefault_enable();
|
||||
preempt_enable();
|
||||
|
||||
if (ret == 0) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ: ret = (oldval == cmparg); break;
|
||||
case FUTEX_OP_CMP_NE: ret = (oldval != cmparg); break;
|
||||
case FUTEX_OP_CMP_LT: ret = (oldval < cmparg); break;
|
||||
case FUTEX_OP_CMP_GE: ret = (oldval >= cmparg); break;
|
||||
case FUTEX_OP_CMP_LE: ret = (oldval <= cmparg); break;
|
||||
case FUTEX_OP_CMP_GT: ret = (oldval > cmparg); break;
|
||||
default: ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (ret == 0)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
@ -126,18 +111,9 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
|
|||
|
||||
#else
|
||||
static inline int
|
||||
futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
|
||||
arch_futex_atomic_op_inuser(int op, u32 oparg, int *oval, u32 __user *uaddr)
|
||||
{
|
||||
int op = (encoded_op >> 28) & 7;
|
||||
int cmp = (encoded_op >> 24) & 15;
|
||||
int oparg = (encoded_op << 8) >> 20;
|
||||
int cmparg = (encoded_op << 20) >> 20;
|
||||
int oldval = 0, ret;
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
|
||||
oparg = 1 << oparg;
|
||||
|
||||
if (! access_ok (VERIFY_WRITE, uaddr, sizeof(u32)))
|
||||
return -EFAULT;
|
||||
|
||||
pagefault_disable();
|
||||
|
||||
|
@ -153,17 +129,9 @@ futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
|
|||
|
||||
pagefault_enable();
|
||||
|
||||
if (!ret) {
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ: ret = (oldval == cmparg); break;
|
||||
case FUTEX_OP_CMP_NE: ret = (oldval != cmparg); break;
|
||||
case FUTEX_OP_CMP_LT: ret = (oldval < cmparg); break;
|
||||
case FUTEX_OP_CMP_GE: ret = (oldval >= cmparg); break;
|
||||
case FUTEX_OP_CMP_LE: ret = (oldval <= cmparg); break;
|
||||
case FUTEX_OP_CMP_GT: ret = (oldval > cmparg); break;
|
||||
default: ret = -ENOSYS;
|
||||
}
|
||||
}
|
||||
if (!ret)
|
||||
*oval = oldval;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
|
|
@ -38,6 +38,9 @@
|
|||
* Besides, if an arch has a special barrier for acquire/release, it could
|
||||
* implement its own __atomic_op_* and use the same framework for building
|
||||
* variants
|
||||
*
|
||||
* If an architecture overrides __atomic_op_acquire() it will probably want
|
||||
* to define smp_mb__after_spinlock().
|
||||
*/
|
||||
#ifndef __atomic_op_acquire
|
||||
#define __atomic_op_acquire(op, args...) \
|
||||
|
|
|
@ -134,7 +134,7 @@ typedef __u32 __bitwise req_flags_t;
|
|||
struct request {
|
||||
struct list_head queuelist;
|
||||
union {
|
||||
struct call_single_data csd;
|
||||
call_single_data_t csd;
|
||||
u64 fifo_time;
|
||||
};
|
||||
|
||||
|
|
|
@ -9,6 +9,9 @@
|
|||
*/
|
||||
|
||||
#include <linux/wait.h>
|
||||
#ifdef CONFIG_LOCKDEP_COMPLETIONS
|
||||
#include <linux/lockdep.h>
|
||||
#endif
|
||||
|
||||
/*
|
||||
* struct completion - structure used to maintain state for a "completion"
|
||||
|
@ -25,13 +28,53 @@
|
|||
struct completion {
|
||||
unsigned int done;
|
||||
wait_queue_head_t wait;
|
||||
#ifdef CONFIG_LOCKDEP_COMPLETIONS
|
||||
struct lockdep_map_cross map;
|
||||
#endif
|
||||
};
|
||||
|
||||
#ifdef CONFIG_LOCKDEP_COMPLETIONS
|
||||
static inline void complete_acquire(struct completion *x)
|
||||
{
|
||||
lock_acquire_exclusive((struct lockdep_map *)&x->map, 0, 0, NULL, _RET_IP_);
|
||||
}
|
||||
|
||||
static inline void complete_release(struct completion *x)
|
||||
{
|
||||
lock_release((struct lockdep_map *)&x->map, 0, _RET_IP_);
|
||||
}
|
||||
|
||||
static inline void complete_release_commit(struct completion *x)
|
||||
{
|
||||
lock_commit_crosslock((struct lockdep_map *)&x->map);
|
||||
}
|
||||
|
||||
#define init_completion(x) \
|
||||
do { \
|
||||
static struct lock_class_key __key; \
|
||||
lockdep_init_map_crosslock((struct lockdep_map *)&(x)->map, \
|
||||
"(complete)" #x, \
|
||||
&__key, 0); \
|
||||
__init_completion(x); \
|
||||
} while (0)
|
||||
#else
|
||||
#define init_completion(x) __init_completion(x)
|
||||
static inline void complete_acquire(struct completion *x) {}
|
||||
static inline void complete_release(struct completion *x) {}
|
||||
static inline void complete_release_commit(struct completion *x) {}
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_LOCKDEP_COMPLETIONS
|
||||
#define COMPLETION_INITIALIZER(work) \
|
||||
{ 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait), \
|
||||
STATIC_CROSS_LOCKDEP_MAP_INIT("(complete)" #work, &(work)) }
|
||||
#else
|
||||
#define COMPLETION_INITIALIZER(work) \
|
||||
{ 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait) }
|
||||
#endif
|
||||
|
||||
#define COMPLETION_INITIALIZER_ONSTACK(work) \
|
||||
({ init_completion(&work); work; })
|
||||
(*({ init_completion(&work); &work; }))
|
||||
|
||||
/**
|
||||
* DECLARE_COMPLETION - declare and initialize a completion structure
|
||||
|
@ -70,7 +113,7 @@ struct completion {
|
|||
* This inline function will initialize a dynamically created completion
|
||||
* structure.
|
||||
*/
|
||||
static inline void init_completion(struct completion *x)
|
||||
static inline void __init_completion(struct completion *x)
|
||||
{
|
||||
x->done = 0;
|
||||
init_waitqueue_head(&x->wait);
|
||||
|
|
|
@ -37,12 +37,6 @@ static inline bool cpusets_enabled(void)
|
|||
return static_branch_unlikely(&cpusets_enabled_key);
|
||||
}
|
||||
|
||||
static inline int nr_cpusets(void)
|
||||
{
|
||||
/* jump label reference count + the top-level cpuset */
|
||||
return static_key_count(&cpusets_enabled_key.key) + 1;
|
||||
}
|
||||
|
||||
static inline void cpuset_inc(void)
|
||||
{
|
||||
static_branch_inc(&cpusets_pre_enable_key);
|
||||
|
|
|
@ -54,7 +54,6 @@ union futex_key {
|
|||
|
||||
#ifdef CONFIG_FUTEX
|
||||
extern void exit_robust_list(struct task_struct *curr);
|
||||
extern void exit_pi_state_list(struct task_struct *curr);
|
||||
#ifdef CONFIG_HAVE_FUTEX_CMPXCHG
|
||||
#define futex_cmpxchg_enabled 1
|
||||
#else
|
||||
|
@ -64,8 +63,14 @@ extern int futex_cmpxchg_enabled;
|
|||
static inline void exit_robust_list(struct task_struct *curr)
|
||||
{
|
||||
}
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_FUTEX_PI
|
||||
extern void exit_pi_state_list(struct task_struct *curr);
|
||||
#else
|
||||
static inline void exit_pi_state_list(struct task_struct *curr)
|
||||
{
|
||||
}
|
||||
#endif
|
||||
|
||||
#endif
|
||||
|
|
|
@ -23,10 +23,26 @@
|
|||
# define trace_softirq_context(p) ((p)->softirq_context)
|
||||
# define trace_hardirqs_enabled(p) ((p)->hardirqs_enabled)
|
||||
# define trace_softirqs_enabled(p) ((p)->softirqs_enabled)
|
||||
# define trace_hardirq_enter() do { current->hardirq_context++; } while (0)
|
||||
# define trace_hardirq_exit() do { current->hardirq_context--; } while (0)
|
||||
# define lockdep_softirq_enter() do { current->softirq_context++; } while (0)
|
||||
# define lockdep_softirq_exit() do { current->softirq_context--; } while (0)
|
||||
# define trace_hardirq_enter() \
|
||||
do { \
|
||||
current->hardirq_context++; \
|
||||
crossrelease_hist_start(XHLOCK_HARD); \
|
||||
} while (0)
|
||||
# define trace_hardirq_exit() \
|
||||
do { \
|
||||
current->hardirq_context--; \
|
||||
crossrelease_hist_end(XHLOCK_HARD); \
|
||||
} while (0)
|
||||
# define lockdep_softirq_enter() \
|
||||
do { \
|
||||
current->softirq_context++; \
|
||||
crossrelease_hist_start(XHLOCK_SOFT); \
|
||||
} while (0)
|
||||
# define lockdep_softirq_exit() \
|
||||
do { \
|
||||
current->softirq_context--; \
|
||||
crossrelease_hist_end(XHLOCK_SOFT); \
|
||||
} while (0)
|
||||
# define INIT_TRACE_IRQFLAGS .softirqs_enabled = 1,
|
||||
#else
|
||||
# define trace_hardirqs_on() do { } while (0)
|
||||
|
|
|
@ -163,6 +163,8 @@ extern void jump_label_apply_nops(struct module *mod);
|
|||
extern int static_key_count(struct static_key *key);
|
||||
extern void static_key_enable(struct static_key *key);
|
||||
extern void static_key_disable(struct static_key *key);
|
||||
extern void static_key_enable_cpuslocked(struct static_key *key);
|
||||
extern void static_key_disable_cpuslocked(struct static_key *key);
|
||||
|
||||
/*
|
||||
* We should be using ATOMIC_INIT() for initializing .enabled, but
|
||||
|
@ -234,24 +236,29 @@ static inline int jump_label_apply_nops(struct module *mod)
|
|||
|
||||
static inline void static_key_enable(struct static_key *key)
|
||||
{
|
||||
int count = static_key_count(key);
|
||||
STATIC_KEY_CHECK_USE();
|
||||
|
||||
WARN_ON_ONCE(count < 0 || count > 1);
|
||||
|
||||
if (!count)
|
||||
static_key_slow_inc(key);
|
||||
if (atomic_read(&key->enabled) != 0) {
|
||||
WARN_ON_ONCE(atomic_read(&key->enabled) != 1);
|
||||
return;
|
||||
}
|
||||
atomic_set(&key->enabled, 1);
|
||||
}
|
||||
|
||||
static inline void static_key_disable(struct static_key *key)
|
||||
{
|
||||
int count = static_key_count(key);
|
||||
STATIC_KEY_CHECK_USE();
|
||||
|
||||
WARN_ON_ONCE(count < 0 || count > 1);
|
||||
|
||||
if (count)
|
||||
static_key_slow_dec(key);
|
||||
if (atomic_read(&key->enabled) != 1) {
|
||||
WARN_ON_ONCE(atomic_read(&key->enabled) != 0);
|
||||
return;
|
||||
}
|
||||
atomic_set(&key->enabled, 0);
|
||||
}
|
||||
|
||||
#define static_key_enable_cpuslocked(k) static_key_enable((k))
|
||||
#define static_key_disable_cpuslocked(k) static_key_disable((k))
|
||||
|
||||
#define STATIC_KEY_INIT_TRUE { .enabled = ATOMIC_INIT(1) }
|
||||
#define STATIC_KEY_INIT_FALSE { .enabled = ATOMIC_INIT(0) }
|
||||
|
||||
|
@ -413,8 +420,10 @@ extern bool ____wrong_branch_error(void);
|
|||
* Normal usage; boolean enable/disable.
|
||||
*/
|
||||
|
||||
#define static_branch_enable(x) static_key_enable(&(x)->key)
|
||||
#define static_branch_disable(x) static_key_disable(&(x)->key)
|
||||
#define static_branch_enable(x) static_key_enable(&(x)->key)
|
||||
#define static_branch_disable(x) static_key_disable(&(x)->key)
|
||||
#define static_branch_enable_cpuslocked(x) static_key_enable_cpuslocked(&(x)->key)
|
||||
#define static_branch_disable_cpuslocked(x) static_key_disable_cpuslocked(&(x)->key)
|
||||
|
||||
#endif /* __ASSEMBLY__ */
|
||||
|
||||
|
|
|
@ -2,11 +2,13 @@
|
|||
#define _LINUX_KASAN_CHECKS_H
|
||||
|
||||
#ifdef CONFIG_KASAN
|
||||
void kasan_check_read(const void *p, unsigned int size);
|
||||
void kasan_check_write(const void *p, unsigned int size);
|
||||
void kasan_check_read(const volatile void *p, unsigned int size);
|
||||
void kasan_check_write(const volatile void *p, unsigned int size);
|
||||
#else
|
||||
static inline void kasan_check_read(const void *p, unsigned int size) { }
|
||||
static inline void kasan_check_write(const void *p, unsigned int size) { }
|
||||
static inline void kasan_check_read(const volatile void *p, unsigned int size)
|
||||
{ }
|
||||
static inline void kasan_check_write(const volatile void *p, unsigned int size)
|
||||
{ }
|
||||
#endif
|
||||
|
||||
#endif
|
||||
|
|
|
@ -277,6 +277,13 @@ extern int oops_may_print(void);
|
|||
void do_exit(long error_code) __noreturn;
|
||||
void complete_and_exit(struct completion *, long) __noreturn;
|
||||
|
||||
#ifdef CONFIG_ARCH_HAS_REFCOUNT
|
||||
void refcount_error_report(struct pt_regs *regs, const char *err);
|
||||
#else
|
||||
static inline void refcount_error_report(struct pt_regs *regs, const char *err)
|
||||
{ }
|
||||
#endif
|
||||
|
||||
/* Internal, do not use. */
|
||||
int __must_check _kstrtoul(const char *s, unsigned int base, unsigned long *res);
|
||||
int __must_check _kstrtol(const char *s, unsigned int base, long *res);
|
||||
|
|
|
@ -18,6 +18,8 @@ extern int lock_stat;
|
|||
|
||||
#define MAX_LOCKDEP_SUBCLASSES 8UL
|
||||
|
||||
#include <linux/types.h>
|
||||
|
||||
#ifdef CONFIG_LOCKDEP
|
||||
|
||||
#include <linux/linkage.h>
|
||||
|
@ -29,7 +31,7 @@ extern int lock_stat;
|
|||
* We'd rather not expose kernel/lockdep_states.h this wide, but we do need
|
||||
* the total number of states... :-(
|
||||
*/
|
||||
#define XXX_LOCK_USAGE_STATES (1+3*4)
|
||||
#define XXX_LOCK_USAGE_STATES (1+2*4)
|
||||
|
||||
/*
|
||||
* NR_LOCKDEP_CACHING_CLASSES ... Number of classes
|
||||
|
@ -155,6 +157,12 @@ struct lockdep_map {
|
|||
int cpu;
|
||||
unsigned long ip;
|
||||
#endif
|
||||
#ifdef CONFIG_LOCKDEP_CROSSRELEASE
|
||||
/*
|
||||
* Whether it's a crosslock.
|
||||
*/
|
||||
int cross;
|
||||
#endif
|
||||
};
|
||||
|
||||
static inline void lockdep_copy_map(struct lockdep_map *to,
|
||||
|
@ -258,8 +266,95 @@ struct held_lock {
|
|||
unsigned int hardirqs_off:1;
|
||||
unsigned int references:12; /* 32 bits */
|
||||
unsigned int pin_count;
|
||||
#ifdef CONFIG_LOCKDEP_CROSSRELEASE
|
||||
/*
|
||||
* Generation id.
|
||||
*
|
||||
* A value of cross_gen_id will be stored when holding this,
|
||||
* which is globally increased whenever each crosslock is held.
|
||||
*/
|
||||
unsigned int gen_id;
|
||||
#endif
|
||||
};
|
||||
|
||||
#ifdef CONFIG_LOCKDEP_CROSSRELEASE
|
||||
#define MAX_XHLOCK_TRACE_ENTRIES 5
|
||||
|
||||
/*
|
||||
* This is for keeping locks waiting for commit so that true dependencies
|
||||
* can be added at commit step.
|
||||
*/
|
||||
struct hist_lock {
|
||||
/*
|
||||
* Id for each entry in the ring buffer. This is used to
|
||||
* decide whether the ring buffer was overwritten or not.
|
||||
*
|
||||
* For example,
|
||||
*
|
||||
* |<----------- hist_lock ring buffer size ------->|
|
||||
* pppppppppppppppppppppiiiiiiiiiiiiiiiiiiiiiiiiiiiii
|
||||
* wrapped > iiiiiiiiiiiiiiiiiiiiiiiiiii.......................
|
||||
*
|
||||
* where 'p' represents an acquisition in process
|
||||
* context, 'i' represents an acquisition in irq
|
||||
* context.
|
||||
*
|
||||
* In this example, the ring buffer was overwritten by
|
||||
* acquisitions in irq context, that should be detected on
|
||||
* rollback or commit.
|
||||
*/
|
||||
unsigned int hist_id;
|
||||
|
||||
/*
|
||||
* Seperate stack_trace data. This will be used at commit step.
|
||||
*/
|
||||
struct stack_trace trace;
|
||||
unsigned long trace_entries[MAX_XHLOCK_TRACE_ENTRIES];
|
||||
|
||||
/*
|
||||
* Seperate hlock instance. This will be used at commit step.
|
||||
*
|
||||
* TODO: Use a smaller data structure containing only necessary
|
||||
* data. However, we should make lockdep code able to handle the
|
||||
* smaller one first.
|
||||
*/
|
||||
struct held_lock hlock;
|
||||
};
|
||||
|
||||
/*
|
||||
* To initialize a lock as crosslock, lockdep_init_map_crosslock() should
|
||||
* be called instead of lockdep_init_map().
|
||||
*/
|
||||
struct cross_lock {
|
||||
/*
|
||||
* When more than one acquisition of crosslocks are overlapped,
|
||||
* we have to perform commit for them based on cross_gen_id of
|
||||
* the first acquisition, which allows us to add more true
|
||||
* dependencies.
|
||||
*
|
||||
* Moreover, when no acquisition of a crosslock is in progress,
|
||||
* we should not perform commit because the lock might not exist
|
||||
* any more, which might cause incorrect memory access. So we
|
||||
* have to track the number of acquisitions of a crosslock.
|
||||
*/
|
||||
int nr_acquire;
|
||||
|
||||
/*
|
||||
* Seperate hlock instance. This will be used at commit step.
|
||||
*
|
||||
* TODO: Use a smaller data structure containing only necessary
|
||||
* data. However, we should make lockdep code able to handle the
|
||||
* smaller one first.
|
||||
*/
|
||||
struct held_lock hlock;
|
||||
};
|
||||
|
||||
struct lockdep_map_cross {
|
||||
struct lockdep_map map;
|
||||
struct cross_lock xlock;
|
||||
};
|
||||
#endif
|
||||
|
||||
/*
|
||||
* Initialization, self-test and debugging-output methods:
|
||||
*/
|
||||
|
@ -281,13 +376,6 @@ extern void lockdep_on(void);
|
|||
extern void lockdep_init_map(struct lockdep_map *lock, const char *name,
|
||||
struct lock_class_key *key, int subclass);
|
||||
|
||||
/*
|
||||
* To initialize a lockdep_map statically use this macro.
|
||||
* Note that _name must not be NULL.
|
||||
*/
|
||||
#define STATIC_LOCKDEP_MAP_INIT(_name, _key) \
|
||||
{ .name = (_name), .key = (void *)(_key), }
|
||||
|
||||
/*
|
||||
* Reinitialize a lock key - for cases where there is special locking or
|
||||
* special initialization of locks so that the validator gets the scope
|
||||
|
@ -363,10 +451,6 @@ static inline void lock_set_subclass(struct lockdep_map *lock,
|
|||
|
||||
extern void lock_downgrade(struct lockdep_map *lock, unsigned long ip);
|
||||
|
||||
extern void lockdep_set_current_reclaim_state(gfp_t gfp_mask);
|
||||
extern void lockdep_clear_current_reclaim_state(void);
|
||||
extern void lockdep_trace_alloc(gfp_t mask);
|
||||
|
||||
struct pin_cookie { unsigned int val; };
|
||||
|
||||
#define NIL_COOKIE (struct pin_cookie){ .val = 0U, }
|
||||
|
@ -375,7 +459,7 @@ extern struct pin_cookie lock_pin_lock(struct lockdep_map *lock);
|
|||
extern void lock_repin_lock(struct lockdep_map *lock, struct pin_cookie);
|
||||
extern void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie);
|
||||
|
||||
# define INIT_LOCKDEP .lockdep_recursion = 0, .lockdep_reclaim_gfp = 0,
|
||||
# define INIT_LOCKDEP .lockdep_recursion = 0,
|
||||
|
||||
#define lockdep_depth(tsk) (debug_locks ? (tsk)->lockdep_depth : 0)
|
||||
|
||||
|
@ -416,9 +500,6 @@ static inline void lockdep_on(void)
|
|||
# define lock_downgrade(l, i) do { } while (0)
|
||||
# define lock_set_class(l, n, k, s, i) do { } while (0)
|
||||
# define lock_set_subclass(l, s, i) do { } while (0)
|
||||
# define lockdep_set_current_reclaim_state(g) do { } while (0)
|
||||
# define lockdep_clear_current_reclaim_state() do { } while (0)
|
||||
# define lockdep_trace_alloc(g) do { } while (0)
|
||||
# define lockdep_info() do { } while (0)
|
||||
# define lockdep_init_map(lock, name, key, sub) \
|
||||
do { (void)(name); (void)(key); } while (0)
|
||||
|
@ -467,6 +548,58 @@ struct pin_cookie { };
|
|||
|
||||
#endif /* !LOCKDEP */
|
||||
|
||||
enum xhlock_context_t {
|
||||
XHLOCK_HARD,
|
||||
XHLOCK_SOFT,
|
||||
XHLOCK_CTX_NR,
|
||||
};
|
||||
|
||||
#ifdef CONFIG_LOCKDEP_CROSSRELEASE
|
||||
extern void lockdep_init_map_crosslock(struct lockdep_map *lock,
|
||||
const char *name,
|
||||
struct lock_class_key *key,
|
||||
int subclass);
|
||||
extern void lock_commit_crosslock(struct lockdep_map *lock);
|
||||
|
||||
/*
|
||||
* What we essencially have to initialize is 'nr_acquire'. Other members
|
||||
* will be initialized in add_xlock().
|
||||
*/
|
||||
#define STATIC_CROSS_LOCK_INIT() \
|
||||
{ .nr_acquire = 0,}
|
||||
|
||||
#define STATIC_CROSS_LOCKDEP_MAP_INIT(_name, _key) \
|
||||
{ .map.name = (_name), .map.key = (void *)(_key), \
|
||||
.map.cross = 1, .xlock = STATIC_CROSS_LOCK_INIT(), }
|
||||
|
||||
/*
|
||||
* To initialize a lockdep_map statically use this macro.
|
||||
* Note that _name must not be NULL.
|
||||
*/
|
||||
#define STATIC_LOCKDEP_MAP_INIT(_name, _key) \
|
||||
{ .name = (_name), .key = (void *)(_key), .cross = 0, }
|
||||
|
||||
extern void crossrelease_hist_start(enum xhlock_context_t c);
|
||||
extern void crossrelease_hist_end(enum xhlock_context_t c);
|
||||
extern void lockdep_invariant_state(bool force);
|
||||
extern void lockdep_init_task(struct task_struct *task);
|
||||
extern void lockdep_free_task(struct task_struct *task);
|
||||
#else /* !CROSSRELEASE */
|
||||
#define lockdep_init_map_crosslock(m, n, k, s) do {} while (0)
|
||||
/*
|
||||
* To initialize a lockdep_map statically use this macro.
|
||||
* Note that _name must not be NULL.
|
||||
*/
|
||||
#define STATIC_LOCKDEP_MAP_INIT(_name, _key) \
|
||||
{ .name = (_name), .key = (void *)(_key), }
|
||||
|
||||
static inline void crossrelease_hist_start(enum xhlock_context_t c) {}
|
||||
static inline void crossrelease_hist_end(enum xhlock_context_t c) {}
|
||||
static inline void lockdep_invariant_state(bool force) {}
|
||||
static inline void lockdep_init_task(struct task_struct *task) {}
|
||||
static inline void lockdep_free_task(struct task_struct *task) {}
|
||||
#endif /* CROSSRELEASE */
|
||||
|
||||
#ifdef CONFIG_LOCK_STAT
|
||||
|
||||
extern void lock_contended(struct lockdep_map *lock, unsigned long ip);
|
||||
|
|
|
@ -526,26 +526,6 @@ extern void tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm,
|
|||
extern void tlb_finish_mmu(struct mmu_gather *tlb,
|
||||
unsigned long start, unsigned long end);
|
||||
|
||||
/*
|
||||
* Memory barriers to keep this state in sync are graciously provided by
|
||||
* the page table locks, outside of which no page table modifications happen.
|
||||
* The barriers are used to ensure the order between tlb_flush_pending updates,
|
||||
* which happen while the lock is not taken, and the PTE updates, which happen
|
||||
* while the lock is taken, are serialized.
|
||||
*/
|
||||
static inline bool mm_tlb_flush_pending(struct mm_struct *mm)
|
||||
{
|
||||
return atomic_read(&mm->tlb_flush_pending) > 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Returns true if there are two above TLB batching threads in parallel.
|
||||
*/
|
||||
static inline bool mm_tlb_flush_nested(struct mm_struct *mm)
|
||||
{
|
||||
return atomic_read(&mm->tlb_flush_pending) > 1;
|
||||
}
|
||||
|
||||
static inline void init_tlb_flush_pending(struct mm_struct *mm)
|
||||
{
|
||||
atomic_set(&mm->tlb_flush_pending, 0);
|
||||
|
@ -554,27 +534,82 @@ static inline void init_tlb_flush_pending(struct mm_struct *mm)
|
|||
static inline void inc_tlb_flush_pending(struct mm_struct *mm)
|
||||
{
|
||||
atomic_inc(&mm->tlb_flush_pending);
|
||||
|
||||
/*
|
||||
* Guarantee that the tlb_flush_pending increase does not leak into the
|
||||
* critical section updating the page tables
|
||||
* The only time this value is relevant is when there are indeed pages
|
||||
* to flush. And we'll only flush pages after changing them, which
|
||||
* requires the PTL.
|
||||
*
|
||||
* So the ordering here is:
|
||||
*
|
||||
* atomic_inc(&mm->tlb_flush_pending);
|
||||
* spin_lock(&ptl);
|
||||
* ...
|
||||
* set_pte_at();
|
||||
* spin_unlock(&ptl);
|
||||
*
|
||||
* spin_lock(&ptl)
|
||||
* mm_tlb_flush_pending();
|
||||
* ....
|
||||
* spin_unlock(&ptl);
|
||||
*
|
||||
* flush_tlb_range();
|
||||
* atomic_dec(&mm->tlb_flush_pending);
|
||||
*
|
||||
* Where the increment if constrained by the PTL unlock, it thus
|
||||
* ensures that the increment is visible if the PTE modification is
|
||||
* visible. After all, if there is no PTE modification, nobody cares
|
||||
* about TLB flushes either.
|
||||
*
|
||||
* This very much relies on users (mm_tlb_flush_pending() and
|
||||
* mm_tlb_flush_nested()) only caring about _specific_ PTEs (and
|
||||
* therefore specific PTLs), because with SPLIT_PTE_PTLOCKS and RCpc
|
||||
* locks (PPC) the unlock of one doesn't order against the lock of
|
||||
* another PTL.
|
||||
*
|
||||
* The decrement is ordered by the flush_tlb_range(), such that
|
||||
* mm_tlb_flush_pending() will not return false unless all flushes have
|
||||
* completed.
|
||||
*/
|
||||
smp_mb__before_spinlock();
|
||||
}
|
||||
|
||||
/* Clearing is done after a TLB flush, which also provides a barrier. */
|
||||
static inline void dec_tlb_flush_pending(struct mm_struct *mm)
|
||||
{
|
||||
/*
|
||||
* Guarantee that the tlb_flush_pending does not not leak into the
|
||||
* critical section, since we must order the PTE change and changes to
|
||||
* the pending TLB flush indication. We could have relied on TLB flush
|
||||
* as a memory barrier, but this behavior is not clearly documented.
|
||||
* See inc_tlb_flush_pending().
|
||||
*
|
||||
* This cannot be smp_mb__before_atomic() because smp_mb() simply does
|
||||
* not order against TLB invalidate completion, which is what we need.
|
||||
*
|
||||
* Therefore we must rely on tlb_flush_*() to guarantee order.
|
||||
*/
|
||||
smp_mb__before_atomic();
|
||||
atomic_dec(&mm->tlb_flush_pending);
|
||||
}
|
||||
|
||||
static inline bool mm_tlb_flush_pending(struct mm_struct *mm)
|
||||
{
|
||||
/*
|
||||
* Must be called after having acquired the PTL; orders against that
|
||||
* PTLs release and therefore ensures that if we observe the modified
|
||||
* PTE we must also observe the increment from inc_tlb_flush_pending().
|
||||
*
|
||||
* That is, it only guarantees to return true if there is a flush
|
||||
* pending for _this_ PTL.
|
||||
*/
|
||||
return atomic_read(&mm->tlb_flush_pending);
|
||||
}
|
||||
|
||||
static inline bool mm_tlb_flush_nested(struct mm_struct *mm)
|
||||
{
|
||||
/*
|
||||
* Similar to mm_tlb_flush_pending(), we must have acquired the PTL
|
||||
* for which there is a TLB flush pending in order to guarantee
|
||||
* we've seen both that PTE modification and the increment.
|
||||
*
|
||||
* (no requirement on actually still holding the PTL, that is irrelevant)
|
||||
*/
|
||||
return atomic_read(&mm->tlb_flush_pending) > 1;
|
||||
}
|
||||
|
||||
struct vm_fault;
|
||||
|
||||
struct vm_special_mapping {
|
||||
|
|
|
@ -2774,7 +2774,7 @@ struct softnet_data {
|
|||
unsigned int input_queue_head ____cacheline_aligned_in_smp;
|
||||
|
||||
/* Elements below can be accessed between CPUs for RPS/RFS */
|
||||
struct call_single_data csd ____cacheline_aligned_in_smp;
|
||||
call_single_data_t csd ____cacheline_aligned_in_smp;
|
||||
struct softnet_data *rps_ipi_next;
|
||||
unsigned int cpu;
|
||||
unsigned int input_queue_tail;
|
||||
|
|
|
@ -53,6 +53,9 @@ extern __must_check bool refcount_sub_and_test(unsigned int i, refcount_t *r);
|
|||
extern __must_check bool refcount_dec_and_test(refcount_t *r);
|
||||
extern void refcount_dec(refcount_t *r);
|
||||
#else
|
||||
# ifdef CONFIG_ARCH_HAS_REFCOUNT
|
||||
# include <asm/refcount.h>
|
||||
# else
|
||||
static inline __must_check bool refcount_add_not_zero(unsigned int i, refcount_t *r)
|
||||
{
|
||||
return atomic_add_unless(&r->refs, i, 0);
|
||||
|
@ -87,6 +90,7 @@ static inline void refcount_dec(refcount_t *r)
|
|||
{
|
||||
atomic_dec(&r->refs);
|
||||
}
|
||||
# endif /* !CONFIG_ARCH_HAS_REFCOUNT */
|
||||
#endif /* CONFIG_REFCOUNT_FULL */
|
||||
|
||||
extern __must_check bool refcount_dec_if_one(refcount_t *r);
|
||||
|
|
|
@ -32,6 +32,7 @@ struct rw_semaphore {
|
|||
#define RWSEM_UNLOCKED_VALUE 0x00000000
|
||||
|
||||
extern void __down_read(struct rw_semaphore *sem);
|
||||
extern int __must_check __down_read_killable(struct rw_semaphore *sem);
|
||||
extern int __down_read_trylock(struct rw_semaphore *sem);
|
||||
extern void __down_write(struct rw_semaphore *sem);
|
||||
extern int __must_check __down_write_killable(struct rw_semaphore *sem);
|
||||
|
|
|
@ -44,6 +44,7 @@ struct rw_semaphore {
|
|||
};
|
||||
|
||||
extern struct rw_semaphore *rwsem_down_read_failed(struct rw_semaphore *sem);
|
||||
extern struct rw_semaphore *rwsem_down_read_failed_killable(struct rw_semaphore *sem);
|
||||
extern struct rw_semaphore *rwsem_down_write_failed(struct rw_semaphore *sem);
|
||||
extern struct rw_semaphore *rwsem_down_write_failed_killable(struct rw_semaphore *sem);
|
||||
extern struct rw_semaphore *rwsem_wake(struct rw_semaphore *);
|
||||
|
|
|
@ -847,7 +847,17 @@ struct task_struct {
|
|||
int lockdep_depth;
|
||||
unsigned int lockdep_recursion;
|
||||
struct held_lock held_locks[MAX_LOCK_DEPTH];
|
||||
gfp_t lockdep_reclaim_gfp;
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_LOCKDEP_CROSSRELEASE
|
||||
#define MAX_XHLOCKS_NR 64UL
|
||||
struct hist_lock *xhlocks; /* Crossrelease history locks */
|
||||
unsigned int xhlock_idx;
|
||||
/* For restoring at history boundaries */
|
||||
unsigned int xhlock_idx_hist[XHLOCK_CTX_NR];
|
||||
unsigned int hist_id;
|
||||
/* For overwrite check at each context exit */
|
||||
unsigned int hist_id_save[XHLOCK_CTX_NR];
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_UBSAN
|
||||
|
|
|
@ -167,6 +167,14 @@ static inline gfp_t current_gfp_context(gfp_t flags)
|
|||
return flags;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_LOCKDEP
|
||||
extern void fs_reclaim_acquire(gfp_t gfp_mask);
|
||||
extern void fs_reclaim_release(gfp_t gfp_mask);
|
||||
#else
|
||||
static inline void fs_reclaim_acquire(gfp_t gfp_mask) { }
|
||||
static inline void fs_reclaim_release(gfp_t gfp_mask) { }
|
||||
#endif
|
||||
|
||||
static inline unsigned int memalloc_noio_save(void)
|
||||
{
|
||||
unsigned int flags = current->flags & PF_MEMALLOC_NOIO;
|
||||
|
|
|
@ -14,13 +14,17 @@
|
|||
#include <linux/llist.h>
|
||||
|
||||
typedef void (*smp_call_func_t)(void *info);
|
||||
struct call_single_data {
|
||||
struct __call_single_data {
|
||||
struct llist_node llist;
|
||||
smp_call_func_t func;
|
||||
void *info;
|
||||
unsigned int flags;
|
||||
};
|
||||
|
||||
/* Use __aligned() to avoid to use 2 cache lines for 1 csd */
|
||||
typedef struct __call_single_data call_single_data_t
|
||||
__aligned(sizeof(struct __call_single_data));
|
||||
|
||||
/* total number of cpus in this system (may exceed NR_CPUS) */
|
||||
extern unsigned int total_cpus;
|
||||
|
||||
|
@ -48,7 +52,7 @@ void on_each_cpu_cond(bool (*cond_func)(int cpu, void *info),
|
|||
smp_call_func_t func, void *info, bool wait,
|
||||
gfp_t gfp_flags);
|
||||
|
||||
int smp_call_function_single_async(int cpu, struct call_single_data *csd);
|
||||
int smp_call_function_single_async(int cpu, call_single_data_t *csd);
|
||||
|
||||
#ifdef CONFIG_SMP
|
||||
|
||||
|
|
|
@ -118,16 +118,39 @@ do { \
|
|||
#endif
|
||||
|
||||
/*
|
||||
* Despite its name it doesn't necessarily has to be a full barrier.
|
||||
* It should only guarantee that a STORE before the critical section
|
||||
* can not be reordered with LOADs and STOREs inside this section.
|
||||
* spin_lock() is the one-way barrier, this LOAD can not escape out
|
||||
* of the region. So the default implementation simply ensures that
|
||||
* a STORE can not move into the critical section, smp_wmb() should
|
||||
* serialize it with another STORE done by spin_lock().
|
||||
* This barrier must provide two things:
|
||||
*
|
||||
* - it must guarantee a STORE before the spin_lock() is ordered against a
|
||||
* LOAD after it, see the comments at its two usage sites.
|
||||
*
|
||||
* - it must ensure the critical section is RCsc.
|
||||
*
|
||||
* The latter is important for cases where we observe values written by other
|
||||
* CPUs in spin-loops, without barriers, while being subject to scheduling.
|
||||
*
|
||||
* CPU0 CPU1 CPU2
|
||||
*
|
||||
* for (;;) {
|
||||
* if (READ_ONCE(X))
|
||||
* break;
|
||||
* }
|
||||
* X=1
|
||||
* <sched-out>
|
||||
* <sched-in>
|
||||
* r = X;
|
||||
*
|
||||
* without transitivity it could be that CPU1 observes X!=0 breaks the loop,
|
||||
* we get migrated and CPU2 sees X==0.
|
||||
*
|
||||
* Since most load-store architectures implement ACQUIRE with an smp_mb() after
|
||||
* the LL/SC loop, they need no further barriers. Similarly all our TSO
|
||||
* architectures imply an smp_mb() for each atomic instruction and equally don't
|
||||
* need more.
|
||||
*
|
||||
* Architectures that can implement ACQUIRE better need to take care.
|
||||
*/
|
||||
#ifndef smp_mb__before_spinlock
|
||||
#define smp_mb__before_spinlock() smp_wmb()
|
||||
#ifndef smp_mb__after_spinlock
|
||||
#define smp_mb__after_spinlock() do { } while (0)
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_DEBUG_SPINLOCK
|
||||
|
|
|
@ -1275,12 +1275,17 @@ config BASE_FULL
|
|||
config FUTEX
|
||||
bool "Enable futex support" if EXPERT
|
||||
default y
|
||||
select RT_MUTEXES
|
||||
imply RT_MUTEXES
|
||||
help
|
||||
Disabling this option will cause the kernel to be built without
|
||||
support for "fast userspace mutexes". The resulting kernel may not
|
||||
run glibc-based applications correctly.
|
||||
|
||||
config FUTEX_PI
|
||||
bool
|
||||
depends on FUTEX && RT_MUTEXES
|
||||
default y
|
||||
|
||||
config HAVE_FUTEX_CMPXCHG
|
||||
bool
|
||||
depends on FUTEX
|
||||
|
|
|
@ -577,6 +577,13 @@ static void update_domain_attr_tree(struct sched_domain_attr *dattr,
|
|||
rcu_read_unlock();
|
||||
}
|
||||
|
||||
/* Must be called with cpuset_mutex held. */
|
||||
static inline int nr_cpusets(void)
|
||||
{
|
||||
/* jump label reference count + the top-level cpuset */
|
||||
return static_key_count(&cpusets_enabled_key.key) + 1;
|
||||
}
|
||||
|
||||
/*
|
||||
* generate_sched_domains()
|
||||
*
|
||||
|
|
|
@ -918,6 +918,7 @@ void __noreturn do_exit(long code)
|
|||
exit_rcu();
|
||||
exit_tasks_rcu_finish();
|
||||
|
||||
lockdep_free_task(tsk);
|
||||
do_task_dead();
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(do_exit);
|
||||
|
|
|
@ -484,6 +484,8 @@ void __init fork_init(void)
|
|||
cpuhp_setup_state(CPUHP_BP_PREPARE_DYN, "fork:vm_stack_cache",
|
||||
NULL, free_vm_stack_cache);
|
||||
#endif
|
||||
|
||||
lockdep_init_task(&init_task);
|
||||
}
|
||||
|
||||
int __weak arch_dup_task_struct(struct task_struct *dst,
|
||||
|
@ -1700,6 +1702,7 @@ static __latent_entropy struct task_struct *copy_process(
|
|||
p->lockdep_depth = 0; /* no locks held yet */
|
||||
p->curr_chain_key = 0;
|
||||
p->lockdep_recursion = 0;
|
||||
lockdep_init_task(p);
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_DEBUG_MUTEXES
|
||||
|
@ -1958,6 +1961,7 @@ static __latent_entropy struct task_struct *copy_process(
|
|||
bad_fork_cleanup_perf:
|
||||
perf_event_free_task(p);
|
||||
bad_fork_cleanup_policy:
|
||||
lockdep_free_task(p);
|
||||
#ifdef CONFIG_NUMA
|
||||
mpol_put(p->mempolicy);
|
||||
bad_fork_cleanup_threadgroup_lock:
|
||||
|
|
|
@ -876,6 +876,8 @@ static struct task_struct *futex_find_get_task(pid_t pid)
|
|||
return p;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_FUTEX_PI
|
||||
|
||||
/*
|
||||
* This task is holding PI mutexes at exit time => bad.
|
||||
* Kernel cleans up PI-state, but userspace is likely hosed.
|
||||
|
@ -933,6 +935,8 @@ void exit_pi_state_list(struct task_struct *curr)
|
|||
raw_spin_unlock_irq(&curr->pi_lock);
|
||||
}
|
||||
|
||||
#endif
|
||||
|
||||
/*
|
||||
* We need to check the following states:
|
||||
*
|
||||
|
@ -1547,6 +1551,45 @@ futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset)
|
|||
return ret;
|
||||
}
|
||||
|
||||
static int futex_atomic_op_inuser(unsigned int encoded_op, u32 __user *uaddr)
|
||||
{
|
||||
unsigned int op = (encoded_op & 0x70000000) >> 28;
|
||||
unsigned int cmp = (encoded_op & 0x0f000000) >> 24;
|
||||
int oparg = sign_extend32((encoded_op & 0x00fff000) >> 12, 12);
|
||||
int cmparg = sign_extend32(encoded_op & 0x00000fff, 12);
|
||||
int oldval, ret;
|
||||
|
||||
if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28)) {
|
||||
if (oparg < 0 || oparg > 31)
|
||||
return -EINVAL;
|
||||
oparg = 1 << oparg;
|
||||
}
|
||||
|
||||
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
|
||||
return -EFAULT;
|
||||
|
||||
ret = arch_futex_atomic_op_inuser(op, oparg, &oldval, uaddr);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
switch (cmp) {
|
||||
case FUTEX_OP_CMP_EQ:
|
||||
return oldval == cmparg;
|
||||
case FUTEX_OP_CMP_NE:
|
||||
return oldval != cmparg;
|
||||
case FUTEX_OP_CMP_LT:
|
||||
return oldval < cmparg;
|
||||
case FUTEX_OP_CMP_GE:
|
||||
return oldval >= cmparg;
|
||||
case FUTEX_OP_CMP_LE:
|
||||
return oldval <= cmparg;
|
||||
case FUTEX_OP_CMP_GT:
|
||||
return oldval > cmparg;
|
||||
default:
|
||||
return -ENOSYS;
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* Wake up all waiters hashed on the physical page that is mapped
|
||||
* to this virtual address:
|
||||
|
@ -1800,6 +1843,15 @@ static int futex_requeue(u32 __user *uaddr1, unsigned int flags,
|
|||
struct futex_q *this, *next;
|
||||
DEFINE_WAKE_Q(wake_q);
|
||||
|
||||
/*
|
||||
* When PI not supported: return -ENOSYS if requeue_pi is true,
|
||||
* consequently the compiler knows requeue_pi is always false past
|
||||
* this point which will optimize away all the conditional code
|
||||
* further down.
|
||||
*/
|
||||
if (!IS_ENABLED(CONFIG_FUTEX_PI) && requeue_pi)
|
||||
return -ENOSYS;
|
||||
|
||||
if (requeue_pi) {
|
||||
/*
|
||||
* Requeue PI only works on two distinct uaddrs. This
|
||||
|
@ -2595,6 +2647,9 @@ static int futex_lock_pi(u32 __user *uaddr, unsigned int flags,
|
|||
struct futex_q q = futex_q_init;
|
||||
int res, ret;
|
||||
|
||||
if (!IS_ENABLED(CONFIG_FUTEX_PI))
|
||||
return -ENOSYS;
|
||||
|
||||
if (refill_pi_state_cache())
|
||||
return -ENOMEM;
|
||||
|
||||
|
@ -2774,6 +2829,9 @@ static int futex_unlock_pi(u32 __user *uaddr, unsigned int flags)
|
|||
struct futex_q *top_waiter;
|
||||
int ret;
|
||||
|
||||
if (!IS_ENABLED(CONFIG_FUTEX_PI))
|
||||
return -ENOSYS;
|
||||
|
||||
retry:
|
||||
if (get_user(uval, uaddr))
|
||||
return -EFAULT;
|
||||
|
@ -2984,6 +3042,9 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
|
|||
struct futex_q q = futex_q_init;
|
||||
int res, ret;
|
||||
|
||||
if (!IS_ENABLED(CONFIG_FUTEX_PI))
|
||||
return -ENOSYS;
|
||||
|
||||
if (uaddr == uaddr2)
|
||||
return -EINVAL;
|
||||
|
||||
|
|
|
@ -79,29 +79,7 @@ int static_key_count(struct static_key *key)
|
|||
}
|
||||
EXPORT_SYMBOL_GPL(static_key_count);
|
||||
|
||||
void static_key_enable(struct static_key *key)
|
||||
{
|
||||
int count = static_key_count(key);
|
||||
|
||||
WARN_ON_ONCE(count < 0 || count > 1);
|
||||
|
||||
if (!count)
|
||||
static_key_slow_inc(key);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(static_key_enable);
|
||||
|
||||
void static_key_disable(struct static_key *key)
|
||||
{
|
||||
int count = static_key_count(key);
|
||||
|
||||
WARN_ON_ONCE(count < 0 || count > 1);
|
||||
|
||||
if (count)
|
||||
static_key_slow_dec(key);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(static_key_disable);
|
||||
|
||||
void static_key_slow_inc(struct static_key *key)
|
||||
static void static_key_slow_inc_cpuslocked(struct static_key *key)
|
||||
{
|
||||
int v, v1;
|
||||
|
||||
|
@ -125,24 +103,87 @@ void static_key_slow_inc(struct static_key *key)
|
|||
return;
|
||||
}
|
||||
|
||||
cpus_read_lock();
|
||||
jump_label_lock();
|
||||
if (atomic_read(&key->enabled) == 0) {
|
||||
atomic_set(&key->enabled, -1);
|
||||
jump_label_update(key);
|
||||
atomic_set(&key->enabled, 1);
|
||||
/*
|
||||
* Ensure that if the above cmpxchg loop observes our positive
|
||||
* value, it must also observe all the text changes.
|
||||
*/
|
||||
atomic_set_release(&key->enabled, 1);
|
||||
} else {
|
||||
atomic_inc(&key->enabled);
|
||||
}
|
||||
jump_label_unlock();
|
||||
}
|
||||
|
||||
void static_key_slow_inc(struct static_key *key)
|
||||
{
|
||||
cpus_read_lock();
|
||||
static_key_slow_inc_cpuslocked(key);
|
||||
cpus_read_unlock();
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(static_key_slow_inc);
|
||||
|
||||
static void __static_key_slow_dec(struct static_key *key,
|
||||
unsigned long rate_limit, struct delayed_work *work)
|
||||
void static_key_enable_cpuslocked(struct static_key *key)
|
||||
{
|
||||
STATIC_KEY_CHECK_USE();
|
||||
|
||||
if (atomic_read(&key->enabled) > 0) {
|
||||
WARN_ON_ONCE(atomic_read(&key->enabled) != 1);
|
||||
return;
|
||||
}
|
||||
|
||||
jump_label_lock();
|
||||
if (atomic_read(&key->enabled) == 0) {
|
||||
atomic_set(&key->enabled, -1);
|
||||
jump_label_update(key);
|
||||
/*
|
||||
* See static_key_slow_inc().
|
||||
*/
|
||||
atomic_set_release(&key->enabled, 1);
|
||||
}
|
||||
jump_label_unlock();
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(static_key_enable_cpuslocked);
|
||||
|
||||
void static_key_enable(struct static_key *key)
|
||||
{
|
||||
cpus_read_lock();
|
||||
static_key_enable_cpuslocked(key);
|
||||
cpus_read_unlock();
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(static_key_enable);
|
||||
|
||||
void static_key_disable_cpuslocked(struct static_key *key)
|
||||
{
|
||||
STATIC_KEY_CHECK_USE();
|
||||
|
||||
if (atomic_read(&key->enabled) != 1) {
|
||||
WARN_ON_ONCE(atomic_read(&key->enabled) != 0);
|
||||
return;
|
||||
}
|
||||
|
||||
jump_label_lock();
|
||||
if (atomic_cmpxchg(&key->enabled, 1, 0))
|
||||
jump_label_update(key);
|
||||
jump_label_unlock();
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(static_key_disable_cpuslocked);
|
||||
|
||||
void static_key_disable(struct static_key *key)
|
||||
{
|
||||
cpus_read_lock();
|
||||
static_key_disable_cpuslocked(key);
|
||||
cpus_read_unlock();
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(static_key_disable);
|
||||
|
||||
static void static_key_slow_dec_cpuslocked(struct static_key *key,
|
||||
unsigned long rate_limit,
|
||||
struct delayed_work *work)
|
||||
{
|
||||
/*
|
||||
* The negative count check is valid even when a negative
|
||||
* key->enabled is in use by static_key_slow_inc(); a
|
||||
|
@ -153,7 +194,6 @@ static void __static_key_slow_dec(struct static_key *key,
|
|||
if (!atomic_dec_and_mutex_lock(&key->enabled, &jump_label_mutex)) {
|
||||
WARN(atomic_read(&key->enabled) < 0,
|
||||
"jump label: negative count!\n");
|
||||
cpus_read_unlock();
|
||||
return;
|
||||
}
|
||||
|
||||
|
@ -164,6 +204,14 @@ static void __static_key_slow_dec(struct static_key *key,
|
|||
jump_label_update(key);
|
||||
}
|
||||
jump_label_unlock();
|
||||
}
|
||||
|
||||
static void __static_key_slow_dec(struct static_key *key,
|
||||
unsigned long rate_limit,
|
||||
struct delayed_work *work)
|
||||
{
|
||||
cpus_read_lock();
|
||||
static_key_slow_dec_cpuslocked(key, rate_limit, work);
|
||||
cpus_read_unlock();
|
||||
}
|
||||
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -143,6 +143,8 @@ struct lockdep_stats {
|
|||
int redundant_softirqs_on;
|
||||
int redundant_softirqs_off;
|
||||
int nr_unused_locks;
|
||||
int nr_redundant_checks;
|
||||
int nr_redundant;
|
||||
int nr_cyclic_checks;
|
||||
int nr_cyclic_check_recursions;
|
||||
int nr_find_usage_forwards_checks;
|
||||
|
|
|
@ -201,6 +201,10 @@ static void lockdep_stats_debug_show(struct seq_file *m)
|
|||
debug_atomic_read(chain_lookup_hits));
|
||||
seq_printf(m, " cyclic checks: %11llu\n",
|
||||
debug_atomic_read(nr_cyclic_checks));
|
||||
seq_printf(m, " redundant checks: %11llu\n",
|
||||
debug_atomic_read(nr_redundant_checks));
|
||||
seq_printf(m, " redundant links: %11llu\n",
|
||||
debug_atomic_read(nr_redundant));
|
||||
seq_printf(m, " find-mask forwards checks: %11llu\n",
|
||||
debug_atomic_read(nr_find_usage_forwards_checks));
|
||||
seq_printf(m, " find-mask backwards checks: %11llu\n",
|
||||
|
|
|
@ -6,4 +6,3 @@
|
|||
*/
|
||||
LOCKDEP_STATE(HARDIRQ)
|
||||
LOCKDEP_STATE(SOFTIRQ)
|
||||
LOCKDEP_STATE(RECLAIM_FS)
|
||||
|
|
|
@ -109,6 +109,19 @@ bool osq_lock(struct optimistic_spin_queue *lock)
|
|||
|
||||
prev = decode_cpu(old);
|
||||
node->prev = prev;
|
||||
|
||||
/*
|
||||
* osq_lock() unqueue
|
||||
*
|
||||
* node->prev = prev osq_wait_next()
|
||||
* WMB MB
|
||||
* prev->next = node next->prev = prev // unqueue-C
|
||||
*
|
||||
* Here 'node->prev' and 'next->prev' are the same variable and we need
|
||||
* to ensure these stores happen in-order to avoid corrupting the list.
|
||||
*/
|
||||
smp_wmb();
|
||||
|
||||
WRITE_ONCE(prev->next, node);
|
||||
|
||||
/*
|
||||
|
|
|
@ -72,7 +72,7 @@ static inline bool pv_queued_spin_steal_lock(struct qspinlock *lock)
|
|||
struct __qspinlock *l = (void *)lock;
|
||||
|
||||
if (!(atomic_read(&lock->val) & _Q_LOCKED_PENDING_MASK) &&
|
||||
(cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0)) {
|
||||
(cmpxchg_acquire(&l->locked, 0, _Q_LOCKED_VAL) == 0)) {
|
||||
qstat_inc(qstat_pv_lock_stealing, true);
|
||||
return true;
|
||||
}
|
||||
|
@ -101,16 +101,16 @@ static __always_inline void clear_pending(struct qspinlock *lock)
|
|||
|
||||
/*
|
||||
* The pending bit check in pv_queued_spin_steal_lock() isn't a memory
|
||||
* barrier. Therefore, an atomic cmpxchg() is used to acquire the lock
|
||||
* just to be sure that it will get it.
|
||||
* barrier. Therefore, an atomic cmpxchg_acquire() is used to acquire the
|
||||
* lock just to be sure that it will get it.
|
||||
*/
|
||||
static __always_inline int trylock_clear_pending(struct qspinlock *lock)
|
||||
{
|
||||
struct __qspinlock *l = (void *)lock;
|
||||
|
||||
return !READ_ONCE(l->locked) &&
|
||||
(cmpxchg(&l->locked_pending, _Q_PENDING_VAL, _Q_LOCKED_VAL)
|
||||
== _Q_PENDING_VAL);
|
||||
(cmpxchg_acquire(&l->locked_pending, _Q_PENDING_VAL,
|
||||
_Q_LOCKED_VAL) == _Q_PENDING_VAL);
|
||||
}
|
||||
#else /* _Q_PENDING_BITS == 8 */
|
||||
static __always_inline void set_pending(struct qspinlock *lock)
|
||||
|
@ -138,7 +138,7 @@ static __always_inline int trylock_clear_pending(struct qspinlock *lock)
|
|||
*/
|
||||
old = val;
|
||||
new = (val & ~_Q_PENDING_MASK) | _Q_LOCKED_VAL;
|
||||
val = atomic_cmpxchg(&lock->val, old, new);
|
||||
val = atomic_cmpxchg_acquire(&lock->val, old, new);
|
||||
|
||||
if (val == old)
|
||||
return 1;
|
||||
|
@ -362,8 +362,18 @@ static void pv_kick_node(struct qspinlock *lock, struct mcs_spinlock *node)
|
|||
* observe its next->locked value and advance itself.
|
||||
*
|
||||
* Matches with smp_store_mb() and cmpxchg() in pv_wait_node()
|
||||
*
|
||||
* The write to next->locked in arch_mcs_spin_unlock_contended()
|
||||
* must be ordered before the read of pn->state in the cmpxchg()
|
||||
* below for the code to work correctly. To guarantee full ordering
|
||||
* irrespective of the success or failure of the cmpxchg(),
|
||||
* a relaxed version with explicit barrier is used. The control
|
||||
* dependency will order the reading of pn->state before any
|
||||
* subsequent writes.
|
||||
*/
|
||||
if (cmpxchg(&pn->state, vcpu_halted, vcpu_hashed) != vcpu_halted)
|
||||
smp_mb__before_atomic();
|
||||
if (cmpxchg_relaxed(&pn->state, vcpu_halted, vcpu_hashed)
|
||||
!= vcpu_halted)
|
||||
return;
|
||||
|
||||
/*
|
||||
|
|
|
@ -40,6 +40,9 @@ struct rt_mutex_waiter {
|
|||
/*
|
||||
* Various helpers to access the waiters-tree:
|
||||
*/
|
||||
|
||||
#ifdef CONFIG_RT_MUTEXES
|
||||
|
||||
static inline int rt_mutex_has_waiters(struct rt_mutex *lock)
|
||||
{
|
||||
return !RB_EMPTY_ROOT(&lock->waiters);
|
||||
|
@ -69,6 +72,32 @@ task_top_pi_waiter(struct task_struct *p)
|
|||
pi_tree_entry);
|
||||
}
|
||||
|
||||
#else
|
||||
|
||||
static inline int rt_mutex_has_waiters(struct rt_mutex *lock)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
static inline struct rt_mutex_waiter *
|
||||
rt_mutex_top_waiter(struct rt_mutex *lock)
|
||||
{
|
||||
return NULL;
|
||||
}
|
||||
|
||||
static inline int task_has_pi_waiters(struct task_struct *p)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
static inline struct rt_mutex_waiter *
|
||||
task_top_pi_waiter(struct task_struct *p)
|
||||
{
|
||||
return NULL;
|
||||
}
|
||||
|
||||
#endif
|
||||
|
||||
/*
|
||||
* lock->owner state tracking:
|
||||
*/
|
||||
|
|
|
@ -126,7 +126,7 @@ __rwsem_wake_one_writer(struct rw_semaphore *sem)
|
|||
/*
|
||||
* get a read lock on the semaphore
|
||||
*/
|
||||
void __sched __down_read(struct rw_semaphore *sem)
|
||||
int __sched __down_read_common(struct rw_semaphore *sem, int state)
|
||||
{
|
||||
struct rwsem_waiter waiter;
|
||||
unsigned long flags;
|
||||
|
@ -140,8 +140,6 @@ void __sched __down_read(struct rw_semaphore *sem)
|
|||
goto out;
|
||||
}
|
||||
|
||||
set_current_state(TASK_UNINTERRUPTIBLE);
|
||||
|
||||
/* set up my own style of waitqueue */
|
||||
waiter.task = current;
|
||||
waiter.type = RWSEM_WAITING_FOR_READ;
|
||||
|
@ -149,20 +147,41 @@ void __sched __down_read(struct rw_semaphore *sem)
|
|||
|
||||
list_add_tail(&waiter.list, &sem->wait_list);
|
||||
|
||||
/* we don't need to touch the semaphore struct anymore */
|
||||
raw_spin_unlock_irqrestore(&sem->wait_lock, flags);
|
||||
|
||||
/* wait to be given the lock */
|
||||
for (;;) {
|
||||
if (!waiter.task)
|
||||
break;
|
||||
if (signal_pending_state(state, current))
|
||||
goto out_nolock;
|
||||
set_current_state(state);
|
||||
raw_spin_unlock_irqrestore(&sem->wait_lock, flags);
|
||||
schedule();
|
||||
set_current_state(TASK_UNINTERRUPTIBLE);
|
||||
raw_spin_lock_irqsave(&sem->wait_lock, flags);
|
||||
}
|
||||
|
||||
__set_current_state(TASK_RUNNING);
|
||||
raw_spin_unlock_irqrestore(&sem->wait_lock, flags);
|
||||
out:
|
||||
;
|
||||
return 0;
|
||||
|
||||
out_nolock:
|
||||
/*
|
||||
* We didn't take the lock, so that there is a writer, which
|
||||
* is owner or the first waiter of the sem. If it's a waiter,
|
||||
* it will be woken by current owner. Not need to wake anybody.
|
||||
*/
|
||||
list_del(&waiter.list);
|
||||
raw_spin_unlock_irqrestore(&sem->wait_lock, flags);
|
||||
return -EINTR;
|
||||
}
|
||||
|
||||
void __sched __down_read(struct rw_semaphore *sem)
|
||||
{
|
||||
__down_read_common(sem, TASK_UNINTERRUPTIBLE);
|
||||
}
|
||||
|
||||
int __sched __down_read_killable(struct rw_semaphore *sem)
|
||||
{
|
||||
return __down_read_common(sem, TASK_KILLABLE);
|
||||
}
|
||||
|
||||
/*
|
||||
|
|
|
@ -221,8 +221,8 @@ static void __rwsem_mark_wake(struct rw_semaphore *sem,
|
|||
/*
|
||||
* Wait for the read lock to be granted
|
||||
*/
|
||||
__visible
|
||||
struct rw_semaphore __sched *rwsem_down_read_failed(struct rw_semaphore *sem)
|
||||
static inline struct rw_semaphore __sched *
|
||||
__rwsem_down_read_failed_common(struct rw_semaphore *sem, int state)
|
||||
{
|
||||
long count, adjustment = -RWSEM_ACTIVE_READ_BIAS;
|
||||
struct rwsem_waiter waiter;
|
||||
|
@ -255,17 +255,44 @@ struct rw_semaphore __sched *rwsem_down_read_failed(struct rw_semaphore *sem)
|
|||
|
||||
/* wait to be given the lock */
|
||||
while (true) {
|
||||
set_current_state(TASK_UNINTERRUPTIBLE);
|
||||
set_current_state(state);
|
||||
if (!waiter.task)
|
||||
break;
|
||||
if (signal_pending_state(state, current)) {
|
||||
raw_spin_lock_irq(&sem->wait_lock);
|
||||
if (waiter.task)
|
||||
goto out_nolock;
|
||||
raw_spin_unlock_irq(&sem->wait_lock);
|
||||
break;
|
||||
}
|
||||
schedule();
|
||||
}
|
||||
|
||||
__set_current_state(TASK_RUNNING);
|
||||
return sem;
|
||||
out_nolock:
|
||||
list_del(&waiter.list);
|
||||
if (list_empty(&sem->wait_list))
|
||||
atomic_long_add(-RWSEM_WAITING_BIAS, &sem->count);
|
||||
raw_spin_unlock_irq(&sem->wait_lock);
|
||||
__set_current_state(TASK_RUNNING);
|
||||
return ERR_PTR(-EINTR);
|
||||
}
|
||||
|
||||
__visible struct rw_semaphore * __sched
|
||||
rwsem_down_read_failed(struct rw_semaphore *sem)
|
||||
{
|
||||
return __rwsem_down_read_failed_common(sem, TASK_UNINTERRUPTIBLE);
|
||||
}
|
||||
EXPORT_SYMBOL(rwsem_down_read_failed);
|
||||
|
||||
__visible struct rw_semaphore * __sched
|
||||
rwsem_down_read_failed_killable(struct rw_semaphore *sem)
|
||||
{
|
||||
return __rwsem_down_read_failed_common(sem, TASK_KILLABLE);
|
||||
}
|
||||
EXPORT_SYMBOL(rwsem_down_read_failed_killable);
|
||||
|
||||
/*
|
||||
* This function must be called with the sem->wait_lock held to prevent
|
||||
* race conditions between checking the rwsem wait list and setting the
|
||||
|
|
|
@ -26,6 +26,7 @@
|
|||
#include <linux/nmi.h>
|
||||
#include <linux/console.h>
|
||||
#include <linux/bug.h>
|
||||
#include <linux/ratelimit.h>
|
||||
|
||||
#define PANIC_TIMER_STEP 100
|
||||
#define PANIC_BLINK_SPD 18
|
||||
|
@ -601,6 +602,17 @@ EXPORT_SYMBOL(__stack_chk_fail);
|
|||
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_ARCH_HAS_REFCOUNT
|
||||
void refcount_error_report(struct pt_regs *regs, const char *err)
|
||||
{
|
||||
WARN_RATELIMIT(1, "refcount_t %s at %pB in %s[%d], uid/euid: %u/%u\n",
|
||||
err, (void *)instruction_pointer(regs),
|
||||
current->comm, task_pid_nr(current),
|
||||
from_kuid_munged(&init_user_ns, current_uid()),
|
||||
from_kuid_munged(&init_user_ns, current_euid()));
|
||||
}
|
||||
#endif
|
||||
|
||||
core_param(panic, panic_timeout, int, 0644);
|
||||
core_param(pause_on_oops, pause_on_oops, int, 0644);
|
||||
core_param(panic_on_warn, panic_on_warn, int, 0644);
|
||||
|
|
|
@ -32,6 +32,12 @@ void complete(struct completion *x)
|
|||
unsigned long flags;
|
||||
|
||||
spin_lock_irqsave(&x->wait.lock, flags);
|
||||
|
||||
/*
|
||||
* Perform commit of crossrelease here.
|
||||
*/
|
||||
complete_release_commit(x);
|
||||
|
||||
if (x->done != UINT_MAX)
|
||||
x->done++;
|
||||
__wake_up_locked(&x->wait, TASK_NORMAL, 1);
|
||||
|
@ -99,9 +105,14 @@ __wait_for_common(struct completion *x,
|
|||
{
|
||||
might_sleep();
|
||||
|
||||
complete_acquire(x);
|
||||
|
||||
spin_lock_irq(&x->wait.lock);
|
||||
timeout = do_wait_for_common(x, action, timeout, state);
|
||||
spin_unlock_irq(&x->wait.lock);
|
||||
|
||||
complete_release(x);
|
||||
|
||||
return timeout;
|
||||
}
|
||||
|
||||
|
|
|
@ -1972,8 +1972,8 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
|
|||
* reordered with p->state check below. This pairs with mb() in
|
||||
* set_current_state() the waiting thread does.
|
||||
*/
|
||||
smp_mb__before_spinlock();
|
||||
raw_spin_lock_irqsave(&p->pi_lock, flags);
|
||||
smp_mb__after_spinlock();
|
||||
if (!(p->state & state))
|
||||
goto out;
|
||||
|
||||
|
@ -3296,8 +3296,8 @@ static void __sched notrace __schedule(bool preempt)
|
|||
* can't be reordered with __set_current_state(TASK_INTERRUPTIBLE)
|
||||
* done by the caller to avoid the race with signal_wake_up().
|
||||
*/
|
||||
smp_mb__before_spinlock();
|
||||
rq_lock(rq, &rf);
|
||||
smp_mb__after_spinlock();
|
||||
|
||||
/* Promote REQ to ACT */
|
||||
rq->clock_update_flags <<= 1;
|
||||
|
|
|
@ -769,7 +769,7 @@ struct rq {
|
|||
#ifdef CONFIG_SCHED_HRTICK
|
||||
#ifdef CONFIG_SMP
|
||||
int hrtick_csd_pending;
|
||||
struct call_single_data hrtick_csd;
|
||||
call_single_data_t hrtick_csd;
|
||||
#endif
|
||||
struct hrtimer hrtick_timer;
|
||||
#endif
|
||||
|
|
|
@ -33,9 +33,6 @@ void swake_up(struct swait_queue_head *q)
|
|||
{
|
||||
unsigned long flags;
|
||||
|
||||
if (!swait_active(q))
|
||||
return;
|
||||
|
||||
raw_spin_lock_irqsave(&q->lock, flags);
|
||||
swake_up_locked(q);
|
||||
raw_spin_unlock_irqrestore(&q->lock, flags);
|
||||
|
@ -51,9 +48,6 @@ void swake_up_all(struct swait_queue_head *q)
|
|||
struct swait_queue *curr;
|
||||
LIST_HEAD(tmp);
|
||||
|
||||
if (!swait_active(q))
|
||||
return;
|
||||
|
||||
raw_spin_lock_irq(&q->lock);
|
||||
list_splice_init(&q->task_list, &tmp);
|
||||
while (!list_empty(&tmp)) {
|
||||
|
|
32
kernel/smp.c
32
kernel/smp.c
|
@ -28,7 +28,7 @@ enum {
|
|||
};
|
||||
|
||||
struct call_function_data {
|
||||
struct call_single_data __percpu *csd;
|
||||
call_single_data_t __percpu *csd;
|
||||
cpumask_var_t cpumask;
|
||||
cpumask_var_t cpumask_ipi;
|
||||
};
|
||||
|
@ -51,7 +51,7 @@ int smpcfd_prepare_cpu(unsigned int cpu)
|
|||
free_cpumask_var(cfd->cpumask);
|
||||
return -ENOMEM;
|
||||
}
|
||||
cfd->csd = alloc_percpu(struct call_single_data);
|
||||
cfd->csd = alloc_percpu(call_single_data_t);
|
||||
if (!cfd->csd) {
|
||||
free_cpumask_var(cfd->cpumask);
|
||||
free_cpumask_var(cfd->cpumask_ipi);
|
||||
|
@ -103,12 +103,12 @@ void __init call_function_init(void)
|
|||
* previous function call. For multi-cpu calls its even more interesting
|
||||
* as we'll have to ensure no other cpu is observing our csd.
|
||||
*/
|
||||
static __always_inline void csd_lock_wait(struct call_single_data *csd)
|
||||
static __always_inline void csd_lock_wait(call_single_data_t *csd)
|
||||
{
|
||||
smp_cond_load_acquire(&csd->flags, !(VAL & CSD_FLAG_LOCK));
|
||||
}
|
||||
|
||||
static __always_inline void csd_lock(struct call_single_data *csd)
|
||||
static __always_inline void csd_lock(call_single_data_t *csd)
|
||||
{
|
||||
csd_lock_wait(csd);
|
||||
csd->flags |= CSD_FLAG_LOCK;
|
||||
|
@ -116,12 +116,12 @@ static __always_inline void csd_lock(struct call_single_data *csd)
|
|||
/*
|
||||
* prevent CPU from reordering the above assignment
|
||||
* to ->flags with any subsequent assignments to other
|
||||
* fields of the specified call_single_data structure:
|
||||
* fields of the specified call_single_data_t structure:
|
||||
*/
|
||||
smp_wmb();
|
||||
}
|
||||
|
||||
static __always_inline void csd_unlock(struct call_single_data *csd)
|
||||
static __always_inline void csd_unlock(call_single_data_t *csd)
|
||||
{
|
||||
WARN_ON(!(csd->flags & CSD_FLAG_LOCK));
|
||||
|
||||
|
@ -131,14 +131,14 @@ static __always_inline void csd_unlock(struct call_single_data *csd)
|
|||
smp_store_release(&csd->flags, 0);
|
||||
}
|
||||
|
||||
static DEFINE_PER_CPU_SHARED_ALIGNED(struct call_single_data, csd_data);
|
||||
static DEFINE_PER_CPU_SHARED_ALIGNED(call_single_data_t, csd_data);
|
||||
|
||||
/*
|
||||
* Insert a previously allocated call_single_data element
|
||||
* Insert a previously allocated call_single_data_t element
|
||||
* for execution on the given CPU. data must already have
|
||||
* ->func, ->info, and ->flags set.
|
||||
*/
|
||||
static int generic_exec_single(int cpu, struct call_single_data *csd,
|
||||
static int generic_exec_single(int cpu, call_single_data_t *csd,
|
||||
smp_call_func_t func, void *info)
|
||||
{
|
||||
if (cpu == smp_processor_id()) {
|
||||
|
@ -210,7 +210,7 @@ static void flush_smp_call_function_queue(bool warn_cpu_offline)
|
|||
{
|
||||
struct llist_head *head;
|
||||
struct llist_node *entry;
|
||||
struct call_single_data *csd, *csd_next;
|
||||
call_single_data_t *csd, *csd_next;
|
||||
static bool warned;
|
||||
|
||||
WARN_ON(!irqs_disabled());
|
||||
|
@ -268,8 +268,10 @@ static void flush_smp_call_function_queue(bool warn_cpu_offline)
|
|||
int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
|
||||
int wait)
|
||||
{
|
||||
struct call_single_data *csd;
|
||||
struct call_single_data csd_stack = { .flags = CSD_FLAG_LOCK | CSD_FLAG_SYNCHRONOUS };
|
||||
call_single_data_t *csd;
|
||||
call_single_data_t csd_stack = {
|
||||
.flags = CSD_FLAG_LOCK | CSD_FLAG_SYNCHRONOUS,
|
||||
};
|
||||
int this_cpu;
|
||||
int err;
|
||||
|
||||
|
@ -321,7 +323,7 @@ EXPORT_SYMBOL(smp_call_function_single);
|
|||
* NOTE: Be careful, there is unfortunately no current debugging facility to
|
||||
* validate the correctness of this serialization.
|
||||
*/
|
||||
int smp_call_function_single_async(int cpu, struct call_single_data *csd)
|
||||
int smp_call_function_single_async(int cpu, call_single_data_t *csd)
|
||||
{
|
||||
int err = 0;
|
||||
|
||||
|
@ -444,7 +446,7 @@ void smp_call_function_many(const struct cpumask *mask,
|
|||
|
||||
cpumask_clear(cfd->cpumask_ipi);
|
||||
for_each_cpu(cpu, cfd->cpumask) {
|
||||
struct call_single_data *csd = per_cpu_ptr(cfd->csd, cpu);
|
||||
call_single_data_t *csd = per_cpu_ptr(cfd->csd, cpu);
|
||||
|
||||
csd_lock(csd);
|
||||
if (wait)
|
||||
|
@ -460,7 +462,7 @@ void smp_call_function_many(const struct cpumask *mask,
|
|||
|
||||
if (wait) {
|
||||
for_each_cpu(cpu, cfd->cpumask) {
|
||||
struct call_single_data *csd;
|
||||
call_single_data_t *csd;
|
||||
|
||||
csd = per_cpu_ptr(cfd->csd, cpu);
|
||||
csd_lock_wait(csd);
|
||||
|
|
|
@ -23,7 +23,7 @@ int smp_call_function_single(int cpu, void (*func) (void *info), void *info,
|
|||
}
|
||||
EXPORT_SYMBOL(smp_call_function_single);
|
||||
|
||||
int smp_call_function_single_async(int cpu, struct call_single_data *csd)
|
||||
int smp_call_function_single_async(int cpu, call_single_data_t *csd)
|
||||
{
|
||||
unsigned long flags;
|
||||
|
||||
|
|
|
@ -2091,8 +2091,30 @@ __acquires(&pool->lock)
|
|||
|
||||
spin_unlock_irq(&pool->lock);
|
||||
|
||||
lock_map_acquire_read(&pwq->wq->lockdep_map);
|
||||
lock_map_acquire(&pwq->wq->lockdep_map);
|
||||
lock_map_acquire(&lockdep_map);
|
||||
/*
|
||||
* Strictly speaking we should mark the invariant state without holding
|
||||
* any locks, that is, before these two lock_map_acquire()'s.
|
||||
*
|
||||
* However, that would result in:
|
||||
*
|
||||
* A(W1)
|
||||
* WFC(C)
|
||||
* A(W1)
|
||||
* C(C)
|
||||
*
|
||||
* Which would create W1->C->W1 dependencies, even though there is no
|
||||
* actual deadlock possible. There are two solutions, using a
|
||||
* read-recursive acquire on the work(queue) 'locks', but this will then
|
||||
* hit the lockdep limitation on recursive locks, or simply discard
|
||||
* these locks.
|
||||
*
|
||||
* AFAICT there is no possible deadlock scenario between the
|
||||
* flush_work() and complete() primitives (except for single-threaded
|
||||
* workqueues), so hiding them isn't a problem.
|
||||
*/
|
||||
lockdep_invariant_state(true);
|
||||
trace_workqueue_execute_start(work);
|
||||
worker->current_func(work);
|
||||
/*
|
||||
|
@ -2474,7 +2496,16 @@ static void insert_wq_barrier(struct pool_workqueue *pwq,
|
|||
*/
|
||||
INIT_WORK_ONSTACK(&barr->work, wq_barrier_func);
|
||||
__set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(&barr->work));
|
||||
init_completion(&barr->done);
|
||||
|
||||
/*
|
||||
* Explicitly init the crosslock for wq_barrier::done, make its lock
|
||||
* key a subkey of the corresponding work. As a result we won't
|
||||
* build a dependency between wq_barrier::done and unrelated work.
|
||||
*/
|
||||
lockdep_init_map_crosslock((struct lockdep_map *)&barr->done.map,
|
||||
"(complete)wq_barr::done",
|
||||
target->lockdep_map.key, 1);
|
||||
__init_completion(&barr->done);
|
||||
barr->task = current;
|
||||
|
||||
/*
|
||||
|
@ -2815,16 +2846,18 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
|
|||
spin_unlock_irq(&pool->lock);
|
||||
|
||||
/*
|
||||
* If @max_active is 1 or rescuer is in use, flushing another work
|
||||
* item on the same workqueue may lead to deadlock. Make sure the
|
||||
* flusher is not running on the same workqueue by verifying write
|
||||
* access.
|
||||
* Force a lock recursion deadlock when using flush_work() inside a
|
||||
* single-threaded or rescuer equipped workqueue.
|
||||
*
|
||||
* For single threaded workqueues the deadlock happens when the work
|
||||
* is after the work issuing the flush_work(). For rescuer equipped
|
||||
* workqueues the deadlock happens when the rescuer stalls, blocking
|
||||
* forward progress.
|
||||
*/
|
||||
if (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer)
|
||||
if (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer) {
|
||||
lock_map_acquire(&pwq->wq->lockdep_map);
|
||||
else
|
||||
lock_map_acquire_read(&pwq->wq->lockdep_map);
|
||||
lock_map_release(&pwq->wq->lockdep_map);
|
||||
lock_map_release(&pwq->wq->lockdep_map);
|
||||
}
|
||||
|
||||
return true;
|
||||
already_gone:
|
||||
|
|
|
@ -1091,6 +1091,8 @@ config PROVE_LOCKING
|
|||
select DEBUG_MUTEXES
|
||||
select DEBUG_RT_MUTEXES if RT_MUTEXES
|
||||
select DEBUG_LOCK_ALLOC
|
||||
select LOCKDEP_CROSSRELEASE
|
||||
select LOCKDEP_COMPLETIONS
|
||||
select TRACE_IRQFLAGS
|
||||
default n
|
||||
help
|
||||
|
@ -1160,6 +1162,22 @@ config LOCK_STAT
|
|||
CONFIG_LOCK_STAT defines "contended" and "acquired" lock events.
|
||||
(CONFIG_LOCKDEP defines "acquire" and "release" events.)
|
||||
|
||||
config LOCKDEP_CROSSRELEASE
|
||||
bool
|
||||
help
|
||||
This makes lockdep work for crosslock which is a lock allowed to
|
||||
be released in a different context from the acquisition context.
|
||||
Normally a lock must be released in the context acquiring the lock.
|
||||
However, relexing this constraint helps synchronization primitives
|
||||
such as page locks or completions can use the lock correctness
|
||||
detector, lockdep.
|
||||
|
||||
config LOCKDEP_COMPLETIONS
|
||||
bool
|
||||
help
|
||||
A deadlock caused by wait_for_completion() and complete() can be
|
||||
detected by lockdep using crossrelease feature.
|
||||
|
||||
config DEBUG_LOCKDEP
|
||||
bool "Lock dependency engine debugging"
|
||||
depends on DEBUG_KERNEL && LOCKDEP
|
||||
|
|
|
@ -362,6 +362,103 @@ static void rsem_AA3(void)
|
|||
RSL(X2); // this one should fail
|
||||
}
|
||||
|
||||
/*
|
||||
* read_lock(A)
|
||||
* spin_lock(B)
|
||||
* spin_lock(B)
|
||||
* write_lock(A)
|
||||
*/
|
||||
static void rlock_ABBA1(void)
|
||||
{
|
||||
RL(X1);
|
||||
L(Y1);
|
||||
U(Y1);
|
||||
RU(X1);
|
||||
|
||||
L(Y1);
|
||||
WL(X1);
|
||||
WU(X1);
|
||||
U(Y1); // should fail
|
||||
}
|
||||
|
||||
static void rwsem_ABBA1(void)
|
||||
{
|
||||
RSL(X1);
|
||||
ML(Y1);
|
||||
MU(Y1);
|
||||
RSU(X1);
|
||||
|
||||
ML(Y1);
|
||||
WSL(X1);
|
||||
WSU(X1);
|
||||
MU(Y1); // should fail
|
||||
}
|
||||
|
||||
/*
|
||||
* read_lock(A)
|
||||
* spin_lock(B)
|
||||
* spin_lock(B)
|
||||
* read_lock(A)
|
||||
*/
|
||||
static void rlock_ABBA2(void)
|
||||
{
|
||||
RL(X1);
|
||||
L(Y1);
|
||||
U(Y1);
|
||||
RU(X1);
|
||||
|
||||
L(Y1);
|
||||
RL(X1);
|
||||
RU(X1);
|
||||
U(Y1); // should NOT fail
|
||||
}
|
||||
|
||||
static void rwsem_ABBA2(void)
|
||||
{
|
||||
RSL(X1);
|
||||
ML(Y1);
|
||||
MU(Y1);
|
||||
RSU(X1);
|
||||
|
||||
ML(Y1);
|
||||
RSL(X1);
|
||||
RSU(X1);
|
||||
MU(Y1); // should fail
|
||||
}
|
||||
|
||||
|
||||
/*
|
||||
* write_lock(A)
|
||||
* spin_lock(B)
|
||||
* spin_lock(B)
|
||||
* write_lock(A)
|
||||
*/
|
||||
static void rlock_ABBA3(void)
|
||||
{
|
||||
WL(X1);
|
||||
L(Y1);
|
||||
U(Y1);
|
||||
WU(X1);
|
||||
|
||||
L(Y1);
|
||||
WL(X1);
|
||||
WU(X1);
|
||||
U(Y1); // should fail
|
||||
}
|
||||
|
||||
static void rwsem_ABBA3(void)
|
||||
{
|
||||
WSL(X1);
|
||||
ML(Y1);
|
||||
MU(Y1);
|
||||
WSU(X1);
|
||||
|
||||
ML(Y1);
|
||||
WSL(X1);
|
||||
WSU(X1);
|
||||
MU(Y1); // should fail
|
||||
}
|
||||
|
||||
/*
|
||||
* ABBA deadlock:
|
||||
*/
|
||||
|
@ -1056,8 +1153,6 @@ static void dotest(void (*testcase_fn)(void), int expected, int lockclass_mask)
|
|||
if (debug_locks != expected) {
|
||||
unexpected_testcase_failures++;
|
||||
pr_cont("FAILED|");
|
||||
|
||||
dump_stack();
|
||||
} else {
|
||||
testcase_successes++;
|
||||
pr_cont(" ok |");
|
||||
|
@ -1933,6 +2028,30 @@ void locking_selftest(void)
|
|||
dotest(rsem_AA3, FAILURE, LOCKTYPE_RWSEM);
|
||||
pr_cont("\n");
|
||||
|
||||
print_testname("mixed read-lock/lock-write ABBA");
|
||||
pr_cont(" |");
|
||||
dotest(rlock_ABBA1, FAILURE, LOCKTYPE_RWLOCK);
|
||||
/*
|
||||
* Lockdep does indeed fail here, but there's nothing we can do about
|
||||
* that now. Don't kill lockdep for it.
|
||||
*/
|
||||
unexpected_testcase_failures--;
|
||||
|
||||
pr_cont(" |");
|
||||
dotest(rwsem_ABBA1, FAILURE, LOCKTYPE_RWSEM);
|
||||
|
||||
print_testname("mixed read-lock/lock-read ABBA");
|
||||
pr_cont(" |");
|
||||
dotest(rlock_ABBA2, SUCCESS, LOCKTYPE_RWLOCK);
|
||||
pr_cont(" |");
|
||||
dotest(rwsem_ABBA2, FAILURE, LOCKTYPE_RWSEM);
|
||||
|
||||
print_testname("mixed write-lock/lock-write ABBA");
|
||||
pr_cont(" |");
|
||||
dotest(rlock_ABBA3, FAILURE, LOCKTYPE_RWLOCK);
|
||||
pr_cont(" |");
|
||||
dotest(rwsem_ABBA3, FAILURE, LOCKTYPE_RWSEM);
|
||||
|
||||
printk(" --------------------------------------------------------------------------\n");
|
||||
|
||||
/*
|
||||
|
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue