The conntrack code can export the internal secid to userspace. These are
dynamic, can change on lsm changes, and have no meaning in userspace. We
should instead be sending lsm contexts to userspace instead. This patch sends
the secctx (rather than secid) to userspace over the netlink socket. We use a
new field CTA_SECCTX and stop using the the old CTA_SECMARK field since it did
not send particularly useful information.
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: James Morris <jmorris@namei.org>
With the (long ago) interface change to have the secid_to_secctx functions
do the string allocation instead of having the caller do the allocation we
lost the ability to query the security server for the length of the
upcoming string. The SECMARK code would like to allocate a netlink skb
with enough length to hold the string but it is just too unclean to do the
string allocation twice or to do the allocation the first time and hold
onto the string and slen. This patch adds the ability to call
security_secid_to_secctx() with a NULL data pointer and it will just set
the slen pointer.
Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Right now secmark has lots of direct selinux calls. Use all LSM calls and
remove all SELinux specific knowledge. The only SELinux specific knowledge
we leave is the mode. The only point is to make sure that other LSMs at
least test this generic code before they assume it works. (They may also
have to make changes if they do not represent labels as strings)
Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Paul Moore <paul.moore@hp.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: James Morris <jmorris@namei.org>
All security modules shouldn't change sched_param parameter of
security_task_setscheduler(). This is not only meaningless, but also
make a harmful result if caller pass a static variable.
This patch remove policy and sched_param parameter from
security_task_setscheduler() becuase none of security module is
using it.
Cc: James Morris <jmorris@namei.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: James Morris <jmorris@namei.org>
These facilitate preallocation of pages so that we can encode into the pagelist
in an atomic context.
Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
This factors out protocol and low-level storage parts of ceph into a
separate libceph module living in net/ceph and include/linux/ceph. This
is mostly a matter of moving files around. However, a few key pieces
of the interface change as well:
- ceph_client becomes ceph_fs_client and ceph_client, where the latter
captures the mon and osd clients, and the fs_client gets the mds client
and file system specific pieces.
- Mount option parsing and debugfs setup is correspondingly broken into
two pieces.
- The mon client gets a generic handler callback for otherwise unknown
messages (mds map, in this case).
- The basic supported/required feature bits can be expanded (and are by
ceph_fs_client).
No functional change, aside from some subtle error handling cases that got
cleaned up in the refactoring process.
Signed-off-by: Sage Weil <sage@newdream.net>
Add a new readmostly percpu section and API. This can be used to
avoid dirtying data lines which are generally not written to, which is
especially important for data which may be accessed by processors
other than the one for which the percpu area belongs to.
[ hpa: moved it *after* the page-aligned section, for obvious
reasons. ]
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
LKML-Reference: <1287544022.4571.7.camel@sli10-conroe.sh.intel.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
There is no point using RCU for dst we allocate for a very short time
(used once).
Change dst_release() to take DST_NOCACHE into account, but also change
skb_dst_set_noref() to force a refcount increment for such dst.
This is a _huge_ gain, because we dont waste memory to store xx thousand
of dsts. Instead of queueing them to RCU, we can free them instantly.
CPU caches can stay hot, re-using same memory blocks to hold temporary
dsts.
Note : remove unneeded smp_mb__before_atomic_dec(); in dst_release(),
since atomic_dec_return() implies a full memory barrier.
Stress test, 160.000.000 udp frames sent, IP route cache disabled
(DDOS).
Before:
real 0m38.091s
user 0m13.189s
sys 7m53.018s
After:
real 0m29.946s
user 0m12.157s
sys 7m40.605s
For reference, if IP route cache was enabled :
real 0m32.030s
user 0m10.521s
sys 8m15.243s
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces netif_alloc_netdev_queues which is called from
register_device instead of alloc_netdev_mq. This makes TX queue
allocation symmetric with RX allocation. Also, queue locks allocation
is done in netdev_init_one_queue. Change set_real_num_tx_queues to
fail if requested number < 1 or greater than number of allocated
queues.
Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Found by running make namespacecheck on linux-next
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Commit a25909a4 (lockdep: Add an in_workqueue_context() lockdep-based
test function) added in_workqueue_context() but there hasn't been any
in-kernel user and the lockdep annotation in workqueue is scheduled to
change. Remove the unused function.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
IPv6 encapsulation uses a bad source address for the tunnel.
i.e. VIP will be used as local-addr and encap. dst addr.
Decapsulation will not accept this.
Example
LVS (eth1 2003::2:0:1/96, VIP 2003::2:0:100)
(eth0 2003::1:0:1/96)
RS (ethX 2003::1:0:5/96)
tcpdump
2003::2:0:100 > 2003::1:0:5: IP6 (hlim 63, next-header TCP (6) payload length: 40) 2003::3:0:10.50991 > 2003::2:0:100.http: Flags [S], cksum 0x7312 (correct), seq 3006460279, win 5760, options [mss 1440,sackOK,TS val 1904932 ecr 0,nop,wscale 3], length 0
In Linux IPv6 impl. you can't have a tunnel with an any cast address
receiving packets (I have not tried to interpret RFC 2473)
To have receive capabilities the tunnel must have:
- Local address set as multicast addr or an unicast addr
- Remote address set as an unicast addr.
- Loop back addres or Link local address are not allowed.
This causes us to setup a tunnel in the Real Server with the
LVS as the remote address, here you can't use the VIP address since it's
used inside the tunnel.
Solution
Use outgoing interface IPv6 address (match against the destination).
i.e. use ip6_route_output() to look up the route cache and
then use ipv6_dev_get_saddr(...) to set the source address of the
encapsulated packet.
Additionally, cache the results in new destination
fields: dst_cookie and dst_saddr and properly check the
returned dst from ip6_route_output. We now add xfrm_lookup
call only for the tunneling method where the source address
is a local one.
Signed-off-by:Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This patch allows to listen to events that inform about
expectations destroyed.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
To help to determine if digital display port needs to enable
audio output or not. This one adds a helper to get monitor's
audio capability via EDID CEA extension block.
Tested-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
cciss: fix PCI IDs for new controllers
This patch fixes the botched up PCI IDs of new controllers. Please consider
this patch for inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
/proc/diskstats would display a strange output as follows.
$ cat /proc/diskstats |grep sda
8 0 sda 90524 7579 102154 20464 0 0 0 0 0 14096 20089
8 1 sda1 19085 1352 21841 4209 0 0 0 0 4294967064 15689 4293424691
~~~~~~~~~~
8 2 sda2 71252 3624 74891 15950 0 0 0 0 232 23995 1562390
8 3 sda3 54 487 2188 92 0 0 0 0 0 88 92
8 4 sda4 4 0 8 0 0 0 0 0 0 0 0
8 5 sda5 81 2027 2130 138 0 0 0 0 0 87 137
Its reason is the wrong way of accounting hd_struct->in_flight. When a bio is
merged into a request belongs to different partition by ELEVATOR_FRONT_MERGE.
The detailed root cause is as follows.
Assuming that there are two partition, sda1 and sda2.
1. A request for sda2 is in request_queue. Hence sda1's hd_struct->in_flight
is 0 and sda2's one is 1.
| hd_struct->in_flight
---------------------------
sda1 | 0
sda2 | 1
---------------------------
2. A bio belongs to sda1 is issued and is merged into the request mentioned on
step1 by ELEVATOR_BACK_MERGE. The first sector of the request is changed
from sda2 region to sda1 region. However the two partition's
hd_struct->in_flight are not changed.
| hd_struct->in_flight
---------------------------
sda1 | 0
sda2 | 1
---------------------------
3. The request is finished and blk_account_io_done() is called. In this case,
sda2's hd_struct->in_flight, not a sda1's one, is decremented.
| hd_struct->in_flight
---------------------------
sda1 | -1
sda2 | 1
---------------------------
The patch fixes the problem by caching the partition lookup
inside the request structure, hence making sure that the increment
and decrement will always happen on the same partition struct. This
also speeds up IO with accounting enabled, since it cuts down on
the number of lookups we have to do.
When reloading partition tables, quiesce IO to ensure that no
request references to the partition struct exists. When it is safe
to free the partition table, the IO for that device is restarted
again.
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
The enter argument as implemented by commit 413d45d362 (drm, kdb, kms:
Add an enter argument to mode_set_base_atomic() API) should be more
descriptive as to what it does vs just passing 1 and 0 around.
There is no runtime behavior change as a result of this patch.
Reported-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
CC: David Airlie <airlied@linux.ie>
CC: dri-devel@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
We need the unlocked variant for the new codepath introduced to fix the
race condition in master recently.
Signed-off-by: Dave Airlie <airlied@redhat.com>
This reverts commit f6765502f8 and adds
the missing include file.
Signed-off-by: Peter Hsiang <Peter.Hsiang@maxim-ic.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
s390/powerpc/ia64 have support for CONFIG_VIRT_CPU_ACCOUNTING which does
the fine granularity accounting of user, system, hardirq, softirq times.
Adding that option on archs like x86 will be challenging however, given the
state of TSC reliability on various platforms and also the overhead it will
add in syscall entry exit.
Instead, add a lighter variant that only does finer accounting of
hardirq and softirq times, providing precise irq times (instead of timer tick
based samples). This accounting is added with a new config option
CONFIG_IRQ_TIME_ACCOUNTING so that there won't be any overhead for users not
interested in paying the perf penalty.
This accounting is based on sched_clock, with the code being generic.
So, other archs may find it useful as well.
This patch just adds the core logic and does not enable this logic yet.
Signed-off-by: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1286237003-12406-5-git-send-email-venki@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
To account softirq time cleanly in scheduler, we need to identify whether
softirq is invoked in ksoftirqd context or softirq at hardirq tail context.
Add PF_KSOFTIRQD for that purpose.
As all PF flag bits are currently taken, create space by moving one of the
infrequently used bits (PF_THREAD_BOUND) down in task_struct to be along
with some other state fields.
Signed-off-by: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1286237003-12406-4-git-send-email-venki@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Just a minor cleanup patch that makes things easier to the following patches.
No functionality change in this patch.
Signed-off-by: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1286237003-12406-3-git-send-email-venki@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra found a bug in the way softirq time is accounted in
VIRT_CPU_ACCOUNTING on this thread:
http://lkml.indiana.edu/hypermail//linux/kernel/1009.2/01366.html
The problem is, softirq processing uses local_bh_disable internally. There
is no way, later in the flow, to differentiate between whether softirq is
being processed or is it just that bh has been disabled. So, a hardirq when bh
is disabled results in time being wrongly accounted as softirq.
Looking at the code a bit more, the problem exists in !VIRT_CPU_ACCOUNTING
as well. As account_system_time() in normal tick based accouting also uses
softirq_count, which will be set even when not in softirq with bh disabled.
Peter also suggested solution of using 2*SOFTIRQ_OFFSET as irq count
for local_bh_{disable,enable} and using just SOFTIRQ_OFFSET while softirq
processing. The patch below does that and adds API in_serving_softirq() which
returns whether we are currently processing softirq or not.
Also changes one of the usages of softirq_count in net/sched/cls_cgroup.c
to in_serving_softirq.
Looks like many usages of in_softirq really want in_serving_softirq. Those
changes can be made individually on a case by case basis.
Signed-off-by: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1286237003-12406-2-git-send-email-venki@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The use of the JUMP_LABEL() construct ends up creating endless silly
wrappers, create a higher level construct to reduce this clutter.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Trades a call + conditional + ret for an unconditional jmp.
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101014203625.501657727@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add an interface to allow usage of jump_labels with atomic counters.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101014203625.501657727@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Now that there's still only a few users around, rename things to make
them more consistent.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101014203625.448565169@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
hw_breakpoint creation needs to account stuff per-task to ensure there
is always sufficient hardware resources to back these things due to
ptrace.
With the perf per pmu context changes the event initialization no
longer has access to the event context, for the simple reason that we
need to first find the pmu (result of initialization) before we can
find the context.
This makes hw_breakpoints unhappy, because it can no longer do per
task accounting, cure this by frobbing a task pointer in the event::hw
bits for now...
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101014203625.391543667@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Provide a mechanism that allows running code in IRQ context. It is
most useful for NMI code that needs to interact with the rest of the
system -- like wakeup a task to drain buffers.
Perf currently has such a mechanism, so extract that and provide it as
a generic feature, independent of perf so that others may also
benefit.
The IRQ context callback is generated through self-IPIs where
possible, or on architectures like powerpc the decrementer (the
built-in timer facility) is set to generate an interrupt immediately.
Architectures that don't have anything like this get to do with a
callback from the timer tick. These architectures can call
irq_work_run() at the tail of any IRQ handlers that might enqueue such
work (like the perf IRQ handler) to avoid undue latencies in
processing the work.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[ various fixes ]
Signed-off-by: Huang Ying <ying.huang@intel.com>
LKML-Reference: <1287036094.7768.291.camel@yhuang-dev>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Current lockdep_map only caches one class with subclass == 0,
and looks up hash table of classes when subclass != 0.
It seems that this has no problem because the case of
subclass != 0 is rare. But locks of struct rq are
acquired with subclass == 1 when task migration is executed.
Task migration is high frequent event, so I modified lockdep
to cache subclasses.
I measured the score of perf bench sched messaging.
This patch has slightly but certain (order of milli seconds
or 10 milli seconds) effect when lots of tasks are running.
I'll show the result in the tail of this description.
NR_LOCKDEP_CACHING_CLASSES specifies how many classes can be
cached in the instances of lockdep_map.
I discussed with Peter Zijlstra in LinuxCon Japan about
this approach and he taught me that caching every subclasses(8)
is cleary waste of memory. So number of cached classes
should be configurable.
=== Score comparison of benchmarks ===
# "min" means best score, and "max" means worst score
for i in `seq 1 10`; do ./perf bench -f simple sched messaging; done
before: min: 0.565000, max: 0.583000, avg: 0.572500
after: min: 0.559000, max: 0.568000, avg: 0.563300
# with more processes
for i in `seq 1 10`; do ./perf bench -f simple sched messaging -g 40; done
before: min: 2.274000, max: 2.298000, avg: 2.286300
after: min: 2.242000, max: 2.270000, avg: 2.259700
Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1286269311-28336-2-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
PS2Mult is a simple serial protocol used for multiplexing several PS/2
streams into one serial data stream. It's used e.g. on TQM85xx series
of boards.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
The bonding driver currently modifies the netpoll structure in its xmit path
while sending frames from netpoll. This is racy, as other cpus can access the
netpoll structure in parallel. Since the bonding driver points np->dev to a
slave device, other cpus can inadvertently attempt to send data directly to
slave devices, leading to improper locking with the bonding master, lost frames,
and deadlocks. This patch fixes that up.
This patch also removes the real_dev pointer from the netpoll structure as that
data is really only used by bonding in the poll_controller, and we can emulate
its behavior by check each slave for IS_UP.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit e446630c96, i.e. v2.6.35-rc1,
the mcp251x chip model can be selected via the modalias member in the
struct spi_board_info. The driver stores the actual model in the
struct mcp251x_platform_data.
From the driver point of view the platform_data should be read only.
Since all in-tree users of the mcp251x have already been converted to
the modalias method, this patch moves the "model" member from the
struct mcp251x_platform_data to the driver's private data structure.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Christian Pellegrin <chripell@fsfe.org>
Cc: Marc Zyngier <maz@misterjones.org>
The ebt_ip6.h and ebt_nflog.h headers are not not known to Kbuild and
therefore not installed by make headers_install. Fix that up.
Signed-off-by: Nick Bowler <nbowler@elliptictech.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The patch below updates broken web addresses in the kernel
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Finn Thain <fthain@telegraphics.com.au>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Dimitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Acked-by: Ben Pfaff <blp@cs.stanford.edu>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Reviewed-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
For the Blackfin port, we can use much of the asm-generic/io.h header,
but we still need to declare some of our own versions of functions.
Like the __raw_read* and in/out "string" helpers. So let people do
this easily for many of these funcs.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
CCR is defined in emu10k1, but SuperH is defined too.
If user use this driver with SuperH, it becomes a double definition.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Version 20101013.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
In a network bench, I noticed an unfortunate false sharing between
'loopback_dev' and 'count' fields in "struct net".
'count' is written each time a socket is created or destroyed, while
loopback_dev might be often read in routing code.
Move loopback_dev in a read mostly section of "struct net"
Note: struct netns_xfrm is cache line aligned on SMP.
(It contains a "struct dst_ops")
Move it at the end to avoid holes, and reduce sizeof(struct net) by 128
bytes on ia32.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we compile the ASoC code with PM disabled, we hit stuff like:
sound/soc/soc-dapm.c: In function 'snd_soc_dapm_suspend_check':
sound/soc/soc-dapm.c:440: warning: unused variable 'codec'
So tweak the stub macro to avoid these issues.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
SoCs have a standard set of tuples consisting of frequency and
voltage pairs that the device will support per voltage domain. These
are called Operating Performance Points or OPPs. The actual
definitions of OPP varies over silicon versions. For a specific domain,
we can have a set of {frequency, voltage} pairs. As the kernel boots
and more information is available, a default set of these are activated
based on the precise nature of device. Further on operation, based on
conditions prevailing in the system (such as temperature), some OPP
availability may be temporarily controlled by the SoC frameworks.
To implement an OPP, some sort of power management support is necessary
hence this library depends on CONFIG_PM.
Contributions include:
Sanjeev Premi for the initial concept:
http://patchwork.kernel.org/patch/50998/
Kevin Hilman for converting original design to device-based.
Kevin Hilman and Paul Walmsey for cleaning up many of the function
abstractions, improvements and data structure handling.
Romit Dasgupta for using enums instead of opp pointers.
Thara Gopinath, Eduardo Valentin and Vishwanath BS for fixes and
cleanups.
Linus Walleij for recommending this layer be made generic for usage
in other architectures beyond OMAP and ARM.
Mark Brown, Andrew Morton, Rafael J. Wysocki, Paul E. McKenney for
valuable improvements.
Discussions and comments from:
http://marc.info/?l=linux-omap&m=126033945313269&w=2http://marc.info/?l=linux-omap&m=125482970102327&w=2http://marc.info/?t=125809247500002&r=1&w=2http://marc.info/?l=linux-omap&m=126025973426007&w=2http://marc.info/?t=128152609200064&r=1&w=2http://marc.info/?t=128468723000002&r=1&w=2
incorporated.
v1: http://marc.info/?t=128468723000002&r=1&w=2
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
If the device which fails to resume is part of a loadable kernel module
it won't be checked at startup against the magic number stored in the
RTC.
Add a read-only sysfs attribute /sys/power/pm_trace_dev_match which
contains a list of newline separated devices (usually just the one)
which currently match the last magic number. This allows the device
which is failing to resume to be found after the modules are loaded
again.
Signed-off-by: James Hogan <james@albanarts.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
If there is a wakeup event during the freezing of tasks, suspend or
hibernation will fail anyway. Since try_to_freeze_tasks() can take
up to 20 seconds to complete or fail, aborting it as soon as a wakeup
event is detected improves the worst case wakeup latency.
Based on a patch from Arve Hjønnevåg.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
The patch "PM / Runtime: Implement autosuspend support" introduces
"autosuspend" facility for runtime PM, but misses helper function
of pm_request_autosuspend, so add it.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
This patch (as1427) implements the "autosuspend" facility for runtime
PM. A few new fields are added to the dev_pm_info structure and
several new PM helper functions are defined, for telling the PM core
whether or not a device uses autosuspend, for setting the autosuspend
delay, and for marking periods of device activity.
Drivers that do not want to use autosuspend can continue using the
same helper functions as before; their behavior will not change. In
addition, drivers supporting autosuspend can also call the old helper
functions to get the old behavior.
The details are all explained in Documentation/power/runtime_pm.txt
and Documentation/ABI/testing/sysfs-devices-power.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Some devices, such as USB interfaces, cannot be power-managed
independently of their parents, i.e., they cannot be put in low power
while the parent remains at full power. This patch (as1425) creates a
new "no_callbacks" flag, which tells the PM core not to invoke the
runtime-PM callback routines for the such devices but instead to
assume that the callbacks always succeed. In addition, the
non-debugging runtime-PM sysfs attributes for the devices are removed,
since they are pretty much meaningless.
The advantage of this scheme comes not so much from avoiding the
callbacks themselves, but rather from the fact that without the need
for a process context in which to run the callbacks, more work can be
done in interrupt context.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
This patch (as1424) combines the various public entry points for the
runtime PM routines into three simple functions: one for idle, one for
suspend, and one for resume. A new bitflag specifies whether or not
to increment or decrement the usage_count field.
The new entry points are named __pm_runtime_idle,
__pm_runtime_suspend, and __pm_runtime_resume, to reflect that they
are trampolines. Simultaneously, the corresponding internal routines
are renamed to rpm_idle, rpm_suspend, and rpm_resume.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
The "from_wq" argument in __pm_runtime_suspend() and
__pm_runtime_resume() supposedly indicates whether or not the function
was called by the PM workqueue thread, but in fact it isn't always
used this way. It really indicates whether or not the function should
return early if the requested operation is already in progress.
Along with this badly-named boolean argument, later patches in this
series will add several other boolean arguments to these functions and
others. Therefore this patch (as1422) begins the conversion process
by replacing from_wq with a bitflag argument. The same bitflags are
also used in __pm_runtime_get() and __pm_runtime_put(), where they
indicate whether or not the operation should be asynchronous.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
This patch (as1420) adds sysfs_merge_group() and sysfs_unmerge_group()
functions, allowing drivers easily to add and remove sets of
attributes to a pre-existing attribute group directory.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
There is a potential issue with the asynchronous suspend code that
a device driver suspending asynchronously may not notice that it
should back off. There are two failing scenarions, (1) when the
driver is waiting for a driver suspending synchronously to complete
and that second driver returns error code, in which case async_error
won't be set and the waiting driver will continue suspending and (2)
after the driver has called device_pm_wait_for_dev() and the waited
for driver returns error code, in which case the caller of
device_pm_wait_for_dev() will not know that there was an error and
will continue suspending.
To fix this issue make __device_suspend() set async_error, so
async_suspend() doesn't need to set it any more, and make
device_pm_wait_for_dev() return async_error, so that its callers
can check whether or not they should continue suspending.
No more changes are necessary, since device_pm_wait_for_dev() is
not used by any drivers' suspend routines.
Reported-by: Colin Cross <ccross@android.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Introduce struct wakeup_source for representing system wakeup sources
within the kernel and for collecting statistics related to them.
Make the recently introduced helper functions pm_wakeup_event(),
pm_stay_awake() and pm_relax() use struct wakeup_source objects
internally, so that wakeup statistics associated with wakeup devices
can be collected and reported in a consistent way (the definition of
pm_relax() is changed, which is harmless, because this function is
not called directly by anyone yet). Introduce new wakeup-related
sysfs device attributes in /sys/devices/.../power for reporting the
device wakeup statistics.
Change the global wakeup events counters event_count and
events_in_progress into atomic variables, so that it is not necessary
to acquire a global spinlock in pm_wakeup_event(), pm_stay_awake()
and pm_relax(), which should allow us to avoid lock contention in
these functions on SMP systems with many wakeup devices.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Do some cleanups of TIPC based on make namespacecheck
1. Don't export unused symbols
2. Eliminate dead code
3. Make functions and variables local
4. Rename buf_acquire to tipc_buf_acquire since it is used in several files
Compile tested only.
This make break out of tree kernel modules that depend on TIPC routines.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit b30973f877 (node-aware skb allocation) spread a wrong habit of
allocating net drivers skbs on a given memory node : The one closest to
the NIC hardware. This is wrong because as soon as we try to scale
network stack, we need to use many cpus to handle traffic and hit
slub/slab management on cross-node allocations/frees when these cpus
have to alloc/free skbs bound to a central node.
skb allocated in RX path are ephemeral, they have a very short
lifetime : Extra cost to maintain NUMA affinity is too expensive. What
appeared as a nice idea four years ago is in fact a bad one.
In 2010, NIC hardwares are multiqueue, or we use RPS to spread the load,
and two 10Gb NIC might deliver more than 28 million packets per second,
needing all the available cpus.
Cost of cross-node handling in network and vm stacks outperforms the
small benefit hardware had when doing its DMA transfert in its 'local'
memory node at RX time. Even trying to differentiate the two allocations
done for one skb (the sk_buff on local node, the data part on NIC
hardware node) is not enough to bring good performance.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently have a kernel internal type called aligned_u64 which aligns
__u64's on 8 bytes boundaries even on systems which would normally align
them on 4 byte boundaries. This patch creates a new type __aligned_u64
which does the same thing but which is exposed to userspace rather than
being kernel internal.
[akpm: merge early as both the net and audit trees want this]
[akpm@linux-foundation.org: enhance the comment describing the reasons for using aligned_u64. Via Andreas and Andi.]
Based-on-patch-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: Jan Engelhardt <jengelh@medozas.de>
Cc: David Miller <davem@davemloft.net>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is a number of problems with acpi_pm_device_sleep_state() now.
First, if _S0W is not defined, it prevents devices from being put
into D3 by PCI runtime PM, which shouldn't happen. Second, it
shouldn't use adev->wakeup.state.enabled, because if it's set, it
only means that either the device is permanently enabled to wake up
the system, or that it has been enabled to do that through
/proc/acpi/wakeup. Finally, it should be compiled if CONFIG_PM_SLEEP
is not set, so that PCI runtime PM works correctly in that case.
Fix these problems.
Reported-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
Previously we tracked whether the integrity metadata had been remapped
using a request flag. This was fine for low-level retries. However, if
an I/O was redriven by upper layers we would end up remapping again,
causing the retry to fail.
Deprecate the REQ_INTEGRITY flag and introduce BIO_MAPPED_INTEGRITY
which enables filesystems to notify lower layers that the bio in
question has already been remapped.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
This adds a necessary race breaker to these commits:
drbd: fix for possible deadlock on IO error during resync
drbd: drop wrong debug asserts, fix recently introduced race
What we do is get a refcount, check the state, then depending on the
state and the requested minimum disk state, either hold it (success),
or give it back immediately (failed "try lock").
Some code paths (flushing of drbd metadata) may still grab and hold a
refcount even if we are D_FAILED (application IO won't).
So even if we hit local_cnt == 0 once after being D_FAILED,
we still need to wait for that again after we changed to D_DISKLESS.
Once local_cnt reaches 0 while we are D_DISKLESS, we can be sure that
no one will look at the protected members anymore, so only then is it
safe to free them.
We cannot easily convert to standard locking primitives here, as we want
to be able to use it in atomic context (we always do a "try lock"),
as well as hold references for a "long time" (from IO submission to
completion callback).
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Commit e9677b3ce (oprofile, ARM: Use oprofile_arch_exit() to
cleanup on failure) caused oprofile_perf_exit to be called
in the cleanup path of oprofile_perf_init. The __exit tag
for oprofile_perf_exit should therefore be dropped.
The same has to be done for exit_driverfs as well, as this
function is called from oprofile_perf_exit. Else, we get
the following two linker errors.
LD .tmp_vmlinux1
`oprofile_perf_exit' referenced in section `.init.text' of arch/arm/oprofile/built-in.o: defined in discarded section `.exit.text' of arch/arm/oprofile/built-in.o
make: *** [.tmp_vmlinux1] Error 1
LD .tmp_vmlinux1
`exit_driverfs' referenced in section `.text' of arch/arm/oprofile/built-in.o: defined in discarded section `.exit.text' of arch/arm/oprofile/built-in.o
make: *** [.tmp_vmlinux1] Error 1
Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
This adds a new clk_rate_div_range_round() for implementing rate rounding
by divisor ranges. This can be used trivially by clocks that support
arbitrary ranged divisors without the need for rate table construction.
This should only be used by clocks that both have large divisor ranges in
addition to clocks that will never be arbitrarily scaled, as the lack of
a backing frequency table will prevent cpufreq from being able to do much
of anything with them.
Primarily intended for use as a ->recalc helper.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Presently the only assisted rate rounding is frequency table backed, but
there are cases where it's impractical to use a frequency table for
certain clocks (such as the FSIDIV case, which supports 65535 divisors),
and we wish to reuse the same rate rounding algorithm.
This breaks out the core of the rate rounding logic in to its own helper
routine and shuffles the frequency table logic around, switching to using
an iterator for the generic helper routine.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This implements support for ioremapping of register windows that
encapsulate clock control registers used by a struct clk, with
transparent sibling inheritance.
Root clocks at the top of a given topology often encapsulate the entire
register space of all of their sibling clocks, so this mapping can be
done once and handed down. A given clock enable/disable case maps out to
a single bit in a shared register, so this prevents creating multiple
overlapping mappings.
The mapping case breaks down in to a couple of different situations:
- Sibling clocks without a specific mapping.
- Root clocks without a specific mapping.
- Any of sibling/root clocks with a specific mapping.
Sibling clocks with no specified mapping will grovel up the clock chain
and install the root clock mapping unconditionally at registration time.
Root clocks without their own mappings have a dummy BSS-initialized
mapping inserted that is handed down the chain just like any other
mapping. This permits all of the sibling clock ops to read/write using
the mapping offsets without any special configuration, enabling them to
not care whether access ultimately goes through translatable or
untranslatable memory.
Any clock with its own mapping will have the window initialized at
registration time and be ready for use by its clock ops. Failure to
establish the mapping will prevent registration, so no additional sanity
checks are needed. Sibling clocks that double as parents for the moment
will not propagate their mapping down, but this is easily tunable if the
need arises.
All clock mappings are kref refcounted, with each instance of mapping
inheritance incrementing the refcount.
Tested-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Tony Luck reports that the addition of the access_ok() check in commit
0eead9ab41 ("Don't dump task struct in a.out core-dumps") broke the
ia64 compile due to missing the necessary header file includes.
Rather than add yet another include (<asm/unistd.h>) to make everything
happy, just uninline the silly core dump helper functions and move the
bodies to fs/exec.c where they make a lot more sense.
dump_seek() in particular was too big to be an inline function anyway,
and none of them are in any way performance-critical. And we really
don't need to mess up our include file headers more than they already
are.
Reported-and-tested-by: Tony Luck <tony.luck@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for packing IBoE packet headers.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
[ Clean up and fix ib_ud_header_init() a bit. - Roland ]
Signed-off-by: Roland Dreier <rolandd@cisco.com>
akiphie points out that a.out core-dumps have that odd task struct
dumping that was never used and was never really a good idea (it goes
back into the mists of history, probably the original core-dumping
code). Just remove it.
Also do the access_ok() check on dump_write(). It probably doesn't
matter (since normal filesystems all seem to do it anyway), but he
points out that it's normally done by the VFS layer, so ...
[ I suspect that we should possibly do "vfs_write()" instead of
calling ->write directly. That also does the whole fsnotify and write
statistics thing, which may or may not be a good idea. ]
And just to be anal, do this all for the x86-64 32-bit a.out emulation
code too, even though it's not enabled (and won't currently even
compile)
Reported-by: akiphie <akiphie@lavabit.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Various cleanup paths have been incomplete, for the very unlikely case
that we cannot allocate enough bios from process context when submitting
on behalf of the peer or resync process.
Never observed.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Connections through a compressing proxy might have more bits
on the fly. 500MByte instead of 50MByte
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
There are three ways to get IO suspended:
* Loss of any access to data
* Fence-peer-handler running
* User requested to suspend IO
Track those in different bits, so that one condition clearing its
state bit does not interfere with the other two conditions.
Only when the user resumes IO he overrules all three bits.
The fact is hidden from the user, he sees only a single suspend
bit.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
We now track the data rate of locally submitted resync related requests,
and can thus detect non-resync activity on the lower level device.
If the current sync rate is above c-min-rate, and the lower level device
appears to be busy, we throttle the resyncer.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
When no data is accessible (no connection to the peer, nor a local disk)
allow the user to select to freeze all IO operations instead of getting
IO errors.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Define dummy __stop_machine() function even when
CONFIG_STOP_MACHINE=n. This getcpu-required version of
stop_machine() will be used from poke_text_smp().
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101014031030.4100.34156.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add support for IBoE device binding and IP --> GID resolution. Path
resolving and multicast joining are implemented within cma.c by
filling in the responses and running callbacks in the CMA work queue.
IP --> GID resolution always yields IPv6 link local addresses; remote
GIDs are derived from the destination MAC address of the remote port.
Multicast GIDs are always mapped to multicast MACs as is done in IPv6.
(IPv4 multicast is enabled by translating IPv4 multicast addresses to
IPv6 multicast as described in
<http://www.mail-archive.com/ipng@sunroof.eng.sun.com/msg02134.html>.)
Some helper functions are added to ib_addr.h.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Based on suggestion by Rémi Denis-Courmont to implement 'connect'
for Pipe controller logic, this patch implements 'connect' socket
call for the Pipe controller logic.
The patch does following:-
- Removes setsockopts for PNPIPE_CREATE and PNPIPE_DESTROY
- Adds setsockopt for setting the Pipe handle value
- Implements connect socket call
- Updates the Pipe controller logic
User-space should now follow below sequence with Pipe controller:-
-socket
-bind
-setsockopt for PNPIPE_PIPE_HANDLE
-connect
-setsockopt for PNPIPE_ENCAP_IP
-setsockopt for PNPIPE_ENABLE
GPRS/3G data has been tested working fine with this.
Signed-off-by: Kumar Sanghvi <kumar.sanghvi@stericsson.com>
Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using the frame registration notification, we
can see when probe requests are requested and
notify the low-level driver via filtering. The
flag is also set in AP and IBSS modes.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Drivers may need to adjust their filters according
to frame registrations, so notify them about them.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
wext: fix alignment problem in serializing 'struct iw_point'
This fixes a typo in the definition of the serialized length of struct iw_point:
a) wireless.h is exported to userspace, the typo causes IW_EV_POINT_PK_LEN
to be 12 on 64-bit, and 8 on 32-bit systems (causing misalignment);
b) in compat-64 mode iwe_stream_add_point() memcpys overlap (see below).
The second case in in compat-64 mode looks like (variable names are as in
include/net/iw_handler.h:iwe_stream_add_point()):
point_len = IW_EV_COMPAT_POINT_LEN = 8
lcp_len = IW_EV_COMPAT_LCP_LEN = 4
2nd memcpy: IW_EV_POINT_PK_LEN - IW_EV_LCP_PK_LEN = 12 - 4 = 8
IW_EV_LCP_PK_LEN
<--------------> *---> 'extra' data area
+-------+-------+-------+-------+---------------+------- ...-+
| len | cmd |length | flags | (empty) -> extra ... |
+-------+-------+-------+-------+---------------+------- ...-+
2 2 2 2 4
lcp_len
<--------------> <-!! OVERLAP !!>
<--1st memcpy--><------- 2nd memcpy ----------->
<---- 3rd memcpy ------- ... >
<--------- point_len ---------->
This case could cause overrun whenever iw_point.length < 4.
The other two cases are -
* 32-bit systems: IW_EV_POINT_PK_LEN - IW_EV_LCP_PK_LEN = 8 - 4 = 4,
the second memcpy copies exactly the 4 required bytes;
* 64-bit systems: IW_EV_POINT_PK_LEN - IW_EV_LCP_PK_LEN = 12 - 4 = 8,
the second memcpy copies a superfluous (but non overlapping) 4 bytes.
The patch changes IW_EV_POINT_PK_LEN to be 8, so that in all 3 cases always only
the requested iw_point.{length,flags} (both __u16) are copied, avoiding overrrun
(compat-64) and superfluous copy (64-bit). In addition, the userspace header is
sanitized (in agreement with version 30 of the wireless tools).
Many thanks to Johannes Berg for help and review with this patch.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Physical block size was declared unsigned int to accomodate the maximum
size reported by READ CAPACITY(16). Make sure we use the right type in
the related functions.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Many of the used macros are just there for userspace compatibility.
Substitute the in-kernel code to directly use the terminal macro
and stuff the defines into #ifndef __KERNEL__ sections.
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Some (rare) serio devices need to have multiple serio children. One of
the examples is PS/2 multiplexer present on several TQC STKxxx boards,
which connect PS/2 keyboard and mouse to single tty port.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Remove "name" and "id" from drivers/sh/ struct clk.
The struct clk members "name" and "id" are not used
now when matching is done through clkdev.
Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
With generic AC97 ASoC glue driver (codec/ac97.c), we get following warning when
the device is registered (slightly stripped the backtrace):
kobject (c5a863e8): tried to init an initialized object, something is seriously
wrong.
[<c00254fc>] (unwind_backtrace+0x0/0xec)
[<c014fad0>] (kobject_init+0x38/0x70)
[<c0171e94>] (device_initialize+0x20/0x70)
[<c017267c>] (device_register+0xc/0x18)
[<bf20db70>] (snd_soc_instantiate_cards+0x924/0xacc [snd_soc_core])
[<bf20e0d0>] (snd_soc_register_platform+0x16c/0x198 [snd_soc_core])
[<c0175304>] (platform_drv_probe+0x18/0x1c)
[<c0174454>] (driver_probe_device+0xb0/0x16c)
[<c017456c>] (__driver_attach+0x5c/0x7c)
[<c0173cec>] (bus_for_each_dev+0x48/0x78)
[<c0173600>] (bus_add_driver+0x98/0x214)
[<c0174834>] (driver_register+0xa4/0x130)
[<c001f410>] (do_one_initcall+0xd0/0x1a4)
[<c0062ddc>] (sys_init_module+0x12b0/0x1454)
This happens because the generic AC97 glue driver creates its codec->ac97 via
calling snd_ac97_mixer(). snd_ac97_mixer() provides own version of
snd_device.register which handles the device registration when
snd_card_register() is called.
To avoid registering the AC97 device twice, we add a new flag to the
snd_soc_codec: ac97_created which tells whether the AC97 device was created by
SoC subsystem.
Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
Acked-by: Liam Girdwood <lrg@slimlogic.co.uk>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Instead of referencing NO_IRQ in platform.c, define some helper functions
in irq.c to call instead from platform.c. Keep NO_IRQ usage local to
irq.c, and define NO_IRQ if not defined in headers.
Signed-off-by: Andres Salomon <dilinger@queued.net>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
package-to-path is a PROM function which tells us the real (full) name of the
node. This provides a hook for that in the prom ops struct, and makes use
of it in the pdt code when attempting to determine a node's name. If the
hook is available, try using it (falling back to looking at the "name"
property if it fails).
Signed-off-by: Andres Salomon <dilinger@queued.net>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
For symbols still lacking namespace qualifiers, add an of_pdt_ prefix.
Signed-off-by: Andres Salomon <dilinger@queued.net>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Rather than assuming an architecture defines prom_getchild and friends,
define an ops struct with hooks for the various prom functions that
pdt.c needs. This ops struct is filled in by the
arch-(and sometimes firmware-)specific code, and passed to
of_pdt_build_devicetree.
Update sparc code to define the ops struct as well.
Signed-off-by: Andres Salomon <dilinger@queued.net>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
What is the dev pointer doing inside the platform data anyway.
We have another pointer to the actual device at hand, use that.
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This patch adds spi->mode support for the AMBA pl022 driver and
allows spidev to correctly alter SPI modes. Unused fields used in
the pl022 header file for the pl022_config_chip have been removed.
The ab8500 client driver selects the data transfer size instead
of the platform data.
For platforms that use the amba pl022 driver, the unused fields
in the controller data structure have been removed and the .mode
field in the SPI board info structure is used instead.
Signed-off-by: Kevin Wells <wellsk40@gmail.com>
Tested-by: Linus Walleij <linus.walleij@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
This extends the PL022 SSP/SPI driver with generic DMA engine
support using the PrimeCell DMA engine interface. Also fix up the
test code for the U300 platform.
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
We need to round memory regions correctly -- specifically, we need to
round reserved region in the more expansive direction (lower limit
down, upper limit up) whereas usable memory regions need to be rounded
in the more restrictive direction (lower limit up, upper limit down).
This introduces two set of inlines:
memblock_region_memory_base_pfn()
memblock_region_memory_end_pfn()
memblock_region_reserved_base_pfn()
memblock_region_reserved_end_pfn()
Although they are antisymmetric (and therefore are technically
duplicates) the use of the different inlines explicitly documents the
programmer's intention.
The lack of proper rounding caused a bug on ARM, which was then found
to also affect other architectures.
Reported-by: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4CB4CDFD.4020105@kernel.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
We tried very hard to remove all possible dev_hold()/dev_put() pairs in
network stack, using RCU conversions.
There is still an unavoidable device refcount change for every dst we
create/destroy, and this can slow down some workloads (routers or some
app servers, mmap af_packet)
We can switch to a percpu refcount implementation, now dynamic per_cpu
infrastructure is mature. On a 64 cpus machine, this consumes 256 bytes
per device.
On x86, dev_hold(dev) code :
before
lock incl 0x280(%ebx)
after:
movl 0x260(%ebx),%eax
incl fs:(%eax)
Stress bench :
(Sending 160.000.000 UDP frames,
IP route cache disabled, dual E5540 @2.53GHz,
32bit kernel, FIB_TRIE)
Before:
real 1m1.662s
user 0m14.373s
sys 12m55.960s
After:
real 0m51.179s
user 0m15.329s
sys 10m15.942s
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit adds a bt_sock_stream_recvmsg() function for use by any
Bluetooth code that uses SOCK_STREAM sockets. This code is copied
from rfcomm_sock_recvmsg() with minimal modifications to remove
RFCOMM-specific functionality and improve readability.
L2CAP (with the SOCK_STREAM socket type) and RFCOMM have common needs
when it comes to reading data. Proper stream read semantics require
that applications can read from a stream one byte at a time and not
lose any data. The RFCOMM code already operated on and pulled data
from the underlying L2CAP socket, so very few changes were required to
make the code more generic for use with non-RFCOMM data over L2CAP.
Applications that need more awareness of L2CAP frame boundaries are
still free to use SOCK_SEQPACKET sockets, and may verify that they
connection did not fall back to basic mode by calling getsockopt().
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
HCI transport drivers may not know what type of radio an AMP device has
so only say whether they're BR/EDR or AMP devices.
Signed-off-by: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
The move_irq_desc() function was only used due to the problem that the
allocator did not free the old descriptors. So the descriptors had to
be moved in create_irq_nr(). That's history.
The code would have never been able to move active interrupt
descriptors on affinity settings. That can be done in a completely
different way w/o all this horror.
Remove all of it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Use the cleanup functions of the dynamic allocator. No need to have
separate implementations.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
This function should have not been there in the first place.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
irq_2_iommu is now in the x86 code where it belongs. Remove all
leftovers.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
irq_2_iommu is in struct irq_cfg, so we can do the irq_remapped check
based on irq_cfg instead of going through a lookup function. That's
especially interesting in the eoi_ioapic_irq() hotpath.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
That interrupt remapping code is x86 specific and tied to the io_apic
code. No need for separate allocator functions in the interrupt
remapping code. This allows to simplify the code and irq_2_iommu is
small (13 bytes on 64bit) so it's not a real problem even if interrupt
remapping is runtime disabled. If it's compile time disabled the
impact is zero.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Handing down irq_desc to msi just so that msi can access
irq_desc.irq_data.msi_desc is a pretty stupid idea. The calling code
can hand down a pointer to msi_desc so msi code does not need to know
about the irq descriptor at all.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Russell King <linux@arm.linux.org.uk>
sparse irq sets up NR_IRQS_LEGACY irq descriptors and archs then go
ahead and allocate more.
Use the unused return value of arch_probe_nr_irqs() to let the
architecture return the number of early allocations. Fix up all users.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Mark a range of interrupts as allocated. In the SPARSE_IRQ=n case we
need this to update the bitmap for the legacy irqs so the enumerator
via irq_get_next_irq() works.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The current sparse_irq allocator has several short comings due to
failures in the design or the lack of it:
- Requires iteration over the number of active irqs to find a free slot
(Some architectures have grown their own workarounds for this)
- Removal of entries is not possible
- Racy between create_irq_nr and destroy_irq (plugged by horrible
callbacks)
- Migration of active irq descriptors is not possible
- No bulk allocation of irq ranges
- Sprinkeled irq_desc references all over the place outside of kernel/irq/
(The previous chip functions series is addressing this issue)
Implement a sane allocator which fixes the above short comings (though
migration of active descriptors needs a full tree wide cleanup of the
direct and mostly unlocked access to irq_desc).
The new allocator still uses a radix_tree, but uses a bitmap for
keeping track of allocated irq numbers. That allows:
- Fast lookup of a free slot
- Allows the removal of descriptors
- Prevents the create/destroy race
- Bulk allocation of consecutive irq ranges
- Basic design is ready for migration of life descriptors after
further cleanups
The bitmap is also used in the SPARSE_IRQ=n case for lookup and
raceless (de)allocation of irq numbers. So it removes the requirement
for looping through the descriptor array to find slots.
Right now it uses sparse_irq_lock to protect the bitmap and the radix
tree, but after cleaning up all users we should be able convert that
to a mutex and to switch the radix_tree and decriptor allocations to
GFP_KERNEL.
[ Folded in a bugfix from Yinghai Lu ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Arch code sets it's own irq_desc.status flags right after boot and for
dynamically allocated interrupts. That might involve iterating over a
huge array.
Allow ARCH_IRQ_INIT_FLAGS to set separate flags aside of IRQ_DISABLED
which is the default.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
early_init_irq_lock_class() is called way before anything touches the
irq descriptors. In case of SPARSE_IRQ=y this is a NOP operation
because the radix tree is empty at this point. For the SPARSE_IRQ=n
case it's sufficient to set the lock class in early_init_irq().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Get the data structure from the core and provide inline wrappers to
access the irq_data members.
Provide accessor inlines for irq_data as well.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Provide a irq_desc.status modifier function to cleanup the direct
access to irq_desc in arch and driver code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
move_irq() has no users. Remove it and simplify the ifdef forrest while at it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
This patch disables the fanotify syscalls by just not building them and
letting the cond_syscall() statements in kernel/sys_ni.c redirect them
to sys_ni_syscall().
It was pointed out by Tvrtko Ursulin that the fanotify interface did not
include an explicit prioritization between groups. This is necessary
for fanotify to be usable for hierarchical storage management software,
as they must get first access to the file, before inotify-like notifiers
see the file.
This feature can be added in an ABI compatible way in the next release
(by using a number of bits in the flags field to carry the info) but it
was suggested by Alan that maybe we should just hold off and do it in
the next cycle, likely with an (new) explicit argument to the syscall.
I don't like this approach best as I know people are already starting to
use the current interface, but Alan is all wise and noone on list backed
me up with just using what we have. I feel this is needlessly ripping
the rug out from under people at the last minute, but if others think it
needs to be a new argument it might be the best way forward.
Three choices:
Go with what we got (and implement the new feature next cycle). Add a
new field right now (and implement the new feature next cycle). Wait
till next cycle to release the ABI (and implement the new feature next
cycle). This is number 3.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reason for merge:
Forward-port urgent change to arch/x86/mm/srat_64.c to the memblock tree.
Resolved Conflicts:
arch/x86/mm/srat_64.c
Originally-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Le mardi 12 octobre 2010 à 00:02 +0200, Eric Dumazet a écrit :
> Here is the followup patch.
>
> Thanks !
>
Oops, this was an old version, the up2date ones also took care of "used"
field.
I guess its time for a sleep, sorry again.
[PATCH net-next V2] neigh: reorder struct neighbour fields
(refcnt) and (ha_lock, ha, used, dev, output, ops, primary_key) should
be placed on a separate cache lines.
refcnt can be often written, while other fields are mostly read.
This gave me good result on stress test :
before:
real 0m45.570s
user 0m15.525s
sys 9m56.669s
After:
real 0m41.841s
user 0m15.261s
sys 8m45.949s
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ST Micro derivates have several extra interesting registers
that we may soon use for something interesting so may just as
well define them in the header.
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Fix the limit on the size of max fast registration WRs that can be
posted to match hardware capabilities.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
struct dst_ops tracks number of allocated dst in an atomic_t field,
subject to high cache line contention in stress workload.
Switch to a percpu_counter, to reduce number of time we need to dirty a
central location. Place it on a separate cache line to avoid dirtying
read only fields.
Stress test :
(Sending 160.000.000 UDP frames,
IP route cache disabled, dual E5540 @2.53GHz,
32bit kernel, FIB_TRIE, SLUB/NUMA)
Before:
real 0m51.179s
user 0m15.329s
sys 10m15.942s
After:
real 0m45.570s
user 0m15.525s
sys 9m56.669s
With a small reordering of struct neighbour fields, subject of a
following patch, (to separate refcnt from other read mostly fields)
real 0m41.841s
user 0m15.261s
sys 8m45.949s
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a seqlock in struct neighbour to protect neigh->ha[], and avoid
dirtying neighbour in stress situation (many different flows / dsts)
Dirtying takes place because of read_lock(&n->lock) and n->used writes.
Switching to a seqlock, and writing n->used only on jiffies changes
permits less dirtying.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates IEEE802.11 mesh constants to be consistent with newly
approved values. It modifies some values, as well as adds many new constants
in preparation for updating mesh code to the current 802.11s drafts. ANA
numbers were taken from:
https://mentor.ieee.org/802.11/dcn/09/11-09-0031-12-0000-ana-database-assigned-number-authority.xls
A few notes are in order:
1. This will break backwards compatibility with existing Linux kernels as
over-the-air constants have changed.
2. Some old and obsolete constants have been retained for now as the mesh code
itself hasn't been updated yet to the new 802.11s draft. This was desired to
keep the existing mesh scheme working until it can be updated. Adding the
approved values is the first step in updating the mesh code.
3. Obsolete constants have been clearly marked.
4. All ANA approved 802.11s constants have been added.
Signed-off-by: Steve deRosier <steve@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Using these, user space can calculate a relative channel utilization
with arbitrary intervals by regularly taking snapshots of the survey
results.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Some stats for /proc/net/wireless (and wext in general) are not
being set. This patch addresses a few of those with values easily
obtained from mac80211 core.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When a new dst is used to send a frame, neigh_resolve_output() tries to
associate an struct hh_cache to this dst, calling neigh_hh_init() with
the neigh rwlock write locked.
Most of the time, hh_cache is already known and linked into neighbour,
so we find it and increment its refcount.
This patch changes the logic so that we call neigh_hh_init() with
neighbour lock read locked only, so that fast path can be run in
parallel by concurrent cpus.
This brings part of the speedup we got with commit c7d4426a98
(introduce DST_NOCACHE flag) for non cached dsts, even for cached ones,
removing one of the contention point that routers hit on multiqueue
enabled machines.
Further improvements would need to use a seqlock instead of an rwlock to
protect neigh->ha[], to not dirty neigh too often and remove two atomic
ops.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the perf-events backend from arch/arm/oprofile into
drivers/oprofile so that the code can be shared between architectures.
This allows each architecture to maintain only a single copy of the PMU
accessor functions instead of one for both perf and OProfile. It also
becomes possible for other architectures to delete much of their
OProfile code in favour of the common code now available in
drivers/oprofile/oprofile_perf.c.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Tested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Make op_name_from_perf_id() global so that we have a way for each
architecture to construct an oprofile name for op->cpu_type. We need to
remove the argument from the function prototype so that we can hide all
implementation details inside the function.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Introduce perf_pmu_name() helper function that returns the name of the
pmu. This gives us a generic way to get the name of a pmu regardless of
how an architecture identifies it internally.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Add WQ_MEM_RECLAIM flag which currently maps to WQ_RESCUER, mark
WQ_RESCUER as internal and replace all external WQ_RESCUER usages to
WQ_MEM_RECLAIM.
This makes the API users express the intent of the workqueue instead
of indicating the internal mechanism used to guarantee forward
progress. This is also to make it cleaner to add more semantics to
WQ_MEM_RECLAIM. For example, if deemed necessary, memory reclaim
workqueues can be made highpri.
This patch doesn't introduce any functional change.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
The number of counters for the registered pmu is needed in a few places
so provide a helper function that returns this number.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Tested-by: Will Deacon <will.deacon@arm.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Clean up pdt.c:
- make build dependent upon config OF_PROMTREE
- #ifdef out the sparc-specific stuff
- create pdt-specific header
Signed-off-by: Andres Salomon <dilinger@queued.net>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
If CONFIG_GENERIC_FIND_NEXT_BIT is enabled, find_next_bit() and
find_next_zero_bit() are doubly declared in asm-generic/bitops/find.h
and linux/bitops.h.
asm/bitops.h includes asm-generic/bitops/find.h if and only if the
architecture enables CONFIG_GENERIC_FIND_NEXT_BIT. And asm/bitops.h
is included by linux/bitops.h
So we can just remove the extern declarations of find_next_bit() and
find_next_zero_bit() in linux/bitops.h.
Also we can remove unneeded #ifndef CONFIG_GENERIC_FIND_NEXT_BIT in
asm-generic/bitops/find.h.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
asm-generic/bitops/find.h has the extern declarations of find_next_bit()
and find_next_zero_bit() and the macro definitions of find_first_bit()
and find_first_zero_bit(). It is only usable by the architectures which
enables CONFIG_GENERIC_FIND_NEXT_BIT and disables
CONFIG_GENERIC_FIND_FIRST_BIT.
x86 and tile enable both CONFIG_GENERIC_FIND_NEXT_BIT and
CONFIG_GENERIC_FIND_FIRST_BIT. These architectures cannot include
asm-generic/bitops/find.h in their asm/bitops.h. So ifdefed extern
declarations of find_first_bit and find_first_zero_bit() are put in
linux/bitops.h.
This makes asm-generic/bitops/find.h usable by these architectures
and use it. Also this change is needed for the forthcoming duplicated
extern declarations cleanup.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Chris Metcalf <cmetcalf@tilera.com>
All 'pid_t' were changed to '__kernel_pid_t' in a previous commit:
make exported headers use strict posix types
A number of standard posix types are used in exported headers,
which is not allowed if __STRICT_KERNEL_NAMES is defined. In order
to get rid of the non-__STRICT_KERNEL_NAMES part and to make sane
headers the default, we have to change them all to safe types.
but a later change introduced 'pid_t' again:
fcntl: add F_[SG]ETOWN_EX
This makes asm-generic/fcntl.h d use strict posix types again.
Signed-off-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The version of cmpxchg defined in asm-generic/system.h does not handle
correctly non-long arguments. Use the version defined in cmpxchg.h
instead.
Signed-off-by: Mathieu Lacage <mathieu.lacage@inria.fr>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
atomic_add_unless is a macro so, bad things happen if the caller defines
a local variable named c, just like like the local variable c defined by
the macro. Thus, convert atomic_add_unless to a function. (bug triggered
by net/ipv4/netfilter/ipt_CLUSTERIP.c: clusterip_config_find_get calls
atomic_inc_not_zero)
Signed-off-by: Mathieu Lacage <mathieu.lacage@inria.fr>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This patch adds a new MTIOCTOP operation MTWEOFI that writes filemarks with
immediate bit set. This means that the drive does not flush its buffer and the
next file can be started immediately. This speeds up writing in applications
that have to write multiple small files.
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The current code works like this:
int garbage, status;
socklen_t len = sizeof(status);
/* enable pipe */
setsockopt(fd, SOL_PNPIPE, PNPIPE_ENABLE, &garbage, sizeof(garbage));
/* disable pipe */
setsockopt(fd, SOL_PNPIPE, PNPIPE_DISABLE, &garbage, sizeof(garbage));
/* get status */
getsockopt(fd, SOL_PNPIPE, PNPIPE_INQ, &status, &len);
...which does not follow the usual socket option pattern. This patch
merges all three "options" into a single gettable&settable option,
before Linux 2.6.37 gets out:
int status;
socklen_t len = sizeof(status);
/* enable pipe */
status = 1;
setsockopt(fd, SOL_PNPIPE, PNPIPE_ENABLE, &status, sizeof(status));
/* disable pipe */
status = 0;
setsockopt(fd, SOL_PNPIPE, PNPIPE_ENABLE, &status, sizeof(status));
/* get status */
getsockopt(fd, SOL_PNPIPE, PNPIPE_ENABLE, &status, &len);
This also fixes the error code from EFAULT to ENOTCONN.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Cc: Kumar Sanghvi <kumar.sanghvi@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sd will get hung up issuing commands to flush write cache if a SAS
device behind the expander is unplugged without warning. Change libsas
to reject commands to domain devices that have already gone away.
[maciej.trela@intel.com: removed setting ->gone in sas_deform_port() to
permit sync cache commands at module removal]
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Tested-by: Haipao Fan <haipao.fan@intel.com>
Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
We need to use some of these values in eDP configurations, so be sure to
fetch them and store them in the i915 private structure.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This make four macros for the PrimeCell ID register available to
drivers that use them witout using the PrimeCell/AMBA bus
abstraction and struct amba_device. It also moves the magic
PrimeCell CID "B105F00D" to the bus.h header file.
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This fixes a problem introduced with the hugetlb hwpoison handling
The user space SIGBUS signalling wants to know the size of the hugepage
that caused a HWPOISON fault.
Unfortunately the architecture page fault handlers do not have easy
access to the struct page.
Pass the information out in the fault error code instead.
I added a separate VM_FAULT_HWPOISON_LARGE bit for this case and encode
the hpage index in some free upper bits of the fault code. The small
page hwpoison keeps stays with the VM_FAULT_HWPOISON name to minimize
changes.
Also add code to hugetlb.h to convert that index into a page shift.
Will be used in a further patch.
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: fengguang.wu@intel.com
Signed-off-by: Andi Kleen <ak@linux.intel.com>
migrate_huge_page_move_mapping() is declared as "extern int ..."
in include/linux/migrate.h for !CONFIG_MIGRATION,
which causes the build error like below:
mm/mprotect.o: In function `migrate_huge_page_move_mapping':
mprotect.c:(.text+0x0): multiple definition of `migrate_huge_page_move_mapping'
mm/shmem.o:shmem.c:(.text+0x0): first defined here
mm/rmap.o: In function `migrate_huge_page_move_mapping':
rmap.c:(.text+0x0): multiple definition of `migrate_huge_page_move_mapping'
mm/shmem.o:shmem.c:(.text+0x0): first defined here
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
This check is necessary to avoid race between dequeue and allocation,
which can cause a free hugepage to be dequeued twice and get kernel unstable.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
This patch extends page migration code to support hugepage migration.
One of the potential users of this feature is soft offlining which
is triggered by memory corrected errors (added by the next patch.)
Todo:
- there are other users of page migration such as memory policy,
memory hotplug and memocy compaction.
They are not ready for hugepage support for now.
ChangeLog since v4:
- define migrate_huge_pages()
- remove changes on isolation/putback_lru_page()
ChangeLog since v2:
- refactor isolate/putback_lru_page() to handle hugepage
- add comment about race on unmap_and_move_huge_page()
ChangeLog since v1:
- divide migration code path for hugepage
- define routine checking migration swap entry for hugetlb
- replace "goto" with "if/else" in remove_migration_pte()
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
This patch modifies hugepage copy functions to have only destination
and source hugepages as arguments for later use.
The old ones are renamed from copy_{gigantic,huge}_page() to
copy_user_{gigantic,huge}_page().
This naming convention is consistent with that between copy_highpage()
and copy_user_highpage().
ChangeLog since v4:
- add blank line between local declaration and code
- remove unnecessary might_sleep()
ChangeLog since v2:
- change copy_huge_page() from macro to inline dummy function
to avoid compile warning when !CONFIG_HUGETLB_PAGE.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
We can't use existing hugepage allocation functions to allocate hugepage
for page migration, because page migration can happen asynchronously with
the running processes and page migration users should call the allocation
function with physical addresses (not virtual addresses) as arguments.
ChangeLog since v3:
- unify alloc_buddy_huge_page() and alloc_buddy_huge_page_node()
ChangeLog since v2:
- remove unnecessary get/put_mems_allowed() (thanks to David Rientjes)
ChangeLog since v1:
- add comment on top of alloc_huge_page_no_vma()
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Similar change as to signal delivery: copy out the si_addr_lsb field
to user space in signalfd
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
This patch creates a new idmapper system that uses the request-key function to
place a call into userspace to map user and group ids to names. The old
idmapper was single threaded, which prevented more than one request from running
at a single time. This means that a user would have to wait for an upcall to
finish before accessing a cached result.
The upcall result is stored on a keyring of type id_resolver. See the file
Documentation/filesystems/nfs/idmapper.txt for instructions.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
[Trond: fix up the return value of nfs_idmap_lookup_name and clean up code]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This adds a fc host dev loss sysfs file. Instead of
calling into the driver using the get_host_def_dev_loss_tmo
callback, we allow drivers to init the dev loss like is done
for other fc host params, and then the fc class will handle
updating the value if the user writes to the new sysfs file.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm: don't drop handle reference on unload
drm/ttm: Fix two race conditions + fix busy codepaths
Rather than block the workqueue by sleeping to do the debounce use delayed
work to implement the debounce time. This should also means that we extend
the debounce time on each new bounce, potentially allowing shorter debounce
times for clean insertions.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Jarkko Nikula <jhnikula@gmail.com>
Acked-by: Liam Girdwood <lrg@slimlogic.co.uk>
There's no need for the WDS peer address
to not be const, so make it const.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Fix the IRQ flag handling naming. In linux/irqflags.h under one configuration,
it maps:
local_irq_enable() -> raw_local_irq_enable()
local_irq_disable() -> raw_local_irq_disable()
local_irq_save() -> raw_local_irq_save()
...
and under the other configuration, it maps:
raw_local_irq_enable() -> local_irq_enable()
raw_local_irq_disable() -> local_irq_disable()
raw_local_irq_save() -> local_irq_save()
...
This is quite confusing. There should be one set of names expected of the
arch, and this should be wrapped to give another set of names that are expected
by users of this facility.
Change this to have the arch provide:
flags = arch_local_save_flags()
flags = arch_local_irq_save()
arch_local_irq_restore(flags)
arch_local_irq_disable()
arch_local_irq_enable()
arch_irqs_disabled_flags(flags)
arch_irqs_disabled()
arch_safe_halt()
Then linux/irqflags.h wraps these to provide:
raw_local_save_flags(flags)
raw_local_irq_save(flags)
raw_local_irq_restore(flags)
raw_local_irq_disable()
raw_local_irq_enable()
raw_irqs_disabled_flags(flags)
raw_irqs_disabled()
raw_safe_halt()
with type checking on the flags 'arguments', and then wraps those to provide:
local_save_flags(flags)
local_irq_save(flags)
local_irq_restore(flags)
local_irq_disable()
local_irq_enable()
irqs_disabled_flags(flags)
irqs_disabled()
safe_halt()
with tracing included if enabled.
The arch functions can now all be inline functions rather than some of them
having to be macros.
Signed-off-by: David Howells <dhowells@redhat.com> [X86, FRV, MN10300]
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> [Tile]
Signed-off-by: Michal Simek <monstr@monstr.eu> [Microblaze]
Tested-by: Catalin Marinas <catalin.marinas@arm.com> [ARM]
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> [AVR]
Acked-by: Tony Luck <tony.luck@intel.com> [IA-64]
Acked-by: Hirokazu Takata <takata@linux-m32r.org> [M32R]
Acked-by: Greg Ungerer <gerg@uclinux.org> [M68K/M68KNOMMU]
Acked-by: Ralf Baechle <ralf@linux-mips.org> [MIPS]
Acked-by: Kyle McMartin <kyle@mcmartin.ca> [PA-RISC]
Acked-by: Paul Mackerras <paulus@samba.org> [PowerPC]
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [S390]
Acked-by: Chen Liqin <liqin.chen@sunplusct.com> [Score]
Acked-by: Matt Fleming <matt@console-pimps.org> [SH]
Acked-by: David S. Miller <davem@davemloft.net> [Sparc]
Acked-by: Chris Zankel <chris@zankel.net> [Xtensa]
Reviewed-by: Richard Henderson <rth@twiddle.net> [Alpha]
Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp> [H8300]
Cc: starvik@axis.com [CRIS]
Cc: jesper.nilsson@axis.com [CRIS]
Cc: linux-cris-kernel@axis.com
Drop inclusions of asm/system.h from linux/hardirq.h and linux/list.h as
they're no longer required and prevent the M68K arch's IRQ flag handling macros
from being made into inlined functions due to circular dependencies.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
2.6.36 introduces an API for drivers to switch the IO scheduler
instead of manually calling the elevator exit and init functions.
This API was added since q->elevator must be cleared in between
those two calls. And since we already have this functionality
directly from use by the sysfs interface to switch schedulers
online, it was prudent to reuse it internally too.
But this API needs the queue to be in a fully initialized state
before it is called, or it will attempt to unregister elevator
kobjects before they have been added. This results in an oops
like this:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000051
IP: [<ffffffff8116f15e>] sysfs_create_dir+0x2e/0xc0
PGD 47ddfc067 PUD 47c6a1067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:02.0/0000:04:00.1/irq
CPU 2
Modules linked in: t(+) loop hid_apple usbhid ahci ehci_hcd uhci_hcd libahci usbcore nls_base igb
Pid: 7319, comm: modprobe Not tainted 2.6.36-rc6+ #132 QSSC-S4R/QSSC-S4R
RIP: 0010:[<ffffffff8116f15e>] [<ffffffff8116f15e>] sysfs_create_dir+0x2e/0xc0
RSP: 0018:ffff88027da25d08 EFLAGS: 00010246
RAX: ffff88047c68c528 RBX: 00000000fffffffe RCX: 0000000000000000
RDX: 000000000000002f RSI: 000000000000002f RDI: ffff88047e196c88
RBP: ffff88027da25d38 R08: 0000000000000000 R09: d84156c5635688c0
R10: d84156c5635688c0 R11: 0000000000000000 R12: ffff88047e196c88
R13: 0000000000000000 R14: 0000000000000000 R15: ffff88047c68c528
FS: 00007fcb0b26f6e0(0000) GS:ffff880287400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000051 CR3: 000000047e76e000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process modprobe (pid: 7319, threadinfo ffff88027da24000, task ffff88027d377090)
Stack:
ffff88027da25d58 ffff88047c68c528 00000000fffffffe ffff88047e196c88
<0> ffff88047c68c528 ffff88047e05bd90 ffff88027da25d78 ffffffff8123fb77
<0> ffff88047e05bd90 0000000000000000 ffff88047e196c88 ffff88047c68c528
Call Trace:
[<ffffffff8123fb77>] kobject_add_internal+0xe7/0x1f0
[<ffffffff8123fd98>] kobject_add_varg+0x38/0x60
[<ffffffff8123feb9>] kobject_add+0x69/0x90
[<ffffffff8116efe0>] ? sysfs_remove_dir+0x20/0xa0
[<ffffffff8103d48d>] ? sub_preempt_count+0x9d/0xe0
[<ffffffff8143de20>] ? _raw_spin_unlock+0x30/0x50
[<ffffffff8116efe0>] ? sysfs_remove_dir+0x20/0xa0
[<ffffffff8116eff4>] ? sysfs_remove_dir+0x34/0xa0
[<ffffffff81224204>] elv_register_queue+0x34/0xa0
[<ffffffff81224aad>] elevator_change+0xfd/0x250
[<ffffffffa007e000>] ? t_init+0x0/0x361 [t]
[<ffffffffa007e000>] ? t_init+0x0/0x361 [t]
[<ffffffffa007e0a8>] t_init+0xa8/0x361 [t]
[<ffffffff810001de>] do_one_initcall+0x3e/0x170
[<ffffffff8108c3fd>] sys_init_module+0xbd/0x220
[<ffffffff81002f2b>] system_call_fastpath+0x16/0x1b
Code: e5 41 56 41 55 41 54 49 89 fc 53 48 83 ec 10 48 85 ff 74 52 48 8b 47 18 49 c7 c5 00 46 61 81 48 85 c0 74 04 4c 8b 68 30 45 31 f6 <41> 80 7d 51 00 74 0e 49 8b 44 24 28 4c 89 e7 ff 50 20 49 89 c6
RIP [<ffffffff8116f15e>] sysfs_create_dir+0x2e/0xc0
RSP <ffff88027da25d08>
CR2: 0000000000000051
---[ end trace a6541d3bf07945df ]---
Fix this by adding a registered bit to the elevator queue, which is
set when the sysfs kobjects have been registered.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
This is the second step for neighbour RCU conversion.
(first was commit d6bf7817 : RCU conversion of neigh hash table)
neigh_lookup() becomes lockless, but still take a reference on found
neighbour. (no more read_lock()/read_unlock() on tbl->lock)
struct neighbour gets an additional rcu_head field and is freed after an
RCU grace period.
Future work would need to eventually not take a reference on neighbour
for temporary dst (DST_NOCACHE), but this would need dst->_neighbour to
use a noref bit like we did for skb->_dst.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This information is already available in mac80211, we just need to export it
via cfg80211 and nl80211.
Signed-off-by: Bruno Randolf <br1@einfach.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This adds API to allow adding per-station GTKs,
updates mac80211 to support it, and also allows
drivers to remove a key from hwaccel again when
this may be necessary due to multiple GTKs.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Currently disabling CONFIG_SLUB_DEBUG also disabled SYSFS support meaning
that the slabs cannot be tuned without DEBUG.
Make SYSFS support independent of CONFIG_SLUB_DEBUG
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
USB only gives the maximum current allowed to draw.
Signed-off-by: Heikki Krogerus <ext-heikki.krogerus@nokia.com>
Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
This adds power supply types for USB chargers defined in
Battery Charging Specification 1.1.
Signed-off-by: Heikki Krogerus <ext-heikki.krogerus@nokia.com>
Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
Allow sysadmins to configure the number of multicast
membership report sent on a link failure event.
Signed-off-by: Flavio Leitner <fleitner@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* drm-kdb-next:
drm/nouveau/kms: Avoid a hang entering KDB with VT accel on.
radeon, kdb, kms: Save and restore the LUT on atomic KMS enter/exit
drm, kdb, kms: Add an enter argument to mode_set_base_atomic() API
drm/nouveau/kms: Implement KDB debug hooks for nouveau KMS.
drm/radeon/kms: Implement KDB debug hooks for radeon KMS.
[airlied - add fix for vmwgfx build]
* 'nouveau/for-airlied' of ../drm-nouveau-next: (93 commits)
drm/ttm: restructure to allow driver to plug in alternate memory manager
drm/ttm: introduce utility function to free an allocated memory node
drm/nouveau: fix thinkos in mem timing table recordlen check
drm/nouveau: parse voltage from perf 0x40 entires
drm/nouveau: don't use the default pll limits in table v2.1 on nv50+ cards
drm/nv50: Fix large 3D performance regression caused by the interchannel sync patches.
drm/nouveau: Synchronize buffer object moves in hardware.
drm/nouveau: Use semaphores to handle inter-channel sync in hardware.
drm/nouveau: Provide a means to have arbitrary work run on fence completion.
drm/nouveau: Minor refactoring/cleanup of the fence code.
drm/nouveau: Add a module option to force card POST.
drm/nv50: prevent (IB_PUT == IB_GET) for occurring unless idle
drm/nv0x-nv4x: Leave the 0x40 bit untouched when changing CRE_LCD.
drm/nv30-nv40: Fix postdivider mask when writing engine/memory PLLs.
drm/nouveau: Fix perf table parsing on BMP v5.25.
drm/nouveau: fix required mode bandwidth calculation for DP
drm/nouveau: fix typo in c2aa91afea5f7e7ae4530fabd37414a79c03328c
drm/nva3: split pm backend out from nv50
drm/nouveau: run perflvl and M table scripts on mem clock change
drm/nouveau: pass perflvl struct to clock_pre()
...
Some devices such as the radeon chips receive information from user
space which needs to be saved when executing an atomic mode set
operation, else the user space would have to be queried again for the
information.
This patch extends the mode_set_base_atomic() call to pass an argument
to indicate if this is an entry or an exit from an atomic kernel mode
set change. Individual drm drivers can properly save and restore
state accordingly.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: David Airlie <airlied@linux.ie>
CC: dri-devel@lists.freedesktop.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
This can be used by the X server to restrict mode resolutions and size of
root pixmap.
Bump minor to announce this availability.
Bump driver date.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is needed for the callback to identify the caller and take
appropriate locks if needed.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
* 'intel/drm-intel-next' of ../drm-next: (266 commits)
drm/i915: Avoid circular locking from intel_fbdev_fini()
drm/i915: mark display port DPMS state as 'ON' when enabling output
drm/i915: Skip pread/pwrite if size to copy is 0.
drm/i915: avoid struct mutex output_poll mutex lock loop on unload
drm/i915: Rephrase pwrite bounds checking to avoid any potential overflow
drm/i915: Sanity check pread/pwrite
drm/i915: Use pipe state to tell when pipe is off
drm/i915: vblank status not valid while training display port
drivers/gpu/drm/i915/i915_gem.c: Add missing error handling code
drm/i915: Don't mask the return code whilst relocating.
drm/i915: If the GPU hangs twice within 5 seconds, declare it wedged.
drm/i915: Only print 'generating error event' if we actually are
drm/i915: Try to reset gen2 devices.
drm/i915: Clear fence registers on GPU reset
drm/i915: Force the domain to CPU on unbinding whilst wedged.
drm: Move the GTT accounting to i915
drm/i915: Fix refleak during eviction.
i915: Added function to initialize VBT settings
drm/i915: Remove redundant deletion of obj->gpu_write_list
drm/i915: Make get/put pages static
...
This fixes a race pointed out by Dave Airlie where we don't take a buffer
object about to be destroyed off the LRU lists properly. It also fixes a rare
case where a buffer object could be destroyed in the middle of an
accelerated eviction.
The patch also adds a utility function that can be used to prematurely
release GPU memory space usage of an object waiting to be destroyed.
For example during eviction or swapout.
The above mentioned commit didn't queue the buffer on the delayed destroy
list under some rare circumstances. It also didn't completely honor the
remove_all parameter.
Fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=615505http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=591061
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
David
This is the first step for RCU conversion of neigh code.
Next patches will convert hash_buckets[] and "struct neighbour" to RCU
protected objects.
Thanks
[PATCH net-next] net neigh: RCU conversion of neigh hash table
Instead of storing hash_buckets, hash_mask and hash_rnd in "struct
neigh_table", a new structure is defined :
struct neigh_hash_table {
struct neighbour **hash_buckets;
unsigned int hash_mask;
__u32 hash_rnd;
struct rcu_head rcu;
};
And "struct neigh_table" has an RCU protected pointer to such a
neigh_hash_table.
This means the signature of (*hash)() function changed: We need to add a
third parameter with the actual hash_rnd value, since this is not
anymore a neigh_table field.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In various situations, a device provides a packet to our stack and we
drop it before it enters protocol stack :
- softnet backlog full (accounted in /proc/net/softnet_stat)
- bad vlan tag (not accounted)
- unknown/unregistered protocol (not accounted)
We can handle a per-device counter of such dropped frames at core level,
and automatically adds it to the device provided stats (rx_dropped), so
that standard tools can be used (ifconfig, ip link, cat /proc/net/dev)
This is a generalization of commit 8990f468a (net: rx_dropped
accounting), thus reverting it.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As suggested by Linus, push the irqs_disabled() down to the
rcu_read_lock_bh_held() level so that all callers get the benefit
of the correct check.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The "flags" member of "struct wait_queue_t" is used in several places in
the kernel code without beeing initialized by init_wait(). "flags" is
used in bitwise operations.
If "flags" not initialized then unexpected behaviour may take place.
Incorrect flags might used later in code.
Added initialization of "wait_queue_t.flags" with zero value into
"init_wait".
Signed-off-by: Evgeny Kuznetsov <EXT-Eugeny.Kuznetsov@nokia.com>
[ The bit we care about does end up being initialized by both
prepare_to_wait() and add_to_wait_queue(), so this doesn't seem to
cause actual bugs, but is definitely the right thing to do -Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With all the recent module loading cleanups, we've minimized the code
that sits under module_mutex, fixing various deadlocks and making it
possible to do most of the module loading in parallel.
However, that whole conversion totally missed the rather obscure code
that adds a new module to the list for BUG() handling. That code was
doubly obscure because (a) the code itself lives in lib/bugs.c (for
dubious reasons) and (b) it gets called from the architecture-specific
"module_finalize()" rather than from generic code.
Calling it from arch-specific code makes no sense what-so-ever to begin
with, and is now actively wrong since that code isn't protected by the
module loading lock any more.
So this commit moves the "module_bug_{finalize,cleanup}()" calls away
from the arch-specific code, and into the generic code - and in the
process protects it with the module_mutex so that the list operations
are now safe.
Future fixups:
- move the module list handling code into kernel/module.c where it
belongs.
- get rid of 'module_bug_list' and just use the regular list of modules
(called 'modules' - imagine that) that we already create and maintain
for other reasons.
Reported-and-tested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The documentation for NL80211_CMD_REMAIN_ON_CHANNEL
isn't accurate, an interface index is required by
the command. Update it accordingly.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Each family may have some amount of boilerplate
locking code that applies to most, or even all,
commands.
This allows a family to handle such things in
a more generic way, by allowing it to
a) include private flags in each operation
b) specify a pre_doit hook that is called,
before an operation's doit() callback and
may return an error directly,
c) specify a post_doit hook that can undo
locking or similar things done by pre_doit,
and finally
d) include two private pointers in each info
struct passed between all these operations
including doit(). (It's two because I'll
need two in nl80211 -- can be extended.)
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Some drivers cannot handle multiple retry rates specified by the rc
algorithm but instead use their own retry table (for example rt2800).
However, if such a device registers itself with a max_rates value of 1
the rc algorithm cannot make use of the extended information the device
can provide about retried rates. On the other hand, if a device
registers itself with a max_rates value > 1 the rc algorithm assumes
that the device can handle multi rate retries.
Fix this issue by introducing another hw parameter max_report_rates that
can be set to a different value then max_rates to indicate if a device
is capable of reporting more rates then specified in max_rates.
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Added a nl interface to set the peer bssid of a WDS interface.
Signed-off-by: Bill Jordan <bjordan@rajant.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The net/cfg80211.h header file isn't exported to
userspace, so there's no need for any kind of
__KERNEL__ protection in it. If it was exported,
everything else in it would need protection as
well, not just the logging stuff ...
Cc:Joe Perches <joe@perches.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Some user space applications only want to display survey data for
the operating channel, however there is no API to get that yet.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This splits up the sh intc core in to something more vaguely resembling
a subsystem. Most of the functionality was alread fairly well
compartmentalized, and there were only a handful of interdependencies
that needed to be resolved in the process.
This also serves as future-proofing for the genirq and sparseirq rework,
which will make some of the split out functionality wholly generic,
allowing things to be killed off in place with minimal migration pain.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
If lookups happen while the radix node still points to a subgroup
mapping, an IRQ hasn't yet been made available for the specified id, so
error out accordingly. Once the slot is replaced with an IRQ mapping and
the tag is discarded, lookup can commence as normal.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This prepares the removal of the big kernel lock from the
file locking code. We still use the BKL as long as fs/lockd
uses it and ceph might sleep, but we can flip the definition
to a private spinlock as soon as that's done.
All users outside of fs/lockd get converted to use
lock_flocks() instead of lock_kernel() where appropriate.
Based on an earlier patch to use a spinlock from Matthew
Wilcox, who has attempted this a few times before, the
earliest patch from over 10 years ago turned it into
a semaphore, which ended up being slower than the BKL
and was subsequently reverted.
Someone should do some serious performance testing when
this becomes a spinlock, since this has caused problems
before. Using a spinlock should be at least as good
as the BKL in theory, but who knows...
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Miklos Szeredi <mszeredi@suse.cz>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Sage Weil <sage@newdream.net>
Cc: linux-kernel@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
These two tracepoints allow tracking when and how a work is queued and
activated. This patch is based on Frederic's patch to add queue_work
trace point.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Define workqueue_work event class and use it for workqueue_execute_end
trace point. Also, move trace/events/workqueue.h include downwards
such that all struct definitions are visible to it. This is to
prepare for more tracepoints and doesn't cause any functional change.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Another exported symbol only used in one file
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
fib_rules_cleanup_ups is only defined and used in one place.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rtnl_dereference() is used in contexts where RTNL is held, to fetch an
RCU protected pointer.
Updates to this pointer are prevented by RTNL, so we dont need
smp_read_barrier_depends() and the ACCESS_ONCE() provided in
rcu_dereference_check().
rtnl_dereference() is mainly a macro to document the locking invariant.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ingress being not used very much, and net_device->ingress_queue being
quite a big object (128 or 256 bytes), use a dynamic allocation if
needed (tc qdisc add dev eth0 ingress ...)
dev_ingress_queue(dev) helper should be used only with RTNL taken.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nouveau will need this on GeForce 8 and up to account for the GPU
reordering physical VRAM for some memory types.
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Acked-by: Thomas Hellström <thellstrom@vmware.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Existing core code/drivers call drm_mm_put_block on ttm_mem_reg.mm_node
directly. Future patches will modify TTM behaviour in such a way that
ttm_mem_reg.mm_node doesn't necessarily belong to drm_mm.
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Acked-by: Thomas Hellström <thellstrom@vmware.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Forgot to add xt_log.h in commit a8defca0 (netfilter: ipt_LOG:
add bufferisation to call printk() once)
Signed-off-by: Patrick McHardy <kaber@trash.net>
Many interrupts that share a single mask source but are on different
hardware vectors will have an associated register tied to an INTEVT that
denotes the precise cause for the interrupt exception being triggered.
This introduces the concept of IRQ subgroups in the intc core, where
a virtual IRQ map is constructed for each of the pre-defined cause bits,
and a higher level chained handler takes control of the parent INTEVT.
This enables CPUs with heavily muxed IRQ vectors (especially across
disjoint blocks) to break things out in to a series of managed chained
handlers while being able to dynamically lookup and adopt the IRQs
created for them.
This is largely an opt-in interface, requiring CPUs to manually submit
IRQs for subgroup splitting, in addition to providing identifiers in
their enum maps that can be used for lazy lookup via the radix tree.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Dozen of changes in ncpfs to provide some locking other than BKL.
In readdir cache unlock and mark complete first page as last operation,
so it can be used for synchronization, as code intended.
When updating dentry name on case insensitive filesystems do at least
some basic locking...
Hold i_mutex when updating inode fields.
Push some ncp_conn_is_valid down to ncp_request. Connection can become
invalid at any moment, and fewer error code paths to test the better.
Use i_size_{read,write} to modify file size.
Set inode's backing_dev_info as ncpfs has its own special bdi.
In ioctl unbreak ioctls invoked on filesystem mounted 'ro' - tests are
for inode writeable or owner match, but were turned to filesystem
writeable and inode writeable or owner match. Also collect all permission
checks in single place.
Add some locking, and remove comments saying that it would be cool to
add some locks to the code.
Constify some pointers.
Signed-off-by: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The functions nf_nat_proto_find_get and nf_nat_proto_put are
only used internally in nf_nat_core. This might break some out
of tree NAT module.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This implements a scheme roughly analogous to the PowerPC virtual to
hardware IRQ mapping, which we use for IRQ to per-controller ID mapping.
This makes it possible for drivers to use the IDs directly for lookup
instead of hardcoding the vector.
The main motivation for this work is as a building block for dynamically
allocating virtual IRQs for demuxing INTC events sharing a single INTEVT
in addition to a common masking source.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Allow the persistence engine of a virtual service to be set, edited
and unset.
This feature only works with the netlink user-space interface.
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Julian Anastasov <ja@ssi.bg>
This shouldn't break compatibility with userspace as the new data
is at the end of the line.
I have confirmed that this doesn't break ipvsadm, the main (only?)
user-space user of this data.
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: Julian Anastasov <ja@ssi.bg>
This option covers now the old chip functions and the irq_desc data
fields which are moving to struct irq_data. More stuff will follow.
Pretty handy for testing a conversion, whether something broke or not.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
The low level irq chip functions want access to irq_desc->irq_data.
Provide new functions which hand down irq_data instead of the irq
number so these functions avoid to call irq_to_desc() which is a radix
tree lookup in case of sparse irq.
This provides all the old functions except one: end(). end() is a
relict of __do_IRQ() and will just go away with the __do_IRQ() code.
The replacement for set_affinity() has an extra argument "bool
force". The reason for this is to notify the low level code, that the
move has to be done right away and cannot be delayed until the next
interrupt happens. That's necessary to handle the irq fixup on cpu
unplug in the generic code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121841.742126604@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Convert all references in the core code to orq, chip, handler_data,
chip_data, msi_desc, affinity to irq_data.*
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Low level chip functions need access to irq_desc->handler_data,
irq_desc->chip_data and irq_desc->msi_desc. We hand down the irq
number to the low level functions, so they need to lookup irq_desc.
With sparse irq this means a radix tree lookup.
We could hand down irq_desc itself, but low level chip functions have
no need to fiddle with it directly and we want to restrict access to
irq_desc further.
Preparatory patch for new chip functions.
Note, that the ugly anon union/struct is there to avoid a full tree
wide clean up for now. This is not going to last 3 years like __do_IRQ()
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121841.645542300@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
While doing stress tests with IP route cache disabled, and multi queue
devices, I noticed a very high contention on one rwlock used in
neighbour code.
When many cpus are trying to send frames (possibly using a high
performance multiqueue device) to the same neighbour, they fight for the
neigh->lock rwlock in order to call neigh_hh_init(), and fight on
hh->hh_refcnt (a pair of atomic_inc/atomic_dec_and_test())
But we dont need to call neigh_hh_init() for dst that are used only
once. It costs four atomic operations at least, on two contended cache
lines, plus the high contention on neigh->lock rwlock.
Introduce a new dst flag, DST_NOCACHE, that is set when dst was not
inserted in route cache.
With the stress test bench, sending 160000000 frames on one neighbour,
results are :
Before patch:
real 2m28.406s
user 0m11.781s
sys 36m17.964s
After patch:
real 1m26.532s
user 0m12.185s
sys 20m3.903s
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use RCU & RTNL protection for mfc_cache_array[]
ipmr_cache_find() is called under rcu_read_lock();
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Presently the pinmux code is a one-way thing, but there's nothing
preventing an unregistration if no one has grabbed any of the pins.
This will permit us to save a bit of memory on systems that require pin
demux for certain peripherals in the case where registration of those
peripherals fails, or they are otherwise not attached to the system.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The WM8962 features five GPIOs, add support for controlling their output
state via gpiolib.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <lrg@slimlogic.co.uk>
Some controllers will need to be initialized lazily due to pinmux
constraints, while others may simply have no need to be brought online if
there are no backing devices for them attached. In this case it's still
necessary to be able to reserve their hardware vector map before dynamic
IRQs get a hold of them.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Reduce the #ifdefs and simplify bootstrap by making SMP and NUMA as much alike
as possible. This means that there will be an additional indirection to get to
the kmem_cache_node field under SMP.
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
On UP, percpu allocations were redirected to kmalloc. This has the
following problems.
* For certain amount of allocations (determined by
PERCPU_DYNAMIC_EARLY_SLOTS and PERCPU_DYNAMIC_EARLY_SIZE), percpu
allocator can be used before the usual kernel memory allocator is
brought online. On SMP, this is used to initialize the kernel
memory allocator.
* percpu allocator honors alignment upto PAGE_SIZE but kmalloc()
doesn't. For example, workqueue makes use of larger alignments for
cpu_workqueues.
Currently, users of percpu allocators need to handle UP differently,
which is somewhat fragile and ugly. Other than small amount of
memory, there isn't much to lose by enabling percpu allocator on UP.
It can simply use kernel memory based chunk allocation which was added
for SMP archs w/o MMUs.
This patch removes mm/percpu_up.c, builds mm/percpu.c on UP too and
makes UP build use percpu-km. As percpu addresses and kernel
addresses are always identity mapped and static percpu variables don't
need any special treatment, nothing is arch dependent and mm/percpu.c
implements generic setup_per_cpu_areas() for UP.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
In preparation of enabling percpu allocator for UP, reduce
PCPU_MIN_UNIT_SIZE to 32k. On UP, the first chunk doesn't have to
include static percpu variables and chunk size can be smaller which is
important as UP percpu allocator will use contiguous kernel memory to
populate chunks.
PCPU_MIN_UNIT_SIZE also determines the maximum supported allocation
size but 32k should still be enough.
Signed-off-by: Tejun Heo <tj@kernel.org>
These functions are used only by percpu memory allocator on SMP.
Don't build them on UP.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Nick Piggin <npiggin@kernel.dk>
kmalloc caches are statically defined and may take up a lot of space just
because the sizes of the node array has to be dimensioned for the largest
node count supported.
This patch makes the size of the kmem_cache structure dynamic throughout by
creating a kmem_cache slab cache for the kmem_cache objects. The bootstrap
occurs by allocating the initial one or two kmem_cache objects from the
page allocator.
C2->C3
- Fix various issues indicated by David
- Make create kmalloc_cache return a kmem_cache * pointer.
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
NFSv4.1 needs warning when a client tcp connection goes down, if that
connection is being used as a backchannel, so that it can warn the
client that it has lost the backchannel connection.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The spec requires us in various places to keep track of the connections
associated with each session.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
AMD CPU family 0x15 still supports GART for compatibility reasons.
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20100930124316.GG20545@loge.amd.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Unfortunately, spkm3 never got very far; while interoperability with one
other implementation was demonstrated at some point, problems were found
with the spec that were deemed not worth fixing.
The kernel code is useless on its own without nfs-utils patches which
were never merged into nfs-utils, and were only ever available from
citi.umich.edu. They appear not to have been updated since 2005.
Therefore it seems safe to assume that this code has no users, and never
will.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The net is known from the xprt_create and this tagging will also
give un the context in the conntection workers where real sockets
are created.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
After this the socket creation in it knows the context.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
vmwgfx: Fix fb VRAM pinning failure due to fragmentation
vmwgfx: Remove initialisation of dev::devname
vmwgfx: Enable use of the vblank system
vmwgfx: vt-switch (master drop) fixes
drm/vmwgfx: Fix breakage introduced by commit "drm: block userspace under allocating buffer and having drivers overwrite it (v2)"
drm: Hold the mutex when dropping the last GEM reference (v2)
drm/gem: handlecount isn't really a kref so don't make it one.
drm: i810/i830: fix locked ioctl variant
drm/radeon/kms: add quirk for MSI K9A2GM motherboard
drm/radeon/kms: fix potential segfault in r600_ioctl_wait_idle
drm: Prune GEM vma entries
drm/radeon/kms: fix up encoder info messages for DFP6
drm/radeon: fix PCI ID 5657 to be an RV410
Only drm/i915 does the bookkeeping that makes the information useful,
and the information maintained is driver specific, so move it out of the
core and into its single user.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Airlie <airlied@redhat.com>
In order to be fully threadsafe we need to check that the drm_gem_object
refcount is still 0 after acquiring the mutex in order to call the free
function. Otherwise, we may encounter scenarios like:
Thread A: Thread B:
drm_gem_close
unreference_unlocked
kref_put mutex_lock
... i915_gem_evict
... kref_get -> BUG
... i915_gem_unbind
... kref_put
... i915_gem_object_free
... mutex_unlock
mutex_lock
i915_gem_object_free -> BUG
i915_gem_object_unbind
kfree
mutex_unlock
Note that no driver is currently using the free_unlocked vfunc and it is
scheduled for removal, hasten that process.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30454
Reported-and-Tested-by: Magnus Kessler <Magnus.Kessler@gmx.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
On 64bit arches, there are two 32bit holes that we can remove.
sizeof(struct neighbour) shrinks from 0xf8 to 0xf0 bytes
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Version 20100915.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Added extern for this boolean in acpixf.h. Some hosts utilize
this value during suspend/restore operations.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Move the 64-bit overlay structures to the utmath module since
they are used nowhere else. Update module comment. ACPICA BZ 829.
http://www.acpica.org/bugzilla/show_bug.cgi?id=829
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Change definition of acpi_thread_id to always be a u64. This
simplifies the code, especially any printf output. u64 is
the only common data type for all thread_id types across all
operating systems. We now force the OSL to cast the native
thread_id type to u64 before returning the value to ACPICA
(via acpi_os_get_thread_id).
Signed-off-by: Lin Ming <ming.m.lin@intel.com
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The C inline keyword is not standardized, ACPI_INLINE allows this
to be configured on a per-compiler basis.
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This function is not OS-dependent and has been replaced by
acpi_hw_derive_pci_id, which is now in the ACPICA core code. Local
implementations of acpi_os_derive_pci_id are no longer necessary and
are removed. ACPICA BZ 857.
http://www.acpica.org/bugzilla/show_bug.cgi?id=857
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Version 20100806.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Adds install/remove interfaces so that the host can dynamically
alter the global _OSI table. Also adds support for _OSI handlers.
Additional support: new debugger command (osi), and test support in
the acpiexec utility. Adds new file, utilities/utosi.c.
ACPICA bugzilla 836.
The Linux OSL _OSI code is also changed.
acpi_osi_setup can't call acpi_install/remove_interface because ACPICA
is not initialized yet at this early time.
So we just save the osi string in acpi_osi_setup and will handle it
later in a new function acpi_osi_setup_late.
http://www.acpica.org/bugzilla/show_bug.cgi?id=836
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com
Signed-off-by: Len Brown <len.brown@intel.com>
Prototype in acpiosxf.h had the output value pointer as a (u32 *).
Should be a (u64 *).
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
ip_dev_find(net, addr) finds a device given an IPv4 source address and
takes a reference on it.
Introduce __ip_dev_find(), taking a third argument, to optionally take
the device reference. Callers not asking the reference to be taken
should be in an rcu_read_lock() protected section.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid TLB flush IPIs for the cores in deeper c-states by voluntary leave_mm()
before entering into that state. CPUs tend to flush TLB in those c-states
anyways.
acpi_idle does this with C3-type states, but it was not caried over
when intel_idle was introduced. intel_idle can apply it
to C-states in addition to those that ACPI might export as C3...
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
There were lots of places being inconsistent since handle count
looked like a kref but it really wasn't.
Fix this my just making handle count an atomic on the object,
and have it increase the normal object kref.
Now i915/radeon/nouveau drivers can drop the normal reference on
userspace object creation, and have the handle hold it.
This patch fixes a memory leak or corruption on unload, because
the driver had no way of knowing if a handle had been actually
added for this object, and the fbcon object needed to know this
to clean itself up properly.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Add the widget for MICBIAS power control and allow configuration of the
microphone bias setup via the platform data for the WM8962. When
microphone status signals are brought out to GPIO this should be
sufficient to enable microphone detection.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <lrg@slimlogic.co.uk>
The Enhanced Retransmission Mode(ERTM) is a realiable mode of operation
of the Bluetooth L2CAP layer. Think on it like a simplified version of
TCP.
The problem we were facing here was a deadlock. ERTM uses a backlog
queue to queue incomimg packets while the user is helding the lock. At
some moment the sk_sndbuf can be exceeded and we can't alloc new skbs
then the code sleep with the lock to wait for memory, that stalls the
ERTM connection once we can't read the acknowledgements packets in the
backlog queue to free memory and make the allocation of outcoming skb
successful.
This patch actually affect all users of bt_skb_send_alloc(), i.e., all
L2CAP modes and SCO.
We are safe against socket states changes or channels deletion while the
we are sleeping wait memory. Checking for the sk->sk_err and
sk->sk_shutdown make the code safe, since any action that can leave the
socket or the channel in a not usable state set one of the struct
members at least. Then we can check both of them when getting the lock
again and return with the proper error if something unexpected happens.
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Ulisses Furquim <ulisses@profusion.mobi>
Support building wl1271-equipped boards without building the
wl1271 driver itself, e.g.:
CONFIG_MACH_OMAP_ZOOM3=y
CONFIG_WL12XX is not set
Reported-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Luciano Coelho <luciano.coelho@nokia.com>
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
dmaengine: fix interrupt clearing for mv_xor
missing inline keyword for static function in linux/dmaengine.h
dma/shdma: move dereference below the NULL check
There is some confusion with rx_queue name after RPS, and net drivers
private rx_queue fields.
I suggest to rename "struct net_device"->rx_queue to ingress_queue.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds driver support for OMAP2/3/4 high speed UART.
The driver is made separate from 8250 driver as we cannot
over load 8250 driver with omap platform specific configuration for
features like DMA, it makes easier to implement features like DMA and
hardware flow control and software flow control configuration with
this driver as required for the omap-platform.
This patch involves only the core driver and its dependent.
Cc: Tony Lindgren <tony@atomide.com>
Signed-off-by: Govindraj.R <govindraj.raja@ti.com>
Acked-by: Alan Cox <alan@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
On Wed, 29 Sep 2010 14:02:38 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> After merging the final tree, today's linux-next build (powerpc
> ppc44x_defconfig) produced tis warning:
>
> WARNING: net/sunrpc/sunrpc.o(.init.text+0x110): Section mismatch in reference from the function init_sunrpc() to the function .exit.text:rpcauth_remove_module()
> The function __init init_sunrpc() references
> a function __exit rpcauth_remove_module().
> This is often seen when error handling in the init function
> uses functionality in the exit path.
> The fix is often to remove the __exit annotation of
> rpcauth_remove_module() so it may be used outside an exit section.
>
> Probably caused by commit 2f72c9b737
> ("sunrpc: The per-net skeleton").
This actually causes a build failure on a sparc32 defconfig build:
`rpcauth_remove_module' referenced in section `.init.text' of net/built-in.o: defined in discarded section `.exit.text' of net/built-in.o
I applied the following patch for today:
Fixes:
`rpcauth_remove_module' referenced in section `.init.text' of net/built-in.o: defined in discarded section `.exit.text' of net/built-in.o
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The Status (CISREG_CCSR) and ExtStatus (CISREG_ESR) registers were
only accessed to enable audio output for some drivers and IRQ for
serial_cs.c. The former also required setting config_req_t.Attributes
to CONF_ENABLE_SPKR; the latter can be simplified to setting this
field to CONF_ENABLE_ESR.
CC: netdev@vger.kernel.org
CC: linux-wireless@vger.kernel.org
CC: linux-serial@vger.kernel.org
CC: linux-scsi@vger.kernel.org
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
The "Pin" and "Copy" configuration registers (CISREG_SCR, CISREG_PPR)
do not seem to be utilized anywhere. If a device would request a
write to these registers, "0" would be written. Continue to do so, but
warn of unexpected behavior -- and remove the "Pin" and "Copy" entries
from config_req_t.
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
pcmcia_modify_configuration() was only used by two drivers to fix up
one issue each: setting the Vpp to a different value, and reducing
the IO width to 8 bit. Introduce two explicitly named functions
handling these things, and remove one further typedef.
CC: netdev@vger.kernel.org
CC: linux-mtd@lists.infradead.org
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Instead of win_req_t, drivers are now requested to fill out
struct pcmcia_device *p_dev->resource[2,3,4,5] for up to four iomem
ranges. After a call to pcmcia_request_window(), the windows found there
are reserved and may be used until pcmcia_release_window() is called.
CC: netdev@vger.kernel.org
CC: linux-wireless@vger.kernel.org
CC: linux-mtd@lists.infradead.org
CC: Jiri Kosina <jkosina@suse.cz>
CC: linux-scsi@vger.kernel.org
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Provide an initial hookup for interrupts on the WM8962. Currently we simply
report error status via log messages if an IRQ is provided for the device.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <lrg@slimlogic.co.uk>
This patch allows a host to be configured to respond to any address in
a specified range as if it were local, without actually needing to
configure the address on an interface. This is done through routing
table configuration. For instance, to configure a host to respond
to any address in 10.1/16 received on eth0 as a local address we can do:
ip rule add from all iif eth0 lookup 200
ip route add local 10.1/16 dev lo proto kernel scope host src 127.0.0.1 table 200
This host is now reachable by any 10.1/16 address (route lookup on
input for packets received on eth0 can find the route). On output, the
rule will not be matched so that this host can still send packets to
10.1/16 (not sent on loopback). Presumably, external routing can be
configured to make sense out of this.
To make this work, we needed to modify the logic in finding the
interface which is assigned a given source address for output
(dev_ip_find). We perform a normal fib_lookup instead of just a
lookup on the local table, and in the lookup we ignore the input
interface for matching.
This patch is useful to implement IP-anycast for subnets of virtual
addresses.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the basic infrastructure to support user-space
expectation helpers via ctnetlink and the netfilter queuing
infrastructure NFQUEUE. Basically, this patch:
* adds NF_CT_EXPECT_USERSPACE flag to identify user-space
created expectations. I have also added a sanity check in
__nf_ct_expect_check() to avoid that kernel-space helpers
may create an expectation if the master conntrack has no
helper assigned.
* adds some branches to check if the master conntrack helper
exists, otherwise we skip the code that refers to kernel-space
helper such as the local expectation list and the expectation
policy.
* allows to set the timeout for user-space expectations with
no helper assigned.
* a list of expectations created from user-space that depends
on ctnetlink (if this module is removed, they are deleted).
* includes USERSPACE in the /proc output for expectations
that have been created by a user-space helper.
This patch also modifies ctnetlink to skip including the helper
name in the Netlink messages if no kernel-space helper is set
(since no user-space expectation has not kernel-space kernel
assigned).
You can access an example user-space FTP conntrack helper at:
http://people.netfilter.org/pablo/userspace-conntrack-helpers/nf-ftp-helper-userspace-POC.tar.bz
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
tcp: Fix >4GB writes on 64-bit.
net/9p: Mount only matching virtio channels
de2104x: fix ethtool
tproxy: check for transparent flag in ip_route_newports
ipv6: add IPv6 to neighbour table overflow warning
tcp: fix TSO FACK loss marking in tcp_mark_head_lost
3c59x: fix regression from patch "Add ethtool WOL support"
ipv6: add a missing unregister_pernet_subsys call
s390: use free_netdev(netdev) instead of kfree()
sgiseeq: use free_netdev(netdev) instead of kfree()
rionet: use free_netdev(netdev) instead of kfree()
ibm_newemac: use free_netdev(netdev) instead of kfree()
smsc911x: Add MODULE_ALIAS()
net: reset skb queue mapping when rx'ing over tunnel
br2684: fix scheduling while atomic
de2104x: fix TP link detection
de2104x: fix power management
de2104x: disable autonegotiation on broken hardware
net: fix a lockdep splat
e1000e: 82579 do not gate auto config of PHY by hardware during nominal use
...
This sets the active numbers of queues on a net device to match another.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For RPS, we create a kobject for each RX queue based on the number of
queues passed to alloc_netdev_mq(). However, drivers generally do not
determine the numbers of hardware queues to use until much later, so
this usually represents the maximum number the driver may use and not
the actual number in use.
For TX queues, drivers can update the actual number using
netif_set_real_num_tx_queues(). Add a corresponding function for RX
queues, netif_set_real_num_rx_queues().
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tunnels are going to use percpu for their accounting.
They are going to use a new tstats field in net_device.
skb_tunnel_rx() is changed to be a wrapper around __skb_tunnel_rx()
IPTUNNEL_XMIT() is changed to be a wrapper around __IPTUNNEL_XMIT()
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Phonet stack assumes the presence of Pipe Controller, either in Modem or
on Application Processing Engine user-space for the Pipe data.
Nokia Slim Modems like WG2.5 used in ST-Ericsson U8500 platform do not
implement Pipe controller in them.
This patch adds Pipe Controller implemenation to Phonet stack to support
Pipe data over Phonet stack for Nokia Slim Modems.
Signed-off-by: Kumar Sanghvi <kumar.sanghvi@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes kernel bugzilla #16603
tcp_sendmsg() truncates iov_len to an 'int' which a 4GB write to write
zero bytes, for example.
There is also the problem higher up of how verify_iovec() works. It
wants to prevent the total length from looking like an error return
value.
However it does this using 'int', but syscalls return 'long' (and
thus signed 64-bit on 64-bit machines). So it could trigger
false-positives on 64-bit as written. So fix it to use 'long'.
Reported-by: Olaf Bonorden <bono@onlinehome.de>
Reported-by: Daniel Büse <dbuese@gmx.de>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a nasty memory corruption bug when using userptr I/O.
The function videobuf_pages_to_sg() sets up the scatter-gather list for the
DMA transfer to the userspace pages. The first transfer is setup correctly
(the size is set to PAGE_SIZE - offset), but all other transfers have size
PAGE_SIZE. This is wrong for the last transfer which may be less than PAGE_SIZE.
Most, if not all, drivers will program the boards DMA engine correctly, i.e.
even though the size in the last sg element is wrong, they will do their
own size calculations and make sure the right amount is DMA-ed, and so seemingly
prevent memory corruption.
However, behind the scenes the dynamic DMA mapping support (in lib/swiotlb.c)
may create bounce buffers if the memory pages are not in DMA-able memory.
This happens for example on a 64-bit linux with a board that only supports
32-bit DMA.
These bounce buffers DO use the information in the sg list to determine the
size. So while the DMA engine transfers the correct amount of data, when the
data is 'bounced' back too much is copied, causing buffer overwrites.
The fix is simple: calculate and set the correct size for the last sg list
element.
Signed-off-by: Hans Verkuil <hans.verkuil@tandberg.com>
Cc: stable@kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This patch allows ports to have different link layers:
IB_LINK_LAYER_INFINIBAND or IB_LINK_LAYER_ETHERNET. This is required
for adding IBoE (InfiniBand-over-Ethernet, aka RoCE) support. For
devices that do not provide an implementation for querying the link
layer property of a port, we return a default value based on the
transport: RMA_TRANSPORT_IB nodes will return IB_LINK_LAYER_INFINIBAND
and RDMA_TRANSPORT_IWARP nodes will return IB_LINK_LAYER_ETHERNET.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Hook the GEM vm open/close ops into the generic drm vm open/close so
that the private vma entries are created and destroy appropriately.
Fixes the leak of the drm_vma_entries during the lifetime of the filp.
Reported-by: Matt Mackall <mpm@selenic.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
commit 8c0c709eea
Author: Johannes Berg <johannes@sipsolutions.net>
Date: Wed Nov 25 17:46:15 2009 +0100
mac80211: move cmntr flag out of rx flags
moved the CMNTR flag into the skb RX flags for
some aggregation cleanups, but this was wrong
since the optimisation this flag tried to make
requires that it is kept across the processing
of multiple interfaces -- which isn't true for
flags in the skb. The patch not only broke the
optimisation, it also introduced a bug: under
some (common!) circumstances the flag will be
set on an already freed skb!
However, investigating this in more detail, I
found that most of the flags that we set should
be per packet, _except_ for this one, due to
a-MPDU processing. Additionally, the flags used
for processing (currently just this one) need
to be reset before processing a new packet.
Since we haven't actually seen bugs reported as
a result of the wrong flags handling (which is
not too surprising -- the only real bug case I
can come up with is an a-MSDU contained in an
a-MPDU), I'll make a different fix for rc.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The old ieee80211_find_sta_by_hw method didn't properly
find VIFS when there was more than one per AP. This caused
AMPDU logic in ath9k to get the wrong VIF when trying to
account for transmitted SKBs.
This patch changes ieee80211_find_sta_by_hw to take a
localaddr argument to distinguish between VIFs with the
same AP but different local addresses. The method name
is changed to ieee80211_find_sta_by_ifaddr.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86/amd-iommu: Fix rounding-bug in __unmap_single
x86/amd-iommu: Work around S3 BIOS bug
x86/amd-iommu: Set iommu configuration flags in enable-loop
x86, setup: Fix earlyprintk=serial,0x3f8,115200
x86, setup: Fix earlyprintk=serial,ttyS0,115200
The transport representation should be per-net of course.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Existing calls do the same, but for the init_net.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
There are two calls that operate on ip_map_cache and are
directly called from the nfsd code. Other places will be
handled in a different way.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This is done in order to facilitate getting the ip_map_cache from
which to put the ip_map.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Clean up a missing exit path in the ipv6 module init routines. In
addrconf_init we call ipv6_addr_label_init which calls register_pernet_subsys
for the ipv6_addr_label_ops structure. But if module loading fails, or if the
ipv6 module is removed, there is no corresponding unregister_pernet_subsys call,
which leaves a now-bogus address on the pernet_list, leading to oopses in
subsequent registrations. This patch cleans up both the failed load path and
the unload path. Tested by myself with good results.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
include/net/addrconf.h | 1 +
net/ipv6/addrconf.c | 11 ++++++++---
net/ipv6/addrlabel.c | 5 +++++
3 files changed, 14 insertions(+), 3 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
loopback driver uses dev->ml_priv to store its percpu stats pointer.
It uses ugly casts "(void __percpu __force *)" to shut up sparse
complains.
Define an union to better document we use ml_priv in loopback driver and
define a lstats field with appropriate types.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SOCK_MIN_RCVBUF current value is 256 bytes
It doesnt permit to receive the smallest possible frame, considering
socket sk_rmem_alloc/sk_rcvbuf account skb truesizes. On 64bit arches,
sizeof(struct sk_buff) is 240 bytes. Add the typical 64 bytes of
headroom, and we go over the limit.
With old kernels and 32bit arches, we were under the limit, if netdriver
was doing copybreak.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>