Commit Graph

475484 Commits

Author SHA1 Message Date
Linus Torvalds 039001972a While testing some new changes for 3.18, I kept hitting a bug every so
often in the ring buffer. At first I thought it had to do with some
 of the changes I was working on, but then testing something else I
 realized that the bug was in 3.17 itself. I ran several bisects as the
 bug was not very reproducible, and finally came up with the commit
 that I could reproduce easily within a few minutes, and without the change
 I could run the tests over an hour without issue. The change fit the
 bug and I figured out a fix. That bad commit was:
 
 Commit 651e22f270 "ring-buffer: Always reset iterator to reader page"
 
 This commit fixed a bug, but in the process created another one. It used
 the wrong value as the cached value that is used to see if things changed
 while an iterator was in use. This made it look like a change always
 happened, and could cause the iterator to go into an infinite loop.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJULvrkAAoJEKQekfcNnQGuD+sH/iHPE2qb2ojCP9+hqOpszdd1
 d8rN8BNZsDlJxfWQELw2vGVXTmeW7txW5DFWQ3I8qSSjwYa6l27M4mHsw2QLagtw
 kIrcazis3IAcYCH8OE4ruD5nAGYLFqRIt0MOa/NAJD0r00xM7nvOhII2+6uAXF+A
 1JbQDRq8eleCKMUMV0XchqWx6pYTXL8cLh1YEXZ0BTUFKIz+y22HjWnMf+odDhLB
 okQic67/+i7mJDAAW4U+pyevd0QBZdDOohjQtbj+irv2pb7WtWqylKcYhAYSpgsy
 MtPzzYyPDs/aHLNcnIJVdVtbKfNXsaHuCgEvKKgLXnKMMcS5UxSIxj+Q1IxSIOM=
 =B7HS
 -----END PGP SIGNATURE-----

Merge tag 'trace-fixes-v3.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull trace ring buffer iterator fix from Steven Rostedt:
 "While testing some new changes for 3.18, I kept hitting a bug every so
  often in the ring buffer.  At first I thought it had to do with some
  of the changes I was working on, but then testing something else I
  realized that the bug was in 3.17 itself.  I ran several bisects as
  the bug was not very reproducible, and finally came up with the commit
  that I could reproduce easily within a few minutes, and without the
  change I could run the tests over an hour without issue.  The change
  fit the bug and I figured out a fix.  That bad commit was:

    Commit 651e22f270 "ring-buffer: Always reset iterator to reader page"

  This commit fixed a bug, but in the process created another one.  It
  used the wrong value as the cached value that is used to see if things
  changed while an iterator was in use.  This made it look like a change
  always happened, and could cause the iterator to go into an infinite
  loop"

* tag 'trace-fixes-v3.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ring-buffer: Fix infinite spin in reading buffer
2014-10-03 13:31:57 -07:00
Linus Torvalds 7d1419f30c Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs/smb3 fixes from Steve French:
 "Fix for CIFS/SMB3 oops on reconnect during readpages (3.17 regression)
  and for incorrectly closing file handle in symlink error cases"

* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
  CIFS: Fix readpages retrying on reconnects
  Fix problem recognizing symlinks
2014-10-03 13:09:57 -07:00
David S. Miller fba7516303 Merge branch 'rds-net'
Herton R. Krzesinski says:

====================
Small fixes/changes for RDS

I got a report of one issue within RDS (after investigation it was a double
free), and I'm sending the fix (patch 3/3) which reporter said it works (no more
WARNING triggered on a specially instrumented kernel). The report/test was done
on a very old kernel (RHEL 5, 2.6.18 based with backports), but the problem the
patch handles still exists and should not change. Besides that, while
reviewing some of the code but being unable to reproduce with rds_tcp, I
noticed two small improvements/fixes which are in patches 1 and 2.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-03 12:52:19 -07:00
Herton R. Krzesinski 593cbb3ec6 net/rds: fix possible double free on sock tear down
I got a report of a double free happening at RDS slab cache. One
suspicion was that may be somewhere we were doing a sock_hold/sock_put
on an already freed sock. Thus after providing a kernel with the
following change:

 static inline void sock_hold(struct sock *sk)
 {
-       atomic_inc(&sk->sk_refcnt);
+       if (!atomic_inc_not_zero(&sk->sk_refcnt))
+               WARN(1, "Trying to hold sock already gone: %p (family: %hd)\n",
+                       sk, sk->sk_family);
 }

The warning successfuly triggered:

Trying to hold sock already gone: ffff81f6dda61280 (family: 21)
WARNING: at include/net/sock.h:350 sock_hold()
Call Trace:
<IRQ>  [<ffffffff8adac135>] :rds:rds_send_remove_from_sock+0xf0/0x21b
[<ffffffff8adad35c>] :rds:rds_send_drop_acked+0xbf/0xcf
[<ffffffff8addf546>] :rds_rdma:rds_ib_recv_tasklet_fn+0x256/0x2dc
[<ffffffff8009899a>] tasklet_action+0x8f/0x12b
[<ffffffff800125a2>] __do_softirq+0x89/0x133
[<ffffffff8005f30c>] call_softirq+0x1c/0x28
[<ffffffff8006e644>] do_softirq+0x2c/0x7d
[<ffffffff8006e4d4>] do_IRQ+0xee/0xf7
[<ffffffff8005e625>] ret_from_intr+0x0/0xa
<EOI>

Looking at the call chain above, the only way I think this would be
possible is if somewhere we already released the same socket->sock which
is assigned to the rds_message at rds_send_remove_from_sock. Which seems
only possible to happen after the tear down done on rds_release.

rds_release properly calls rds_send_drop_to to drop the socket from any
rds_message, and some proper synchronization is in place to avoid race
with rds_send_drop_acked/rds_send_remove_from_sock. However, I still see
a very narrow window where it may be possible we touch a sock already
released: when rds_release races with rds_send_drop_acked, we check
RDS_MSG_ON_CONN to avoid cleanup on the same rds_message, but in this
specific case we don't clear rm->m_rs. In this case, it seems we could
then go on at rds_send_drop_to and after it returns, the sock is freed
by last sock_put on rds_release, with concurrently we being at
rds_send_remove_from_sock; then at some point in the loop at
rds_send_remove_from_sock we process an rds_message which didn't have
rm->m_rs unset for a freed sock, and a possible sock_hold on an sock
already gone at rds_release happens.

This hopefully address the described condition above and avoids a double
free on "second last" sock_put. In addition, I removed the comment about
socket destruction on top of rds_send_drop_acked: we call rds_send_drop_to
in rds_release and we should have things properly serialized there, thus
I can't see the comment being accurate there.

Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-03 12:52:00 -07:00
Herton R. Krzesinski eb74cc97b8 net/rds: do proper house keeping if connection fails in rds_tcp_conn_connect
I see two problems if we consider the sock->ops->connect attempt to fail in
rds_tcp_conn_connect. The first issue is that for example we don't remove the
previously added rds_tcp_connection item to rds_tcp_tc_list at
rds_tcp_set_callbacks, which means that on a next reconnect attempt for the
same rds_connection, when rds_tcp_conn_connect is called we can again call
rds_tcp_set_callbacks, resulting in duplicated items on rds_tcp_tc_list,
leading to list corruption: to avoid this just make sure we call
properly rds_tcp_restore_callbacks before we exit. The second issue
is that we should also release the sock properly, by setting sock = NULL
only if we are returning without error.

Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-03 12:51:59 -07:00
Herton R. Krzesinski 310886dd5f net/rds: call rds_conn_drop instead of open code it at rds_connect_complete
Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-03 12:51:59 -07:00
David S. Miller c2bf5ec204 Merge branch 'qdisc_bulk_dequeue'
Jesper Dangaard Brouer says:

====================
qdisc: bulk dequeue support

This patchset uses DaveM's recent API changes to dev_hard_start_xmit(),
from the qdisc layer, to implement dequeue bulking.

Patch01: "qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE"
 - Implement basic qdisc dequeue bulking
 - This time, 100% relying on BQL limits, no magic safe-guard constants

Patch02: "qdisc: dequeue bulking also pickup GSO/TSO packets"
 - Extend bulking to bulk several GSO/TSO packets
 - Seperate patch, as it introduce a small regression, see test section.

We do have a patch03, which exports a userspace tunable as a BQL
tunable, that can byte-cap or disable the bulking/bursting.  But we
could not agree on it internally, thus not sending it now.  We
basically strive to avoid adding any new userspace tunable.

Testing patch01:
================
 Demonstrating the performance improvement of qdisc dequeue bulking, is
tricky because the effect only "kicks-in" once the qdisc system have a
backlog. Thus, for a backlog to form, we need either 1) to exceed wirespeed
of the link or 2) exceed the capability of the device driver.

For practical use-cases, the measureable effect of this will be a
reduction in CPU usage

01-TCP_STREAM:
--------------
Testing effect for TCP involves disabling TSO and GSO, because TCP
already benefit from bulking, via TSO and especially for GSO segmented
packets.  This patch view TSO/GSO as a seperate kind of bulking, and
avoid further bulking of these packet types.

The measured perf diff benefit (at 10Gbit/s) for a single netperf
TCP_STREAM were 9.24% less CPU used on calls to _raw_spin_lock()
(mostly from sch_direct_xmit).

If my E5-2695v2(ES) CPU is tuned according to:
 http://netoptimizer.blogspot.dk/2014/04/basic-tuning-for-network-overload.html
Then it is possible that a single netperf TCP_STREAM, with GSO and TSO
disabled, can utilize all bandwidth on a 10Gbit/s link.  This will
then cause a standing backlog queue at the qdisc layer.

Trying to pressure the system some more CPU util wise, I'm starting
24x TCP_STREAMs and monitoring the overall CPU utilization.  This
confirms bulking saves CPU cycles when it "kicks-in".

Tool mpstat, while stressing the system with netperf 24x TCP_STREAM, shows:
 * Disabled bulking: sys:2.58%  soft:8.50%  idle:88.78%
 * Enabled  bulking: sys:2.43%  soft:7.66%  idle:89.79%

02-UDP_STREAM
-------------
The measured perf diff benefit for UDP_STREAM were 6.41% less CPU used
on calls to _raw_spin_lock().  24x UDP_STREAM with packet size -m 1472 (to
avoid sending UDP/IP fragments).

03-trafgen driver test
----------------------
The performance of the 10Gbit/s ixgbe driver is limited due to
updating the HW ring-queue tail-pointer on every packet.  As
previously demonstrated with pktgen.

Using trafgen to send RAW frames from userspace (via AF_PACKET), and
forcing it through qdisc path (with option --qdisc-path and -t0),
sending with 12 CPUs.

I can demonstrate this driver layer limitation:
 * 12.8 Mpps with no qdisc bulking
 * 14.8 Mpps with qdisc bulking (full 10G-wirespeed)

Testing patch02:
================
Testing Bulking several GSO/TSO packets:

Measuring HoL (Head-of-Line) blocking for TSO and GSO, with
netperf-wrapper. Bulking several TSO show no performance regressions
(requeues were in the area 32 requeues/sec for 10G while transmitting
approx 813Kpps).

Bulking several GSOs does show small regression or very small
improvement (requeues were in the area 8000 requeues/sec, for 10G
while transmitting approx 813Kpps).

 Using ixgbe 10Gbit/s with GSO bulking, we can measure some additional
latency. Base-case, which is "normal" GSO bulking, sees varying
high-prio queue delay between 0.38ms to 0.47ms.  Bulking several GSOs
together, result in a stable high-prio queue delay of 0.50ms.

Corrosponding to:
 (10000*10^6)*((0.50-0.47)/10^3)/8 = 37500 bytes
 (10000*10^6)*((0.50-0.38)/10^3)/8 = 150000 bytes
 37500/1500  = 25 pkts
 150000/1500 = 100 pkts

 Using igb at 100Mbit/s with GSO bulking, shows an improvement.
Base-case sees varying high-prio queue delay between 2.23ms to 2.35ms
diff of 0.12ms corrosponding to 1500 bytes at 100Mbit/s. Bulking
several GSOs together, result in a stable high-prio queue delay of
2.23ms.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-03 12:37:23 -07:00
Jesper Dangaard Brouer 808e7ac0bd qdisc: dequeue bulking also pickup GSO/TSO packets
The TSO and GSO segmented packets already benefit from bulking
on their own.

The TSO packets have always taken advantage of the only updating
the tailptr once for a large packet.

The GSO segmented packets have recently taken advantage of
bulking xmit_more API, via merge commit 53fda7f7f9 ("Merge
branch 'xmit_list'"), specifically via commit 7f2e870f2a ("net:
Move main gso loop out of dev_hard_start_xmit() into helper.")
allowing qdisc requeue of remaining list.  And via commit
ce93718fb7 ("net: Don't keep around original SKB when we
software segment GSO frames.").

This patch allow further bulking of TSO/GSO packets together,
when dequeueing from the qdisc.

Testing:
 Measuring HoL (Head-of-Line) blocking for TSO and GSO, with
netperf-wrapper. Bulking several TSO show no performance regressions
(requeues were in the area 32 requeues/sec).

Bulking several GSOs does show small regression or very small
improvement (requeues were in the area 8000 requeues/sec).

 Using ixgbe 10Gbit/s with GSO bulking, we can measure some additional
latency. Base-case, which is "normal" GSO bulking, sees varying
high-prio queue delay between 0.38ms to 0.47ms.  Bulking several GSOs
together, result in a stable high-prio queue delay of 0.50ms.

 Using igb at 100Mbit/s with GSO bulking, shows an improvement.
Base-case sees varying high-prio queue delay between 2.23ms to 2.35ms

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-03 12:37:06 -07:00
Jesper Dangaard Brouer 5772e9a346 qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE
Based on DaveM's recent API work on dev_hard_start_xmit(), that allows
sending/processing an entire skb list.

This patch implements qdisc bulk dequeue, by allowing multiple packets
to be dequeued in dequeue_skb().

The optimization principle for this is two fold, (1) to amortize
locking cost and (2) avoid expensive tailptr update for notifying HW.
 (1) Several packets are dequeued while holding the qdisc root_lock,
amortizing locking cost over several packet.  The dequeued SKB list is
processed under the TXQ lock in dev_hard_start_xmit(), thus also
amortizing the cost of the TXQ lock.
 (2) Further more, dev_hard_start_xmit() will utilize the skb->xmit_more
API to delay HW tailptr update, which also reduces the cost per
packet.

One restriction of the new API is that every SKB must belong to the
same TXQ.  This patch takes the easy way out, by restricting bulk
dequeue to qdisc's with the TCQ_F_ONETXQUEUE flag, that specifies the
qdisc only have attached a single TXQ.

Some detail about the flow; dev_hard_start_xmit() will process the skb
list, and transmit packets individually towards the driver (see
xmit_one()).  In case the driver stops midway in the list, the
remaining skb list is returned by dev_hard_start_xmit().  In
sch_direct_xmit() this returned list is requeued by dev_requeue_skb().

To avoid overshooting the HW limits, which results in requeuing, the
patch limits the amount of bytes dequeued, based on the drivers BQL
limits.  In-effect bulking will only happen for BQL enabled drivers.

Small amounts for extra HoL blocking (2x MTU/0.24ms) were
measured at 100Mbit/s, with bulking 8 packets, but the
oscillating nature of the measurement indicate something, like
sched latency might be causing this effect. More comparisons
show, that this oscillation goes away occationally. Thus, we
disregard this artifact completely and remove any "magic" bulking
limit.

For now, as a conservative approach, stop bulking when seeing TSO and
segmented GSO packets.  They already benefit from bulking on their own.
A followup patch add this, to allow easier bisect-ability for finding
regressions.

Jointed work with Hannes, Daniel and Florian.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-03 12:37:06 -07:00
Mark Einon 38df6492eb et131x: Add PCIe gigabit ethernet driver et131x to drivers/net
This adds the ethernet driver for Agere et131x devices to
drivers/net/ethernet.

The driver being added has been in the staging tree for some time, and will be
removed from there in a seperate patch. This one merely disables the staging
version to prevent two instances being built.

Signed-off-by: Mark Einon <mark.einon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-03 12:22:19 -07:00
Dmitry Torokhov 447a8b858e Merge branch 'next' into for-linus
Prepare first round of input updates for 3.18.
2014-10-03 11:24:46 -07:00
Linus Torvalds ee042ec880 One fix for raid5 discard issue.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIVAwUAVC40bznsnt1WYoG5AQJs0w//T8Z3vBhXwC3SjREbH42x8pWxFXgzhUdW
 WGKwYYwHFQfL05rP3eg+X+Sh+X3s3E3/+lFzDVE/JGjBdSG/A5X+j5jRH6ENl16u
 32wQjZwKliYxSiUPWbbPzl0BDi3nyhpR1CMsgTWpU6qkMuGy2npNZFK/cQIy4/Je
 ygXLwncKPfAVszaMrx9oBeghpwTnJR3Se1RAYhxc4qQLm+jXfUTQ9Ool3J2InZo3
 st7weyuvPrtr+Em65ta0j/VIOzTP5dIallq+LSL5e0lhn5Pc99XBaM60lEzH9mwP
 t8ZDuV6HL8rXktCOwNtkmEQeW/7uOpmwKG0yt6nPFKghsclJvbyrtxOWKnOrPAd8
 0rvUAX7EXX/OQyEjguhJQ1ZK9E4HxglGay7trVyXeLdK3kOUaVjAlG/5y8FrEusU
 K/uOnHwIiCDxTZ86imTcRcXxOG5eWVscsWwVFIHf2IN4J1NKJ7rQB6LP4WEKzHGE
 y3CgSy/c7yyuksMlYZE9q1GRc5i8pI5tGg3tV2S2uvgjuNIw6dkAF1UjW5Mpgy3x
 OFue/SxsYWuBeSqMANuJQDeCOLttwSZV/Ho3Kac+nA7l0LTXkVIYFj0bSGIIUxjI
 scF+r04u35/jib0QALdeTw9FB46V5GYZ7d0bd/lEMxe24dsPreCKiI/xDvpr5NQn
 8flYcaBK+PI=
 =Qyea
 -----END PGP SIGNATURE-----

Merge tag 'md/3.17-final-fix' of git://neil.brown.name/md

Pull raid5 discard fix from Neil Brown:
 "One fix for raid5 discard issue"

* tag 'md/3.17-final-fix' of git://neil.brown.name/md:
  md/raid5: disable 'DISCARD' by default due to safety concerns.
2014-10-03 08:40:37 -07:00
Mark Brown a2285b8c75 Merge remote-tracking branch 'spi/topic/xilinx' into spi-next 2014-10-03 16:33:44 +01:00
Mark Brown bab4d751f7 Merge remote-tracking branches 'spi/topic/pl022', 'spi/topic/pxa2xx', 'spi/topic/rspi', 'spi/topic/sh-msiof' and 'spi/topic/sirf' into spi-next 2014-10-03 16:33:42 +01:00
Mark Brown 899d81b974 Merge remote-tracking branches 'spi/topic/fsl-dspi', 'spi/topic/imx', 'spi/topic/mxs', 'spi/topic/omap-100k' and 'spi/topic/orion' into spi-next 2014-10-03 16:33:41 +01:00
Mark Brown 7020d76971 Merge remote-tracking branches 'spi/topic/davinci', 'spi/topic/doc', 'spi/topic/dw' and 'spi/topic/fsl' into spi-next 2014-10-03 16:33:39 +01:00
Mark Brown 1fc8450313 Merge remote-tracking branches 'spi/topic/bcm53xx', 'spi/topic/cadence', 'spi/topic/checkpatch' and 'spi/topic/clps711x' into spi-next 2014-10-03 16:33:37 +01:00
Mark Brown 613c44798f Merge remote-tracking branch 'spi/topic/dma-dep' into spi-next 2014-10-03 16:33:37 +01:00
Mark Brown ad71f40a83 Merge remote-tracking branch 'spi/topic/core' into spi-next 2014-10-03 16:33:37 +01:00
Mark Brown 62d02e41ea Merge remote-tracking branch 'spi/fix/rockchip' into spi-linus 2014-10-03 16:33:35 +01:00
Linus Torvalds 80ad99da8b Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "Nothing too major or scary.

  One i915 regression fix, nouveau has a tmds regression fix, along with
  a regression fix for the runtime pm code for optimus laptops not
  restoring the display hw correctly"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/nouveau: make sure display hardware is reinitialised on runtime resume
  drm/nouveau: punt fbcon resume out to a workqueue
  drm/nouveau: fix regression on original nv50 board
  drm/nv50/disp: fix dpms regression on certain boards
  drm/i915: Flush the PTEs after updating them before suspend
2014-10-03 08:31:14 -07:00
Geoff Levand 0a6479b0ff arm64: Remove unneeded extern keyword
Function prototypes are never definitions, so remove any 'extern' keyword
from the funcion prototypes in cpu_ops.h. Fixes warnings emited by
checkpatch.

Signed-off-by: Geoff Levand <geoff@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-10-03 14:51:02 +01:00
Michael Opdenacker 0415447aa3 Documentation: fix broken v4l-utils URL
This replaces http://git.linuxtv.org/v4l-utils/ (broken link)
by http://git.linuxtv.org/cgit.cgi/v4l-utils.git/

Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-10-03 15:49:57 +02:00
Uwe Kleine-König c8fdd497a4 ARM64: make of_device_ids const
of_device_ids (i.e. compatible strings and the respective data) are not
supposed to change at runtime. All functions working with of_device_ids
provided by <linux/of.h> work with const of_device_ids. So mark the
only non-const struct in arch/arm64 as const, too.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-10-03 14:49:28 +01:00
Peter Foley 7b345771ba Documentation: update include path for mpssd
sysfs.c includes mpssd.h which includes virtio_ids.h.
sysfs.c doesn't have the proper include flags set to use the latest
headers, so this causes a build error if the system headers are too old.

Signed-off-by: Peter Foley <pefoley2@pefoley.com>
Cc: rdunlap@infradead.org
Cc: linux-doc@vger.kernel.org
Cc: sudeep.dutt@intel.com
Cc: nikhil.rao@intel.com
Cc: ashutosh.dixit@intel.com
Cc: akpm@linux-foundation.org
Cc: gregkh@linuxfoundation.org
Cc: harshavardhan.r.kharche@intel.com
Cc: caz.yokoyama@intel.com
Cc: dasaratharaman.chandramouli@intel.com
Cc: jkosina@suse.cz
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-10-03 15:48:20 +02:00
Arturo Borrero 8da4cc1b10 netfilter: nft_masq: register/unregister notifiers on module init/exit
We have to register the notifiers in the masquerade expression from
the the module _init and _exit path.

This fixes crashes when removing the masquerade rule with no
ipt_MASQUERADE support in place (which was masking the problem).

Fixes: 9ba1f72 ("netfilter: nf_tables: add new nft_masq expression")
Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-03 14:24:35 +02:00
Michael Heimpold a44619c31c spi: spi-mxs: fix a tiny typo in a comment
Signed-off-by: Michael Heimpold <mhei@heimpold.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
2014-10-03 10:33:57 +01:00
Christoph Hellwig 2c2d831c81 [SCSI] uas: disable use of blk-mq I/O path
The uas driver uses the block layer tag for USB3 stream IDs.  With
blk-mq we can get larger tag numbers that the queue depth, which breaks
this assumption.  A fix is under way for 3.18, but sits on top of
large changes so can't easily be backported.   Set the disable_blk_mq
path so that a uas device can't easily crash the system when using
blk-mq for SCSI.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-10-03 05:27:58 -04:00
Geert Uytterhoeven 24cae7934c m68k: Reformat arch/m68k/mm/hwtest.c
No functional changes

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2014-10-03 10:50:56 +02:00
Geert Uytterhoeven e4dc601bf9 m68k: Disable/restore interrupts in hwreg_present()/hwreg_write()
hwreg_present() and hwreg_write() temporarily change the VBR register to
another vector table. This table contains a valid bus error handler
only, all other entries point to arbitrary addresses.

If an interrupt comes in while the temporary table is active, the
processor will start executing at such an arbitrary address, and the
kernel will crash.

While most callers run early, before interrupts are enabled, or
explicitly disable interrupts, Finn Thain pointed out that macsonic has
one callsite that doesn't, causing intermittent boot crashes.
There's another unsafe callsite in hilkbd.

Fix this for good by disabling and restoring interrupts inside
hwreg_present() and hwreg_write().

Explicitly disabling interrupts can be removed from the callsites later.

Reported-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@vger.kernel.org
2014-10-03 10:50:56 +02:00
Greg Kroah-Hartman 69784fa539 Revert "serial/core: Initialize the console pm state"
This reverts commit a86713b153.

Kevin Hilman writes:

	Multiple boot failures on ARM[1] were bisected down to this
	patch.

	How was this patch tested, and on which platforms?

	Also, the changelog states that this should be done only for
	UART_CAP_SLEEP, but the patch does it for every UART.

	Greg, I suggest this patch be dropped from tty-next until it has
	been better described and tested.

	[1] http://lists.linaro.org/pipermail/kernel-build-reports/2014-October/005550.html

Reported-by: Kevin Hilman <khilman@kernel.org>
Cc: Sudhir Sreedharan <ssreedharan@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-02 21:33:50 -07:00
Linus Torvalds 5858686959 ACPI and power management fixes for final 3.17
- A recent cpufreq core fix went too far and introduced a regression
    in the system suspend code path.  Fix from Viresh Kumar.
 
  - An ACPI-related commit in the i915 driver that fixed backlight
    problems for some Thinkpads inadvertently broke a Dell machine
    (in 3.16).  Fix from Aaron Lu.
 
  - The pcc-cpufreq driver was broken during the 3.15 cycle by a
    commit that put wait_event() under a spinlock by mistake.  Fix
    that (Rafael J Wysocki).
 
  - The return value type of integrator_cpufreq_remove() is void, but
    should be int.  Fix from Arnd Bergmann.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJULgB3AAoJEILEb/54YlRxeToP/R5grVhxifKrXl7t0+QYkzWx
 Znm0ZLkKq4bjqXJiBTq5f3MehW3QHYfR/LXMdK+4YQqfZYoJfpa7Ui2XT8PMDyH6
 IXwF81ABNVlcgeiAmHMmRslWuL/F3cMfWeFh49qdozhRrSV72Awljp8esRuMCDcu
 p2OBf+gvxYWxxPzP7pgimsLBxxVfw9L7lL3cjogM1vU46bWz1SVP0dw5XEtBlWs5
 z+R9KwStmskfxX50xmU/gxfVDapllh4OvbEsTOQVfjWyB9edkkLFSkqwfs3MaVwi
 kJlYyPgEGTI7Fu6kMuVHdwCItqnKbDWqoXmIiY87708G9KQLEp4kQLpMJykCEYR1
 jzJ/UxiQOfUHLKfvM2L8bdkLp8MPwsOvesW7oU9G1TLO7+t+ZQannW0E+JmBX00t
 RtNAt5O4XiPigdd8+OUHjPeTVV+w6BTVN3Wd9BRUOzhcKfV2BlH2brmkFNmLeLrQ
 rR3Tt6VyB61v6xCx6exWeBFVi/FuRI97Pur4Oc2J3GE0zjxfSv5sPByF6qDqX1hv
 dgA/xrnIzROWXf4b9GVULcJZgHuTeA8p7PEDugYhLitgN/sk2HsxtURJ2plqnT8X
 HSh+T3PM+uD3Lm0riF1KjTknrAHPLuDafySlhSlrNblYS7R1ggGQdnbSpyVneM1g
 G449tBjoQnC2oPOTMUCR
 =jOuf
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.17-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and power management fixes from Rafael Wysocki:
 "These are three regression fixes (cpufreq core, pcc-cpufreq, i915 /
  ACPI) and one trivial fix for a callback return value mismatch in the
  cpufreq integrator driver.

  Specifics:

   - A recent cpufreq core fix went too far and introduced a regression
     in the system suspend code path.  Fix from Viresh Kumar.

   - An ACPI-related commit in the i915 driver that fixed backlight
     problems for some Thinkpads inadvertently broke a Dell machine (in
     3.16).  Fix from Aaron Lu.

   - The pcc-cpufreq driver was broken during the 3.15 cycle by a commit
     that put wait_event() under a spinlock by mistake.  Fix that
     (Rafael J Wysocki).

   - The return value type of integrator_cpufreq_remove() is void, but
     should be int.  Fix from Arnd Bergmann"

* tag 'pm+acpi-3.17-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: update 'cpufreq_suspended' after stopping governors
  ACPI / i915: Update the condition to ignore firmware backlight change request
  cpufreq: integrator: fix integrator_cpufreq_remove return type
  cpufreq: pcc-cpufreq: Fix wait_event() under spinlock
2014-10-02 18:47:28 -07:00
Dave Airlie eee0815dab Merge tag 'drm-intel-fixes-2014-10-02' of git://anongit.freedesktop.org/drm-intel into drm-fixes
final regression fix for 3.17.

* tag 'drm-intel-fixes-2014-10-02' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: Flush the PTEs after updating them before suspend
2014-10-03 11:38:16 +10:00
Andy Gross 86b59bbfae i2c: qup: Fix order of runtime pm initialization
The runtime pm calls need to be done before populating the children via the
i2c_add_adapter call.  If this is not done, a child can run into issues trying
to do i2c read/writes due to the pm_runtime_sync failing.

Signed-off-by: Andy Gross <agross@codeaurora.org>
Reviewed-by: Felipe Balbi <balbi@ti.com>
Acked-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
2014-10-03 03:20:47 +02:00
Alexandru M Stan cf27020d2f i2c: rk3x: fix 0 length write transfers
i2cdetect -q was broken (everything was a false positive, and no transfers were
actually being sent over i2c). The way it works is by sending a 0 length write
request and checking for NACK. This patch fixes the 0 length writes and actually
sends them.

Reported-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Alexandru M Stan <amstan@chromium.org>
Tested-by: Doug Anderson <dianders@chromium.org>
Tested-by: Max Schwarz <max.schwarz@online.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
2014-10-03 03:18:53 +02:00
Rafael J. Wysocki abcadddc85 Merge branches 'pm-cpufreq' and 'acpi-video'
* pm-cpufreq:
  cpufreq: update 'cpufreq_suspended' after stopping governors
  cpufreq: integrator: fix integrator_cpufreq_remove return type
  cpufreq: pcc-cpufreq: Fix wait_event() under spinlock

* acpi-video:
  ACPI / i915: Update the condition to ignore firmware backlight change request
2014-10-03 03:10:07 +02:00
Linus Torvalds f929d3995d Merge branch 'akpm' (fixes from Andrew Morton)
Merge fixes from Andrew Morton:
 "5 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: page_alloc: fix zone allocation fairness on UP
  perf: fix perf bug in fork()
  MAINTAINERS: change git URL for mpc5xxx tree
  mm: memcontrol: do not iterate uninitialized memcgs
  ocfs2/dlm: should put mle when goto kill in dlm_assert_master_handler
2014-10-02 16:29:19 -07:00
Johannes Weiner abe5f97291 mm: page_alloc: fix zone allocation fairness on UP
The zone allocation batches can easily underflow due to higher-order
allocations or spills to remote nodes.  On SMP that's fine, because
underflows are expected from concurrency and dealt with by returning 0.
But on UP, zone_page_state will just return a wrapped unsigned long,
which will get past the <= 0 check and then consider the zone eligible
until its watermarks are hit.

Commit 3a025760fc ("mm: page_alloc: spill to remote nodes before
waking kswapd") already made the counter-resetting use
atomic_long_read() to accomodate underflows from remote spills, but it
didn't go all the way with it.

Make it clear that these batches are expected to go negative regardless
of concurrency, and use atomic_long_read() everywhere.

Fixes: 81c0a2bb51 ("mm: page_alloc: fair zone allocator policy")
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Leon Romanovsky <leon@leon.nu>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: <stable@vger.kernel.org>	[3.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-02 16:28:44 -07:00
Peter Zijlstra 6c72e3501d perf: fix perf bug in fork()
Oleg noticed that a cleanup by Sylvain actually uncovered a bug; by
calling perf_event_free_task() when failing sched_fork() we will not yet
have done the memset() on ->perf_event_ctxp[] and will therefore try and
'free' the inherited contexts, which are still in use by the parent
process.  This is bad..

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Sylvain 'ythier' Hitier <sylvain.hitier@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-02 16:28:44 -07:00
Anatolij Gustschin cba5b1c6e2 MAINTAINERS: change git URL for mpc5xxx tree
The repository for mpc5xxx has been moved, update git URL to new
location.

Signed-off-by: Anatolij Gustschin <agust@denx.de>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-02 16:28:44 -07:00
Johannes Weiner 2f7dd7a410 mm: memcontrol: do not iterate uninitialized memcgs
The cgroup iterators yield css objects that have not yet gone through
css_online(), but they are not complete memcgs at this point and so the
memcg iterators should not return them.  Commit d8ad305597 ("mm/memcg:
iteration skip memcgs not yet fully initialized") set out to implement
exactly this, but it uses CSS_ONLINE, a cgroup-internal flag that does
not meet the ordering requirements for memcg, and so the iterator may
skip over initialized groups, or return partially initialized memcgs.

The cgroup core can not reasonably provide a clear answer on whether the
object around the css has been fully initialized, as that depends on
controller-specific locking and lifetime rules.  Thus, introduce a
memcg-specific flag that is set after the memcg has been initialized in
css_online(), and read before mem_cgroup_iter() callers access the memcg
members.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>	[3.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-02 16:28:44 -07:00
alex chen 55dacd22db ocfs2/dlm: should put mle when goto kill in dlm_assert_master_handler
In dlm_assert_master_handler, the mle is get in dlm_find_mle, should be
put when goto kill, otherwise, this mle will never be released.

Signed-off-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: joyce.xue <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-02 16:28:44 -07:00
Linus Torvalds b601ce0fe3 media fixes for v3.17-rc8
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJULb5NAAoJEAhfPr2O5OEVMUsQAJMfLQen1kX/kw2J7GRV2kQM
 3MrQoBERrEi+QOZIfpsAAlytk3ebIVi5HCIXiTa7SUc18M5NoeOz4JiVVrTo5iDi
 yJ5aLA0WiY31weRg4JMieGjjYJqyYBiadYPuft6GidX5Gqi5+2Ta/vbq6N4SE+Be
 eIUXWDOKTbeinUt3b8AqwUt9S02A8DRjqGWQyn8jVSYUl/Db0+VBhw95z7CWOfjM
 8MgDjbPYYoLBAkcMsZ4d/1Eyv7qT2miYKFQo7Udgd+Zi0jExFTkY6uFMFRSlFZ1D
 g+h3/diHdmuTPNsiYQaESxSgNxdxfJk4nLhsxnP+E/71c6qUzYwAf6peXvl3JE39
 E7sdDKlmQjoECUUuWBiJffcEg5VG/jKF2paKqCm+ti5jM/c4WQc/kQaIWk63hSYG
 on9Y6t9xyGWHoIylDtqI3m3mwJfituxCshBl3FAenXNDq45qU4BUWRzpsoJK6P6U
 znp25UXborgkvPDb1/IlDqrxLZ9dXP7OKHNUbW/tbMAqpLWH6GEDHWfkvPMI0sMm
 oaYPPp2wj/1FAEZzTIyUH4BHUnob6aKEa/WaOoO87qLshmShkWMamMrEVer59CR6
 CdhFslYNRkvhLyE+ywj9XSbG8zE7HQE6vdGFxfvcUFmFTj2jlHmZWL6cv+Zs4E72
 sCo7y0KxnQmJjc0vwxLq
 =hkpj
 -----END PGP SIGNATURE-----

Merge tag 'media/v3.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fix from Mauro Carvalho Chehab:
 "One last time regression fix at em28xx.  The removal of .reset_resume
  broke suspend/resume on this driver for some devices.

  There are more fixes to be done for em28xx suspend/resume to be better
  handled, but I'm opting to let them to stay for a while at the media
  devel tree, in order to get more tests.  So, for now, let's just
  revert this patch"

* tag 'media/v3.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  Revert "[media] media: em28xx - remove reset_resume interface"
2014-10-02 16:10:38 -07:00
Steven Rostedt (Red Hat) 24607f114f ring-buffer: Fix infinite spin in reading buffer
Commit 651e22f270 "ring-buffer: Always reset iterator to reader page"
fixed one bug but in the process caused another one. The reset is to
update the header page, but that fix also changed the way the cached
reads were updated. The cache reads are used to test if an iterator
needs to be updated or not.

A ring buffer iterator, when created, disables writes to the ring buffer
but does not stop other readers or consuming reads from happening.
Although all readers are synchronized via a lock, they are only
synchronized when in the ring buffer functions. Those functions may
be called by any number of readers. The iterator continues down when
its not interrupted by a consuming reader. If a consuming read
occurs, the iterator starts from the beginning of the buffer.

The way the iterator sees that a consuming read has happened since
its last read is by checking the reader "cache". The cache holds the
last counts of the read and the reader page itself.

Commit 651e22f270 changed what was saved by the cache_read when
the rb_iter_reset() occurred, making the iterator never match the cache.
Then if the iterator calls rb_iter_reset(), it will go into an
infinite loop by checking if the cache doesn't match, doing the reset
and retrying, just to see that the cache still doesn't match! Which
should never happen as the reset is suppose to set the cache to the
current value and there's locks that keep a consuming reader from
having access to the data.

Fixes: 651e22f270 "ring-buffer: Always reset iterator to reader page"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-10-02 16:51:18 -04:00
Ebru Akagunduz 77d966f4b0 staging: emxx_udc: Use min_t instead of min
Use min_t instead of min function in emxx_udc.c

Fix checkpatch.pl warnings:
WARNING: min() should probably be min_t(u32, iBufSize, ep->ep.maxpacket)
WARNING: min() should probably be min_t(u32, data_size, ep->ep.maxpacket)
WARNING: min() should probably be min_t(u16, udc->ctrl.wLength, sizeof(status_data))

Changes in v2:
 - Fixed min function call as min_t

Signed-off-by: Ebru Akagunduz <ebru.akagunduz@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-02 13:51:03 -07:00
Ebru Akagunduz fb71d24bdc staging: emxx_udc: Fix replace printk(KERN_DEBUG ..) with dev_dbg
This patch fixes "Prefer [subsystem eg: netdev]_dbg([subsystem]dev,
... then dev_dbg(dev, ... then pr_debug(...  to printk(KERN_DEBUG"
checkpatch.pl warning in emxx_udc.c

Changes in v2:
 - Fixed dev_debug function call as dev_dbg

Signed-off-by: Ebru Akagunduz <ebru.akagunduz@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-02 13:51:03 -07:00
Yeliz Taneroglu 4571c4f6f6 staging: media: Fixed else after return or break warning
The following patch fixes the checkpatch.pl warning:

drivers/staging/media/omap4iss/iss_csi2.c:811 warning: else is not generally useful after a break or return

Signed-off-by: Yeliz Taneroglu <yeliztaneroglu@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-02 13:51:03 -07:00
Yeliz Taneroglu ae357388c2 staging: media: omap4iss: Fixed else after return or break warning
The following patch fixes the checkpatch.pl warning:

drivers/staging/media/omap4iss/iss_ipipe.c:184 warning: else is not generally useful after a break or return

Signed-off-by: Yeliz Taneroglu <yeliztaneroglu@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-10-02 13:51:03 -07:00
Russell King d5d1689224 Merge branches 'fiq' (early part), 'fixes', 'l2c' (early part) and 'misc' into for-next 2014-10-02 21:47:02 +01:00
Yalin Wang 421520ba98 ARM: 8167/1: extend the reserved memory for initrd to be page aligned
This patch extends the start and end address of initrd to be page aligned,
so that we can free all memory including the un-page aligned head or tail
page of initrd, if the start or end address of initrd are not page
aligned, the page can't be freed by free_initrd_mem() function.

Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-10-02 21:29:17 +01:00