Commit Graph

287399 Commits

Author SHA1 Message Date
Yasunori Goto b5740f4b2c sched: Fix ancient race in do_exit()
try_to_wake_up() has a problem which may change status from TASK_DEAD to
TASK_RUNNING in race condition with SMI or guest environment of virtual
machine. As a result, exited task is scheduled() again and panic occurs.

Here is the sequence how it occurs:

 ----------------------------------+-----------------------------
                                   |
            CPU A                  |             CPU B
 ----------------------------------+-----------------------------

TASK A calls exit()....

do_exit()

  exit_mm()
    down_read(mm->mmap_sem);

    rwsem_down_failed_common()

      set TASK_UNINTERRUPTIBLE
      set waiter.task <= task A
      list_add to sem->wait_list
           :
      raw_spin_unlock_irq()
      (I/O interruption occured)

                                      __rwsem_do_wake(mmap_sem)

                                        list_del(&waiter->list);
                                        waiter->task = NULL
                                        wake_up_process(task A)
                                          try_to_wake_up()
                                             (task is still
                                               TASK_UNINTERRUPTIBLE)
                                              p->on_rq is still 1.)

                                              ttwu_do_wakeup()
                                                 (*A)
                                                   :
     (I/O interruption handler finished)

      if (!waiter.task)
          schedule() is not called
          due to waiter.task is NULL.

      tsk->state = TASK_RUNNING

          :
                                              check_preempt_curr();
                                                  :
  task->state = TASK_DEAD
                                              (*B)
                                        <---    set TASK_RUNNING (*C)

     schedule()
     (exit task is running again)
     BUG_ON() is called!
 --------------------------------------------------------

The execution time between (*A) and (*B) is usually very short,
because the interruption is disabled, and setting TASK_RUNNING at (*C)
must be executed before setting TASK_DEAD.

HOWEVER, if SMI is interrupted between (*A) and (*B),
(*C) is able to execute AFTER setting TASK_DEAD!
Then, exited task is scheduled again, and BUG_ON() is called....

If the system works on guest system of virtual machine, the time
between (*A) and (*B) may be also long due to scheduling of hypervisor,
and same phenomenon can occur.

By this patch, do_exit() waits for releasing task->pi_lock which is used
in try_to_wake_up(). It guarantees the task becomes TASK_DEAD after
waking up.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20120117174031.3118.E1E9C6FF@jp.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-27 11:55:36 +01:00
Seth Heasley 84e83c2846 watchdog: iTCO_wdt: add Intel Lynx Point DeviceIDs
This patch adds the TCO Watchdog DeviceIDs for the Intel Lynx Point PCH.

Signed-off-by: Seth Heasley <seth.heasley@intel.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-01-27 10:01:16 +01:00
Axel Lin f6dd94f819 watchdog: via_wdt: Set min_timeout and max_timeout for wdt_dev
Let the watchdog core to check the valid value range of min_timeout/max_timeout.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-01-27 10:00:53 +01:00
Masanari Iida b1785dfd4f watchdog: Fix typo "unexpectdly"
Correct typo "unexpectdly" to "unexpectedly" in pnx4008_wdt.c
and stmp3xxx_wdt.c

Signed-off-by: Masanari Iida<standby24x7@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-01-27 10:00:15 +01:00
Axel Lin 8a062ac693 watchdog: wafer5823wdt: Fix handling WDIOS_DISABLECARD/WDIOS_ENABLECARD options
While receiving WDIOS_DISABLECARD option for WDIOC_SETOPTIONS command,
call wafwdt_stop() to disable watchdog.
Call wafwdt_start() while receiving WDIOS_ENABLECARD option.

Current code has reverse behavior.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-01-27 09:55:13 +01:00
Axel Lin ebe06e826f watchdog: wm8350_wdt: Fix handling WDIOS_DISABLECARD/WDIOS_ENABLECARD options
While receiving WDIOS_DISABLECARD option for WDIOC_SETOPTIONS command,
call wm8350_wdt_stop() to disable watchdog.
Call wm8350_wdt_start() while receiving WDIOS_ENABLECARD option.

Current code has reverse behavior.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-01-27 09:55:04 +01:00
Axel Lin 2865e770c9 watchdog: Return proper error in nuc900wdt_probe if misc_register fails
Return proper error instead of 0 if misc_register fails

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-01-27 09:54:51 +01:00
Axel Lin e352829a67 watchdog: Staticise nuc900_wdt
It is only used in this driver, so no need to make the symbol global.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-01-27 09:54:41 +01:00
Axel Lin 0318e286f9 watchdog: via_wdt: Staticise wdt_pci_table
It is only used in this driver, so no need to make the symbol global.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Marc Vertes <marc.vertes@sigfox.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-01-27 09:54:12 +01:00
Shubhrajyoti D 12c583d8dc watchdog: omap_wdt.c: Fix the mismatch of pm_runtime enable and disable
Currently the watchdog driver calls the pm_runtime_enable and never
the disable. This may cause a warning when pm_runtime_enable
checks for the count match.

Also fixes the error

/build/watchdog # insmod omap_wdt.ko
[   44.999389] omap_wdt omap_wdt: Unbalanced pm_runtime_enable!
[   45.011047] OMAP Watchdog Timer Rev 0x00: initial timeout 60 sec
/build/watchdog #

Attempting to fix the same by calling pm_runtime_disable.

Signed-off-by: Shubhrajyoti D <shubhrajyoti@ti.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-01-27 09:53:53 +01:00
Julia Lawall 52ea9a7d79 watchdog: dw_wdt.c: use devm_request_and_ioremap
Reimplement a call to devm_request_mem_region followed by a call to ioremap
or ioremap_nocache by a call to devm_request_and_ioremap.

The semantic patch that makes this transformation is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@nm@
expression myname;
identifier i;
@@

struct platform_driver i = { .driver = { .name = myname } };

@@
expression dev,res,size;
expression nm.myname;
@@

-if (!devm_request_mem_region(dev, res->start, size,
-                              \(res->name\|dev_name(dev)\|myname\))) {
-   ...
-   return ...;
-}
... when != res->start
(
-devm_ioremap(dev,res->start,size)
+devm_request_and_ioremap(dev,res)
|
-devm_ioremap_nocache(dev,res->start,size)
+devm_request_and_ioremap(dev,res)
)
... when any
    when != res->start
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-01-27 09:53:28 +01:00
Julia Lawall 5d32d4868a watchdog: imx2_wdt.c: use devm_request_and_ioremap
Reimplement a call to devm_request_mem_region followed by a call to ioremap
or ioremap_nocache by a call to devm_request_and_ioremap.

The variable res_size is then no longer needed.

The semantic patch that makes this transformation is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@nm@
expression myname;
identifier i;
@@

struct platform_driver i = { .driver = { .name = myname } };

@@
expression dev,res,size;
expression nm.myname;
@@

-if (!devm_request_mem_region(dev, res->start, size,
-                              \(res->name\|dev_name(dev)\|myname\))) {
-   ...
-   return ...;
-}
... when != res->start
(
-devm_ioremap(dev,res->start,size)
+devm_request_and_ioremap(dev,res)
|
-devm_ioremap_nocache(dev,res->start,size)
+devm_request_and_ioremap(dev,res)
)
... when any
    when != res->start
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2012-01-27 09:53:19 +01:00
Dave Airlie 24a7eb7954 Merge branch 'exynos-drm-fixes' of git://git.infradead.org/users/kmpark/linux-samsung into drm-fixes
* 'exynos-drm-fixes' of git://git.infradead.org/users/kmpark/linux-samsung:
  drm/exynos: fixed pm feature for fimd module.
  MAINTAINERS: added maintainer entry for Exynos DRM Driver.
  drm/exynos: fixed build dependency for DRM_EXYNOS_FIMD
  drm/exynos: fix build dependency for DRM_EXYNOS_HDMI
  drm/exynos: use release_mem_region instead of release_resource
2012-01-27 07:16:27 +00:00
Olof Johansson 5e96386431 Merge branch 'imx-fixes' of git://git.pengutronix.de/git/imx/linux-2.6 into fixes
* 'imx-fixes' of git://git.pengutronix.de/git/imx/linux-2.6:
  arch/arm/mach-imx/mach-mx53_ard.c: add missing iounmap
  ARM: imx: iomux-v1.h: Fix build error due to __init annotation
2012-01-26 23:15:11 -08:00
Olof Johansson d8c4cd7463 Merge branch 'fixes-for-arm-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson into fixes
* 'fixes-for-arm-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson:
  mach-ux500: no MMC_CAP_SD_HIGHSPEED on Snowball
  mach-ux500: enable ARM errata 764369
  mach-ux500: do not override outer.inv_all
  mach-ux500: musb: now musb is always in OTG mode
2012-01-26 23:13:20 -08:00
Olof Johansson 3c8cee3b40 Merge branch 'imx6/fixes' of git://git.linaro.org/people/shawnguo/linux-2.6 into fixes
* 'imx6/fixes' of git://git.linaro.org/people/shawnguo/linux-2.6:
  ARM: imx6: add missing twd_clk for imx6q clock
2012-01-26 23:12:17 -08:00
Olof Johansson d1a8c54b70 Merge branch 'at91-fixes' of git://github.com/at91linux/linux-at91 into fixes
* 'at91-fixes' of git://github.com/at91linux/linux-at91:
  ARM: at91: Fix at91sam9g45 and at91cap9 reset
  ARM: at91: make rstc soc independent
  ARM: at91: introduce AT91_SAM9_ALT_RESET to select the at91sam9 alternative reset
  ARM: at91: merge at91cap9_ddrsdr.h in at91sam9_ddrsdr.h
  ARM: at91: fix cap9 ddrsdr register
  ARM/USB: at91/ohci-at91: rename vbus_pin_inverted to vbus_pin_active_low
  USB: at91: fix clk_get error handling
  ARM: at91: removal of CAP9 SoC family
  ARM: at91: fix at91rm9200 soc subtype handling
2012-01-26 23:06:52 -08:00
Inki Dae 373af0c0c5 drm/exynos: fixed pm feature for fimd module.
this patch separates fimd specific power on/off function from pm function
and the pm interfaces will call that function for power on or off.
and also removes unnecessary codes of resume function.

Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-01-27 13:03:59 +09:00
Guenter Roeck cba9384b3c MAINTAINERS: Drop maintainer for MAX1668 hwmon driver
David no longer has access to MAX1688 hardware, so drop him from the maintainers
list.

Cc: David George <dgeorgester@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: David George <dgeorgester@gmail.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
2012-01-26 18:29:57 -08:00
Mark Brown d22b086970 MAINTAINERS: Add hwmon entries for Wolfson
The actual driver code seems to have been lost in the shuffle.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
2012-01-26 18:29:56 -08:00
Inki Dae f15013033e MAINTAINERS: added maintainer entry for Exynos DRM Driver.
I'd like to add my colleagues who dedicated to developing and
improving our driver to maintainer entry.

Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-01-27 10:43:26 +09:00
Inki Dae a4b42dab29 drm/exynos: fixed build dependency for DRM_EXYNOS_FIMD
FB based FIMD and DRM based FIMD drivers use same hardware
so with this patch, only one of them would be selected.

Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-01-27 10:43:25 +09:00
Seung-Woo Kim 2363dc636d drm/exynos: fix build dependency for DRM_EXYNOS_HDMI
DRM_EXYNOS_HDMI driver and VIDEO_SAMSUNG_S5P_TV driver should be
not enabled at once because they use same HW blocks. So dependency
for DRM_EXYNOS_HDMI is fixed to check VIDEO_SAMSUNG_S5P_TV=n.

Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-01-27 10:43:24 +09:00
Seung-Woo Kim 485bc54c33 drm/exynos: use release_mem_region instead of release_resource
To make a api pair of request_mem_region and release_mem_region,
release_mem_region is used instead of release_resource.

Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2012-01-27 10:43:23 +09:00
Linus Torvalds 74ea15d909 Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] cinergyT2-fe: Fix bandwdith settings
  [media] V4L: atmel-isi: add clk_prepare()/clk_unprepare() functions
  [media] cxd2820r: sleep on DVB-T/T2 delivery system switch
  [media] anysee: fix CI init
  [media] cxd2820r: remove unused parameter from cxd2820r_attach
  [media] cxd2820r: fix dvb_frontend_ops
2012-01-26 17:04:47 -08:00
Linus Torvalds c75d5c5d82 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc32: forced setting of mode of sun4m per-cpu timers
2012-01-26 17:00:38 -08:00
Tony Lindgren cd3a2ba070 Merge branch 'omap_fixes_a_3.3rc' of git://git.pwsan.com/linux-2.6 into fixes 2012-01-26 17:00:07 -08:00
Julia Lawall 14ea960164 ARM: OMAP2+: arch/arm/mach-omap2/smartreflex.c: add missing iounmap
Add missing iounmap in error handling code, in a case where the function
already preforms iounmap on some other execution path.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression e;
statement S,S1;
int ret;
@@
e = \(ioremap\|ioremap_nocache\)(...)
... when != iounmap(e)
if (<+...e...+>) S
... when any
    when != iounmap(e)
*if (...)
   { ... when != iounmap(e)
     return ...; }
... when any
iounmap(e);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2012-01-26 16:38:47 -08:00
Julia Lawall e0feca899c ARM: OMAP2+: arch/arm/mach-omap2/devices.c: introduce missing kfree
pdata needs to be freed before leaving the function in an error case.

A simplified version of the semantic match that finds the problem is as
follows: (http://coccinelle.lip6.fr)

// <smpl>
@r exists@
local idexpression x;
statement S;
identifier f1;
position p1,p2;
expression *ptr != NULL;
@@

x@p1 = \(kmalloc\|kzalloc\|kcalloc\)(...);
...
if (x == NULL) S
<... when != x
     when != if (...) { <+...x...+> }
x->f1
...>
(
 return \(0\|<+...x...+>\|ptr\);
|
 return@p2 ...;
)

@script:python@
p1 << r.p1;
p2 << r.p2;
@@

print "* file: %s kmalloc %s return %s" % (p1[0].file,p1[0].line,p2[0].line)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2012-01-26 16:38:47 -08:00
Grazvydas Ignotas d82e5190da ARM: OMAP: fix MMC2 loopback clock handling
Currently MMC2 setup code can only enable loopback clock and
relies on reset value for boards that need to have it disabled.
This causes a problem with certain bootloaders that always enable
that clock, resulting with unwanted bootloader dependencies.

Fix this by making it disable the clock if board data says so.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Acked-by: Igor Grinberg <grinberg@compulab.co.il>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2012-01-26 16:38:47 -08:00
Grazvydas Ignotas ffa1e4ede4 ARM: OMAP: fix erroneous mmc2 clock change on mmc3 setup
hsmmc23_before_set_reg() can set MMCSDIO2ADPCLKISEL bit, which
enables internal clock for MMC2. Currently this function is also called
by code handling MMC3, and if .internal_clock is set in platform data
(by default it currently is), it will set MMCSDIO2ADPCLKISEL for MMC2
instead of MMC3 (MMC3 doesn't have such bit so nothing actually needs to
be done). This breaks 2nd SD slot on pandora.

Fix this by changing hsmmc23_before_set_reg() to only handle MMC2.
Note that this removes .remux() call for MMC3, but no board currently
needs it and it's also not called for MMC4 and MMC5.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2012-01-26 15:54:12 -08:00
Yegor Yefremov 8ef5d844cc ARM: OMAP2+: GPMC: fix device size setup
following statement can only change device size from 8-bit(0) to 16-bit(1),
but not vice versa:

regval |= GPMC_CONFIG1_DEVICESIZE(wval);

so as this field has 1 reserved bit, that could be used in future,
just clear both bits and then OR with the desired value

Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tony Lindgren <tony@atomide.com>
2012-01-26 15:30:30 -08:00
Vaibhav Hiremath dbc3982ae2 ARM: OMAP2+: timer: Fix crash due to wrong arg to __omap_dm_timer_read_counter
Commit 2f0778af (ARM: 7205/2: sched_clock: allow sched_clock to be
selected at runtime) had a typo for the case when CONFIG_OMAP_32K_TIMER
is not set.

In dmtimer_read_sched_clock(), wrong argument was getting passed to
__omap_dm_timer_read_counter() function call; instead of "&clksrc",
we were passing "clksrc.io_base", which results into kernel crash.

To reproduce kernel crash, just disable the CONFIG_OMAP_32K_TIMER config
option (and DEBUG_LL) and build/boot the kernel.
This will use dmtimer as a kernel clocksource and lead to kernel
crash during boot  -

[    0.000000] OMAP clocksource: GPTIMER2 at 26000000 Hz
[    0.000000] sched_clock: 32 bits at 26MHz, resolution 38ns, wraps every
165191ms
[    0.000000] Unable to handle kernel paging request at virtual address
00030ef1
[    0.000000] pgd = c0004000
[    0.000000] [00030ef1] *pgd=00000000
[    0.000000] Internal error: Oops: 5 [#1] SMP
[    0.000000] Modules linked in:
[    0.000000] CPU: 0    Not tainted  (3.3.0-rc1-11574-g0c76665-dirty #3)
[    0.000000] PC is at dmtimer_read_sched_clock+0x18/0x4c
[    0.000000] LR is at update_sched_clock+0x10/0x84
[    0.000000] pc : [<c00243b8>]    lr : [<c0018684>]    psr: 200001d3
[    0.000000] sp : c0641f38  ip : c0641e18  fp : 0000000a
[    0.000000] r10: 151c3303  r9 : 00000026  r8 : 76276259
[    0.000000] r7 : 00028547  r6 : c065ac80  r5 : 431bde82  r4 : c0655968
[    0.000000] r3 : 00030ef1  r2 : fb032000  r1 : 00000028  r0 : 00000001

Signed-off-by: Vaibhav Hiremath <hvaibhav@ti.com>
[tony@atomide.com: updated comments]
Signed-off-by: Tony Lindgren <tony@atomide.com>
2012-01-26 15:27:32 -08:00
Glauber Costa 9018e93948 net: explicitly add jump_label.h header to sock.h
Commit 36a1211970 removed linux/module.h
include statement from one of the headers that end up in net/sock.h.
It was providing us with static_branch() definition implicitly, so
after its removal the build got broken.

To fix this, and avoid having this happening in the future,
let me do the right thing and include linux/jump_label.h
explicitly in sock.h.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
CC: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-26 17:13:26 -05:00
Clemens Ladisch d1bb399ad0 firewire: ohci: add reset packet quirk for SB Audigy
The Audigy's SB1394 controller is actually from Texas Instruments
and has the same bus reset packet generation bug, so it needs the
same quirk entry.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Cc: 2.6.36+ <stable@vger.kernel.org>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2012-01-26 22:53:17 +01:00
Stefan Gula f18da14565 net: RTNETLINK adjusting values of min_ifinfo_dump_size
Setting link parameters on a netdevice changes the value
of if_nlmsg_size(), therefore it is necessary to recalculate
min_ifinfo_dump_size.

Signed-off-by: Stefan Gula <steweg@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-26 16:35:57 -05:00
Willem de Bruijn f2b3ee9e42 ipv6: Fix ip_gre lockless xmits.
Tunnel devices set NETIF_F_LLTX to bypass HARD_TX_LOCK.  Sit and
ipip set this unconditionally in ops->setup, but gre enables it
conditionally after parameter passing in ops->newlink. This is
not called during tunnel setup as below, however, so GRE tunnels are
still taking the lock.

modprobe ip_gre
ip tunnel add test0 mode gre remote 10.5.1.1 dev lo
ip link set test0 up
ip addr add 10.6.0.1 dev test0
 # cat /sys/class/net/test0/features
 # $DIR/test_tunnel_xmit 10 10.5.2.1
ip route add 10.5.2.0/24 dev test0
ip tunnel del test0

The newlink callback is only called in rtnl_netlink, and only if
the device is new, as it calls register_netdevice internally. Gre
tunnels are created at 'ip tunnel add' with ioctl SIOCADDTUNNEL,
which calls ipgre_tunnel_locate, which calls register_netdev.
rtnl_newlink is called at 'ip link set', but skips ops->newlink
and the device is up with locking still enabled. The equivalent
ipip tunnel works fine, btw (just substitute 'method gre' for
'method ipip').

On kernels before /sys/class/net/*/features was removed [1],
the first commented out line returns 0x6000 with method gre,
which indicates that NETIF_F_LLTX (0x1000) is not set. With ipip,
it reports 0x7000. This test cannot be used on recent kernels where
the sysfs file is removed (and ETHTOOL_GFEATURES does not currently
work for tunnel devices, because they lack dev->ethtool_ops).

The second commented out line calls a simple transmission test [2]
that sends on 24 cores at maximum rate. Results of a single run:

ipip:			19,372,306
gre before patch:	 4,839,753
gre after patch:	19,133,873

This patch replicates the condition check in ipgre_newlink to
ipgre_tunnel_locate. It works for me, both with oseq on and off.
This is the first time I looked at rtnetlink and iproute2 code,
though, so someone more knowledgeable should probably check the
patch. Thanks.

The tail of both functions is now identical, by the way. To avoid
code duplication, I'll be happy to rework this and merge the two.

[1] http://patchwork.ozlabs.org/patch/104610/
[2] http://kernel.googlecode.com/files/xmit_udp_parallel.c

Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-26 16:34:08 -05:00
Russell King 9a95b9e741 Merge branch 'sa11x0-mcp-fixes' into fixes 2012-01-26 21:06:54 +00:00
Linus Torvalds 2d3c7efa50 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode_amd: Add support for CPU family specific container files
  x86/amd: Add missing feature flag for fam15h models 10h-1fh processors
  x86/boot-image: Don't leak phdrs in arch/x86/boot/compressed/misc.c::Parse_elf()
  x86/numachip: Drop unnecessary conflict with EDAC
  x86/uv: Fix uninitialized spinlocks
  x86/uv: Fix uv_gpa_to_soc_phys_ram() shift
2012-01-26 12:46:07 -08:00
Linus Torvalds b57cea5e33 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Call perf_cgroup_event_time() directly
  perf: Don't call release_callchain_buffers() if allocation fails
2012-01-26 12:45:57 -08:00
Linus Torvalds 2437dcbf55 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rcu: Add missing __cpuinit annotation in rcutorture code
  sched: Add "const" to is_idle_task() parameter
  rcu: Make rcutorture bool parameters really bool (core code)
  memblock: Fix alloc failure due to dumb underflow protection in memblock_find_in_range_node()
2012-01-26 12:45:41 -08:00
Linus Torvalds 0dbfe8ddaa Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: Fix assembler constraint to prevent overeager gcc optimisation
  mac_esp: rename irq
  mac_scsi: dont enable mac_scsi irq before requesting it
  macfb: fix black and white modes
  m68k/irq: Remove obsolete IRQ_FLG_* definitions

Fix up trivial conflict in arch/m68k/kernel/process_mm.c as per Geert.
2012-01-26 12:43:57 -08:00
Prarit Bhargava b0f4c4b32c bugs, x86: Fix printk levels for panic, softlockups and stack dumps
rsyslog will display KERN_EMERG messages on a connected
terminal.  However, these messages are useless/undecipherable
for a general user.

For example, after a softlockup we get:

 Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
 kernel:Stack:

 Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
 kernel:Call Trace:

 Message from syslogd@intel-s3e37-04 at Jan 25 14:18:06 ...
 kernel:Code: ff ff a8 08 75 25 31 d2 48 8d 86 38 e0 ff ff 48 89
 d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff ff a8 08 75 08 b1 01 4c 89 e0 0f 01 c9 <e8> ea 69 dd ff 4c 29 e8 48 89 c7 e8 0f bc da ff 49 89 c4 49 89

This happens because the printk levels for these messages are
incorrect. Only an informational message should be displayed on
a terminal.

I modified the printk levels for various messages in the kernel
and tested the output by using the drivers/misc/lkdtm.c kernel
modules (ie, softlockups, panics, hard lockups, etc.) and
confirmed that the console output was still the same and that
the output to the terminals was correct.

For example, in the case of a softlockup we now see the much
more informative:

 Message from syslogd@intel-s3e37-04 at Jan 25 10:18:06 ...
 BUG: soft lockup - CPU4 stuck for 60s!

instead of the above confusing messages.

AFAICT, the messages no longer have to be KERN_EMERG.  In the
most important case of a panic we set console_verbose().  As for
the other less severe cases the correct data is output to the
console and /var/log/messages.

Successfully tested by me using the drivers/misc/lkdtm.c module.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: dzickus@redhat.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1327586134-11926-1-git-send-email-prarit@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-26 21:28:45 +01:00
Jan Beulich fc395b9291 x86: Properly parenthesize cmpxchg() macro arguments
Quite oddly, all of the arguments passed through from the top
level macros to the second level which didn't need parentheses
had them, while the only expression (involving a parameter)
needing them didn't.

Very recently I got bitten by the lack thereof when using
something like "array + index" for the first operand, with
"array" being an array more narrow than int.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/4F2183A9020000780006F3E6@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-26 21:18:29 +01:00
Josef Bacik 9b23062840 Btrfs: advance window_start if we're using a bitmap
If we span a long area in a bitmap we could end up taking a lot of time
searching to the next free area if we're searching from the original
window_start, so advance window_start in order to make sure we don't do any
superficial searching.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-01-26 15:01:12 -05:00
David Sterba 0c4e538bcc btrfs: mask out gfp flags in releasepage
btree_releasepage is a callback and can be passed unknown gfp flags and then
they may end up in kmem_cache_alloc called from alloc_extent_state, slab
allocator will BUG_ON when there is HIGHMEM or DMA32 flag set.

This may happen when btrfs is mounted from a loop device, which masks out
__GFP_IO flag. The check in try_release_extent_state

3399                 if ((mask & GFP_NOFS) == GFP_NOFS)
3400                         mask = GFP_NOFS;

will not work and passes unfiltered flags further resulting in crash at
mm/slab.c:2963

 [<000000000024ae4c>] cache_alloc_refill+0x3b4/0x5c8
 [<000000000024c810>] kmem_cache_alloc+0x204/0x294
 [<00000000001fd3c2>] mempool_alloc+0x52/0x170
 [<000003c000ced0b0>] alloc_extent_state+0x40/0xd4 [btrfs]
 [<000003c000cee5ae>] __clear_extent_bit+0x38a/0x4cc [btrfs]
 [<000003c000cee78c>] try_release_extent_state+0x9c/0xd4 [btrfs]
 [<000003c000cc4c66>] btree_releasepage+0x7e/0xd0 [btrfs]
 [<0000000000210d84>] shrink_page_list+0x6a0/0x724
 [<0000000000211394>] shrink_inactive_list+0x230/0x578
 [<0000000000211bb8>] shrink_list+0x6c/0x120
 [<0000000000211e4e>] shrink_zone+0x1e2/0x228
 [<0000000000211f24>] shrink_zones+0x90/0x254
 [<0000000000213410>] do_try_to_free_pages+0xac/0x420
 [<0000000000213ae0>] try_to_free_pages+0x13c/0x1b0
 [<0000000000204e6c>] __alloc_pages_nodemask+0x5b4/0x9a8
 [<00000000001fb04a>] grab_cache_page_write_begin+0x7e/0xe8

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-01-26 15:01:12 -05:00
Miao Xie 9e622d6bea Btrfs: fix enospc error caused by wrong checks of the chunk
When we did sysbench test for inline files, enospc error happened easily though
there was lots of free disk space which could be allocated for new chunks.

Reproduce steps:
 # mkfs.btrfs -b $((2 * 1024 * 1024 * 1024)) <test partition>
 # mount <test partition> /mnt
 # ulimit -n 102400
 # cd /mnt
 # sysbench --num-threads=1 --test=fileio --file-num=81920 \
 > --file-total-size=80M --file-block-size=1K --file-io-mode=sync \
 > --file-test-mode=seqwr prepare
 # sysbench --num-threads=1 --test=fileio --file-num=81920 \
 > --file-total-size=80M --file-block-size=1K --file-io-mode=sync \
 > --file-test-mode=seqwr run
 <soon later, BUG_ON() was triggered by enospc error>

The reason of this bug is:
Now, we can reserve space which is larger than the free space in the chunks if
we have enough free disk space which can be used for new chunks. By this way,
the space allocator should allocate a new chunk by force if there is no free
space in the free space cache. But there are two wrong checks which break this
operation.

One is
	if (ret == -ENOSPC && num_bytes > min_alloc_size)
in btrfs_reserve_extent(), it is wrong, we should try to allocate a new chunk
even we fail to allocate free space by minimum allocable size.

The other is
	if (space_info->force_alloc)
		force = space_info->force_alloc;
in do_chunk_alloc(). It makes the allocator ignore CHUNK_ALLOC_FORCE If someone
sets ->force_alloc to CHUNK_ALLOC_LIMITED, and makes the enospc error happen.

Fix these two wrong checks. Especially the second one, we fix it by changing
the value of CHUNK_ALLOC_LIMITED and CHUNK_ALLOC_FORCE, and make
CHUNK_ALLOC_FORCE greater than CHUNK_ALLOC_LIMITED since CHUNK_ALLOC_FORCE has
higher priority. And if the value which is passed in by the caller is greater
than ->force_alloc, use the passed value.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-01-26 15:01:12 -05:00
Liu Bo 7ec31b548a Btrfs: do not defrag a file partially
xfstests 218 complains that btrfs defrags a file partially:
 After: 1
 Write backwards sync, but contiguous - should defrag to 1 extent
 Before: 10
-After: 1
+After: 2

To fix this, we need to set max_to_defrag count properly.

Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-01-26 15:01:12 -05:00
Stefan Behrens 0b485143d8 Btrfs: fix warning for 32-bit build of fs/btrfs/check-integrity.c
There have been 4 warnings on 32-bit build, they are herewith fixed.

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-01-26 15:01:11 -05:00
Josef Bacik 0b4a9d248f Btrfs: use cluster->window_start when allocating from a cluster bitmap
We specifically set window_start in the cluster struct to indicate where the
cluster starts in a bitmap, but we've been using min_start to indicate where
we're searching from.  This is usually the start of the blockgroup, so
essentially means we're constantly searching from the start of any bitmap we
find, which completely negates all the trouble we go to in order to setup a
cluster.  So start using window_start to make sure we actually use the area we
found.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2012-01-26 15:01:11 -05:00