Commit Graph

2469 Commits

Author SHA1 Message Date
Tejun Heo c8615d3716 idr: deprecate idr_pre_get() and idr_get_new[_above]()
Now that all in-kernel users are converted to ues the new alloc
interface, mark the old interface deprecated.  We should be able to
remove these in a few releases.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-13 15:21:47 -07:00
Frederic Weisbecker f792685006 math64: New div64_u64_rem helper
Provide an extended version of div64_u64() that
also returns the remainder of the division.

We are going to need this to refine the cputime
scaling code.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
2013-03-13 18:03:27 +01:00
Randy Dunlap 5857f70c8a idr: fix new kernel-doc warnings
Fix new kernel-doc warnings in idr:

  Warning(include/linux/idr.h:113): No description found for parameter 'idr'
  Warning(include/linux/idr.h:113): Excess function parameter 'idp' description in 'idr_find'
  Warning(lib/idr.c:232): Excess function parameter 'id' description in 'sub_alloc'
  Warning(lib/idr.c:232): Excess function parameter 'id' description in 'sub_alloc'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-12 20:42:09 -07:00
Tejun Heo 2e1c9b2867 idr: remove WARN_ON_ONCE() on negative IDs
idr_find(), idr_remove() and idr_replace() used to silently ignore the
sign bit and perform lookup with the rest of the bits.  The weird behavior
has been changed such that negative IDs are treated as invalid.  As the
behavior change was subtle, WARN_ON_ONCE() was added in the hope of
determining who's calling idr functions with negative IDs so that they can
be examined for problems.

Up until now, all two reported cases are ID number coming directly from
userland and getting fed into idr_find() and the warnings seem to cause
more problems than being helpful.  Drop the WARN_ON_ONCE()s.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: <markus@trippelsdorf.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-08 15:05:34 -08:00
Linus Torvalds 8fd5e7a2d9 ImgTec Meta architecture changes for v3.9-rc1
This adds core architecture support for Imagination's Meta processor
 cores, followed by some later miscellaneous arch/metag cleanups and
 fixes which I kept separate to ease review:
 
  - Support for basic Meta 1 (ATP) and Meta 2 (HTP) core architecture
  - A few fixes all over, particularly for symbol prefixes
  - A few privilege protection fixes
  - Several cleanups (setup.c includes, split out a lot of metag_ksyms.c)
  - Fix some missing exports
  - Convert hugetlb to use vm_unmapped_area()
  - Copy device tree to non-init memory
  - Provide dma_get_sgtable()
 
 Signed-off-by: James Hogan <james.hogan@imgtec.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJRMmVXAAoJEKHZs+irPybfivgP/inEXqJyfw59omQdjwvYcU/a
 /u0MJ3UKSNS3U+HknfaFCy/Nwk1dqPLjqqyVC1V6AbUPBXlaEwGcimlNRx2uRjdq
 Uh96upMLHsNuF/xiiR477g3RwY0egIJdM1R1bGi3mZ3vVrNQGF+wbni6f61xCWGz
 M/4rDglpQvE79oLhYdgj6tidZtHQT0YWtERA9W90zkQWXGYmpFPKBKbfZAi5+rKQ
 U6Gpg26orUugzXNaxltJEYKE8gjLTppEabx8DARnItZ4zCMy4dw5RBJ35RFvQw6e
 eSmfgTy9w9WqBMY2+QMSgU0KQt1IITCzX7OlOXC0jALQJXoU0WWbOELlBVQLCwF1
 T0OcR/5ZP/hIlOk5Dh+e9U3AtbASXdMtqA0ZUe78woH1CBf7Nc/0c0vRg23EdMh8
 lnHDJxT/UqskoOYLI4kgWbEdLDy4uTh19U2pVi7VCo7ksLB9Bj9Xc8VSKgscSXTl
 OwzN+c4Jgtu8FDFTp+Af4AT8pYGJ08j8L2ErsV2sOv3Q44U5WXdrMz3GSgwXj8+4
 wZk3HvdkQVkMD5sJCUZgAswaN6BnbB0pHdCz4wMQ8jR/Ogs015Ipk64Ecym9S/4n
 uES7PnDtt/4lb5EyX2ScbvdnZTAFTaaP7OOhC77BOQvbQjIW1tkAcxWJqRry86uS
 iM0BFgK6Ohx3geqa5Ft0
 =65cR
 -----END PGP SIGNATURE-----

Merge tag 'metag-v3.9-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag

Pull new ImgTec Meta architecture from James Hogan:
 "This adds core architecture support for Imagination's Meta processor
  cores, followed by some later miscellaneous arch/metag cleanups and
  fixes which I kept separate to ease review:

   - Support for basic Meta 1 (ATP) and Meta 2 (HTP) core architecture
   - A few fixes all over, particularly for symbol prefixes
   - A few privilege protection fixes
   - Several cleanups (setup.c includes, split out a lot of
     metag_ksyms.c)
   - Fix some missing exports
   - Convert hugetlb to use vm_unmapped_area()
   - Copy device tree to non-init memory
   - Provide dma_get_sgtable()"

* tag 'metag-v3.9-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: (61 commits)
  metag: Provide dma_get_sgtable()
  metag: prom.h: remove declaration of metag_dt_memblock_reserve()
  metag: copy devicetree to non-init memory
  metag: cleanup metag_ksyms.c includes
  metag: move mm/init.c exports out of metag_ksyms.c
  metag: move usercopy.c exports out of metag_ksyms.c
  metag: move setup.c exports out of metag_ksyms.c
  metag: move kick.c exports out of metag_ksyms.c
  metag: move traps.c exports out of metag_ksyms.c
  metag: move irq enable out of irqflags.h on SMP
  genksyms: fix metag symbol prefix on crc symbols
  metag: hugetlb: convert to vm_unmapped_area()
  metag: export clear_page and copy_page
  metag: export metag_code_cache_flush_all
  metag: protect more non-MMU memory regions
  metag: make TXPRIVEXT bits explicit
  metag: kernel/setup.c: sort includes
  perf: Enable building perf tools for Meta
  metag: add boot time LNKGET/LNKSET check
  metag: add __init to metag_cache_probe()
  ...
2013-03-03 12:06:09 -08:00
James Hogan 79f83c0294 Kconfig.debug: add METAG to dependency lists
Add [!]METAG to a couple of Kconfig dependencies in lib/Kconfig.debug.
Don't allow stack utilization instrumentation on metag, and allow
building with frame pointers.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Paul E. McKenney" <paul.mckenney@linaro.org>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
2013-03-02 20:09:53 +00:00
Linus Torvalds 3cfb07743a KGDB/KDB fixes and cleanups
Cleanups
    Remove kdb ssb command - there is no in kernel disassembler to support it
    Remove kdb ll command - Always caused a kernel oops and there were no
        bug reports so no one was using this command
    Use kernel ARRAY_SIZE macro instead of array computations
 
  Fixes
    Stop oops in kdb if user executes kdb_defcmd with args
    kdb help command truncated text
    ppc64 support for kgdbts
    Add missing kconfig option from original kdb port for dealing with
       catastrophic kernel crashes such that you can reboot automatically
       on continue from kdb
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJRMhJPAAoJEIciOldedpOj25gP+wZn/u3YxOGkhL3CIBOegEwW
 KFte7FFNeLNzw8tyEzBMVXok85PJ+/lvm+Xi+a4hhtV7/S+WvwGoWeIg0zCDtrAu
 im4m2X1u00Egcztcr4tCk8Lwc6vNm5KZsBsUsbKKi0K8twaWifkxLMEWeq+3WLzh
 +1d4TDkziwdZfcLpUHvohmCsCyg1EdjUxTYnWpSJvLrq5lhNvTQBowHWZH06GBcI
 MmhC+aHp3myk2/Vwd1AuDq+jIRbH3ORV50wfRONBR7OJ58sOoD9nV8Mr+fy8QLRN
 BPv8WxeRSm7X/Hl6nPm9PX7oNQSg+Ba1+5helIIJMdGGWhJbl9AZslFEU40Zn3yS
 V7FmS3mwRSA8NCgfZ+CO6311tirSCvtf34+5ttXEntyK1ujaplSPffBP4Ya26y6q
 TB2wLr91+3L0LhzrVRj20P13qexsWUW4t9inLzqtDApD3Zljl9Kuwd0febCiLg0r
 mgGkpZjK0VI7S7RfkxpmEyU4bFP2lCTzBOKAKOfIV6O88bfqFdv5QCTHODJArf22
 ugDGPX3tw++Lyc9otGxyPbxmc7npYGqTD1UEM8xK/7sEBiWJm71jjwHN1bwRwB3o
 EC5bcZ6TJgDJnnvYGBTQec74PnUiCu/UnQVFFabzhowQCJTIWSJkPxjnI2BKK1V5
 3exPqogclusAhhwopOtG
 =Hwbq
 -----END PGP SIGNATURE-----

Merge tag 'for_linux-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb

Pull KGDB/KDB fixes and cleanups from Jason Wessel:
 "For a change we removed more code than we added.  If people aren't
  using it we shouldn't be carrying it.  :-)

  Cleanups:
   - Remove kdb ssb command - there is no in kernel disassembler to
     support it

   - Remove kdb ll command - Always caused a kernel oops and there were
     no bug reports so no one was using this command

   - Use kernel ARRAY_SIZE macro instead of array computations

  Fixes:
   - Stop oops in kdb if user executes kdb_defcmd with args

   - kdb help command truncated text

   - ppc64 support for kgdbts

   - Add missing kconfig option from original kdb port for dealing with
     catastrophic kernel crashes such that you can reboot automatically
     on continue from kdb"

* tag 'for_linux-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb:
  kdb: Remove unhandled ssb command
  kdb: Prevent kernel oops with kdb_defcmd
  kdb: Remove the ll command
  kdb_main: fix help print
  kdb: Fix overlap in buffers with strcpy
  Fixed dead ifdef block by adding missing Kconfig option.
  kdb: Setup basic kdb state before invoking commands via kgdb
  kdb: use ARRAY_SIZE where possible
  kgdb/kgdbts: support ppc64
  kdb: A fix for kdb command table expansion
2013-03-02 08:31:39 -08:00
Linus Torvalds e23b62256a Initial ARC Linux port with some fixes on top for 3.9-rc1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJRMZYbAAoJEGnX8d3iisJeFj8P/R1hWohDDUc8pG3+ov9Y2Brt
 g7oIVw1udlKIk3HhVwsyT14/UHunfcTCONKKKGmmbfRrLJSMMsSlXvYoAQLozokf
 TuaO3Xt5IfERROqTrCDSwdNaAmwIZGsIuI9jWKo4qsXovAL0nc3sR527qMI1o7OE
 9X5eIqOJ/rPvOjXYrPqXmvO/DZKYG8PNTH4PYqePes31CsAmDXWBIlgYmgvF3th3
 ptwKPgRU/c2wKNpDDJXVQg/bcg9NI2cCnndNrjgXZgyUQrC37ZTdt/IOF5w6FgIW
 6i6UbDKXn8MgQAhrXx0Ns/+0kSJZ7eBWmj8hLyrxUzOYlF4rCs/il6ofDRaMO6fv
 9LmbNZXYnGICzm1YAxZRK7dm13IbDnltmMc81vISBpJSMTBgqzLWobHnq5/67Wh4
 2oUkoc2Tfaw70FnRCewX0x4Qop2YXmXl1KBwdecvzdcKi6Yg+rRH08ur/0yyCyx7
 +vAQpPVIuVqCc916qwmCPFaf1UMNnmMStxNH7D1AQHvi1G372NxfXizdYyKFRY9N
 f5Q+6DTo1xh2AxuGieSZxBoeK0Rlp4DWTOBD4MMz29y7BRX7LK1U2iS+nW0g8uir
 3RdYeAqyCxlJtjJNQX9U8ZT54jUPZgvJWU0udesRN1CBdOSQMjM9OyZsLLRtLeX1
 ww2tc7zqhBUyjBej6Itg
 =NKkW
 -----END PGP SIGNATURE-----

Merge tag 'arc-v3.9-rc1-late' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull new ARC architecture from Vineet Gupta:
 "Initial ARC Linux port with some fixes on top for 3.9-rc1:

  I would like to introduce the Linux port to ARC Processors (from
  Synopsys) for 3.9-rc1.  The patch-set has been discussed on the public
  lists since Nov and has received a fair bit of review, specially from
  Arnd, tglx, Al and other subsystem maintainers for DeviceTree, kgdb...

  The arch bits are in arch/arc, some asm-generic changes (acked by
  Arnd), a minor change to PARISC (acked by Helge).

  The series is a touch bigger for a new port for 2 main reasons:

   1. It enables a basic kernel in first sub-series and adds
      ptrace/kgdb/.. later

   2. Some of the fallout of review (DeviceTree support, multi-platform-
      image support) were added on top of orig series, primarily to
      record the revision history.

  This updated pull request additionally contains

   - fixes due to our GNU tools catching up with the new syscall/ptrace
     ABI

   - some (minor) cross-arch Kconfig updates."

* tag 'arc-v3.9-rc1-late' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (82 commits)
  ARC: split elf.h into uapi and export it for userspace
  ARC: Fixup the current ABI version
  ARC: gdbserver using regset interface possibly broken
  ARC: Kconfig cleanup tracking cross-arch Kconfig pruning in merge window
  ARC: make a copy of flat DT
  ARC: [plat-arcfpga] DT arc-uart bindings change: "baud" => "current-speed"
  ARC: Ensure CONFIG_VIRT_TO_BUS is not enabled
  ARC: Fix pt_orig_r8 access
  ARC: [3.9] Fallout of hlist iterator update
  ARC: 64bit RTSC timestamp hardware issue
  ARC: Don't fiddle with non-existent caches
  ARC: Add self to MAINTAINERS
  ARC: Provide a default serial.h for uart drivers needing BASE_BAUD
  ARC: [plat-arcfpga] defconfig for fully loaded ARC Linux
  ARC: [Review] Multi-platform image #8: platform registers SMP callbacks
  ARC: [Review] Multi-platform image #7: SMP common code to use callbacks
  ARC: [Review] Multi-platform image #6: cpu-to-dma-addr optional
  ARC: [Review] Multi-platform image #5: NR_IRQS defined by ARC core
  ARC: [Review] Multi-platform image #4: Isolate platform headers
  ARC: [Review] Multi-platform image #3: switch to board callback
  ...
2013-03-02 07:58:56 -08:00
Robert Obermeier 3b0eb71ec9 Fixed dead ifdef block by adding missing Kconfig option.
Added missing Kconfig option KDB_CONTINUE_CATASTROPHIC which lead to a dead
ifdef block in kernel/debug/kdb/kdb_main.c:73-75.

The code using KDB_CONTINUE_CATASTROPHIC was originally introduced in
commit '5d5314d6795f3c1c0f415348ff8c51f7de042b77' by Jason Wessel.
This patchset ("kdb: core for kgdb back end (1 of 2)")
added platform independent part of kdb to the linux kernel.

The Kernel option however, even though it had the same options and
behaviour on all supported architectures, was part of the x86 and
ia64 patchset of KDB and therefore not pulled into the mainline kernel tree.

I actually took the originally written Kconfig by
Keith Owens <kaos@sgi.com> (2003-06-20 according to KDB changelog)
and changed it to reflect the correct behaviour,
as the KDUMP patchset is not part of the kernel and the expected
functionality is missing from it.

Signed-off-by: Robert Obermeier <obbi89@googlemail.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2013-03-02 08:52:18 -06:00
Linus Torvalds b0af9cd9aa lzo-update-signature-20130226
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iD8DBQBRLQ9iTWFXqwsgQ8kRAvK/AJ9HUkhHJsukjw35XEQPkhfwNs/XPgCglXB1
 dF0wRbbL4d6pE9IloUsgYLg=
 =JSSN
 -----END PGP SIGNATURE-----

Merge tag 'lzo-update-signature-20130226' of git://github.com/markus-oberhumer/linux

Pull LZO compression update from Markus Oberhumer:
 "Summary:
  ========

  Update the Linux kernel LZO compression and decompression code to the
  current upstream version which features significant performance
  improvements on modern machines.

  Some *synthetic* benchmarks:
  ============================

    x86_64 (Sandy Bridge), gcc-4.6 -O3, Silesia test corpus, 256 kB block-size:

                     compression speed   decompression speed

    LZO-2005    :         150 MB/sec          468 MB/sec
    LZO-2012    :         434 MB/sec         1210 MB/sec

    i386 (Sandy Bridge), gcc-4.6 -O3, Silesia test corpus, 256 kB block-size:

                     compression speed   decompression speed

    LZO-2005    :         143 MB/sec          409 MB/sec
    LZO-2012    :         372 MB/sec         1121 MB/sec

    armv7 (Cortex-A9), Linaro gcc-4.6 -O3, Silesia test corpus, 256 kB block-size:

                     compression speed   decompression speed

    LZO-2005    :          27 MB/sec           84 MB/sec
    LZO-2012    :          44 MB/sec          117 MB/sec
  **LZO-2013-UA :          47 MB/sec          167 MB/sec

  Legend:

    LZO-2005    : LZO version in current 3.8 kernel (which is based on
                     the LZO 2.02 release from 2005)
    LZO-2012    : updated LZO version available in linux-next
  **LZO-2013-UA : updated LZO version available in linux-next plus experimental
                     ARM Unaligned Access patch. This needs approval
                     from some ARM maintainer ist NOT YET INCLUDED."

Andrew Morton <akpm@linux-foundation.org> acks it and says:
 "There's a new LZ4 on the block which is even faster than the sped-up
  LZO, but various filesystems and things use LZO"

* tag 'lzo-update-signature-20130226' of git://github.com/markus-oberhumer/linux:
  crypto: testmgr - update LZO compression test vectors
  lib/lzo: Update LZO compression to current upstream version
  lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c
2013-02-28 20:45:52 -08:00
Sasha Levin b67bfe0d42 hlist: drop the node parameter from iterators
I'm not sure why, but the hlist for each entry iterators were conceived

        list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

        hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

 - Fix up the actual hlist iterators in linux/list.h
 - Fix up the declaration of other iterators based on the hlist ones.
 - A very small amount of places were using the 'node' parameter, this
 was modified to use 'obj->member' instead.
 - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
 properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
    <+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
    ...+>

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:24 -08:00
Stefani Seibold dfe2a77fd2 kfifo: fix kfifo_alloc() and kfifo_init()
Fix kfifo_alloc() and kfifo_init() to alloc at least the requested number
of elements.  Since the kfifo operates on power of 2 the request size will
be rounded up to the next power of two.

Signed-off-by: Stefani Seibold <stefani@seibold.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:23 -08:00
Stefani Seibold c759b35e64 kfifo: move kfifo.c from kernel/ to lib/
Move kfifo.c from kernel/ to lib/

Signed-off-by: Stefani Seibold <stefani@seibold.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:23 -08:00
Tejun Heo 7175c61cc6 idr: explain WARN_ON_ONCE() on negative IDs out-of-range ID
Until recently, when an negative ID is specified, idr functions used to
ignore the sign bit and proceeded with the operation with the rest of
bits, which is bizarre and error-prone.  The behavior recently got changed
so that negative IDs are treated as invalid but we're triggering
WARN_ON_ONCE() on negative IDs just in case somebody was depending on the
sign bit being ignored, so that those can be detected and fixed easily.

We only need this for a while.  Explain why WARN_ON_ONCE()s are there and
that they can be removed later.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:21 -08:00
Tejun Heo 0ffc2a9c80 idr: implement lookup hint
While idr lookup isn't a particularly heavy operation, it still is too
substantial to use in hot paths without worrying about the performance
implications.  With recent changes, each idr_layer covers 256 slots
which should be enough to cover most use cases with single idr_layer
making lookup hint very attractive.

This patch adds idr->hint which points to the idr_layer which
allocated an ID most recently and the fast path lookup becomes

	if (look up target's prefix matches that of the hinted layer)
		return hint->ary[ID's offset in the leaf layer];

which can be inlined.

idr->hint is set to the leaf node on idr_fill_slot() and cleared from
free_layer().

[andriy.shevchenko@linux.intel.com: always do slow path when hint is uninitialized]
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:21 -08:00
Tejun Heo 54616283c2 idr: add idr_layer->prefix
Add a field which carries the prefix of ID the idr_layer covers.  This
will be used to implement lookup hint.

This patch doesn't make use of the new field and doesn't introduce any
behavior difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:20 -08:00
Tejun Heo 1d9b2e1e66 idr: remove length restriction from idr_layer->bitmap
Currently, idr->bitmap is declared as an unsigned long which restricts
the number of bits an idr_layer can contain.  All bitops can handle
arbitrary positive integer bit number and there's no reason for this
restriction.

Declare idr_layer->bitmap using DECLARE_BITMAP() instead of a single
unsigned long.

* idr_layer->bitmap is now an array.  '&' dropped from params to
  bitops.

* Replaced "== IDR_FULL" tests with bitmap_full() and removed
  IDR_FULL.

* Replaced find_next_bit() on ~bitmap with find_next_zero_bit().

* Replaced "bitmap = 0" with bitmap_clear().

This patch doesn't (or at least shouldn't) introduce any behavior
changes.

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:20 -08:00
Tejun Heo e8c8d1bc06 idr: remove MAX_IDR_MASK and move left MAX_IDR_* into idr.c
MAX_IDR_MASK is another weirdness in the idr interface.  As idr covers
whole positive integer range, it's defined as 0x7fffffff or INT_MAX.

Its usage in idr_find(), idr_replace() and idr_remove() is bizarre.
They basically mask off the sign bit and operate on the rest, so if
the caller, by accident, passes in a negative number, the sign bit
will be masked off and the remaining part will be used as if that was
the input, which is worse than crashing.

The constant is visible in idr.h and there are several users in the
kernel.

* drivers/i2c/i2c-core.c:i2c_add_numbered_adapter()

  Basically used to test if adap->nr is a negative number which isn't
  -1 and returns -EINVAL if so.  idr_alloc() already has negative
  @start checking (w/ WARN_ON_ONCE), so this can go away.

* drivers/infiniband/core/cm.c:cm_alloc_id()
  drivers/infiniband/hw/mlx4/cm.c:id_map_alloc()

  Used to wrap cyclic @start.  Can be replaced with max(next, 0).
  Note that this type of cyclic allocation using idr is buggy.  These
  are prone to spurious -ENOSPC failure after the first wraparound.

* fs/super.c:get_anon_bdev()

  The ID allocated from ida is masked off before being tested whether
  it's inside valid range.  ida allocated ID can never be a negative
  number and the masking is unnecessary.

Update idr_*() functions to fail with -EINVAL when negative @id is
specified and update other MAX_IDR_MASK users as described above.

This leaves MAX_IDR_MASK without any user, remove it and relocate
other MAX_IDR_* constants to lib/idr.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Roland Dreier <roland@kernel.org>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: "Marciniszyn, Mike" <mike.marciniszyn@intel.com>
Cc: Jack Morgenstein <jackm@dev.mellanox.co.il>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Wolfram Sang <wolfram@the-dreams.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:20 -08:00
Tejun Heo 326cf0f0f3 idr: fix top layer handling
Most functions in idr fail to deal with the high bits when the idr
tree grows to the maximum height.

* idr_get_empty_slot() stops growing idr tree once the depth reaches
  MAX_IDR_LEVEL - 1, which is one depth shallower than necessary to
  cover the whole range.  The function doesn't even notice that it
  didn't grow the tree enough and ends up allocating the wrong ID
  given sufficiently high @starting_id.

  For example, on 64 bit, if the starting id is 0x7fffff01,
  idr_get_empty_slot() will grow the tree 5 layer deep, which only
  covers the 30 bits and then proceed to allocate as if the bit 30
  wasn't specified.  It ends up allocating 0x3fffff01 without the bit
  30 but still returns 0x7fffff01.

* __idr_remove_all() will not remove anything if the tree is fully
  grown.

* idr_find() can't find anything if the tree is fully grown.

* idr_for_each() and idr_get_next() can't iterate anything if the tree
  is fully grown.

Fix it by introducing idr_max() which returns the maximum possible ID
given the depth of tree and replacing the id limit checks in all
affected places.

As the idr_layer pointer array pa[] needs to be 1 larger than the
maximum depth, enlarge pa[] arrays by one.

While this plugs the discovered issues, the whole code base is
horrible and in desparate need of rewrite.  It's fragile like hell,

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: <stable@vger.kernel.org>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:20 -08:00
Tejun Heo d5c7409f79 idr: implement idr_preload[_end]() and idr_alloc()
The current idr interface is very cumbersome.

* For all allocations, two function calls - idr_pre_get() and
  idr_get_new*() - should be made.

* idr_pre_get() doesn't guarantee that the following idr_get_new*()
  will not fail from memory shortage.  If idr_get_new*() returns
  -EAGAIN, the caller is expected to retry pre_get and allocation.

* idr_get_new*() can't enforce upper limit.  Upper limit can only be
  enforced by allocating and then freeing if above limit.

* idr_layer buffer is unnecessarily per-idr.  Each idr ends up keeping
  around MAX_IDR_FREE idr_layers.  The memory consumed per idr is
  under two pages but it makes it difficult to make idr_layer larger.

This patch implements the following new set of allocation functions.

* idr_preload[_end]() - Similar to radix preload but doesn't fail.
  The first idr_alloc() inside preload section can be treated as if it
  were called with @gfp_mask used for idr_preload().

* idr_alloc() - Allocate an ID w/ lower and upper limits.  Takes
  @gfp_flags and can be used w/o preloading.  When used inside
  preloaded section, the allocation mask of preloading can be assumed.

If idr_alloc() can be called from a context which allows sufficiently
relaxed @gfp_mask, it can be used by itself.  If, for example,
idr_alloc() is called inside spinlock protected region, preloading can
be used like the following.

	idr_preload(GFP_KERNEL);
	spin_lock(lock);

	id = idr_alloc(idr, ptr, start, end, GFP_NOWAIT);

	spin_unlock(lock);
	idr_preload_end();
	if (id < 0)
		error;

which is much simpler and less error-prone than idr_pre_get and
idr_get_new*() loop.

The new interface uses per-pcu idr_layer buffer and thus the number of
idr's in the system doesn't affect the amount of memory used for
preloading.

idr_layer_alloc() is introduced to handle idr_layer allocations for
both old and new ID allocation paths.  This is a bit hairy now but the
new interface is expected to replace the old and the internal
implementation eventually will become simpler.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:14 -08:00
Tejun Heo 3594eb2894 idr: refactor idr_get_new_above()
Move slot filling to idr_fill_slot() from idr_get_new_above_int() and
make idr_get_new_above() directly call it.  idr_get_new_above_int() is
no longer needed and removed.

This will be used to implement a new ID allocation interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:14 -08:00
Tejun Heo 12d1b4393e idr: remove _idr_rc_to_errno() hack
idr uses -1, IDR_NEED_TO_GROW and IDR_NOMORE_SPACE to communicate
exception conditions internally.  The return value is later translated
to errno values using _idr_rc_to_errno().

This is confusing.  Drop the custom ones and consistently use -EAGAIN
for "tree needs to grow", -ENOMEM for "need more memory" and -ENOSPC for
"ran out of ID space".

Due to the weird memory preloading mechanism, [ra]_get_new*() return
-EAGAIN on memory shortage, so we need to substitute -ENOMEM w/
-EAGAIN on those interface functions.  They'll eventually be cleaned
up and the translations will go away.

This patch doesn't introduce any functional changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:14 -08:00
Tejun Heo 49038ef4fb idr: relocate idr_for_each_entry() and reorganize id[r|a]_get_new()
* Move idr_for_each_entry() definition next to other idr related
  definitions.

* Make id[r|a]_get_new() inline wrappers of id[r|a]_get_new_above().

This changes the implementation of idr_get_new() but the new
implementation is trivial.  This patch doesn't introduce any
functional change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:14 -08:00
Tejun Heo fe6e24ec90 idr: deprecate idr_remove_all()
There was only one legitimate use of idr_remove_all() and a lot more of
incorrect uses (or lack of it).  Now that idr_destroy() implies
idr_remove_all() and all the in-kernel users updated not to use it,
there's no reason to keep it around.  Mark it deprecated so that we can
later unexport it.

idr_remove_all() is made an inline function calling __idr_remove_all()
to avoid triggering deprecated warning on EXPORT_SYMBOL().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:14 -08:00
Tejun Heo 9bb26bc1ff idr: make idr_destroy() imply idr_remove_all()
idr is silly in quite a few ways, one of which is how it's supposed to
be destroyed - idr_destroy() doesn't release IDs and doesn't even whine
if the idr isn't empty.  If the caller forgets idr_remove_all(), it
simply leaks memory.

Even ida gets this wrong and leaks memory on destruction.  There is
absoltely no reason not to call idr_remove_all() from idr_destroy().
Nobody is abusing idr_destroy() for shrinking free layer buffer and
continues to use idr after idr_destroy(), so it's safe to do remove_all
from destroy.

In the whole kernel, there is only one place where idr_remove_all() is
legitimiately used without following idr_destroy() while there are quite
a few places where the caller forgets either idr_remove_all() or
idr_destroy() leaking memory.

This patch makes idr_destroy() call idr_destroy_all() and updates the
function description accordingly.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:13 -08:00
Tejun Heo 6cdae7416a idr: fix a subtle bug in idr_get_next()
The iteration logic of idr_get_next() is borrowed mostly verbatim from
idr_for_each().  It walks down the tree looking for the slot matching
the current ID.  If the matching slot is not found, the ID is
incremented by the distance of single slot at the given level and
repeats.

The implementation assumes that during the whole iteration id is aligned
to the layer boundaries of the level closest to the leaf, which is true
for all iterations starting from zero or an existing element and thus is
fine for idr_for_each().

However, idr_get_next() may be given any point and if the starting id
hits in the middle of a non-existent layer, increment to the next layer
will end up skipping the same offset into it.  For example, an IDR with
IDs filled between [64, 127] would look like the following.

          [  0  64 ... ]
       /----/   |
       |        |
      NULL    [ 64 ... 127 ]

If idr_get_next() is called with 63 as the starting point, it will try
to follow down the pointer from 0.  As it is NULL, it will then try to
proceed to the next slot in the same level by adding the slot distance
at that level which is 64 - making the next try 127.  It goes around the
loop and finds and returns 127 skipping [64, 126].

Note that this bug also triggers in idr_for_each_entry() loop which
deletes during iteration as deletions can make layers go away leaving
the iteration with unaligned ID into missing layers.

Fix it by ensuring proceeding to the next slot doesn't carry over the
unaligned offset - ie.  use round_up(id + 1, slot_distance) instead of
id += slot_distance.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: David Teigland <teigland@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:12 -08:00
Imre Deak 4225fc8555 lib/scatterlist: use page iterator in the mapping iterator
For better code reuse use the newly added page iterator to iterate
through the pages.  The offset, length within the page is still
calculated by the mapping iterator as well as the actual mapping.  Idea
from Tejun Heo.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:10 -08:00
Imre Deak a321e91b6d lib/scatterlist: add simple page iterator
Add an iterator to walk through a scatter list a page at a time starting
at a specific page offset.  As opposed to the mapping iterator this is
meant to be small, performing well even in simple loops like collecting
all pages on the scatterlist into an array or setting up an iommu table
based on the pages' DMA address.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:10 -08:00
Jingoo Han 9ed8a30f34 lib/devres.c: fix misplaced #endif
A misplaced #endif causes link errors related to pcim_*() functions.

This is because pcim_*() functions are related to CONFIG_PCI option,
however these are not related to CONFIG_HAS_IOPORT option.  Therefore,
when CONFIG_PCI is enabled and CONFIG_HAS_IOPORT is not enabled, it makes
link errors related to pcim_*() functions as below:

drivers/ata/libata-sff.c:3233: undefined reference to `pcim_iomap_regions'
drivers/ata/libata-sff.c:3238: undefined reference to `pcim_iomap_table'
drivers/built-in.o: In function `ata_pci_sff_init_host':
drivers/ata/libata-sff.c:2318: undefined reference to `pcim_iomap_regions'
drivers/ata/libata-sff.c:2329: undefined reference to `pcim_iomap_table

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Cc: Greg KH <greg@kroah.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-27 19:10:09 -08:00
Linus Torvalds 9043a2650c The sweeping change is to make add_taint() explicitly indicate whether to disable
lockdep, but it's a mechanical change.
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJRJAcuAAoJENkgDmzRrbjxsw0P/3eXb+LddYnx0V0uHYdKpCUf
 4vdW7X0fX3Z+aUK69IWRL/6ahoO4TpaHYGHBDjEoivyQ0GDq14X7JNWsYYt3LdMf
 3wmDgRc2cn/mZOJbFeVpNV8ox5l/xc0CUvV+iQ8tMjfQItXMXgWUFZKMECsXKSO6
 eex3lrw9M2jAX2uL8LQPp9W8xtKu24nSZRC6tH5riE/8fCzi1cZPPAqfxP5c8Lee
 ZXtbCRSyAFENZLpKyMe1PC7HvtJyi5NDn9xwOQiXULZV/VOlvP94DGBLIKCM/6dn
 4QvZxpG0P0uOlpCgRAVLyh/z7g4XY4VF/fHopLCmEcqLsvgD+V2LQpQ9zWUalLPC
 Z+pUpz2vu0gIddPU1nR8R6oGpEdJ8O12aJle62p/RSXWZGx12qUQ+Tamu0tgKcv1
 AsiJfbUGNDYfxgU6sHsoQjl2f68LTVckCU1C1LqEbW/S104EIORtGx30CHM4LRiO
 32kDC5TtgYDBKQAIqJ4bL48ZMh+9W3uX40p7xzOI5khHQjvswUKa3jcxupU0C1uv
 lx8KXo7pn8WT33QGysWC782wJCgJuzSc2vRn+KQoqoynuHGM6agaEtR59gil3QWO
 rQEcxH63BBRDgHlg4FM9IkJwwsnC3PWKL8gbX0uAWXAPMbgapJkuuGZAwt0WDGVK
 +GszxsFkCjlW0mK0egTb
 =tiSY
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module update from Rusty Russell:
 "The sweeping change is to make add_taint() explicitly indicate whether
  to disable lockdep, but it's a mechanical change."

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  MODSIGN: Add option to not sign modules during modules_install
  MODSIGN: Add -s <signature> option to sign-file
  MODSIGN: Specify the hash algorithm on sign-file command line
  MODSIGN: Simplify Makefile with a Kconfig helper
  module: clean up load_module a little more.
  modpost: Ignore ARC specific non-alloc sections
  module: constify within_module_*
  taint: add explicit flag to show whether lock dep is still OK.
  module: printk message when module signature fail taints kernel.
2013-02-25 15:41:43 -08:00
Linus Torvalds 3b5d8510b9 Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core locking changes from Ingo Molnar:
 "The biggest change is the rwsem lock-steal improvements, both to the
  assembly optimized and the spinlock based variants.

  The other notable change is the clean up of the seqlock implementation
  to be based on the seqcount infrastructure.

  The rest is assorted smaller debuggability, cleanup and continued -rt
  locking changes."

* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rwsem-spinlock: Implement writer lock-stealing for better scalability
  futex: Revert "futex: Mark get_robust_list as deprecated"
  generic: Use raw local irq variant for generic cmpxchg
  lockdep: Selftest: convert spinlock to raw spinlock
  seqlock: Use seqcount infrastructure
  seqlock: Remove unused functions
  ntp: Make ntp_lock raw
  intel_idle: Convert i7300_idle_lock to raw_spinlock
  locking: Various static lock initializer fixes
  lockdep: Print more info when MAX_LOCK_DEPTH is exceeded
  rwsem: Implement writer lock-stealing for better scalability
  lockdep: Silence warning if CONFIG_LOCKDEP isn't set
  watchdog: Use local_clock for get_timestamp()
  lockdep: Rename print_unlock_inbalance_bug() to print_unlock_imbalance_bug()
  locking/stat: Fix a typo
2013-02-22 19:25:09 -08:00
Linus Torvalds 2ef14f465b Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm changes from Peter Anvin:
 "This is a huge set of several partly interrelated (and concurrently
  developed) changes, which is why the branch history is messier than
  one would like.

  The *really* big items are two humonguous patchsets mostly developed
  by Yinghai Lu at my request, which completely revamps the way we
  create initial page tables.  In particular, rather than estimating how
  much memory we will need for page tables and then build them into that
  memory -- a calculation that has shown to be incredibly fragile -- we
  now build them (on 64 bits) with the aid of a "pseudo-linear mode" --
  a #PF handler which creates temporary page tables on demand.

  This has several advantages:

  1. It makes it much easier to support things that need access to data
     very early (a followon patchset uses this to load microcode way
     early in the kernel startup).

  2. It allows the kernel and all the kernel data objects to be invoked
     from above the 4 GB limit.  This allows kdump to work on very large
     systems.

  3. It greatly reduces the difference between Xen and native (Xen's
     equivalent of the #PF handler are the temporary page tables created
     by the domain builder), eliminating a bunch of fragile hooks.

  The patch series also gets us a bit closer to W^X.

  Additional work in this pull is the 64-bit get_user() work which you
  were also involved with, and a bunch of cleanups/speedups to
  __phys_addr()/__pa()."

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (105 commits)
  x86, mm: Move reserving low memory later in initialization
  x86, doc: Clarify the use of asm("%edx") in uaccess.h
  x86, mm: Redesign get_user with a __builtin_choose_expr hack
  x86: Be consistent with data size in getuser.S
  x86, mm: Use a bitfield to mask nuisance get_user() warnings
  x86/kvm: Fix compile warning in kvm_register_steal_time()
  x86-32: Add support for 64bit get_user()
  x86-32, mm: Remove reference to alloc_remap()
  x86-32, mm: Remove reference to resume_map_numa_kva()
  x86-32, mm: Rip out x86_32 NUMA remapping code
  x86/numa: Use __pa_nodebug() instead
  x86: Don't panic if can not alloc buffer for swiotlb
  mm: Add alloc_bootmem_low_pages_nopanic()
  x86, 64bit, mm: hibernate use generic mapping_init
  x86, 64bit, mm: Mark data/bss/brk to nx
  x86: Merge early kernel reserve for 32bit and 64bit
  x86: Add Crash kernel low reservation
  x86, kdump: Remove crashkernel range find limit for 64bit
  memblock: Add memblock_mem_size()
  x86, boot: Not need to check setup_header version for setup_data
  ...
2013-02-21 18:06:55 -08:00
Linus Torvalds 7c2db36e73 Merge branch 'akpm' (incoming from Andrew)
Merge misc patches from Andrew Morton:

 - Florian has vanished so I appear to have become fbdev maintainer
   again :(

 - Joel and Mark are distracted to welcome to the new OCFS2 maintainer

 - The backlight queue

 - Small core kernel changes

 - lib/ updates

 - The rtc queue

 - Various random bits

* akpm: (164 commits)
  rtc: rtc-davinci: use devm_*() functions
  rtc: rtc-max8997: use devm_request_threaded_irq()
  rtc: rtc-max8907: use devm_request_threaded_irq()
  rtc: rtc-da9052: use devm_request_threaded_irq()
  rtc: rtc-wm831x: use devm_request_threaded_irq()
  rtc: rtc-tps80031: use devm_request_threaded_irq()
  rtc: rtc-lp8788: use devm_request_threaded_irq()
  rtc: rtc-coh901331: use devm_clk_get()
  rtc: rtc-vt8500: use devm_*() functions
  rtc: rtc-tps6586x: use devm_request_threaded_irq()
  rtc: rtc-imxdi: use devm_clk_get()
  rtc: rtc-cmos: use dev_warn()/dev_dbg() instead of printk()/pr_debug()
  rtc: rtc-pcf8583: use dev_warn() instead of printk()
  rtc: rtc-sun4v: use pr_warn() instead of printk()
  rtc: rtc-vr41xx: use dev_info() instead of printk()
  rtc: rtc-rs5c313: use pr_err() instead of printk()
  rtc: rtc-at91rm9200: use dev_dbg()/dev_err() instead of printk()/pr_debug()
  rtc: rtc-rs5c372: use dev_dbg()/dev_warn() instead of printk()/pr_debug()
  rtc: rtc-ds2404: use dev_err() instead of printk()
  rtc: rtc-efi: use dev_err()/dev_warn()/pr_err() instead of printk()
  ...
2013-02-21 17:38:49 -08:00
Florian Fainelli 5dc49c75a2 decompressors: make the default XZ_DEC_* config match the selected architecture
Change the defautl XZ_DEC_* config symbol to match the configured
architecture.  It is perfectly legitimate to support multiple XZ BCJ
filters for different architectures (e.g.: to mount foreign squashfs/xz
compressed filesystems), it is however more natural not to select them all
by default, but only the one matching the configured architecture.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Acked-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-21 17:22:26 -08:00
Florian Fainelli 64dbfb444c decompressors: drop dependency on CONFIG_EXPERT
Remove the XZ_DEC_* depedencey on CONFIG_EXPERT as recommended by Lasse
Colin.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Acked-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-21 17:22:26 -08:00
Florian Fainelli 9d74962965 decompressors: group XZ_DEC_* symbols under an if XZ_BCJ / endif
Group all architecture-specific BCJ filter configuration symbols under an
if XZ_BCJ / endif statement.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Acked-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-21 17:22:26 -08:00
Namjae Jeon 53769627b9 lib/parser.c: fix up comments for valid return values from match_number
match_number() has return values of -ENOMEM, -EINVAL and -ERANGE.  So, for
all the functions calling match_number, the return value should include
these values.  Fix up the comments to reflect the correct values.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-21 17:22:25 -08:00
Stepan Moskovchenko 7d7992108d lib/vsprintf.c: add %pa format specifier for phys_addr_t types
Add the %pa format specifier for printing a phys_addr_t type and its
derivative types (such as resource_size_t), since the physical address
size on some platforms can vary based on build options, regardless of
the native integer type.

Signed-off-by: Stepan Moskovchenko <stepanm@codeaurora.org>
Cc: Rob Landley <rob@landley.net>
Cc: George Spelvin <linux@horizon.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-21 17:22:20 -08:00
Kyle McMartin 76e8402619 lib/Kconfig.debug: unhide CONFIG_PANIC_ON_OOPS
CONFIG_EXPERT doesn't really make sense, and hides it unintentionally.
Remove superfluous "default n" pointed out by Ingo as well.

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-21 17:22:20 -08:00
Linus Torvalds 21eaab6d19 tty/serial patches for 3.9-rc1
Here's the big tty/serial driver patches for 3.9-rc1.
 
 More tty port rework and fixes from Jiri here, as well as lots of
 individual serial driver updates and fixes.
 
 All of these have been in the linux-next tree for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlEmZYQACgkQMUfUDdst+ylJDgCg0B0nMevUUdM4hLvxunbbiyXM
 HUEAoIOedqriNNPvX4Bwy0hjeOEaWx0g
 =vi6x
 -----END PGP SIGNATURE-----

Merge tag 'tty-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial patches from Greg Kroah-Hartman:
 "Here's the big tty/serial driver patches for 3.9-rc1.

  More tty port rework and fixes from Jiri here, as well as lots of
  individual serial driver updates and fixes.

  All of these have been in the linux-next tree for a while."

* tag 'tty-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (140 commits)
  tty: mxser: improve error handling in mxser_probe() and mxser_module_init()
  serial: imx: fix uninitialized variable warning
  serial: tegra: assume CONFIG_OF
  TTY: do not update atime/mtime on read/write
  lguest: select CONFIG_TTY to build properly.
  ARM defconfigs: add missing inclusions of linux/platform_device.h
  fb/exynos: include platform_device.h
  ARM: sa1100/assabet: include platform_device.h directly
  serial: imx: Fix recursive locking bug
  pps: Fix build breakage from decoupling pps from tty
  tty: Remove ancient hardpps()
  pps: Additional cleanups in uart_handle_dcd_change
  pps: Move timestamp read into PPS code proper
  pps: Don't crash the machine when exiting will do
  pps: Fix a use-after free bug when unregistering a source.
  pps: Use pps_lookup_dev to reduce ldisc coupling
  pps: Add pps_lookup_dev() function
  tty: serial: uartlite: Support uartlite on big and little endian systems
  tty: serial: uartlite: Fix sparse and checkpatch warnings
  serial/arc-uart: Miscll DT related updates (Grant's review comments)
  ...

Fix up trivial conflicts, mostly just due to the TTY config option
clashing with the EXPERIMENTAL removal.
2013-02-21 13:41:04 -08:00
Linus Torvalds 06991c28f3 Driver core patches for 3.9-rc1
Here is the big driver core merge for 3.9-rc1
 
 There are two major series here, both of which touch lots of drivers all
 over the kernel, and will cause you some merge conflicts:
   - add a new function called devm_ioremap_resource() to properly be
     able to check return values.
   - remove CONFIG_EXPERIMENTAL
 
 If you need me to provide a merged tree to handle these resolutions,
 please let me know.
 
 Other than those patches, there's not much here, some minor fixes and
 updates.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlEmV0cACgkQMUfUDdst+yncCQCfbmnQZju7kzWXk6PjdFuKspT9
 weAAoMCzcAtEzzc4LXuUxxG/sXBVBCjW
 =yWAQ
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core patches from Greg Kroah-Hartman:
 "Here is the big driver core merge for 3.9-rc1

  There are two major series here, both of which touch lots of drivers
  all over the kernel, and will cause you some merge conflicts:

   - add a new function called devm_ioremap_resource() to properly be
     able to check return values.

   - remove CONFIG_EXPERIMENTAL

  Other than those patches, there's not much here, some minor fixes and
  updates"

Fix up trivial conflicts

* tag 'driver-core-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (221 commits)
  base: memory: fix soft/hard_offline_page permissions
  drivercore: Fix ordering between deferred_probe and exiting initcalls
  backlight: fix class_find_device() arguments
  TTY: mark tty_get_device call with the proper const values
  driver-core: constify data for class_find_device()
  firmware: Ignore abort check when no user-helper is used
  firmware: Reduce ifdef CONFIG_FW_LOADER_USER_HELPER
  firmware: Make user-mode helper optional
  firmware: Refactoring for splitting user-mode helper code
  Driver core: treat unregistered bus_types as having no devices
  watchdog: Convert to devm_ioremap_resource()
  thermal: Convert to devm_ioremap_resource()
  spi: Convert to devm_ioremap_resource()
  power: Convert to devm_ioremap_resource()
  mtd: Convert to devm_ioremap_resource()
  mmc: Convert to devm_ioremap_resource()
  mfd: Convert to devm_ioremap_resource()
  media: Convert to devm_ioremap_resource()
  iommu: Convert to devm_ioremap_resource()
  drm: Convert to devm_ioremap_resource()
  ...
2013-02-21 12:05:51 -08:00
Linus Torvalds 33673dcb37 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "This is basically a maintenance update for the TPM driver and EVM/IMA"

Fix up conflicts in lib/digsig.c and security/integrity/ima/ima_main.c

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (45 commits)
  tpm/ibmvtpm: build only when IBM pseries is configured
  ima: digital signature verification using asymmetric keys
  ima: rename hash calculation functions
  ima: use new crypto_shash API instead of old crypto_hash
  ima: add policy support for file system uuid
  evm: add file system uuid to EVM hmac
  tpm_tis: check pnp_acpi_device return code
  char/tpm/tpm_i2c_stm_st33: drop temporary variable for return value
  char/tpm/tpm_i2c_stm_st33: remove dead assignment in tpm_st33_i2c_probe
  char/tpm/tpm_i2c_stm_st33: Remove __devexit attribute
  char/tpm/tpm_i2c_stm_st33: Don't use memcpy for one byte assignment
  tpm_i2c_stm_st33: removed unused variables/code
  TPM: Wait for TPM_ACCESS tpmRegValidSts to go high at startup
  tpm: Fix cancellation of TPM commands (interrupt mode)
  tpm: Fix cancellation of TPM commands (polling mode)
  tpm: Store TPM vendor ID
  TPM: Work around buggy TPMs that block during continue self test
  tpm_i2c_stm_st33: fix oops when i2c client is unavailable
  char/tpm: Use struct dev_pm_ops for power management
  TPM: STMicroelectronics ST33 I2C BUILD STUFF
  ...
2013-02-21 08:18:12 -08:00
Markus F.X.J. Oberhumer 8b975bd3f9 lib/lzo: Update LZO compression to current upstream version
This commit updates the kernel LZO code to the current upsteam version
which features a significant speed improvement - benchmarking the Calgary
and Silesia test corpora typically shows a doubled performance in
both compression and decompression on modern i386/x86_64/powerpc machines.

Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>
2013-02-20 19:36:01 +01:00
Markus F.X.J. Oberhumer b6bec26cea lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c
Rename the source file to match the function name and thereby
also make room for a possible future even slightly faster
"non-safe" decompressor version.

Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com>
2013-02-20 19:36:00 +01:00
Linus Torvalds e84cf5d0fd Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU changes from Ingo Molnar:
 "SRCU changes:

   - These include debugging aids, updates that move towards the goal of
     permitting srcu_read_lock() and srcu_read_unlock() to be used from
     idle and offline CPUs, and a few small fixes.

  Changes to rcutorture and to RCU documentation:

   - Posted to LKML at https://lkml.org/lkml/2013/1/26/188

  Enhancements to uniprocessor handling in tiny RCU:

   - Posted to LKML at https://lkml.org/lkml/2013/1/27/2

  Tag RCU callbacks with grace-period number to simplify callback
  advancement:

   - Posted to LKML at https://lkml.org/lkml/2013/1/26/203

  Miscellaneous fixes:

   - Posted to LKML at https://lkml.org/lkml/2013/1/26/204"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  srcu: use ACCESS_ONCE() to access sp->completed in srcu_read_lock()
  srcu: Update synchronize_srcu_expedited()'s comments
  srcu: Update synchronize_srcu()'s comments
  srcu: Remove checks preventing idle CPUs from calling srcu_read_lock()
  srcu: Remove checks preventing offline CPUs from calling srcu_read_lock()
  srcu: Simple cleanup for cleanup_srcu_struct()
  srcu: Add might_sleep() annotation to synchronize_srcu()
  srcu: Simplify __srcu_read_unlock() via this_cpu_dec()
  rcu: Allow rcutorture to be built at low optimization levels
  rcu: Make rcutorture's shuffler task shuffle recently added tasks
  rcu: Allow TREE_PREEMPT_RCU on UP systems
  rcu: Provide RCU CPU stall warnings for tiny RCU
  context_tracking: Add comments on interface and internals
  rcu: Remove obsolete Kconfig option from comment
  rcu: Remove unused code originally used for context tracking
  rcu: Consolidate debugging Kconfig options
  rcu: Correct 'optimized' to 'optimize' in header comment
  rcu: Trace callback acceleration
  rcu: Tag callback lists with corresponding grace-period number
  rcutorture: Don't compare ptr with 0
  ...
2013-02-19 17:45:20 -08:00
Yuanhan Liu 41ef8f8266 rwsem-spinlock: Implement writer lock-stealing for better scalability
We (Linux Kernel Performance project) found a regression
introduced by commit:

  5a505085f0 mm/rmap: Convert the struct anon_vma::mutex to an rwsem

which converted all anon_vma::mutex locks rwsem write locks.

The semantics are the same, but the behavioral difference is
quite huge in some cases. After investigating it we found the
root cause: mutexes support lock stealing while rwsems don't.

Here is the link for the detailed regression report:

  https://lkml.org/lkml/2013/1/29/84

Ingo suggested adding write lock stealing to rwsems:

    "I think we should allow lock-steal between rwsem writers - that
     will not hurt fairness as most rwsem fairness concerns relate to
     reader vs. writer fairness"

And here is the rwsem-spinlock version.

With this patch, we got a double performance increase in one
test box with following aim7 workfile:

    FILESIZE: 1M
    POOLSIZE: 10M
    10 fork_test

 /usr/bin/time output w/o patch                       /usr/bin/time_output with patch
 -- Percent of CPU this job got: 369%                 Percent of CPU this job got: 537%
 Voluntary context switches: 640595016                Voluntary context switches: 157915561

We got a 45% increase in CPU usage and saved about 3/4 voluntary context switches.

Reported-by: LKP project <lkp@linux.intel.com>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: Alex Shi <alex.shi@intel.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Anton Blanchard <anton@samba.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: paul.gortmaker@windriver.com
Link: http://lkml.kernel.org/r/1359716356-23865-1-git-send-email-yuanhan.liu@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-19 08:43:39 +01:00
Yong Zhang 9fb1b90ce0 lockdep: Selftest: convert spinlock to raw spinlock
To make the lockdep selftest working on RT we need to convert the
spinlock tests to a raw spinlock. Otherwise we cannot run the irq
context checks. For mainline this is just annotational as spinlocks
are mapped to raw_spinlocks anyway.

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Link: http://lkml.kernel.org/r/1334559716-18447-2-git-send-email-yong.zhang0@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-02-19 08:43:35 +01:00
Alex Shi ce6711f3d1 rwsem: Implement writer lock-stealing for better scalability
Commit 5a505085f0 ("mm/rmap: Convert the struct anon_vma::mutex
to an rwsem") changed struct anon_vma::mutex to an rwsem, which
caused aim7 fork_test performance to drop by 50%.

Yuanhan Liu did the following excellent analysis:

    https://lkml.org/lkml/2013/1/29/84

and found that the regression is caused by strict, serialized,
FIFO sequential write-ownership of rwsems. Ingo suggested
implementing opportunistic lock-stealing for the front writer
task in the waitqueue.

Yuanhan Liu implemented lock-stealing for spinlock-rwsems,
which indeed recovered much of the regression - confirming
the analysis that the main factor in the regression was the
FIFO writer-fairness of rwsems.

In this patch we allow lock-stealing to happen when the first
waiter is also writer. With that change in place the
aim7 fork_test performance is fully recovered on my
Intel NHM EP, NHM EX, SNB EP 2S and 4S test-machines.

Reported-by: lkp@linux.intel.com
Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Alex Shi <alex.shi@intel.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Anton Blanchard <anton@samba.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: paul.gortmaker@windriver.com
Link: https://lkml.org/lkml/2013/1/29/84
Link: http://lkml.kernel.org/r/1360069915-31619-1-git-send-email-alex.shi@intel.com
[ Small stylistic fixes, updated changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-19 08:42:43 +01:00
Vineet Gupta 64e69073c3 asm-generic headers: Allow yet more arch overrides in checksum.h
arches can have more efficient implementation of these routines

Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2013-02-11 20:00:33 +05:30
Ingo Molnar 9228b5f243 Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney:

1.	Changes to rcutorture and to RCU documentation. Posted to LKML at
        https://lkml.org/lkml/2013/1/26/188.

2.	Enhancements to uniprocessor handling in tiny RCU. Posted to LKML
        at https://lkml.org/lkml/2013/1/27/2.

3.	Tag RCU callbacks with grace-period number to simplify callback
        advancement. Posted to LKML at https://lkml.org/lkml/2013/1/26/203.

4.	Miscellaneous fixes. Posted to LKML at https://lkml.org/lkml/2013/1/26/204.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-02-04 19:06:34 +01:00
Andy Shevchenko 0d2a1b2d03 mpilib: use DIV_ROUND_UP and remove unused macros
Remove MIN, MAX and ABS macros that are duplicates kernel's native
implementation.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-02-01 16:28:32 +11:00
Dmitry Kasatkin 26d438457e digsig: remove unnecessary memory allocation and copying
In existing use case, copying of the decoded data is unnecessary in
pkcs_1_v1_5_decode_emsa. It is just enough to get pointer to the message.
Removing copying and extra buffer allocation.

Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-02-01 16:28:24 +11:00
YOSHIFUJI Hideaki 7810cc1e77 digsig: Fix memory leakage in digsig_verify_rsa()
digsig_verify_rsa() does not free kmalloc'ed buffer returned by
mpi_get_buffer().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2013-02-01 15:59:33 +11:00
Yinghai Lu ac2cbab21f x86: Don't panic if can not alloc buffer for swiotlb
Normal boot path on system with iommu support:
swiotlb buffer will be allocated early at first and then try to initialize
iommu, if iommu for intel or AMD could setup properly, swiotlb buffer
will be freed.

The early allocating is with bootmem, and could panic when we try to use
kdump with buffer above 4G only, or with memmap to limit mem under 4G.
for example: memmap=4095M$1M to remove memory under 4G.

According to Eric, add _nopanic version and no_iotlb_memory to fail
map single later if swiotlb is still needed.

-v2: don't pass nopanic, and use -ENOMEM return value according to Eric.
     panic early instead of using swiotlb_full to panic...according to Eric/Konrad.
-v3: make swiotlb_init to be notpanic, but will affect:
     arm64, ia64, powerpc, tile, unicore32, x86.
-v4: cleanup swiotlb_init by removing swiotlb_init_with_default_size.

Suggested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-36-git-send-email-yinghai@kernel.org
Reviewed-and-tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Cc: linux-mips@linux-mips.org
Cc: xen-devel@lists.xensource.com
Cc: virtualization@lists.linux-foundation.org
Cc: Shuah Khan <shuahkhan@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-29 19:36:53 -08:00
Paul E. McKenney 40393f525f Merge branches 'doctorture.2013.01.29a', 'fixes.2013.01.26a', 'tagcb.2013.01.24a' and 'tiny.2013.01.29b' into HEAD
doctorture.2013.01.11a: Changes to rcutorture and to RCU documentation.

fixes.2013.01.26a: Miscellaneous fixes.

tagcb.2013.01.24a: Tag RCU callbacks with grace-period number to
	simplify callback advancement.

tiny.2013.01.29b: Enhancements to uniprocessor handling in tiny RCU.
2013-01-28 22:25:21 -08:00
Paul E. McKenney 6bfc09e232 rcu: Provide RCU CPU stall warnings for tiny RCU
Tiny RCU has historically omitted RCU CPU stall warnings in order to
reduce memory requirements, however, lack of these warnings caused
Thomas Gleixner some debugging pain recently.  Therefore, this commit
adds RCU CPU stall warnings to tiny RCU if RCU_TRACE=y.  This keeps
the memory footprint small, while still enabling CPU stall warnings
in kernels built to enable them.

Updated to include Josh Triplett's suggested use of RCU_STALL_COMMON
config variable to simplify #if expressions.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-01-28 22:06:21 -08:00
Dave Hansen 2f03e3ca74 rcu: Consolidate debugging Kconfig options
The RCU-related debugging Kconfig options are in two different places,
and consume too much screen real estate.  This commit therefore
consolidates them into their own menu.

Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-01-26 16:34:48 -08:00
Greg Kroah-Hartman 422d26b6ec Merge 3.8-rc5 into driver-core-next
This resolves a gpio driver merge issue pointed out in linux-next.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-25 21:06:30 -08:00
Greg Kroah-Hartman 9f9cba810f Merge 3.8-rc5 into tty-next
This resolves a number of tty driver merge issues found in linux-next

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-25 13:27:36 -08:00
Thierry Reding f4a18312f4 lib: devres: Fix build breakage
The ERR_PTR() and IS_ERR() macros used by the devm_ioremap_resource()
function are defined in the linux/err.h header. On ARM this seems to be
pulled in by one of the other headers but the build fails at least on
OpenRISC.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-22 13:31:18 -08:00
Thierry Reding 75096579c3 lib: devres: Introduce devm_ioremap_resource()
The devm_request_and_ioremap() function is very useful and helps avoid a
whole lot of boilerplate. However, one issue that keeps popping up is
its lack of a specific error code to determine which of the steps that
it performs failed. Furthermore, while the function gives an example and
suggests what error code to return on failure, a wide variety of error
codes are used throughout the tree.

In an attempt to fix these problems, this patch adds a new function that
drivers can transition to. The devm_ioremap_resource() returns a pointer
to the remapped I/O memory on success or an ERR_PTR() encoded error code
on failure. Callers can check for failure using IS_ERR() and determine
its cause by extracting the error code using PTR_ERR().

devm_request_and_ioremap() is implemented as a wrapper around the new
API and return NULL on failure as before. This ensures that backwards
compatibility is maintained until all users have been converted to the
new API, at which point the old devm_request_and_ioremap() function
should be removed.

A semantic patch is included which can be used to convert from the old
devm_request_and_ioremap() API to the new devm_ioremap_resource() API.
Some non-trivial cases may require manual intervention, though.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-22 09:41:43 -08:00
Rusty Russell 373d4d0997 taint: add explicit flag to show whether lock dep is still OK.
Fix up all callers as they were before, with make one change: an
unsigned module taints the kernel, but doesn't turn off lockdep.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-01-21 17:17:57 +10:30
Linus Torvalds 226364766f Various minor fixes, but a slightly more complex one to fix the per-cpu overload
problem introduced recently by kvm id changes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQ/IaJAAoJENkgDmzRrbjxOjAQAIrI9+Jo3Lsxk1v9gXeo9xn2
 ST4LNv7/oW2+3NFBOkKsGVpcXe1JtGySIXyx9k+dELPa5xe4Rs4HE3pHQj/VoEx8
 FKz3oUXSHkuh+paKuFXvZ2u/z0/FI99GmqHPObvGQ4iS3hTXAibzO83yYYPxwApq
 Zq4kof/dAcVVPLm8fGVAMPA2Rbh/WmjDfrIv8gv71QkDjtRLzcr40VIgky5cvu7V
 FWcBl4/DVoKkGnDPsLDhLK9QGqgBGhFIlNIcVX4Jv50DiCibOyzdjeUXYxMftoGr
 Rw56hHwGpPdqbRIjBkR071vIl/mlXTmxIv+d77vZNBin2MIBwAzCQXo8I1/HojCK
 /wKhI+RFj0J5DaDo/BTB80cmI3X2oah5sRUebW6vd9HjunhFFndg4mVeDNPa0E0+
 F72xWlj79BjdIOuD06TLg6Tg2klL49nC8bUc0wrsh6onEjhd9v7Cp/X/rxi5cKYW
 eEv3oLkKwUHoheF9gBlpnT0Yyl/HpFe+nemblzj/ybRKnk4A5vtJqV9eZnqoOS16
 lgIkKOpgXT9dzSom2EL/f4sMCeLLYC44DQwOvxNKt/BdMY0r5y8OLaJORXQGfEDF
 Ztvu2G8PmELxV0B3JZcGR/zOcKxpOBsrGoVn0/EQIul3A/0C0ID7i5zwJAyX6LP7
 V+6vyF2eHMf10tB0rbfB
 =SpOo
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module fixes and a virtio block fix from Rusty Russell:
 "Various minor fixes, but a slightly more complex one to fix the
  per-cpu overload problem introduced recently by kvm id changes."

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  module: put modules in list much earlier.
  module: add new state MODULE_STATE_UNFORMED.
  module: prevent warning when finit_module a 0 sized file
  virtio-blk: Don't free ida when disk is in use
2013-01-20 16:44:28 -08:00
Joe Millenbach 4f73bc4dd3 tty: Added a CONFIG_TTY option to allow removal of TTY
The option allows you to remove TTY and compile without errors. This
saves space on systems that won't support TTY interfaces anyway.
bloat-o-meter output is below.

The bulk of this patch consists of Kconfig changes adding "depends on
TTY" to various serial devices and similar drivers that require the TTY
layer.  Ideally, these dependencies would occur on a common intermediate
symbol such as SERIO, but most drivers "select SERIO" rather than
"depends on SERIO", and "select" does not respect dependencies.

bloat-o-meter output comparing our previous minimal to new minimal by
removing TTY.  The list is filtered to not show removed entries with awk
'$3 != "-"' as the list was very long.

add/remove: 0/226 grow/shrink: 2/14 up/down: 6/-35356 (-35350)
function                                     old     new   delta
chr_dev_init                                 166     170      +4
allow_signal                                  80      82      +2
static.__warned                              143     142      -1
disallow_signal                               63      62      -1
__set_special_pids                            95      94      -1
unregister_console                           126     121      -5
start_kernel                                 546     541      -5
register_console                             593     588      -5
copy_from_user                                45      40      -5
sys_setsid                                   128     120      -8
sys_vhangup                                   32      19     -13
do_exit                                     1543    1526     -17
bitmap_zero                                   60      40     -20
arch_local_irq_save                          137     117     -20
release_task                                 674     652     -22
static.spin_unlock_irqrestore                308     260     -48

Signed-off-by: Joe Millenbach <jmillenbach@gmail.com>
Reviewed-by: Jamey Sharp <jamey@minilop.net>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-18 16:15:27 -08:00
Greg Kroah-Hartman ed408f7c0f Merge 3.9-rc4 into driver-core-next
This is to fix up a build problem with a wireless driver due to the
dynamic-debug patches in this branch.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17 19:48:18 -08:00
Jim Cromie 18c216c53b dynamic_debug: add pr_errs before -EINVALs
Ma noted that dynamic-debug is silent about many query errors, so add
pr_err()s to explain those errors, and tweak a few others.  Also parse
flags 1st, so that match-spec errs are slightly clearer.

CC: Jianpeng Ma <majianpeng@gmail.com>
CC: Joe Perches <joe@perches.com>
Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17 12:19:09 -08:00
Vladimir Kondratiev 7a555613eb dynamic_debug: dynamic hex dump
Introduce print_hex_dump_debug() that can be dynamically controlled, similar to
pr_debug.

Also, make print_hex_dump_bytes() dynamically controlled

Implement only 'p' flag (_DPRINTK_FLAGS_PRINT) to keep it simple since hex dump prints
multiple lines and long prefix would impact readability.
To provide line/file etc. information, use pr_debug or similar
before/after print_hex_dump_debug()

Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17 12:19:09 -08:00
Joe Perches f657fd21e1 dynamic_debug: Fix vpr_<foo> logging styles
vpr_info_dq should be a function and vpr_info should have
a do {} while (0)

Add missing newlines to pr_<level>s.

Miscellaneous neatening too.
braces, coalescing formats, alignments, etc...

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17 12:18:07 -08:00
Kees Cook 525c1f9204 lib: remove depends on CONFIG_EXPERIMENTAL
The CONFIG_EXPERIMENTAL config item has not carried much meaning for a
while now and is almost always enabled by default. As agreed during the
Linux kernel summit, remove it from any "depends on" lines in Kconfigs.

CC: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
CC: James Morris <james.l.morris@oracle.com>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Akinobu Mita <akinobu.mita@gmail.com>
CC: Ingo Molnar <mingo@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-17 12:11:27 -08:00
Rusty Russell 1fb9341ac3 module: put modules in list much earlier.
Prarit's excellent bug report:
> In recent Fedora releases (F17 & F18) some users have reported seeing
> messages similar to
>
> [   15.478160] kvm: Could not allocate 304 bytes percpu data
> [   15.478174] PERCPU: allocation failed, size=304 align=32, alloc from
> reserved chunk failed
>
> during system boot.  In some cases, users have also reported seeing this
> message along with a failed load of other modules.
>
> What is happening is systemd is loading an instance of the kvm module for
> each cpu found (see commit e9bda3b).  When the module load occurs the kernel
> currently allocates the modules percpu data area prior to checking to see
> if the module is already loaded or is in the process of being loaded.  If
> the module is already loaded, or finishes load, the module loading code
> releases the current instance's module's percpu data.

Now we have a new state MODULE_STATE_UNFORMED, we can insert the
module into the list (and thus guarantee its uniqueness) before we
allocate the per-cpu region.

Reported-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Prarit Bhargava <prarit@redhat.com>
2013-01-12 13:27:46 +10:30
Michel Lespinasse 3cb7a56344 lib/rbtree.c: avoid the use of non-static __always_inline
lib/rbtree.c declared __rb_erase_color() as __always_inline void, and
then exported it with EXPORT_SYMBOL.

This was because __rb_erase_color() must be exported for augmented
rbtree users, but it must also be inlined into rb_erase() so that the
dummy callback can get optimized out of that call site.

(Actually with a modern compiler, none of the dummy callback functions
should even be generated as separate text functions).

The above usage is legal C, but it was unusual enough for some compilers
to warn about it.  This change makes things more explicit, with a static
__always_inline ____rb_erase_color function for use in rb_erase(), and a
separate non-inline __rb_erase_color function for use in
rb_erase_augmented call sites.

Signed-off-by: Michel Lespinasse <walken@google.com>
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-11 14:54:56 -08:00
David Decotigny 896f97ea95 lib: cpu_rmap: avoid flushing all workqueues
In some cases, free_irq_cpu_rmap() is called while holding a lock (eg
rtnl).  This can lead to deadlocks, because it invokes
flush_scheduled_work() which ends up waiting for whole system workqueue
to flush, but some pending works might try to acquire the lock we are
already holding.

This commit uses reference-counting to replace
irq_run_affinity_notifiers().  It also removes
irq_run_affinity_notifiers() altogether.

[akpm@linux-foundation.org: eliminate free_cpu_rmap, rename cpu_rmap_reclaim() to cpu_rmap_release(), propagate kref_put() retval from cpu_rmap_put()]
Signed-off-by: David Decotigny <decot@googlers.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-01-11 14:54:54 -08:00
Paul E. McKenney 5249453510 rcu: Reduce rcutorture tracing
Currently, rcutorture traces every read-side access.  This can be
problematic because even a two-minute rcutorture run on a two-CPU system
can generate 28,853,363 reads.  Normally, only a failing read is of
interest, so this commit traces adjusts rcutorture's tracing to only
trace failing reads.  The resulting event tracing records the time
and the ->completed value captured at the beginning of the RCU read-side
critical section, allowing correlation with other event-tracing messages.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
[ paulmck: Add fix to build problem located by Randy Dunlap based on
  diagnosis by Steven Rostedt. ]
2013-01-08 14:14:55 -08:00
Greg Kroah-Hartman 6ae141718e misc: remove __dev* attributes.
CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
markings need to be removed.

This change removes the last of the __dev* markings from the kernel from
a variety of different, tiny, places.

Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.

Cc: Bill Pemberton <wfp5p@virginia.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-03 15:57:16 -08:00
Stephen Boyd fcc16882ac lib: atomic64: Initialize locks statically to fix early users
The atomic64 library uses a handful of static spin locks to implement
atomic 64-bit operations on architectures without support for atomic
64-bit instructions.

Unfortunately, the spinlocks are initialized in a pure initcall and that
is too late for the vfs namespace code which wants to use atomic64
operations before the initcall is run.

This became a problem as of commit 8823c079ba71: "vfs: Add setns support
for the mount namespace".

This leads to BUG messages such as:

  BUG: spinlock bad magic on CPU#0, swapper/0/0
   lock: atomic64_lock+0x240/0x400, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
    do_raw_spin_lock+0x158/0x198
    _raw_spin_lock_irqsave+0x4c/0x58
    atomic64_add_return+0x30/0x5c
    alloc_mnt_ns.clone.14+0x44/0xac
    create_mnt_ns+0xc/0x54
    mnt_init+0x120/0x1d4
    vfs_caches_init+0xe0/0x10c
    start_kernel+0x29c/0x300

coming out early on during boot when spinlock debugging is enabled.

Fix this by initializing the spinlocks statically at compile time.

Reported-and-tested-by: Vaibhav Bedia <vaibhav.bedia@ti.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-20 13:50:16 -08:00
Linus Torvalds 787314c35f IOMMU Updates for Linux v3.8
A few new features this merge-window. The most important one is
 probably, that dma-debug now warns if a dma-handle is not checked with
 dma_mapping_error by the device driver. This requires minor changes to
 some architectures which make use of dma-debug. Most of these changes
 have the respective Acks by the Arch-Maintainers.
 Besides that there are updates to the AMD IOMMU driver for refactor the
 IOMMU-Groups support and to make sure it does not trigger a hardware
 erratum.
 The OMAP changes (for which I pulled in a branch from Tony Lindgren's
 tree) have a conflict in linux-next with the arm-soc tree. The conflict
 is in the file arch/arm/mach-omap2/clock44xx_data.c which is deleted in
 the arm-soc tree. It is safe to delete the file too so solve the
 conflict. Similar changes are done in the arm-soc tree in the common
 clock framework migration. A missing hunk from the patch in the IOMMU
 tree will be submitted as a seperate patch when the merge-window is
 closed.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQzbQQAAoJECvwRC2XARrjXCIP/2RxBzbVOiaPOorl+ZWbsZ41
 lzWiXsCHJkh4BK4/qGsVeKhiNd9LcbQUlhywnBbhWxym3spzmjGtvU2Hcg8QiO/M
 R83r9S4e8Z6DnF9Gcats1Ns9BufgpyhLXg3XoXPxtyHOgRS59fvYi6xXOxyX30Dy
 uhbj+WL6UD0zvOMNztEnM1p6UhX+XlpvzKDTR5+G5xKdVPkcgeiaKSwqz739caTn
 QE2NpqIh+8Mwuu1nIapk8h07xhUYU5eGMXa38u1LvDwSHsrsCMLC+lXIjtInn7Gw
 Bv+XcCHgtOaoPQwwk/xd2HVwJQxO9HNb5YX51EIjwP0C5S/3yW9Ji1RgqFb6Ewqq
 jIkF6ckwUheLWsBGkw5UknI/f7RX3MDiTWkziYLIniYKKewm+ymGfgIqPt2TzLIO
 tMZZiIssKvy7wOXQ5JjpYJg5Xmrau6opNwdEguC8pWkJT7qsn+3SeLjMt0Lh9IoY
 +37DOgOLb3O3/vnZJ3i0KMRZBfVeaRj5HaGmlxFCYUZCNQymIPTih9Jtqm+WuVcu
 YaGQCTtynsQ0JVh8YEekLzSfgd3OODP68fyCg1CQNixEgvUi2hd/toX2/Z1wkkSA
 JC9bZarcoPkSWqaTAA2HvmaaxvRR+0UbhFPopFTQarVV0MVLZWBxoyuKy/nMrmMd
 UgTzrDYy74UKdrSTwIXg
 =pPHZ
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:
 "A few new features this merge-window.  The most important one is
  probably, that dma-debug now warns if a dma-handle is not checked with
  dma_mapping_error by the device driver.  This requires minor changes
  to some architectures which make use of dma-debug.  Most of these
  changes have the respective Acks by the Arch-Maintainers.

  Besides that there are updates to the AMD IOMMU driver for refactor
  the IOMMU-Groups support and to make sure it does not trigger a
  hardware erratum.

  The OMAP changes (for which I pulled in a branch from Tony Lindgren's
  tree) have a conflict in linux-next with the arm-soc tree.  The
  conflict is in the file arch/arm/mach-omap2/clock44xx_data.c which is
  deleted in the arm-soc tree.  It is safe to delete the file too so
  solve the conflict.  Similar changes are done in the arm-soc tree in
  the common clock framework migration.  A missing hunk from the patch
  in the IOMMU tree will be submitted as a seperate patch when the
  merge-window is closed."

* tag 'iommu-updates-v3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (29 commits)
  ARM: dma-mapping: support debug_dma_mapping_error
  ARM: OMAP4: hwmod data: ipu and dsp to use parent clocks instead of leaf clocks
  iommu/omap: Adapt to runtime pm
  iommu/omap: Migrate to hwmod framework
  iommu/omap: Keep mmu enabled when requested
  iommu/omap: Remove redundant clock handling on ISR
  iommu/amd: Remove obsolete comment
  iommu/amd: Don't use 512GB pages
  iommu/tegra: smmu: Move bus_set_iommu after probe for multi arch
  iommu/tegra: gart: Move bus_set_iommu after probe for multi arch
  iommu/tegra: smmu: Remove unnecessary PTC/TLB flush all
  tile: dma_debug: add debug_dma_mapping_error support
  sh: dma_debug: add debug_dma_mapping_error support
  powerpc: dma_debug: add debug_dma_mapping_error support
  mips: dma_debug: add debug_dma_mapping_error support
  microblaze: dma-mapping: support debug_dma_mapping_error
  ia64: dma_debug: add debug_dma_mapping_error support
  c6x: dma_debug: add debug_dma_mapping_error support
  ARM64: dma_debug: add debug_dma_mapping_error support
  intel-iommu: Prevent devices with RMRRs from being placed into SI Domain
  ...
2012-12-20 10:07:25 -08:00
Linus Torvalds 7a684c452e Nothing all that exciting; a new module-from-fd syscall for those who want
to verify the source of the module (ChromeOS) and/or use standard IMA on it
 or other security hooks.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQ0VKlAAoJENkgDmzRrbjxjuEQALVHpD1cSmryOzVwkNn7rVGP
 PV3KVbUs+qzUCm2c3AafIIlSBm2LOUl+cR3uNC7di8aHarRF3VHkK2OQ4Fx97ECd
 KKBqAyY3R0q1mAKujb/MWwiK0YgosEDIOzGGn2yQhNFsxKqnMB02P4j82IO7+g+w
 Cc3XuDyWHoH2I+ySgz0Q8NHAqufD/DMZUKud7jw2Lsv6PuICJ1Oqgl/Gd/muxort
 4a5tV3tjhRGywHS/8b2fbDUXkybC5NKK0FN+gyoaROmJ/THeHEQDGXZT9bc2vmVx
 HvRy/5k8dzQ6LAJ2mLnPvy0pmv0u7NYMvjxTxxUlUkFMkYuVticikQfwSYDbDPt4
 mbsLxchpgi8z4x8HltEERffCX5tldo/5hz1uemqhqIsMRIrRFnlHkSIgkGjVHf2u
 LXQBLT8uTm6C0VyNQPrI/hUZzIax7WtKbPSoK9lmExNbKqloEFh/mVXvfQxei2kp
 wnUZcnmPIqSvw7b4CWu7HibMYu2VvGBgm3YIfJRi4AQme1mzFYLpZoxF5Pj+Ykbt
 T//Hb1EsNQTTFCg7MZhnJSAw/EVUvNDUoullORClyqw6+xxjVKqWpPJgYDRfWOlJ
 Xa+s7DNrL+Oo1WWR8l5ruoQszbR8szIyeyPKKxRUcQj2zsqghoWuzKAx2saSEw3W
 pNkoJU+dGC7kG/yVAS8N
 =uoJj
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module update from Rusty Russell:
 "Nothing all that exciting; a new module-from-fd syscall for those who
  want to verify the source of the module (ChromeOS) and/or use standard
  IMA on it or other security hooks."

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  MODSIGN: Fix kbuild output when using default extra_certificates
  MODSIGN: Avoid using .incbin in C source
  modules: don't hand 0 to vmalloc.
  module: Remove a extra null character at the top of module->strtab.
  ASN.1: Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constants
  ASN.1: Define indefinite length marker constant
  moduleparam: use __UNIQUE_ID()
  __UNIQUE_ID()
  MODSIGN: Add modules_sign make target
  powerpc: add finit_module syscall.
  ima: support new kernel module syscall
  add finit_module syscall to asm-generic
  ARM: add finit_module syscall to ARM
  security: introduce kernel_module_from_file hook
  module: add flags arg to sys_finit_module()
  module: add syscall to load module from fd
2012-12-19 07:55:08 -08:00
Linus Torvalds 16e024f30c Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc update from Benjamin Herrenschmidt:
 "The main highlight is probably some base POWER8 support.  There's more
  to come such as transactional memory support but that will wait for
  the next one.

  Overall it's pretty quiet, or rather I've been pretty poor at picking
  things up from patchwork and reviewing them this time around and Kumar
  no better on the FSL side it seems..."

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (73 commits)
  powerpc+of: Rename and fix OF reconfig notifier error inject module
  powerpc: mpc5200: Add a3m071 board support
  powerpc/512x: don't compile any platform DIU code if the DIU is not enabled
  powerpc/mpc52xx: use module_platform_driver macro
  powerpc+of: Export of_reconfig_notifier_[register,unregister]
  powerpc/dma/raidengine: add raidengine device
  powerpc/iommu/fsl: Add PAMU bypass enable register to ccsr_guts struct
  powerpc/mpc85xx: Change spin table to cached memory
  powerpc/fsl-pci: Add PCI controller ATMU PM support
  powerpc/86xx: fsl_pcibios_fixup_bus requires CONFIG_PCI
  drivers/virt: the Freescale hypervisor driver doesn't need to check MSR[GS]
  powerpc/85xx: p1022ds: Use NULL instead of 0 for pointers
  powerpc: Disable relocation on exceptions when kexecing
  powerpc: Enable relocation on during exceptions at boot
  powerpc: Move get_longbusy_msecs into hvcall.h and remove duplicate function
  powerpc: Add wrappers to enable/disable relocation on exceptions
  powerpc: Add set_mode hcall
  powerpc: Setup relocation on exceptions for bare metal systems
  powerpc: Move initial mfspr LPCR out of __init_LPCR
  powerpc: Add relocation on exception vector handlers
  ...
2012-12-18 09:58:09 -08:00
Linus Torvalds ea88eeac0c md update for 3.8
Mostly just little fixes.  Probably biggest part is
 AVX accelerated RAID6 calculations.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIVAwUAUM/w2Dnsnt1WYoG5AQKXlg/9F5juv4CjRkRRFLqZgOPBLmn/s/2Vspgh
 2Kv8Jcyixd8jUQNbobZv0ahlJH/iSU61kpOE8QjLbKi5Y42vAbM0ZU2aHJ6nqGZy
 HiTI8K+7kTvCK3ZXLcUQ+4oPPBNTcoTZbLWaEOmIqB1ruLddoIR7M9fG3PspVeG0
 jijnXR8IfL6mr4YDXnJkEhFrneTysVik05RkKYZKyM/9r3stAoMJ9o0/EFy3OFxb
 lO6mLEtvjVArXcnuf1RMCw2YKgki9Y4r73HCplgQsVFvcxcpsya4gFF+lRR5j7cO
 /eMYbSQ89iWEYKh1dJ9u1nofc8fX5ia71QQyO1fkO4GXRHXPVIyBgKSbe7SaL6iG
 JUMm7idUV2rZGeq3ln3k8Yor4QqHvN1n7pRKKUF+ZdsPoQ1B/TABu+qpsAdo5ZhP
 fxDsULsHrzEaxgetd4V8F2Uptca9ni43sMI8mwsvVlA0p6SOzMIyoJLC9xAZpx11
 b3H3+7Oje/fasmszBoq5B9uAlSt9XXVN4DDn2q6cX+S96JSX6jcsN1c6cJBO+ZxB
 OU6a6P5mnU6HuxU02rspe7G8BeU+ybaonErOW+GdyC4r7M/cImC0dSp0NGHK2211
 oqu0xBx/Q/ddTFwKQqa4HzR2ws09+LhKbjdqYIhCEKttIbLIAjf73ARZ19XPSRRX
 pDR/ey2CB6E=
 =uK52
 -----END PGP SIGNATURE-----

Merge tag 'md-3.8' of git://neil.brown.name/md

Pull md update from Neil Brown:
 "Mostly just little fixes.  Probably biggest part is AVX accelerated
  RAID6 calculations."

* tag 'md-3.8' of git://neil.brown.name/md:
  md/raid5: add blktrace calls
  md/raid5: use async_tx_quiesce() instead of open-coding it.
  md: Use ->curr_resync as last completed request when cleanly aborting resync.
  lib/raid6: build proper files on corresponding arch
  lib/raid6: Add AVX2 optimized gen_syndrome functions
  lib/raid6: Add AVX2 optimized recovery functions
  md: Update checkpoint of resync/recovery based on time.
  md:Add place to update ->recovery_cp.
  md.c: re-indent various 'switch' statements.
  md: close race between removing and adding a device.
  md: removed unused variable in calc_sb_1_csm.
2012-12-18 09:32:44 -08:00
Linus Torvalds 848b81415c Merge branch 'akpm' (Andrew's patch-bomb)
Merge misc patches from Andrew Morton:
 "Incoming:

   - lots of misc stuff

   - backlight tree updates

   - lib/ updates

   - Oleg's percpu-rwsem changes

   - checkpatch

   - rtc

   - aoe

   - more checkpoint/restart support

  I still have a pile of MM stuff pending - Pekka should be merging
  later today after which that is good to go.  A number of other things
  are twiddling thumbs awaiting maintainer merges."

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (180 commits)
  scatterlist: don't BUG when we can trivially return a proper error.
  docs: update documentation about /proc/<pid>/fdinfo/<fd> fanotify output
  fs, fanotify: add @mflags field to fanotify output
  docs: add documentation about /proc/<pid>/fdinfo/<fd> output
  fs, notify: add procfs fdinfo helper
  fs, exportfs: add exportfs_encode_inode_fh() helper
  fs, exportfs: escape nil dereference if no s_export_op present
  fs, epoll: add procfs fdinfo helper
  fs, eventfd: add procfs fdinfo helper
  procfs: add ability to plug in auxiliary fdinfo providers
  tools/testing/selftests/kcmp/kcmp_test.c: print reason for failure in kcmp_test
  breakpoint selftests: print failure status instead of cause make error
  kcmp selftests: print fail status instead of cause make error
  kcmp selftests: make run_tests fix
  mem-hotplug selftests: print failure status instead of cause make error
  cpu-hotplug selftests: print failure status instead of cause make error
  mqueue selftests: print failure status instead of cause make error
  vm selftests: print failure status instead of cause make error
  ubifs: use prandom_bytes
  mtd: nandsim: use prandom_bytes
  ...
2012-12-17 20:58:12 -08:00
Nick Bowler 6fd59a83b9 scatterlist: don't BUG when we can trivially return a proper error.
There is absolutely no reason to crash the kernel when we have a
perfectly good return value already available to use for conveying
failure status.

Let's return an error code instead of crashing the kernel: that sounds
like a much better plan.

[akpm@linux-foundation.org: s/E2BIG/EINVAL/]
Signed-off-by: Nick Bowler <nbowler@elliptictech.com>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-17 17:15:28 -08:00
Akinobu Mita 6582c665d6 prandom: introduce prandom_bytes() and prandom_bytes_state()
Add functions to get the requested number of pseudo-random bytes.

The difference from get_random_bytes() is that it generates pseudo-random
numbers by prandom_u32().  It doesn't consume the entropy pool, and the
sequence is reproducible if the same rnd_state is used.  So it is suitable
for generating random bytes for testing.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Eilon Greenstein <eilong@broadcom.com>
Cc: David Laight <david.laight@aculab.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Robert Love <robert.w.love@intel.com>
Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-17 17:15:26 -08:00
Akinobu Mita 496f2f93b1 random32: rename random32 to prandom
This renames all random32 functions to have 'prandom_' prefix as follows:

  void prandom_seed(u32 seed);	/* rename from srandom32() */
  u32 prandom_u32(void);		/* rename from random32() */
  void prandom_seed_state(struct rnd_state *state, u64 seed);
  				/* rename from prandom32_seed() */
  u32 prandom_u32_state(struct rnd_state *state);
  				/* rename from prandom32() */

The purpose of this renaming is to prevent some kernel developers from
assuming that prandom32() and random32() might imply that only
prandom32() was the one using a pseudo-random number generator by
prandom32's "p", and the result may be a very embarassing security
exposure.  This concern was expressed by Theodore Ts'o.

And furthermore, I'm going to introduce new functions for getting the
requested number of pseudo-random bytes.  If I continue to use both
prandom32 and random32 prefixes for these functions, the confusion
is getting worse.

As a result of this renaming, "prandom_" is the common prefix for
pseudo-random number library.

Currently, srandom32() and random32() are preserved because it is
difficult to rename too many users at once.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Robert Love <robert.w.love@intel.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Cc: David Laight <david.laight@aculab.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-17 17:15:26 -08:00
Eldad Zack 462e471107 simple_strto*: annotate function as obsolete
Update the documentation for simple_strto* to reflect that it has been
obsoleted and advise the usage of kstrto*.

Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Joe Perches <joe@perches.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Rob Landley <rob@landley.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-17 17:15:22 -08:00
Eldad Zack 4c925d6031 kstrto*: add documentation
As Bruce Fields pointed out, kstrto* is currently lacking kerneldoc
comments.  This patch adds kerneldoc comments to common variants of
kstrto*: kstrto(u)l, kstrto(u)ll and kstrto(u)int.

Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Joe Perches <joe@perches.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Rob Landley <rob@landley.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-17 17:15:22 -08:00
Cong Ding 918854a65e lib/rbtree_test.c: fix uninitialized variable warning
Fix this warning:

  lib/rbtree_test.c: In function `check':
  lib/rbtree_test.c:121: warning: `blacks' may be used uninitialized in this function

Signed-off-by: Cong Ding <dinggnu@gmail.com>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-17 17:15:18 -08:00
Oleg Nesterov 22b361d1df percpu_rw_semaphore: introduce CONFIG_PERCPU_RWSEM
Currently only block_dev and uprobes use percpu_rw_semaphore,
add the config option selected by BLOCK || UPROBES.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Arapov <anton@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-17 17:15:18 -08:00
Oleg Nesterov 8ebe34731a percpu_rw_semaphore: add lockdep annotations
Add lockdep annotations.  Not only this can help to find the potential
problems, we do not want the false warnings if, say, the task takes two
different percpu_rw_semaphore's for reading.  IOW, at least ->rw_sem
should not use a single class.

This patch exposes this internal lock to lockdep so that it represents the
whole percpu_rw_semaphore.  This way we do not need to add another "fake"
->lockdep_map and lock_class_key.  More importantly, this also makes the
output from lockdep much more understandable if it finds the problem.

In short, with this patch from lockdep pov percpu_down_read() and
percpu_up_read() acquire/release ->rw_sem for reading, this matches the
actual semantics.  This abuses __up_read() but I hope this is fine and in
fact I'd like to have down_read_no_lockdep() as well,
percpu_down_read_recursive_readers() will need it.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Arapov <anton@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-17 17:15:18 -08:00
Oleg Nesterov 9390ef0c85 percpu_rw_semaphore: kill ->writer_mutex, add ->write_ctr
percpu_rw_semaphore->writer_mutex was only added to simplify the initial
rewrite, the only thing it protects is clear_fast_ctr() which otherwise
could be called by multiple writers.  ->rw_sem is enough to serialize the
writers.

Kill this mutex and add "atomic_t write_ctr" instead.  The writers
increment/decrement this counter, the readers check it is zero instead of
mutex_is_locked().

Move atomic_add(clear_fast_ctr(), slow_read_ctr) under down_write() to
avoid the race with other writers.  This is a bit sub-optimal, only the
first writer needs this and we do not need to exclude the readers at this
stage.  But this is simple, we do not want another internal lock until we
add more features.

And this speeds up the write-contended case.  Before this patch the racing
writers sleep in synchronize_sched_expedited() sequentially, with this
patch multiple synchronize_sched_expedited's can "overlap" with each
other.  Note: we can do more optimizations, this is only the first step.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Arapov <anton@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-17 17:15:18 -08:00
Oleg Nesterov a1fd3e24d8 percpu_rw_semaphore: reimplement to not block the readers unnecessarily
Currently the writer does msleep() plus synchronize_sched() 3 times to
acquire/release the semaphore, and during this time the readers are
blocked completely.  Even if the "write" section was not actually started
or if it was already finished.

With this patch down_write/up_write does synchronize_sched() twice and
down_read/up_read are still possible during this time, just they use the
slow path.

percpu_down_write() first forces the readers to use rw_semaphore and
increment the "slow" counter to take the lock for reading, then it
takes that rw_semaphore for writing and blocks the readers.

Also.  With this patch the code relies on the documented behaviour of
synchronize_sched(), it doesn't try to pair synchronize_sched() with
barrier.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anton Arapov <anton@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-17 17:15:18 -08:00
Jan Beulich 53809751ac sscanf: don't ignore field widths for numeric conversions
This is another step towards better standard conformance.  Rather than
adding a local buffer to store the specified portion of the string (with
the need to enforce an arbitrary maximum supported width to limit the
buffer size), do a maximum width conversion and then drop as much of it as
is necessary to meet the caller's request.

Also fail on negative field widths.

Uses the deprecated simple_strto*() functions because kstrtoXX() fail on
non-zero terminated strings.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-17 17:15:18 -08:00
Andy Shevchenko 35367ab28d lib: dynamic_debug: use kbasename()
Remove the custom implementation of the functionality similar to kbasename().

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Jason Baron <jbaron@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-17 17:15:17 -08:00
Jason Gunthorpe ef12496022 lib/vsprintf.c: fix handling of %zd when using ssize_t
Documentation/printk-formats.txt says to use %zd for a ssize_t argument
and some drivers do.  Unfortunately this prints a positive number for
negative values eg:

  tpm_tis 70030000.tpm_tis: tpm_transmit: tpm_send: error 4294967234

Add a case to va_args a ssize_t type if the interpretation should be
signed.

Tested on PPC32.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-17 17:15:13 -08:00
Benjamin Herrenschmidt 376bddd344 Merge remote-tracking branch 'agust/next' into next
Brings some 52xx updates. Also manually merged tools/perf/perf.h.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-12-18 10:22:27 +11:00
Linus Torvalds 9228ff9038 Merge branch 'for-3.8/drivers' of git://git.kernel.dk/linux-block
Pull block driver update from Jens Axboe:
 "Now that the core bits are in, here are the driver bits for 3.8.  The
  branch contains:

   - A huge pile of drbd bits that were dumped from the 3.7 merge
     window.  Following that, it was both made perfectly clear that
     there is going to be no more over-the-wall pulls and how the
     situation on individual pulls can be improved.

   - A few cleanups from Akinobu Mita for drbd and cciss.

   - Queue improvement for loop from Lukas.  This grew into adding a
     generic interface for waiting/checking an even with a specific
     lock, allowing this to be pulled out of md and now loop and drbd is
     also using it.

   - A few fixes for xen back/front block driver from Roger Pau Monne.

   - Partition improvements from Stephen Warren, allowing partiion UUID
     to be used as an identifier."

* 'for-3.8/drivers' of git://git.kernel.dk/linux-block: (609 commits)
  drbd: update Kconfig to match current dependencies
  drbd: Fix drbdsetup wait-connect, wait-sync etc... commands
  drbd: close race between drbd_set_role and drbd_connect
  drbd: respect no-md-barriers setting also when changed online via disk-options
  drbd: Remove obsolete check
  drbd: fixup after wait_even_lock_irq() addition to generic code
  loop: Limit the number of requests in the bio list
  wait: add wait_event_lock_irq() interface
  xen-blkfront: free allocated page
  xen-blkback: move free persistent grants code
  block: partition: msdos: provide UUIDs for partitions
  init: reduce PARTUUID min length to 1 from 36
  block: store partition_meta_info.uuid as a string
  cciss: use check_signature()
  cciss: cleanup bitops usage
  drbd: use copy_highpage
  drbd: if the replication link breaks during handshake, keep retrying
  drbd: check return of kmalloc in receive_uuids
  drbd: Broadcast sync progress no more often than once per second
  drbd: don't try to clear bits once the disk has failed
  ...
2012-12-17 13:39:11 -08:00
Linus Torvalds 9b690c3d56 Feature:
- Use dma addresses instead of the virt_to_phys and vice versa functions.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQEcBAABAgAGBQJQy9RFAAoJEFjIrFwIi8fJaYwH/0aC5xbJYSNubBI5opZe7APC
 b1LzHpV1yXyIZc2wOqdhzFmmrzRLVHULnfG/oOF9/VaAjrxWeZVDs54DF33Xao9M
 pOFLrY9kwK+AB5DPm0aBzF7JOS8Fd+alNM7Dork3mtsxbqqX3fx0OHvNTZJVdrj8
 ToYVoTiz7dpXC1IIgBoJf6XKrwqvcQrZc7yXgz0A8ZZhCUZSBMHV1Lifx/8xzpQ+
 bgfTSsH6CwejlpSXWQ6jfsF2C/Ku/nAWOKrpVZleb08svgPR/UA8Nhkjlaoj2xbp
 zq/aWnpaF4fKiYmY6qrRGo+ABcUHua7Iaco8ZjNCyrIPfInSqhxua/Ajzk1xtNU=
 =QiGx
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb

Pull swiotlb update from Konrad Rzeszutek Wilk:
 "Feature:
   - Use dma addresses instead of the virt_to_phys and vice versa
     functions.

  Remove the multitude of phys_to_virt/virt_to_phys calls and instead
  operate on the physical addresses instead of virtual in many of the
  internal functions.  This does provide a speed up in interrupt
  handlers that do DMA operations and use SWIOTLB."

* tag 'stable/for-linus-3.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
  swiotlb: Do not export swiotlb_bounce since there are no external consumers
  swiotlb: Use physical addresses instead of virtual in swiotlb_tbl_sync_single
  swiotlb: Use physical addresses for swiotlb_tbl_unmap_single
  swiotlb: Return physical addresses when calling swiotlb_tbl_map_single
  swiotlb: Make io_tlb_overflow_buffer a physical address
  swiotlb: Make io_tlb_start a physical address instead of a virtual one
  swiotlb: Make io_tlb_end a physical address instead of a virtual one
2012-12-16 17:39:14 -08:00
Joerg Roedel 9c6ecf6a3a Merge branches 'iommu/fixes', 'dma-debug', 'x86/amd', 'x86/vt-d', 'arm/tegra' and 'arm/omap' into next 2012-12-16 12:24:09 +01:00
Linus Torvalds 18dd0bf22b Merge branch 'x86-acpi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 ACPI update from Peter Anvin:
 "This is a patchset which didn't make the last merge window.  It adds a
  debugging capability to feed ACPI tables via the initramfs.

  On a grander scope, it formalizes using the initramfs protocol for
  feeding arbitrary blobs which need to be accessed early to the kernel:
  they are fed first in the initramfs blob (lots of bootloaders can
  concatenate this at boot time, others can use a single file) in an
  uncompressed cpio archive using filenames starting with "kernel/".

  The ACPI maintainers requested that this patchset be fed via the x86
  tree rather than the ACPI tree as the footprint in the general x86
  code is much bigger than in the ACPI code proper."

* 'x86-acpi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  X86 ACPI: Use #ifdef not #if for CONFIG_X86 check
  ACPI: Fix build when disabled
  ACPI: Document ACPI table overriding via initrd
  ACPI: Create acpi_table_taint() function to avoid code duplication
  ACPI: Implement physical address table override
  ACPI: Store valid ACPI tables passed via early initrd in reserved memblock areas
  x86, acpi: Introduce x86 arch specific arch_reserve_mem_area() for e820 handling
  lib: Add early cpio decoder
2012-12-14 10:03:23 -08:00
David Howells 99cca91e37 ASN.1: Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constants
Use the ASN1_LONG_TAG and ASN1_INDEFINITE_LENGTH constants in the ASN.1
general decoder instead of the equivalent numbers.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-14 13:06:42 +10:30
Benjamin Herrenschmidt d526e85f60 powerpc+of: Rename and fix OF reconfig notifier error inject module
This module used to inject errors in the pSeries specific dynamic
reconfiguration notifiers. Those are gone however, replaced by
generic notifiers for changes to the device-tree. So let's update
the module to deal with these instead and rename it along the way.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Akinobu Mita <akinobu.mita@gmail.com>
2012-12-14 10:32:52 +11:00
Linus Torvalds a2013a13e6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial branch from Jiri Kosina:
 "Usual stuff -- comment/printk typo fixes, documentation updates, dead
  code elimination."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  HOWTO: fix double words typo
  x86 mtrr: fix comment typo in mtrr_bp_init
  propagate name change to comments in kernel source
  doc: Update the name of profiling based on sysfs
  treewide: Fix typos in various drivers
  treewide: Fix typos in various Kconfig
  wireless: mwifiex: Fix typo in wireless/mwifiex driver
  messages: i2o: Fix typo in messages/i2o
  scripts/kernel-doc: check that non-void fcts describe their return value
  Kernel-doc: Convention: Use a "Return" section to describe return values
  radeon: Fix typo and copy/paste error in comments
  doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c
  various: Fix spelling of "asynchronous" in comments.
  Fix misspellings of "whether" in comments.
  eisa: Fix spelling of "asynchronous".
  various: Fix spelling of "registered" in comments.
  doc: fix quite a few typos within Documentation
  target: iscsi: fix comment typos in target/iscsi drivers
  treewide: fix typo of "suport" in various comments and Kconfig
  treewide: fix typo of "suppport" in various comments
  ...
2012-12-13 12:00:02 -08:00
Yuanhan Liu 4f8c55c5ad lib/raid6: build proper files on corresponding arch
sse and avx2 stuff only exist on x86 arch, and we don't need to build
altivec on x86. And we can do that at lib/raid6/Makefile.

Proposed-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Jim Kukunas <james.t.kukunas@linux.intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-12-13 19:51:04 +11:00
Yuanhan Liu 2c935842bd lib/raid6: Add AVX2 optimized gen_syndrome functions
Add AVX2 optimized gen_syndrom functions, which is simply based on
sse2.c written by hpa.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Jim Kukunas <james.t.kukunas@linux.intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-12-13 19:51:03 +11:00
Jim Kukunas 7056741fd9 lib/raid6: Add AVX2 optimized recovery functions
Optimize RAID6 recovery functions to take advantage of
the 256-bit YMM integer instructions introduced in AVX2.

The patch was tested and benchmarked before submission.
However hardware is not yet released so benchmark numbers
cannot be reported.

Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Jim Kukunas <james.t.kukunas@linux.intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2012-12-13 16:42:01 +11:00
Linus Torvalds 37ea95a959 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU update from Ingo Molnar:
 "The major features of this tree are:

     1. A first version of no-callbacks CPUs.  This version prohibits
        offlining CPU 0, but only when enabled via CONFIG_RCU_NOCB_CPU=y.
        Relaxing this constraint is in progress, but not yet ready
        for prime time.  These commits were posted to LKML at
        https://lkml.org/lkml/2012/10/30/724.

     2. Changes to SRCU that allows statically initialized srcu_struct
        structures.  These commits were posted to LKML at
        https://lkml.org/lkml/2012/10/30/296.

     3. Restructuring of RCU's debugfs output.  These commits were posted
        to LKML at https://lkml.org/lkml/2012/10/30/341.

     4. Additional CPU-hotplug/RCU improvements, posted to LKML at
        https://lkml.org/lkml/2012/10/30/327.
        Note that the commit eliminating __stop_machine() was judged to
        be too-high of risk, so is deferred to 3.9.

     5. Changes to RCU's idle interface, most notably a new module
        parameter that redirects normal grace-period operations to
        their expedited equivalents.  These were posted to LKML at
        https://lkml.org/lkml/2012/10/30/739.

     6. Additional diagnostics for RCU's CPU stall warning facility,
        posted to LKML at https://lkml.org/lkml/2012/10/30/315.
        The most notable change reduces the
        default RCU CPU stall-warning time from 60 seconds to 21 seconds,
        so that it once again happens sooner than the softlockup timeout.

     7. Documentation updates, which were posted to LKML at
        https://lkml.org/lkml/2012/10/30/280.
        A couple of late-breaking changes were posted at
        https://lkml.org/lkml/2012/11/16/634 and
        https://lkml.org/lkml/2012/11/16/547.

     8. Miscellaneous fixes, which were posted to LKML at
        https://lkml.org/lkml/2012/10/30/309.

     9. Finally, a fix for an lockdep-RCU splat was posted to LKML
        at https://lkml.org/lkml/2012/11/7/486."

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (49 commits)
  context_tracking: New context tracking susbsystem
  sched: Mark RCU reader in sched_show_task()
  rcu: Separate accounting of callbacks from callback-free CPUs
  rcu: Add callback-free CPUs
  rcu: Add documentation for the new rcuexp debugfs trace file
  rcu: Update documentation for TREE_RCU debugfs tracing
  rcu: Reduce default RCU CPU stall warning timeout
  rcu: Fix TINY_RCU rcu_is_cpu_rrupt_from_idle check
  rcu: Clarify memory-ordering properties of grace-period primitives
  rcu: Add new rcutorture module parameters to start/end test messages
  rcu: Remove list_for_each_continue_rcu()
  rcu: Fix batch-limit size problem
  rcu: Add tracing for synchronize_sched_expedited()
  rcu: Remove old debugfs interfaces and also RCU flavor name
  rcu: split 'rcuhier' to each flavor
  rcu: split 'rcugp' to each flavor
  rcu: split 'rcuboost' to each flavor
  rcu: split 'rcubarrier' to each flavor
  rcu: Fix tracing formatting
  rcu: Remove the interface "rcudata.csv"
  ...
2012-12-11 18:10:49 -08:00
Linus Torvalds 608ff1a210 Merge branch 'akpm' (Andrew's patchbomb)
Merge misc updates from Andrew Morton:
 "About half of most of MM.  Going very early this time due to
  uncertainty over the coreautounifiednumasched things.  I'll send the
  other half of most of MM tomorrow.  The rest of MM awaits a slab merge
  from Pekka."

* emailed patches from Andrew Morton: (71 commits)
  memory_hotplug: ensure every online node has NORMAL memory
  memory_hotplug: handle empty zone when online_movable/online_kernel
  mm, memory-hotplug: dynamic configure movable memory and portion memory
  drivers/base/node.c: cleanup node_state_attr[]
  bootmem: fix wrong call parameter for free_bootmem()
  avr32, kconfig: remove HAVE_ARCH_BOOTMEM
  mm: cma: remove watermark hacks
  mm: cma: skip watermarks check for already isolated blocks in split_free_page()
  mm, oom: fix race when specifying a thread as the oom origin
  mm, oom: change type of oom_score_adj to short
  mm: cleanup register_node()
  mm, mempolicy: remove duplicate code
  mm/vmscan.c: try_to_freeze() returns boolean
  mm: introduce putback_movable_pages()
  virtio_balloon: introduce migration primitives to balloon pages
  mm: introduce compaction and migration for ballooned pages
  mm: introduce a common interface for balloon pages mobility
  mm: redefine address_space.assoc_mapping
  mm: adjust address_space_operations.migratepage() return code
  arch/sparc/kernel/sys_sparc_64.c: s/COLOUR/COLOR/
  ...
2012-12-11 18:05:37 -08:00
Joonsoo Kim 81df9bff26 bootmem: fix wrong call parameter for free_bootmem()
It is strange that alloc_bootmem() returns a virtual address and
free_bootmem() requires a physical address.  Anyway, free_bootmem()'s
first parameter should be physical address.

There are some call sites for free_bootmem() with virtual address.  So fix
them.

[akpm@linux-foundation.org: improve free_bootmem() and free_bootmem_pate() documentation]
Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:28 -08:00
Linus Torvalds cff2f741b8 Driver core updates for 3.8-rc1
Here's the large driver core updates for 3.8-rc1.
 
 The biggest thing here is the various __dev* marking removals.  This is
 going to be a pain for the merge with different subsystem trees, I know,
 but all of the patches included here have been ACKed by their various
 subsystem maintainers, as they wanted them to go through here.
 
 If this is too much of a pain, I can pull all of them out of this tree
 and just send you one with the other fixes/updates and then, after
 3.8-rc1 is out, do the rest of the removals to ensure we catch them all,
 it's up to you.  The merges should all be trivial, and Stephen has been
 doing them all in linux-next for a few weeks now quite easily.
 
 Other than the __dev* marking removals, there's nothing major here, some
 firmware loading updates and other minor things in the driver core.
 
 All of these have (much to Stephen's annoyance), been in linux-next for
 a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlDHkPkACgkQMUfUDdst+ykaWgCfW7AM30cv0nzoVO08ax6KjlG1
 KVYAn3z/KYazvp4B6LMvrW9y0G34Wmad
 =yvVr
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg Kroah-Hartman:
 "Here's the large driver core updates for 3.8-rc1.

  The biggest thing here is the various __dev* marking removals.  This
  is going to be a pain for the merge with different subsystem trees, I
  know, but all of the patches included here have been ACKed by their
  various subsystem maintainers, as they wanted them to go through here.

  If this is too much of a pain, I can pull all of them out of this tree
  and just send you one with the other fixes/updates and then, after
  3.8-rc1 is out, do the rest of the removals to ensure we catch them
  all, it's up to you.  The merges should all be trivial, and Stephen
  has been doing them all in linux-next for a few weeks now quite
  easily.

  Other than the __dev* marking removals, there's nothing major here,
  some firmware loading updates and other minor things in the driver
  core.

  All of these have (much to Stephen's annoyance), been in linux-next
  for a while.

  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

Fixed up trivial conflicts in drivers/gpio/gpio-{em,stmpe}.c due to gpio
update.

* tag 'driver-core-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (93 commits)
  modpost.c: Stop checking __dev* section mismatches
  init.h: Remove __dev* sections from the kernel
  acpi: remove use of __devinit
  PCI: Remove __dev* markings
  PCI: Always build setup-bus when PCI is enabled
  PCI: Move pci_uevent into pci-driver.c
  PCI: Remove CONFIG_HOTPLUG ifdefs
  unicore32/PCI: Remove CONFIG_HOTPLUG ifdefs
  sh/PCI: Remove CONFIG_HOTPLUG ifdefs
  powerpc/PCI: Remove CONFIG_HOTPLUG ifdefs
  mips/PCI: Remove CONFIG_HOTPLUG ifdefs
  microblaze/PCI: Remove CONFIG_HOTPLUG ifdefs
  dma: remove use of __devinit
  dma: remove use of __devexit_p
  firewire: remove use of __devinitdata
  firewire: remove use of __devinit
  leds: remove use of __devexit
  leds: remove use of __devinit
  leds: remove use of __devexit_p
  mmc: remove use of __devexit
  ...
2012-12-11 13:13:55 -08:00
Nadia Yvette Chambers 6d49e352ae propagate name change to comments in kernel source
I've legally changed my name with New York State, the US Social Security
Administration, et al. This patch propagates the name change and change
in initials and login to comments in the kernel source as well.

Signed-off-by: Nadia Yvette Chambers <nyc@holomorphy.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-12-06 10:39:54 +01:00
Tim Gardner 527897ccd9 lib/Makefile: Fix oid_registry build dependency
It is $(obj)/oid_registry.o that is dependent on $(obj)/oid_registry_data.c.
The object file cannot be built until $(obj)/oid_registry_data.c has been
generated.

A periodic and hard to reproduce parallel build failure is due to
this incorrect lib/Makefile dependency. The compile error is completely
disingenuous.

  GEN     lib/oid_registry_data.c
Compiling 49 OIDs
  CC      lib/oid_registry.o
gcc: error: lib/oid_registry.c: No such file or directory
gcc: fatal error: no input files
compilation terminated.
make[3]: *** [lib/oid_registry.o] Error 4

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-06 17:25:01 +10:30
David Howells f3537f91f9 ASN.1: Fix an indefinite length skip error
Fix an error in asn1_find_indefinite_length() whereby small definite length
elements of size 0x7f are incorrecly classified as non-small.  Without this
fix, an error will be given as the length of the length will be perceived as
being very much greater than the maximum supported size.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-12-05 11:27:39 +10:30
Masanari Iida e41e85cc17 treewide: Fix typos in various Kconfig
Correct spelling typo within various Kconfig.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-12-03 11:03:56 +01:00
Ingo Molnar 630e1e0bcd Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Conflicts:
	arch/x86/kernel/ptrace.c

Pull the latest RCU tree from Paul E. McKenney:

"       The major features of this series are:

  1.	A first version of no-callbacks CPUs.  This version prohibits
  	offlining CPU 0, but only when enabled via CONFIG_RCU_NOCB_CPU=y.
  	Relaxing this constraint is in progress, but not yet ready
  	for prime time.  These commits were posted to LKML at
  	https://lkml.org/lkml/2012/10/30/724, and are at branch rcu/nocb.

  2.	Changes to SRCU that allows statically initialized srcu_struct
  	structures.  These commits were posted to LKML at
  	https://lkml.org/lkml/2012/10/30/296, and are at branch rcu/srcu.

  3.	Restructuring of RCU's debugfs output.  These commits were posted
  	to LKML at https://lkml.org/lkml/2012/10/30/341, and are at
  	branch rcu/tracing.

  4.	Additional CPU-hotplug/RCU improvements, posted to LKML at
  	https://lkml.org/lkml/2012/10/30/327, and are at branch rcu/hotplug.
  	Note that the commit eliminating __stop_machine() was judged to
  	be too-high of risk, so is deferred to 3.9.

  5.	Changes to RCU's idle interface, most notably a new module
  	parameter that redirects normal grace-period operations to
  	their expedited equivalents.  These were posted to LKML at
  	https://lkml.org/lkml/2012/10/30/739, and are at branch rcu/idle.

  6.	Additional diagnostics for RCU's CPU stall warning facility,
  	posted to LKML at https://lkml.org/lkml/2012/10/30/315, and
  	are at branch rcu/stall.  The most notable change reduces the
  	default RCU CPU stall-warning time from 60 seconds to 21 seconds,
  	so that it once again happens sooner than the softlockup timeout.

  7.	Documentation updates, which were posted to LKML at
  	https://lkml.org/lkml/2012/10/30/280, and are at branch rcu/doc.
  	A couple of late-breaking changes were posted at
  	https://lkml.org/lkml/2012/11/16/634 and
  	https://lkml.org/lkml/2012/11/16/547.

  8.	Miscellaneous fixes, which were posted to LKML at
  	https://lkml.org/lkml/2012/10/30/309, along with a late-breaking
  	change posted at Fri, 16 Nov 2012 11:26:25 -0800 with message-ID
  	<20121116192625.GA447@linux.vnet.ibm.com>, but which lkml.org
  	seems to have missed.  These are at branch rcu/fixes.

  9.	Finally, a fix for an lockdep-RCU splat was posted to LKML
  	at https://lkml.org/lkml/2012/11/7/486.  This is at rcu/next. "

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-12-03 06:27:05 +01:00
Bill Pemberton 610141ee65 lib: kobject_uevent is no longer dependant on CONFIG_HOTPLUG
CONFIG_HOTPLUG is being removed so kobject_uevent needs to always be
part of the library.

Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-28 10:52:58 -08:00
Manuel Lauss a3cea98941 MPI: Fix compilation on MIPS with GCC 4.4 and newer
Since 4.4 GCC on MIPS no longer recognizes the "h" constraint,
leading to this build failure:

  CC      lib/mpi/generic_mpih-mul1.o
lib/mpi/generic_mpih-mul1.c: In function 'mpihelp_mul_1':
lib/mpi/generic_mpih-mul1.c:50:3: error: impossible constraint in 'asm'

This patch updates MPI with the latest umul_ppm implementations for MIPS.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
Cc: Linux-MIPS <linux-mips@linux-mips.org>
Cc: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Cc: James Morris <jmorris@namei.org>
Patchwork: https://patchwork.linux-mips.org/patch/4612/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2012-11-23 18:57:17 +01:00
Shuah Khan bfe0fb0f1a dma-debug: fix to not have dependency on get_dma_ops() interface
dma-debug depends on get_dma_ops() interface. Several architectures
do not define dma_ops and get_dma_ops(). When dma debug interfaces are
used on an architecture (e.g: c6x) that doesn't define get_dmap_ops(),
compilation fails. Changing dma-debug to call dma_mapping_error() instead
of defining its own that calls get_dma_ops(), such that the internal use of
dma_mapping_error() doesn't interfere with the debug_dma_mapping_error()
interface's mapping error checks. Moving dma_mapping_error() checks in
check_unmap() under the dma debug entry not found is sufficient to fix the
problem.

Reference: https://lkml.org/lkml/2012/10/26/367

Signed-off-by: Shuah Khan <shuah.khan@hp.com>
Reported-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
2012-11-17 13:06:41 +01:00
Paul E. McKenney c896054f75 rcu: Reduce default RCU CPU stall warning timeout
The RCU CPU stall warning timeout has defaulted to 60 seconds for
some years, with almost no false positives.  This commit therefore
reduces the default to 21 seconds, slightly shorter than the new
soft-lockup timeout.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-11-13 14:09:56 -08:00
Philipp Reisner 986836503e Merge branch 'drbd-8.4_ed6' into for-3.8-drivers-drbd-8.4_ed6 2012-11-09 14:20:23 +01:00
Alexander Duyck af51a9f184 swiotlb: Do not export swiotlb_bounce since there are no external consumers
Currently swiotlb is the only consumer for swiotlb_bounce.  Since that is the
case it doesn't make much sense to be exporting it so make it a static
function only.

In addition we can save a few more lines of code by making it so that it
accepts the DMA address as a physical address instead of a virtual one.  This
is the last piece in essentially pushing all of the DMA address values to use
physical addresses in swiotlb.

In order to clarify things since we now have 2 physical addresses in use
inside of swiotlb_bounce I am renaming phys to orig_addr, and dma_addr to
tlb_addr.  This way is should be clear that orig_addr is contained within
io_orig_addr and tlb_addr is an address within the io_tlb_addr buffer.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-10-30 09:32:07 -04:00
Alexander Duyck fbfda893eb swiotlb: Use physical addresses instead of virtual in swiotlb_tbl_sync_single
This change makes it so that the sync functionality also uses physical
addresses.  This helps to further reduce the use of virt_to_phys and
phys_to_virt functions.

In order to clarify things since we now have 2 physical addresses in use
inside of swiotlb_tbl_sync_single I am renaming phys to orig_addr, and
dma_addr to tlb_addr.  This way is should be clear that orig_addr is
contained within io_orig_addr and tlb_addr is an address within the
io_tlb_addr buffer.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-10-30 09:32:07 -04:00
Alexander Duyck 61ca08c322 swiotlb: Use physical addresses for swiotlb_tbl_unmap_single
This change makes it so that the unmap functionality also uses physical
addresses.  This helps to further reduce the use of virt_to_phys and
phys_to_virt functions.

In order to clarify things since we now have 2 physical addresses in use
inside of swiotlb_tbl_unmap_single I am renaming phys to orig_addr, and
dma_addr to tlb_addr.  This way is should be clear that orig_addr is
contained within io_orig_addr and tlb_addr is an address within the
io_tlb_addr buffer.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-10-30 09:32:07 -04:00
Alexander Duyck e05ed4d1fa swiotlb: Return physical addresses when calling swiotlb_tbl_map_single
This change makes it so that swiotlb_tbl_map_single will return a physical
address instead of a virtual address when called.  The advantage to this once
again is that we are avoiding a number of virt_to_phys and phys_to_virt
translations by working with everything as a physical address.

One change I had to make in order to support using physical addresses is that
I could no longer trust 0 to be a invalid physical address on all platforms.
So instead I made it so that ~0 is returned on error.  This should never be a
valid return value as it implies that only one byte would be available for
use.

In order to clarify things since we now have 2 physical addresses in use
inside of swiotlb_tbl_map_single I am renaming phys to orig_addr, and
dma_addr to tlb_addr.  This way is should be clear that orig_addr is
contained within io_orig_addr and tlb_addr is an address within the
io_tlb_addr buffer.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-10-30 09:32:06 -04:00
Alexander Duyck ee3f6ba896 swiotlb: Make io_tlb_overflow_buffer a physical address
This change makes it so that we can avoid virt_to_phys overhead when using the
io_tlb_overflow_buffer.  My original plan was to completely remove the value
and replace it with a constant but I had seen that there were recent patches
that stated this couldn't be done until all device drivers that depended on
that functionality be updated.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-10-30 09:32:06 -04:00
Alexander Duyck ff7204a749 swiotlb: Make io_tlb_start a physical address instead of a virtual one
This change replaces all references to the virtual address for io_tlb_start
with references to the physical address io_tlb_end.  The main advantage of
replacing the virtual address with a physical address is that we can avoid
having to do multiple translations from the virtual address to the physical
one needed for testing an existing DMA address.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-10-30 09:32:06 -04:00
Alexander Duyck c40dba06e9 swiotlb: Make io_tlb_end a physical address instead of a virtual one
This change replaces all references to the virtual address for io_tlb_end
with references to the physical address io_tlb_end.  The main advantage of
replacing the virtual address with a physical address is that we can avoid
having to do multiple translations from the virtual address to the physical
one needed for testing an existing DMA address.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-10-30 09:32:05 -04:00
Thadeu Lima de Souza Cascardo eedce141cd genalloc: stop crashing the system when destroying a pool
The genalloc code uses the bitmap API from include/linux/bitmap.h and
lib/bitmap.c, which is based on long values.  Both bitmap_set from
lib/bitmap.c and bitmap_set_ll, which is the lockless version from
genalloc.c, use BITMAP_LAST_WORD_MASK to set the first bits in a long in
the bitmap.

That one uses (1 << bits) - 1, 0b111, if you are setting the first three
bits.  This means that the API counts from the least significant bits
(LSB from now on) to the MSB.  The LSB in the first long is bit 0, then.
The same works for the lookup functions.

The genalloc code uses longs for the bitmap, as it should.  In
include/linux/genalloc.h, struct gen_pool_chunk has unsigned long
bits[0] as its last member.  When allocating the struct, genalloc should
reserve enough space for the bitmap.  This should be a proper number of
longs that can fit the amount of bits in the bitmap.

However, genalloc allocates an integer number of bytes that fit the
amount of bits, but may not be an integer amount of longs.  9 bytes, for
example, could be allocated for 70 bits.

This is a problem in itself if the Least Significat Bit in a long is in
the byte with the largest address, which happens in Big Endian machines.
This means genalloc is not allocating the byte in which it will try to
set or check for a bit.

This may end up in memory corruption, where genalloc will try to set the
bits it has not allocated.  In fact, genalloc may not set these bits
because it may find them already set, because they were not zeroed since
they were not allocated.  And that's what causes a BUG when
gen_pool_destroy is called and check for any set bits.

What really happens is that genalloc uses kmalloc_node with __GFP_ZERO
on gen_pool_add_virt.  With SLAB and SLUB, this means the whole slab
will be cleared, not only the requested bytes.  Since struct
gen_pool_chunk has a size that is a multiple of 8, and slab sizes are
multiples of 8, we get lucky and allocate and clear the right amount of
bytes.

Hower, this is not the case with SLOB or with older code that did memset
after allocating instead of using __GFP_ZERO.

So, a simple module as this (running 3.6.0), will cause a crash when
rmmod'ed.

  [root@phantom-lp2 foo]# cat foo.c
  #include <linux/kernel.h>
  #include <linux/module.h>
  #include <linux/init.h>
  #include <linux/genalloc.h>

  MODULE_LICENSE("GPL");
  MODULE_VERSION("0.1");

  static struct gen_pool *foo_pool;

  static __init int foo_init(void)
  {
          int ret;
          foo_pool = gen_pool_create(10, -1);
          if (!foo_pool)
                  return -ENOMEM;
          ret = gen_pool_add(foo_pool, 0xa0000000, 32 << 10, -1);
          if (ret) {
                  gen_pool_destroy(foo_pool);
                  return ret;
          }
          return 0;
  }

  static __exit void foo_exit(void)
  {
          gen_pool_destroy(foo_pool);
  }

  module_init(foo_init);
  module_exit(foo_exit);
  [root@phantom-lp2 foo]# zcat /proc/config.gz | grep SLOB
  CONFIG_SLOB=y
  [root@phantom-lp2 foo]# insmod ./foo.ko
  [root@phantom-lp2 foo]# rmmod foo
  ------------[ cut here ]------------
  kernel BUG at lib/genalloc.c:243!
  cpu 0x4: Vector: 700 (Program Check) at [c0000000bb0e7960]
      pc: c0000000003cb50c: .gen_pool_destroy+0xac/0x110
      lr: c0000000003cb4fc: .gen_pool_destroy+0x9c/0x110
      sp: c0000000bb0e7be0
     msr: 8000000000029032
    current = 0xc0000000bb0e0000
    paca    = 0xc000000006d30e00   softe: 0        irq_happened: 0x01
      pid   = 13044, comm = rmmod
  kernel BUG at lib/genalloc.c:243!
  [c0000000bb0e7ca0] d000000004b00020 .foo_exit+0x20/0x38 [foo]
  [c0000000bb0e7d20] c0000000000dff98 .SyS_delete_module+0x1a8/0x290
  [c0000000bb0e7e30] c0000000000097d4 syscall_exit+0x0/0x94
  --- Exception: c00 (System Call) at 000000800753d1a0
  SP (fffd0b0e640) is in userspace

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Benjamin Gaignard <benjamin.gaignard@stericsson.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-25 14:37:52 -07:00
Shuah Khan 6c9c6d6301 dma-debug: New interfaces to debug dma mapping errors
Add dma-debug interface debug_dma_mapping_error() to debug
drivers that fail to check dma mapping errors on addresses
returned by dma_map_single() and dma_map_page() interfaces.
This interface clears a flag set by debug_dma_map_page() to
indicate that dma_mapping_error() has been called by the
driver. When driver does unmap, debug_dma_unmap() checks the
flag and if this flag is still set, prints warning message
that includes call trace that leads up to the unmap. This
interface can be called from dma_mapping_error() routines to
enable dma mapping error check debugging.

Tested: Intel iommu and swiotlb (iommu=soft) on x86-64 with
        CONFIG_DMA_API_DEBUG enabled and disabled.

Signed-off-by: Shuah Khan <shuah.khan@hp.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-10-24 17:06:43 +02:00
Ming Lei fe73fbe1c5 lib/dma-debug.c: fix __hash_bucket_find()
If there is only one match, the unique matched entry should be returned.

Without the fix, the upcoming dma debug interfaces ("dma-debug: new
interfaces to debug dma mapping errors") can't work reliably because
only device and dma_addr are passed to dma_mapping_error().

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Tested-by: Shuah Khan <shuah.khan@hp.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-19 14:07:48 -07:00
Linus Torvalds d25282d1c9 Merge branch 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module signing support from Rusty Russell:
 "module signing is the highlight, but it's an all-over David Howells frenzy..."

Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG.

* 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits)
  X.509: Fix indefinite length element skip error handling
  X.509: Convert some printk calls to pr_devel
  asymmetric keys: fix printk format warning
  MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking
  MODSIGN: Make mrproper should remove generated files.
  MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs
  MODSIGN: Use the same digest for the autogen key sig as for the module sig
  MODSIGN: Sign modules during the build process
  MODSIGN: Provide a script for generating a key ID from an X.509 cert
  MODSIGN: Implement module signature checking
  MODSIGN: Provide module signing public keys to the kernel
  MODSIGN: Automatically generate module signing keys if missing
  MODSIGN: Provide Kconfig options
  MODSIGN: Provide gitignore and make clean rules for extra files
  MODSIGN: Add FIPS policy
  module: signature checking hook
  X.509: Add a crypto key parser for binary (DER) X.509 certificates
  MPILIB: Provide a function to read raw data into an MPI
  X.509: Add an ASN.1 decoder
  X.509: Add simple ASN.1 grammar compiler
  ...
2012-10-14 13:39:34 -07:00
Linus Torvalds 14ffe009ca Merge branch 'akpm' (Fixups from Andrew)
Merge misc fixes from Andrew Morton:
 "Followups, fixes and some random stuff I found on the internet."

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (11 patches)
  perf: fix duplicate header inclusion
  memcg, kmem: fix build error when CONFIG_INET is disabled
  rtc: kconfig: fix RTC_INTF defaults connected to RTC_CLASS
  rapidio: fix comment
  lib/kasprintf.c: use kmalloc_track_caller() to get accurate traces for kvasprintf
  rapidio: update for destination ID allocation
  rapidio: update asynchronous discovery initialization
  rapidio: use msleep in discovery wait
  mm: compaction: fix bit ranges in {get,clear,set}_pageblock_skip()
  arch/powerpc/platforms/pseries/hotplug-memory.c: section removal cleanups
  arch/powerpc/platforms/pseries/hotplug-memory.c: fix section handling code
2012-10-11 10:14:16 +09:00
Linus Torvalds ce40be7a82 Merge branch 'for-3.7/core' of git://git.kernel.dk/linux-block
Pull block IO update from Jens Axboe:
 "Core block IO bits for 3.7.  Not a huge round this time, it contains:

   - First series from Kent cleaning up and generalizing bio allocation
     and freeing.

   - WRITE_SAME support from Martin.

   - Mikulas patches to prevent O_DIRECT crashes when someone changes
     the block size of a device.

   - Make bio_split() work on data-less bio's (like trim/discards).

   - A few other minor fixups."

Fixed up silent semantic mis-merge as per Mikulas Patocka and Andrew
Morton.  It is due to the VM no longer using a prio-tree (see commit
6b2dbba8b6ac: "mm: replace vma prio_tree with an interval tree").

So make set_blocksize() use mapping_mapped() instead of open-coding the
internal VM knowledge that has changed.

* 'for-3.7/core' of git://git.kernel.dk/linux-block: (26 commits)
  block: makes bio_split support bio without data
  scatterlist: refactor the sg_nents
  scatterlist: add sg_nents
  fs: fix include/percpu-rwsem.h export error
  percpu-rw-semaphore: fix documentation typos
  fs/block_dev.c:1644:5: sparse: symbol 'blkdev_mmap' was not declared
  blockdev: turn a rw semaphore into a percpu rw semaphore
  Fix a crash when block device is read and block size is changed at the same time
  block: fix request_queue->flags initialization
  block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue()
  block: ioctl to zero block ranges
  block: Make blkdev_issue_zeroout use WRITE SAME
  block: Implement support for WRITE SAME
  block: Consolidate command flag and queue limit checks for merges
  block: Clean up special command handling logic
  block/blk-tag.c: Remove useless kfree
  block: remove the duplicated setting for congestion_threshold
  block: reject invalid queue attribute values
  block: Add bio_clone_bioset(), bio_clone_kmalloc()
  block: Consolidate bio_alloc_bioset(), bio_kmalloc()
  ...
2012-10-11 09:04:23 +09:00
Ezequiel Garcia 3e1aa66bd4 lib/kasprintf.c: use kmalloc_track_caller() to get accurate traces for kvasprintf
Previously kvasprintf() allocation was being done through kmalloc(),
thus producing an inaccurate trace report.

This is a common problem: in order to get accurate callsite tracing, a
lib/utils function shouldn't allocate kmalloc but instead use
kmalloc_track_caller.

Signed-off-by: Ezequiel Garcia <elezegarcia@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-11 08:50:15 +09:00
David Howells dbadc17683 X.509: Fix indefinite length element skip error handling
asn1_find_indefinite_length() returns an error indicator of -1, which the
caller asn1_ber_decoder() places in a size_t (which is usually unsigned) and
then checks to see whether it is less than 0 (which it can't be).  This can
lead to the following warning:

	lib/asn1_decoder.c:320 asn1_ber_decoder()
		warn: unsigned 'len' is never less than zero.

Instead, asn1_find_indefinite_length() update the caller's idea of the data
cursor and length separately from returning the error code.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-10-10 20:06:39 +10:30
Michel Lespinasse ed8ea81501 mm: add CONFIG_DEBUG_VM_RB build option
Add a CONFIG_DEBUG_VM_RB build option for the previously existing
DEBUG_MM_RB code.  Now that Andi Kleen modified it to avoid using
recursive algorithms, we can expose it a bit more.

Also extend this code to validate_mm() after stack expansion, and to check
that the vma's start and last pgoffs have not changed since the nodes were
inserted on the anon vma interval tree (as it is important that the nodes
be reindexed after each such update).

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:42 +09:00
Michel Lespinasse 9826a516ff mm: interval tree updates
Update the generic interval tree code that was introduced in "mm: replace
vma prio_tree with an interval tree".

Changes:

- fixed 'endpoing' typo noticed by Andrew Morton

- replaced include/linux/interval_tree_tmpl.h, which was used as a
  template (including it automatically defined the interval tree
  functions) with include/linux/interval_tree_generic.h, which only
  defines a preprocessor macro INTERVAL_TREE_DEFINE(), which itself
  defines the interval tree functions when invoked. Now that is a very
  long macro which is unfortunate, but it does make the usage sites
  (lib/interval_tree.c and mm/interval_tree.c) a bit nicer than previously.

- make use of RB_DECLARE_CALLBACKS() in the INTERVAL_TREE_DEFINE() macro,
  instead of duplicating that code in the interval tree template.

- replaced vma_interval_tree_add(), which was actually handling the
  nonlinear and interval tree cases, with vma_interval_tree_insert_after()
  which handles only the interval tree case and has an API that is more
  consistent with the other interval tree handling functions.
  The nonlinear case is now handled explicitly in kernel/fork.c dup_mmap().

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:40 +09:00
Michel Lespinasse 9c079add0d rbtree: move augmented rbtree functionality to rbtree_augmented.h
Provide rb_insert_augmented() and rb_erase_augmented() through a new
rbtree_augmented.h include file.  rb_erase_augmented() is defined there as
an __always_inline function, in order to allow inlining of augmented
rbtree callbacks into it.  Since this generates a relatively large
function, each augmented rbtree user should make sure to have a single
call site.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:40 +09:00
Michel Lespinasse 147e615f83 prio_tree: remove
After both prio_tree users have been converted to use red-black trees,
there is no need to keep around the prio tree library anymore.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:40 +09:00
Michel Lespinasse 6b2dbba8b6 mm: replace vma prio_tree with an interval tree
Implement an interval tree as a replacement for the VMA prio_tree.  The
algorithms are similar to lib/interval_tree.c; however that code can't be
directly reused as the interval endpoints are not explicitly stored in the
VMA.  So instead, the common algorithm is moved into a template and the
details (node type, how to get interval endpoints from the node, etc) are
filled in using the C preprocessor.

Once the interval tree functions are available, using them as a
replacement to the VMA prio tree is a relatively simple, mechanical job.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:39 +09:00
Michel Lespinasse fff3fd8a12 rbtree: add prio tree and interval tree tests
Patch 1 implements support for interval trees, on top of the augmented
rbtree API. It also adds synthetic tests to compare the performance of
interval trees vs prio trees. Short answers is that interval trees are
slightly faster (~25%) on insert/erase, and much faster (~2.4 - 3x)
on search. It is debatable how realistic the synthetic test is, and I have
not made such measurements yet, but my impression is that interval trees
would still come out faster.

Patch 2 uses a preprocessor template to make the interval tree generic,
and uses it as a replacement for the vma prio_tree.

Patch 3 takes the other prio_tree user, kmemleak, and converts it to use
a basic rbtree. We don't actually need the augmented rbtree support here
because the intervals are always non-overlapping.

Patch 4 removes the now-unused prio tree library.

Patch 5 proposes an additional optimization to rb_erase_augmented, now
providing it as an inline function so that the augmented callbacks can be
inlined in. This provides an additional 5-10% performance improvement
for the interval tree insert/erase benchmark. There is a maintainance cost
as it exposes augmented rbtree users to some of the rbtree library internals;
however I think this cost shouldn't be too high as I expect the augmented
rbtree will always have much less users than the base rbtree.

I should probably add a quick summary of why I think it makes sense to
replace prio trees with augmented rbtree based interval trees now.  One of
the drivers is that we need augmented rbtrees for Rik's vma gap finding
code, and once you have them, it just makes sense to use them for interval
trees as well, as this is the simpler and more well known algorithm.  prio
trees, in comparison, seem *too* clever: they impose an additional 'heap'
constraint on the tree, which they use to guarantee a faster worst-case
complexity of O(k+log N) for stabbing queries in a well-balanced prio
tree, vs O(k*log N) for interval trees (where k=number of matches,
N=number of intervals).  Now this sounds great, but in practice prio trees
don't realize this theorical benefit.  First, the additional constraint
makes them harder to update, so that the kernel implementation has to
simplify things by balancing them like a radix tree, which is not always
ideal.  Second, the fact that there are both index and heap properties
makes both tree manipulation and search more complex, which results in a
higher multiplicative time constant.  As it turns out, the simple interval
tree algorithm ends up running faster than the more clever prio tree.

This patch:

Add two test modules:

- prio_tree_test measures the performance of lib/prio_tree.c, both for
  insertion/removal and for stabbing searches

- interval_tree_test measures the performance of a library of equivalent
  functionality, built using the augmented rbtree support.

In order to support the second test module, lib/interval_tree.c is
introduced. It is kept separate from the interval_tree_test main file
for two reasons: first we don't want to provide an unfair advantage
over prio_tree_test by having everything in a single compilation unit,
and second there is the possibility that the interval tree functionality
could get some non-test users in kernel over time.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:39 +09:00
Michel Lespinasse 3908836aa7 rbtree: add RB_DECLARE_CALLBACKS() macro
As proposed by Peter Zijlstra, this makes it easier to define the augmented
rbtree callbacks.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:38 +09:00
Michel Lespinasse 9d9e6f9703 rbtree: remove prior augmented rbtree implementation
convert arch/x86/mm/pat_rbtree.c to the proposed augmented rbtree api
and remove the old augmented rbtree implementation.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:38 +09:00
Michel Lespinasse 14b94af0b2 rbtree: faster augmented rbtree manipulation
Introduce new augmented rbtree APIs that allow minimal recalculation of
augmented node information.

A new callback is added to the rbtree insertion and erase rebalancing
functions, to be called on each tree rotations. Such rotations preserve
the subtree's root augmented value, but require recalculation of the one
child that was previously located at the subtree root.

In the insertion case, the handcoded search phase must be updated to
maintain the augmented information on insertion, and then the rbtree
coloring/rebalancing algorithms keep it up to date.

In the erase case, things are more complicated since it is library
code that manipulates the rbtree in order to remove internal nodes.
This requires a couple additional callbacks to copy a subtree's
augmented value when a new root is stitched in, and to recompute
augmented values down the ancestry path when a node is removed from
the tree.

In order to preserve maximum speed for the non-augmented case,
we provide two versions of each tree manipulation function.
rb_insert_augmented() is the augmented equivalent of rb_insert_color(),
and rb_erase_augmented() is the augmented equivalent of rb_erase().

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:37 +09:00
Michel Lespinasse dadf93534f rbtree: augmented rbtree test
Small test to measure the performance of augmented rbtrees.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:37 +09:00
Michel Lespinasse 4f035ad67f rbtree: low level optimizations in rb_erase()
Various minor optimizations in rb_erase():
- Avoid multiple loading of node->__rb_parent_color when computing parent
  and color information (possibly not in close sequence, as there might
  be further branches in the algorithm)
- In the 1-child subcase of case 1, copy the __rb_parent_color field from
  the erased node to the child instead of recomputing it from the desired
  parent and color
- When searching for the erased node's successor, differentiate between
  cases 2 and 3 based on whether any left links were followed. This avoids
  a condition later down.
- In case 3, keep a pointer to the erased node's right child so we don't
  have to refetch it later to adjust its parent.
- In the no-childs subcase of cases 2 and 3, place the rebalance assigment
  last so that the compiler can remove the following if(rebalance) test.

Also, added some comments to illustrate cases 2 and 3.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:37 +09:00
Michel Lespinasse 46b6135a74 rbtree: handle 1-child recoloring in rb_erase() instead of rb_erase_color()
An interesting observation for rb_erase() is that when a node has
exactly one child, the node must be black and the child must be red.
An interesting consequence is that removing such a node can be done by
simply replacing it with its child and making the child black,
which we can do efficiently in rb_erase(). __rb_erase_color() then
only needs to handle the no-childs case and can be modified accordingly.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:37 +09:00
Michel Lespinasse 60670b8034 rbtree: place easiest case first in rb_erase()
In rb_erase, move the easy case (node to erase has no more than
1 child) first. I feel the code reads easier that way.

Signed-off-by: Michel Lespinasse <walken@google.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:36 +09:00
Michel Lespinasse 7abc704ae3 rbtree: add __rb_change_child() helper function
Add __rb_change_child() as an inline helper function to replace code that
would otherwise be duplicated 4 times in the source.

No changes to binary size or speed.

Signed-off-by: Michel Lespinasse <walken@google.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:36 +09:00
Michel Lespinasse 28d7530928 rbtree test: fix sparse warning about 64-bit constant
Just a small fix to make sparse happy.

Signed-off-by: Michel Lespinasse <walken@google.com>
Reported-by: Fengguang Wu <wfg@linux.intel.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:36 +09:00
Michel Lespinasse 59633abf34 rbtree: optimize fetching of sibling node
When looking to fetch a node's sibling, we went through a sequence of:
- check if node is the parent's left child
- if it is, then fetch the parent's right child

This can be replaced with:
- fetch the parent's right child as an assumed sibling
- check that node is NOT the fetched child

This avoids fetching the parent's left child when node is actually
that child. Saves a bit on code size, though it doesn't seem to make
a large difference in speed.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:35 +09:00
Michel Lespinasse 7ce6ff9e5d rbtree: coding style adjustments
Set comment and indentation style to be consistent with linux coding style
and the rest of the file, as suggested by Peter Zijlstra

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:35 +09:00
Michel Lespinasse 6280d2356f rbtree: low level optimizations in __rb_erase_color()
In __rb_erase_color(), we often already have pointers to the nodes being
rotated and/or know what their colors must be, so we can generate more
efficient code than the generic __rb_rotate_left() and __rb_rotate_right()
functions.

Also when the current node is red or when flipping the sibling's color,
the parent is already known so we can use the more efficient
rb_set_parent_color() function to set the desired color.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:35 +09:00
Michel Lespinasse e125d1471a rbtree: optimize case selection logic in __rb_erase_color()
In __rb_erase_color(), we have to select one of 3 cases depending on the
color on the 'other' node children.  If both children are black, we flip a
few node colors and iterate.  Otherwise, we do either one or two tree
rotations, depending on the color of the 'other' child opposite to 'node',
and then we are done.

The corresponding logic had duplicate checks for the color of the 'other'
child opposite to 'node'.  It was checking it first to determine if both
children are black, and then to determine how many tree rotations are
required.  Rearrange the logic to avoid that extra check.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:34 +09:00
Michel Lespinasse d6ff127392 rbtree: adjust node color in __rb_erase_color() only when necessary
In __rb_erase_color(), we were always setting a node to black after
exiting the main loop.  And in one case, after fixing up the tree to
satisfy all rbtree invariants, we were setting the current node to root
just to guarantee a loop exit, at which point the root would be set to
black.  However this is not necessary, as the root of an rbtree is already
known to be black.  The only case where the color flip is required is when
we exit the loop due to the current node being red, and it's easiest to
just do the flip at that point instead of doing it after the loop.

[adrian.hunter@intel.com: perf tools: fix build for another rbtree.c change]
Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:34 +09:00
Michel Lespinasse 5bc9188aa2 rbtree: low level optimizations in rb_insert_color()
- Use the newly introduced rb_set_parent_color() function to flip the color
  of nodes whose parent is already known.
- Optimize rb_parent() when the node is known to be red - there is no need
  to mask out the color in that case.
- Flipping gparent's color to red requires us to fetch its rb_parent_color
  field, so we can reuse it as the parent value for the next loop iteration.
- Do not use __rb_rotate_left() and __rb_rotate_right() to handle tree
  rotations: we already have pointers to all relevant nodes, and know their
  colors (either because we want to adjust it, or because we've tested it,
  or we can deduce it as black due to the node proximity to a known red node).
  So we can generate more efficient code by making use of the node pointers
  we already have, and setting both the parent and color attributes for
  nodes all at once. Also in Case 2, some node attributes don't have to
  be set because we know another tree rotation (Case 3) will always follow
  and override them.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:34 +09:00
Michel Lespinasse 6d58452dc0 rbtree: adjust root color in rb_insert_color() only when necessary
The root node of an rbtree must always be black.  However,
rb_insert_color() only needs to maintain this invariant when it has been
broken - that is, when it exits the loop due to the current (red) node
being the root.  In all other cases (exiting after tree rotations, or
exiting due to an existing black parent) the invariant is already
satisfied, so there is no need to adjust the root node color.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:33 +09:00
Michel Lespinasse 1f0528653e rbtree: break out of rb_insert_color loop after tree rotation
It is a well known property of rbtrees that insertion never requires more
than two tree rotations.  In our implementation, after one loop iteration
identified one or two necessary tree rotations, we would iterate and look
for more.  However at that point the node's parent would always be black,
which would cause us to exit the loop.

We can make the code flow more obvious by just adding a break statement
after the tree rotations, where we know we are done.  Additionally, in the
cases where two tree rotations are necessary, we don't have to update the
'node' pointer as it wouldn't be used until the next loop iteration, which
we now avoid due to this break statement.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:33 +09:00
Michel Lespinasse 910a742d4b rbtree: performance and correctness test
This small module helps measure the performance of rbtree insert and
erase.

Additionally, we run a few correctness tests to check that the rbtrees
have all desired properties:

- contains the right number of nodes in the order desired,
- never two consecutive red nodes on any path,
- all paths to leaf nodes have the same number of black nodes,
- root node is black

[akpm@linux-foundation.org: fix printk warning: sparc64 cycles_t is unsigned long]
Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:33 +09:00
Michel Lespinasse bf7ad8eeab rbtree: move some implementation details from rbtree.h to rbtree.c
rbtree users must use the documented APIs to manipulate the tree
structure.  Low-level helpers to manipulate node colors and parenthood are
not part of that API, so move them to lib/rbtree.c

[dwmw2@infradead.org: fix jffs2 build issue due to renamed __rb_parent_color field]
Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:32 +09:00
Michel Lespinasse 4c199a93a2 rbtree: empty nodes have no color
Empty nodes have no color.  We can make use of this property to simplify
the code emitted by the RB_EMPTY_NODE and RB_CLEAR_NODE macros.  Also,
we can get rid of the rb_init_node function which had been introduced by
commit 88d19cf379 ("timers: Add rb_init_node() to allow for stack
allocated rb nodes") to avoid some issue with the empty node's color not
being initialized.

I'm not sure what the RB_EMPTY_NODE checks in rb_prev() / rb_next() are
doing there, though.  axboe introduced them in commit 10fd48f237
("rbtree: fixed reversed RB_EMPTY_NODE and rb_next/prev").  The way I
see it, the 'empty node' abstraction is only used by rbtree users to
flag nodes that they haven't inserted in any rbtree, so asking the
predecessor or successor of such nodes doesn't make any sense.

One final rb_init_node() caller was recently added in sysctl code to
implement faster sysctl name lookups.  This code doesn't make use of
RB_EMPTY_NODE at all, and from what I could see it only called
rb_init_node() under the mistaken assumption that such initialization was
required before node insertion.

[sfr@canb.auug.org.au: fix net/ceph/osd_client.c build]
Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:32 +09:00
Catalin Marinas 9b2a60c484 Kconfig: clean up the long arch list for the DEBUG_BUGVERBOSE config option
Introduce HAVE_DEBUG_BUGVERBOSE config option and select it in
corresponding architecture Kconfig files.  Architectures that already
select GENERIC_BUG don't need to select HAVE_DEBUG_BUGVERBOSE.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:14 +09:00
Catalin Marinas b69ec42b1b Kconfig: clean up the long arch list for the DEBUG_KMEMLEAK config option
Introduce HAVE_DEBUG_KMEMLEAK config option and select it in corresponding
architecture Kconfig files.  DEBUG_KMEMLEAK now only depends on
HAVE_DEBUG_KMEMLEAK.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:14 +09:00
David Howells e104599294 MPILIB: Provide a function to read raw data into an MPI
Provide a function to read raw data of a predetermined size into an MPI rather
than expecting the size to be encoded within the data.  The data is assumed to
represent an unsigned integer, and the resulting MPI will be positive.

The function looks like this:

	MPI mpi_read_raw_data(const void *, size_t);

This is useful for reading ASN.1 integer primitives where the length is encoded
in the ASN.1 metadata.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-10-08 13:50:21 +10:30
David Howells 42d5ec27f8 X.509: Add an ASN.1 decoder
Add an ASN.1 BER/DER/CER decoder.  This uses the bytecode from the ASN.1
compiler in the previous patch to inform it as to what to expect to find in the
encoded byte stream.  The output from the compiler also tells it what functions
to call on what tags, thus allowing the caller to retrieve information.

The decoder is called as follows:

	int asn1_decoder(const struct asn1_decoder *decoder,
			 void *context,
			 const unsigned char *data,
			 size_t datalen);

The decoder argument points to the bytecode from the ASN.1 compiler.  context
is the caller's context and is passed to the action functions.  data and
datalen define the byte stream to be decoded.

Note that the decoder is currently limited to datalen being less than 64K.
This reduces the amount of stack space used by the decoder because ASN.1 is a
nested construct.  Similarly, the decoder is limited to a maximum of 10 levels
of constructed data outside of a leaf node also in an effort to keep stack
usage down.

These restrictions can be raised if necessary.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-10-08 13:50:20 +10:30
David Howells 4f73175d03 X.509: Add utility functions to render OIDs as strings
Add a pair of utility functions to render OIDs as strings.  The first takes an
encoded OID and turns it into a "a.b.c.d" form string:

	int sprint_oid(const void *data, size_t datasize,
		       char *buffer, size_t bufsize);

The second takes an OID enum index and calls the first on the data held
therein:

	int sprint_OID(enum OID oid, char *buffer, size_t bufsize);

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-10-08 13:50:18 +10:30
David Howells a77ad6ea0b X.509: Implement simple static OID registry
Implement a simple static OID registry that allows the mapping of an encoded
OID to an enum value for ease of use.

The OID registry index enum appears in the:

	linux/oid_registry.h

header file.  A script generates the registry from lines in the header file
that look like:

	<sp*>OID_foo,<sp*>/*<sp*>1.2.3.4<sp*>*/

The actual OID is taken to be represented by the numbers with interpolated
dots in the comment.

All other lines in the header are ignored.

The registry is queries by calling:

	OID look_up_oid(const void *data, size_t datasize);

This returns a number from the registry enum representing the OID if found or
OID__NR if not.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-10-08 13:50:18 +10:30
David Howells 12f008b6dc MPILIB: Reinstate mpi_cmp[_ui]() and export for RSA signature verification
Reinstate and export mpi_cmp() and mpi_cmp_ui() from the MPI library for use by
RSA signature verification as per RFC3447 section 5.2.2 step 1.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-10-08 13:50:15 +10:30
David Howells aacf29bf1b MPILIB: Provide count_leading/trailing_zeros() based on arch functions
Provide count_leading/trailing_zeros() macros based on extant arch bit scanning
functions rather than reimplementing from scratch in MPILIB.

Whilst we're at it, turn count_foo_zeros(n, x) into n = count_foo_zeros(x).

Also move the definition to asm-generic as other people may be interested in
using it.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
Cc: Arnd Bergmann <arnd@arndb.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-10-08 13:50:11 +10:30
Linus Torvalds c0703c12ef IOMMU Updates for Linux v3.7-rc1
This time the IOMMU updates contain a bunch of fixes and cleanups to
 various IOMMU drivers and the DMA debug code. New features are the
 code for IRQ remapping support with the AMD IOMMU (preperation for that
 was already merged in the last release) and a debugfs interface to
 export some statistics in the NVidia Tegra IOMMU driver.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIbBAABAgAGBQJQbvi2AAoJECvwRC2XARrjIOcP92esm3auz49N2+rq5TAto6hX
 4MbpUFgkpKQRdFUSgYOsuzjh+I46ZpWEVajCxM+9oJz4I3PalDH0hvN7VkcZTjvR
 bENRaLOZNR39rtIPrNVlne2RX7DXjKxJF5+K7jD6uMGJdItPfXk1Vo0VrLw30+XD
 E3pWNFCS+qJkWl7d0UGUJ6CzS1BZuHZdAx+XdHTa9DIV88jANiUefIV35cAVuqrL
 xVDDDMs7ITIYn6vTpEj7/zXCHNcFS7Ki4FQxAQ4DFvpPdrvLtPMeXnVSBMg0LPpe
 Gb2r8tjywo83yjZMoDxca6w20SJPkQHoWwlansGlg5j0KwqPhvD1M/OXwzbq6avd
 fqL69kZTXttLY+qtnlM5YfAqySXevGmGusgmqEXT7GvbAH6dSqMDdPU2W6lt4zBd
 80LyAdcwP5X5hT91KiOfGGpFY4np97dUC7QFbwAdgDO7t6R+c2OfSte6azzWkYm4
 2eYCsIKy+5U6XLSNabp7QNAxnic3mt6/AVMydZsVLyjeQZ8ybCwslQjPxSNwQrxc
 bx8z8KFY37UuFUs77dYFIChM8BJP6erKL0W1BOt0Ve2PWgeOaEl3y5n3mwBe+GXa
 JU9SPqy0juNZIYJWU1ZE/90ULo6ZpB3wM7QpMgX1pJ7KTd1gYDjSXhAEJi5Eje8r
 GNHlYwjxsQZQcvR19g8=
 =Q5kj
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v3.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:
 "This time the IOMMU updates contain a bunch of fixes and cleanups to
  various IOMMU drivers and the DMA debug code.  New features are the
  code for IRQ remapping support with the AMD IOMMU (preperation for
  that was already merged in the last release) and a debugfs interface
  to export some statistics in the NVidia Tegra IOMMU driver."

* tag 'iommu-updates-v3.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (27 commits)
  iommu/amd: Remove obsolete comment line
  dma-debug: Remove local BUS_NOTIFY_UNBOUND_DRIVER define
  iommu/amd: Fix possible use after free in get_irq_table()
  iommu/amd: Report irq remapping through IOMMU-API
  iommu/amd: Print message to system log when irq remapping is enabled
  iommu/irq: Use amd_iommu_irq_ops if supported
  iommu/amd: Make sure irq remapping still works on dma init failure
  iommu/amd: Add initialization routines for AMD interrupt remapping
  iommu/amd: Add call-back routine for HPET MSI
  iommu/amd: Implement MSI routines for interrupt remapping
  iommu/amd: Add IOAPIC remapping routines
  iommu/amd: Add routines to manage irq remapping tables
  iommu/amd: Add IRTE invalidation routine
  iommu/amd: Make sure IOMMU is not considered to translate itself
  iommu/amd: Split device table initialization into irq and dma part
  iommu/amd: Check if IOAPIC information is correct
  iommu/amd: Allocate data structures to keep track of irq remapping tables
  iommu/amd: Add slab-cache for irq remapping tables
  iommu/amd: Keep track of HPET and IOAPIC device ids
  iommu/amd: Fix features reporting
  ...
2012-10-08 06:33:44 +09:00
Hein Tibosch 33e2a4227d lib/decompress.c add __init to decompress_method and data
Fix the warning:

  WARNING: vmlinux.o(.text+0x14cfd8): Section mismatch in reference from the variable compressed_formats to the function .init.text:gunzip()
  The function compressed_formats() references
  the function __init gunzip().
  etc..

Within decompress.c, compressed_formats[] needs 'a __initdata annotation',
because some of it's data members refer to functions which will be
unloaded after init.

Consequently, its user decompress_method() will get the __init prefix.

Signed-off-by: Hein Tibosch <hein_tibosch@yahoo.es>
Cc: Albin Tonnerre <albin.tonnerre@free-electrons.com>
Cc: Phillip Lougher <phillip@lougher.demon.co.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:05:32 +09:00
Tejun Heo 8290e2d2dc scatterlist: atomic sg_mapping_iter() no longer needs disabled IRQs
SG mapping iterator w/ SG_MITER_ATOMIC set required IRQ disabled because
it originally used KM_BIO_SRC_IRQ to allow use from IRQ handlers.
kmap_atomic() has long been updated to handle stacking atomic mapping
requests on per-cpu basis and only requires not sleeping while mapped.

Update sg_mapping_iter such that atomic iterators only require disabling
preemption instead of disabling IRQ.

While at it, convert wte weird @ARG@ notations to @ARG in the comment of
sg_miter_start().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:04:58 +09:00
Borislav Petkov 17d7aac9a5 lib/plist.c: make plist test announcements KERN_DEBUG
They show up in dmesg

[    4.041094] start plist test
[    4.045804] end plist test

without a lot of meaning so hide them behind debug loglevel.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:04:58 +09:00
Jan Beulich da99075c1d lib/vsprintf.c: improve standard conformance of sscanf()
Xen's pciback points out a couple of deficiencies with vsscanf()'s
standard conformance:

- Trailing character matching cannot be checked by the caller: With a
  format string of "(%x:%x.%x) %n" absence of the closing parenthesis
  cannot be checked, as input of "(00:00.0)" doesn't cause the %n to be
  evaluated (because of the code not skipping white space before the
  trailing %n).

- The parameter corresponding to a trailing %n could get filled even if
  there was a matching error: With a format string of "(%x:%x.%x)%n",
  input of "(00:00.0]" would still fill the respective variable pointed to
  (and hence again make the mismatch non-detectable by the caller).

This patch aims at fixing those, but leaves other non-conforming aspects
of it untouched, among them these possibly relevant ones:

- improper handling of the assignment suppression character '*' (blindly
  discarding all succeeding non-white space from the format and input
  strings),

- not honoring conversion specifiers for %n, - not recognizing the C99
  conversion specifier 't' (recognized by vsprintf()).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:04:58 +09:00
Vikram Mulukutla 214f766ea0 lib/spinlock_debug: avoid livelock in do_raw_spin_lock()
The logic in do_raw_spin_lock() attempts to acquire a spinlock by invoking
arch_spin_trylock() in a loop with a delay between each attempt.  Now
consider the following situation in a 2 CPU system:

1. CPU-0 continually acquires and releases a spinlock in a
   tight loop; it stays in this loop until some condition X
   is satisfied. X can only be satisfied by another CPU.

2. CPU-1 tries to acquire the same spinlock, in an attempt
   to satisfy the aforementioned condition X. However, it
   never sees the unlocked value of the lock because the
   debug spinlock code uses trylock instead of just lock;
   it checks at all the wrong moments - whenever CPU-0 has
   locked the lock.

Now in the absence of debug spinlocks, the architecture specific spinlock
code can correctly allow CPU-1 to wait in a "queue" (e.g., ticket
spinlocks), ensuring that it acquires the lock at some point.  However,
with the debug spinlock code, livelock can easily occur due to the use of
try_lock, which obviously cannot put the CPU in that "queue".  This
queueing mechanism is implemented in both x86 and ARM spinlock code.

Note that the situation mentioned above is not hypothetical.  A real
problem was encountered where CPU-0 was running hrtimer_cancel with
interrupts disabled, and CPU-1 was attempting to run the hrtimer that
CPU-0 was trying to cancel.

Address this by actually attempting arch_spin_lock once it is suspected
that there is a spinlock lockup.  If we're in a situation that is
described above, the arch_spin_lock should succeed; otherwise other
timeout mechanisms (e.g., watchdog) should alert the system of a lockup.
Therefore, if there is a genuine system problem and the spinlock can't be
acquired, the end result (irrespective of this change being present) is
the same.  If there is a livelock caused by the debug code, this change
will allow the lock to be acquired, depending on the implementation of the
lower level arch specific spinlock code.

[akpm@linux-foundation.org: tweak comment]
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:04:57 +09:00
Benjamin Gaignard ca279cf106 genalloc: make it possible to use a custom allocation algorithm
Premit use of another algorithm than the default first-fit one.  For
example a custom algorithm could be used to manage alignment requirements.

As I can't predict all the possible requirements/needs for all allocation
uses cases, I add a "free" field 'void *data' to pass any needed
information to the allocation function.  For example 'data' could be used
to handle a structure where you store the alignment, the expected memory
bank, the requester device, or any information that could influence the
allocation algorithm.

An usage example may look like this:
struct my_pool_constraints {
	int align;
	int bank;
	...
};

unsigned long my_custom_algo(unsigned long *map, unsigned long size,
		unsigned long start, unsigned int nr, void *data)
{
	struct my_pool_constraints *constraints = data;
	...
	deal with allocation contraints
	...
	return the index in bitmap where perform the allocation
}

void create_my_pool()
{
	struct my_pool_constraints c;
	struct gen_pool *pool = gen_pool_create(...);
	gen_pool_add(pool, ...);
	gen_pool_set_algo(pool, my_custom_algo, &c);
}

Add of best-fit algorithm function:
most of the time best-fit is slower then first-fit but memory fragmentation
is lower. The random buffer allocation/free tests don't show any arithmetic
relation between the allocation time and fragmentation but the
best-fit algorithm
is sometime able to perform the allocation when the first-fit can't.

This new algorithm help to remove static allocations on ESRAM, a small but
fast on-chip RAM of few KB, used for high-performance uses cases like DMA
linked lists, graphic accelerators, encoders/decoders. On the Ux500
(in the ARM tree) we have define 5 ESRAM banks of 128 KB each and use of
static allocations becomes unmaintainable:
cd arch/arm/mach-ux500 && grep -r ESRAM .
./include/mach/db8500-regs.h:/* Base address and bank offsets for ESRAM */
./include/mach/db8500-regs.h:#define U8500_ESRAM_BASE   0x40000000
./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK_SIZE      0x00020000
./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK0  U8500_ESRAM_BASE
./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK1       (U8500_ESRAM_BASE + U8500_ESRAM_BANK_SIZE)
./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK2       (U8500_ESRAM_BANK1 + U8500_ESRAM_BANK_SIZE)
./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK3       (U8500_ESRAM_BANK2 + U8500_ESRAM_BANK_SIZE)
./include/mach/db8500-regs.h:#define U8500_ESRAM_BANK4       (U8500_ESRAM_BANK3 + U8500_ESRAM_BANK_SIZE)
./include/mach/db8500-regs.h:#define U8500_ESRAM_DMA_LCPA_OFFSET     0x10000
./include/mach/db8500-regs.h:#define U8500_DMA_LCPA_BASE
(U8500_ESRAM_BANK0 + U8500_ESRAM_DMA_LCPA_OFFSET)
./include/mach/db8500-regs.h:#define U8500_DMA_LCLA_BASE U8500_ESRAM_BANK4

I want to use genalloc to do dynamic allocations but I need to be able to
fine tune the allocation algorithm. I my case best-fit algorithm give
better results than first-fit, but it will not be true for every use case.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@stericsson.com>
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:04:57 +09:00
Davidlohr Bueso e96875677f lib/gcd.c: prevent possible div by 0
Account for all properties when a and/or b are 0:
gcd(0, 0) = 0
gcd(a, 0) = a
gcd(0, b) = b

Fixes no known problems in current kernels.

Signed-off-by: Davidlohr Bueso <dave@gnu.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:04:57 +09:00
Jan Beulich 8f1f66ed7e lib/Kconfig.debug: adjust hard-lockup related Kconfig options
The main option should not appear in the resulting .config when the
dependencies aren't met (i.e.  use "depends on" rather than directly
setting the default from the combined dependency values).

The sub-options should depend on the main option rather than a more
generic higher level one.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:04:57 +09:00
Alex Elder 77dd3b0bd1 lib/parser.c: avoid overflow in match_number()
The result of converting an integer value to another signed integer type
that's unable to represent the original value is implementation defined.
(See notes in section 6.3.1.3 of the C standard.)

In match_number(), the result of simple_strtol() (which returns type long)
is assigned to a value of type int.

Instead, handle the result of simple_strtol() in a well-defined way, and
return -ERANGE if the result won't fit in the int variable used to hold
the parsed result.

No current callers pay attention to the particular error value returned,
so this additional return code shouldn't do any harm.

[akpm@linux-foundation.org: coding-style tweaks]
Signed-off-by: Alex Elder <elder@inktank.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:04:56 +09:00
Fengguang Wu 125c4c706b idr: rename MAX_LEVEL to MAX_IDR_LEVEL
To avoid name conflicts:

  drivers/video/riva/fbdev.c:281:9: sparse: preprocessor token MAX_LEVEL redefined

While at it, also make the other names more consistent and add
parentheses.

[akpm@linux-foundation.org: repair fallout]
[sfr@canb.auug.org.au: IB/mlx4: fix for MAX_ID_MASK to MAX_IDR_MASK name change]
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: walter harms <wharms@bfs.de>
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Roland Dreier <roland@purestorage.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:04:56 +09:00
Andy Shevchenko 7c59154e75 lib/vsprintf: update documentation to cover all of %p[Mm][FR]
Acked-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:04:50 +09:00
George Spelvin f40005165f lib: vsprintf: fix broken comments
Numbering the 8 potential digits 2 though 9 never did make a lot of sense.

Signed-off-by: George Spelvin <linux@horizon.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:04:49 +09:00
George Spelvin cb239d0a97 lib: vsprintf: optimize put_dec_trunc8()
If you're going to have a conditional branch after each 32x32->64-bit
multiply, might as well shrink the code and make it a loop.

This also avoids using the long multiply for small integers.

(This leaves the comments in a confusing state, but that's a separate
patch to make review easier.)

Signed-off-by: George Spelvin <linux@horizon.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Rabin Vincent <rabin@rab.in>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:04:49 +09:00
George Spelvin 2359172a75 lib: vsprintf: optimize division by 10000
The same multiply-by-inverse technique can be used to convert division by
10000 to a 32x32->64-bit multiply.

Signed-off-by: George Spelvin <linux@horizon.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:04:48 +09:00
George Spelvin e49317d415 lib: vsprintf: optimize division by 10 for small integers
Shrink the reciprocal approximations used in put_dec_full4() based on the
comments in put_dec_full9().

Signed-off-by: George Spelvin <linux@horizon.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:04:48 +09:00
Joe Mario 8f243af42a sections: fix const sections for crc32 table
Fix the const sections for the code generated by crc32 table.  There's
no ro version of the cacheline aligned section, so we cannot put in
const data without a conflict Just don't make the crc tables const for
now.

[ak@linux.intel.com: some fixes and new description]
[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Joe Mario <jmario@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:04:46 +09:00
Linus Torvalds 56d92aa5cf Features:
* When hotplugging PCI devices in a PV guest we can allocate Xen-SWIOTLB later.
  * Cleanup Xen SWIOTLB.
  * Support pages out grants from HVM domains in the backends.
  * Support wild cards in xen-pciback.hide=(BDF) arguments.
  * Update grant status updates with upstream hypervisor.
  * Boot PV guests with more than 128GB.
  * Cleanup Xen MMU code/add comments.
  * Obtain XENVERS using a preferred method.
  * Lay out generic changes to support Xen ARM.
  * Allow privcmd ioctl for HVM (used to do only PV).
  * Do v2 of mmap_batch for privcmd ioctls.
  * If hypervisor saves the LED keyboard light - we will now instruct the kernel
    about its state.
 Fixes:
  * More fixes to Xen PCI backend for various calls/FLR/etc.
  * With more than 4GB in a 64-bit PV guest disable native SWIOTLB.
  * Fix up smatch warnings.
  * Fix up various return values in privmcmd and mm.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQEcBAABAgAGBQJQaY8qAAoJEFjIrFwIi8fJwPMH+gKngf4vSqrHjw+V2nsmeYaw
 zrhRQrm3xV4BNR7yQHs+InDst/AJRAr0GjuReDK4BqDEzUfcFKvzalspdMGGqf+W
 MUp+pMdN2S6649r/KMFfPCYcQvmIkFu8l76aClAqfA77SZRv1VL2Gn9eBxd82jS0
 sWAUu5ichDSdfm/vAKXhdvhlKsK0hmihEbCM3+wRBoXEJX0kKbhEGn82smaLqkEt
 uxWDJBT4nyYqbm6KVXQJ/WYCaWEmEImGSDb9J1WeqftGEn1Q55mpknvElkpNPE1b
 Ifayqk50Kt43qnLk/AUrm8KFFlNKb73wTyAb0hVw7SQDcw1AcLa8ZdohLIZOl/4=
 =prMY
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.7-x86-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

Pull Xen update from Konrad Rzeszutek Wilk:
 "Features:
   - When hotplugging PCI devices in a PV guest we can allocate
     Xen-SWIOTLB later.
   - Cleanup Xen SWIOTLB.
   - Support pages out grants from HVM domains in the backends.
   - Support wild cards in xen-pciback.hide=(BDF) arguments.
   - Update grant status updates with upstream hypervisor.
   - Boot PV guests with more than 128GB.
   - Cleanup Xen MMU code/add comments.
   - Obtain XENVERS using a preferred method.
   - Lay out generic changes to support Xen ARM.
   - Allow privcmd ioctl for HVM (used to do only PV).
   - Do v2 of mmap_batch for privcmd ioctls.
   - If hypervisor saves the LED keyboard light - we will now instruct
     the kernel about its state.
  Fixes:
   - More fixes to Xen PCI backend for various calls/FLR/etc.
   - With more than 4GB in a 64-bit PV guest disable native SWIOTLB.
   - Fix up smatch warnings.
   - Fix up various return values in privmcmd and mm."

* tag 'stable/for-linus-3.7-x86-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (48 commits)
  xen/pciback: Restore the PCI config space after an FLR.
  xen-pciback: properly clean up after calling pcistub_device_find()
  xen/vga: add the xen EFI video mode support
  xen/x86: retrieve keyboard shift status flags from hypervisor.
  xen/gndev: Xen backend support for paged out grant targets V4.
  xen-pciback: support wild cards in slot specifications
  xen/swiotlb: Fix compile warnings when using plain integer instead of NULL pointer.
  xen/swiotlb: Remove functions not needed anymore.
  xen/pcifront: Use Xen-SWIOTLB when initting if required.
  xen/swiotlb: For early initialization, return zero on success.
  xen/swiotlb: Use the swiotlb_late_init_with_tbl to init Xen-SWIOTLB late when PV PCI is used.
  xen/swiotlb: Move the error strings to its own function.
  xen/swiotlb: Move the nr_tbl determination in its own function.
  xen/arm: compile and run xenbus
  xen: resynchronise grant table status codes with upstream
  xen/privcmd: return -EFAULT on error
  xen/privcmd: Fix mmap batch ioctl error status copy back.
  xen/privcmd: add PRIVCMD_MMAPBATCH_V2 ioctl
  xen/mm: return more precise error from xen_remap_domain_range()
  xen/mmu: If the revector fails, don't attempt to revector anything else.
  ...
2012-10-02 22:09:10 -07:00
Linus Torvalds aecdc33e11 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking changes from David Miller:

 1) GRE now works over ipv6, from Dmitry Kozlov.

 2) Make SCTP more network namespace aware, from Eric Biederman.

 3) TEAM driver now works with non-ethernet devices, from Jiri Pirko.

 4) Make openvswitch network namespace aware, from Pravin B Shelar.

 5) IPV6 NAT implementation, from Patrick McHardy.

 6) Server side support for TCP Fast Open, from Jerry Chu and others.

 7) Packet BPF filter supports MOD and XOR, from Eric Dumazet and Daniel
    Borkmann.

 8) Increate the loopback default MTU to 64K, from Eric Dumazet.

 9) Use a per-task rather than per-socket page fragment allocator for
    outgoing networking traffic.  This benefits processes that have very
    many mostly idle sockets, which is quite common.

    From Eric Dumazet.

10) Use up to 32K for page fragment allocations, with fallbacks to
    smaller sizes when higher order page allocations fail.  Benefits are
    a) less segments for driver to process b) less calls to page
    allocator c) less waste of space.

    From Eric Dumazet.

11) Allow GRO to be used on GRE tunnels, from Eric Dumazet.

12) VXLAN device driver, one way to handle VLAN issues such as the
    limitation of 4096 VLAN IDs yet still have some level of isolation.
    From Stephen Hemminger.

13) As usual there is a large boatload of driver changes, with the scale
    perhaps tilted towards the wireless side this time around.

Fix up various fairly trivial conflicts, mostly caused by the user
namespace changes.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1012 commits)
  hyperv: Add buffer for extended info after the RNDIS response message.
  hyperv: Report actual status in receive completion packet
  hyperv: Remove extra allocated space for recv_pkt_list elements
  hyperv: Fix page buffer handling in rndis_filter_send_request()
  hyperv: Fix the missing return value in rndis_filter_set_packet_filter()
  hyperv: Fix the max_xfer_size in RNDIS initialization
  vxlan: put UDP socket in correct namespace
  vxlan: Depend on CONFIG_INET
  sfc: Fix the reported priorities of different filter types
  sfc: Remove EFX_FILTER_FLAG_RX_OVERRIDE_IP
  sfc: Fix loopback self-test with separate_tx_channels=1
  sfc: Fix MCDI structure field lookup
  sfc: Add parentheses around use of bitfield macro arguments
  sfc: Fix null function pointer in efx_sriov_channel_type
  vxlan: virtual extensible lan
  igmp: export symbol ip_mc_leave_group
  netlink: add attributes to fdb interface
  tg3: unconditionally select HWMON support when tg3 is enabled.
  Revert "net: ti cpsw ethernet: allow reading phy interface mode from DT"
  gre: fix sparse warning
  ...
2012-10-02 13:38:27 -07:00
Shuah Khan 759643ce39 dma-debug: Remove local BUS_NOTIFY_UNBOUND_DRIVER define
Remove local BUS_NOTIFY_UNBOUND_DRIVER define. This is not used since
BUS_NOTIFY_UNBOUND_DRIVER is defined in include/linux/device.h

Signed-off-by: Shuah Khan <shuah.khan@hp.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-10-02 12:10:27 +02:00
Linus Torvalds d9a807461f USB merge for 3.7-rc1
Here is the big USB pull request for 3.7-rc1
 
 There are lots of gadget driver changes (including copying a bunch of
 files into the drivers/staging/ccg/ directory so that the other gadget
 drivers can be fixed up properly without breaking that driver), and we
 remove the old obsolete ub.c driver from the tree.  There are also the
 usual XHCI set of updates, and other various driver changes and updates.
 We also are trying hard to remove the old dbg() macro, but the final
 bits of that removal will be coming in through the networking tree
 before we can delete it for good.
 
 All of these patches have been in the linux-next tree.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlBp3+AACgkQMUfUDdst+ym5vwCfe93FyJyXn/RDkGz7iBemvWFd
 vrwAoIxjaOa4/yWZWcgrWc5bP4aO3ssc
 =jYDr
 -----END PGP SIGNATURE-----

Merge tag 'usb-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB changes from Greg Kroah-Hartman:
 "Here is the big USB pull request for 3.7-rc1

  There are lots of gadget driver changes (including copying a bunch of
  files into the drivers/staging/ccg/ directory so that the other gadget
  drivers can be fixed up properly without breaking that driver), and we
  remove the old obsolete ub.c driver from the tree.

  There are also the usual XHCI set of updates, and other various driver
  changes and updates.  We also are trying hard to remove the old dbg()
  macro, but the final bits of that removal will be coming in through
  the networking tree before we can delete it for good.

  All of these patches have been in the linux-next tree.

  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

Fix up several annoying - but fairly mindless - conflicts due to the
termios structure having moved into the tty device, and often clashing
with dbg -> dev_dbg conversion.

* tag 'usb-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (339 commits)
  USB: ezusb: move ezusb.c from drivers/usb/serial to drivers/usb/misc
  USB: uas: fix gcc warning
  USB: uas: fix locking
  USB: Fix race condition when removing host controllers
  USB: uas: add locking
  USB: uas: fix abort
  USB: uas: remove aborted field, replace with status bit.
  USB: uas: fix task management
  USB: uas: keep track of command urbs
  xhci: Intel Panther Point BEI quirk.
  powerpc/usb: remove checking PHY_CLK_VALID for UTMI PHY
  USB: ftdi_sio: add TIAO USB Multi-Protocol Adapter (TUMPA) support
  Revert "usb : Add sysfs files to control port power."
  USB: serial: remove vizzini driver
  usb: host: xhci: Fix Null pointer dereferencing with 71c731a for non-x86 systems
  Increase XHCI suspend timeout to 16ms
  USB: ohci-at91: fix null pointer in ohci_hcd_at91_overcurrent_irq
  USB: sierra_ms: don't keep unused variable
  fsl/usb: Add support for USB controller version 2.4
  USB: qcaux: add Pantech vendor class match
  ...
2012-10-01 13:23:01 -07:00
Linus Torvalds 06d2fe153b Driver core merge for 3.7-rc1
Here is the big driver core update for 3.7-rc1.
 
 A number of firmware_class.c updates (as you saw a month or so ago), and
 some hyper-v updates and some printk fixes as well.  All patches that
 are outside of the drivers/base area have been acked by the respective
 maintainers, and have all been in the linux-next tree for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlBp3vkACgkQMUfUDdst+ylQoACgldktGFgkCLzH+rGYthrXOC5P
 9hUAnjmOhdoHlMTL81vWTlH+BrGernym
 =khrr
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core merge from Greg Kroah-Hartman:
 "Here is the big driver core update for 3.7-rc1.

  A number of firmware_class.c updates (as you saw a month or so ago),
  and some hyper-v updates and some printk fixes as well.  All patches
  that are outside of the drivers/base area have been acked by the
  respective maintainers, and have all been in the linux-next tree for a
  while.

  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

* tag 'driver-core-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (95 commits)
  memory: tegra{20,30}-mc: Fix reading incorrect register in mc_readl()
  device.h: Add missing inline to #ifndef CONFIG_PRINTK dev_vprintk_emit
  memory: emif: Add ifdef CONFIG_DEBUG_FS guard for emif_debugfs_[init|exit]
  Documentation: Fixes some translation error in Documentation/zh_CN/gpio.txt
  Documentation: Remove 3 byte redundant code at the head of the Documentation/zh_CN/arm/booting
  Documentation: Chinese translation of Documentation/video4linux/omap3isp.txt
  device and dynamic_debug: Use dev_vprintk_emit and dev_printk_emit
  dev: Add dev_vprintk_emit and dev_printk_emit
  netdev_printk/netif_printk: Remove a superfluous logging colon
  netdev_printk/dynamic_netdev_dbg: Directly call printk_emit
  dev_dbg/dynamic_debug: Update to use printk_emit, optimize stack
  driver-core: Shut up dev_dbg_reatelimited() without DEBUG
  tools/hv: Parse /etc/os-release
  tools/hv: Check for read/write errors
  tools/hv: Fix exit() error code
  tools/hv: Fix file handle leak
  Tools: hv: Implement the KVP verb - KVP_OP_GET_IP_INFO
  Tools: hv: Rename the function kvp_get_ip_address()
  Tools: hv: Implement the KVP verb - KVP_OP_SET_IP_INFO
  Tools: hv: Add an example script to configure an interface
  ...
2012-10-01 12:10:44 -07:00
Linus Torvalds 81f56e5375 Linux support for the 64-bit ARM architecture (AArch64)
Features currently supported:
 - 39-bit address space for user and kernel (each)
 - 4KB and 64KB page configurations
 - Compat (32-bit) user applications (ARMv7, EABI only)
 - Flattened Device Tree (mandated for all AArch64 platforms)
 - ARM generic timers
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.9 (GNU/Linux)
 
 iQIcBAABAgAGBQJQabRiAAoJEGvWsS0AyF7xXgcQAK+FTXt0ikdQYMkV5AIZXb9i
 xHRhuiZWx2vKyk0mCqpyGLY58GSmSb6uTBg/2P2Ej7vXdH/RB2goPzjlspfjkDL4
 o8RJp7eQ07Uz3KRDYEJgMP8xKZid6KFG93RJ6TjjpKZLuDBdwiG1GP1vb0jVcWfo
 ttZrj/aI8lMcqrh3Vq5qefP7GWP1OVATqeaGTiT7oo38pXwF3t237xfBr2iDGFBp
 ZgIRddrxpa7JYUesfJDDDdGHvLq7Vh2jJV+io9qasBZDrtppGJIhZ0vUni2DgIi7
 r4i1LcynDN4JaG0maZ4U/YQm74TCD4BqxV8GJ7zwLPTWeN+of+skjhPSLOkA+0fp
 I+sWjXlv200gDfJZ9qnUld2kFpoDfJi2b7fNDouSDd2OhmVOVWG3jnVP4Z7meVSb
 O8BYzWDdsAiabuwciUY3OsmW6424lT93b2v86Vncs4unKMvEjOPxYZbUxhqX8f2j
 gsmWwwD/yS4THx2B6OyW9VT3I5J6miqs2Glt/GG6vPWT5AKQJn9jCxKaBGhPMPIs
 xe5/GycBYjdk/Y8qRjegxFbEqzQuiRzmkeFn5jwjmBLqpGNbZDpvMaL6adhAKM5/
 v6UIKa91ra4fC9N0h6G61pOc9N9DbT8wPbCbdYY0RMTMRuLDZDgAM3Bvz0r2APdD
 96leNy6vx684hbkCSLJs
 =buJB
 -----END PGP SIGNATURE-----

Merge tag 'arm64-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64

Pull arm64 support from Catalin Marinas:
 "Linux support for the 64-bit ARM architecture (AArch64)

  Features currently supported:
   - 39-bit address space for user and kernel (each)
   - 4KB and 64KB page configurations
   - Compat (32-bit) user applications (ARMv7, EABI only)
   - Flattened Device Tree (mandated for all AArch64 platforms)
   - ARM generic timers"

* tag 'arm64-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: (35 commits)
  arm64: ptrace: remove obsolete ptrace request numbers from user headers
  arm64: Do not set the SMP/nAMP processor bit
  arm64: MAINTAINERS update
  arm64: Build infrastructure
  arm64: Miscellaneous header files
  arm64: Generic timers support
  arm64: Loadable modules
  arm64: Miscellaneous library functions
  arm64: Performance counters support
  arm64: Add support for /proc/sys/debug/exception-trace
  arm64: Debugging support
  arm64: Floating point and SIMD
  arm64: 32-bit (compat) applications support
  arm64: User access library functions
  arm64: Signal handling support
  arm64: VDSO support
  arm64: System calls handling
  arm64: ELF definitions
  arm64: SMP support
  arm64: DMA mapping API
  ...
2012-10-01 11:51:57 -07:00
Linus Torvalds 620e77533f Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU changes from Ingo Molnar:

 0. 'idle RCU':

     Adds RCU APIs that allow non-idle tasks to enter RCU idle mode and
     provides x86 code to make use of them, allowing RCU to treat
     user-mode execution as an extended quiescent state when the new
     RCU_USER_QS kernel configuration parameter is specified.  (Work is
     in progress to port this to a few other architectures, but is not
     part of this series.)

 1.  A fix for a latent bug that has been in RCU ever since the addition
     of CPU stall warnings.  This bug results in false-positive stall
     warnings, but thus far only on embedded systems with severely
     cut-down userspace configurations.

 2.  Further reductions in latency spikes for huge systems, along with
     additional boot-time adaptation to the actual hardware.

     This is a large change, as it moves RCU grace-period initialization
     and cleanup, along with quiescent-state forcing, from softirq to a
     kthread.  However, it appears to be in quite good shape (famous
     last words).

 3.  Updates to documentation and rcutorture, the latter category
     including keeping statistics on CPU-hotplug latencies and fixing
     some initialization-time races.

 4.  CPU-hotplug fixes and improvements.

 5.  Idle-loop fixes that were omitted on an earlier submission.

 6.  Miscellaneous fixes and improvements

In certain RCU configurations new kernel threads will show up (rcu_bh,
rcu_sched), showing RCU processing overhead.

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (90 commits)
  rcu: Apply micro-optimization and int/bool fixes to RCU's idle handling
  rcu: Userspace RCU extended QS selftest
  x86: Exit RCU extended QS on notify resume
  x86: Use the new schedule_user API on userspace preemption
  rcu: Exit RCU extended QS on user preemption
  rcu: Exit RCU extended QS on kernel preemption after irq/exception
  x86: Exception hooks for userspace RCU extended QS
  x86: Unspaghettize do_general_protection()
  x86: Syscall hooks for userspace RCU extended QS
  rcu: Switch task's syscall hooks on context switch
  rcu: Ignore userspace extended quiescent state by default
  rcu: Allow rcu_user_enter()/exit() to nest
  rcu: Settle config for userspace extended quiescent state
  rcu: Make RCU_FAST_NO_HZ handle adaptive ticks
  rcu: New rcu_user_enter_after_irq() and rcu_user_exit_after_irq() APIs
  rcu: New rcu_user_enter() and rcu_user_exit() APIs
  ia64: Add missing RCU idle APIs on idle loop
  xtensa: Add missing RCU idle APIs on idle loop
  score: Add missing RCU idle APIs on idle loop
  parisc: Add missing RCU idle APIs on idle loop
  ...
2012-10-01 10:16:42 -07:00
H. Peter Anvin e6459606b0 lib: Add early cpio decoder
Add a simple cpio decoder without library dependencies for the purpose
of extracting components from the initramfs blob for early kernel
uses.  Intended consumers so far are microcode and ACPI override.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1349043837-22659-2-git-send-email-trenn@suse.de
Cc: Len Brown <lenb@kernel.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-09-30 18:02:20 -07:00
David S. Miller 6a06e5e1bb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/team/team.c
	drivers/net/usb/qmi_wwan.c
	net/batman-adv/bat_iv_ogm.c
	net/ipv4/fib_frontend.c
	net/ipv4/route.c
	net/l2tp/l2tp_netlink.c

The team, fib_frontend, route, and l2tp_netlink conflicts were simply
overlapping changes.

qmi_wwan and bat_iv_ogm were of the "use HEAD" variety.

With help from Antonio Quartulli.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-09-28 14:40:49 -04:00
Maxim Levitsky 232f1b5106 scatterlist: refactor the sg_nents
Replace 'while' with 'for' as suggested by Tejun Heo

Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-09-28 10:38:15 +02:00
Maxim Levitsky 2e48461029 scatterlist: add sg_nents
Useful helper to know the number of entries in scatterlist.

Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Alex Dubov <oakad@yahoo.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-09-27 12:45:28 +02:00
Paul E. McKenney 593d1006cd Merge remote-tracking branch 'tip/core/rcu' into next.2012.09.25b
Resolved conflict in kernel/sched/core.c using Peter Zijlstra's
approach from https://lkml.org/lkml/2012/9/5/585.
2012-09-25 10:03:56 -07:00
Jan Kara b5bd6a0e5f lib/flex_proportions.c: fix corruption of denominator in flexible proportions
When racing with CPU hotplug, percpu_counter_sum() can return negative
values for the number of observed events.

This confuses fprop_new_period(), which uses unsigned type and as a
result number of events is set to big *positive* number.  From that
moment on, things go pear shaped and can result e.g.  in division by
zero as denominator is later truncated to 32-bits.

This bug causes a divide-by-zero oops in bdi_dirty_limit() in Borislav's
3.6.0-rc6 based kernel.

Fix the issue by using a signed type in fprop_new_period().  That makes
us bail out from the function without doing anything (mistakenly)
thinking there are no events to age.  That makes aging somewhat
inaccurate but getting accurate data would be rather hard.

Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Borislav Petkov <bp@amd64.org>
Reported-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-25 08:59:21 -07:00
Paul E. McKenney e3ebfb96f3 rcu: Add PROVE_RCU_DELAY to provoke difficult races
There have been some recent bugs that were triggered only when
preemptible RCU's __rcu_read_unlock() was preempted just after setting
->rcu_read_lock_nesting to INT_MIN, which is a low-probability event.
Therefore, reproducing those bugs (to say nothing of gaining confidence
in alleged fixes) was quite difficult.  This commit therefore creates
a new debug-only RCU kernel config option that forces a short delay
in __rcu_read_unlock() to increase the probability of those sorts of
bugs occurring.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-09-23 07:42:49 -07:00
Konrad Rzeszutek Wilk a5f9515570 Merge branch 'stable/late-swiotlb.v3.3' into stable/for-linus-3.7
* stable/late-swiotlb.v3.3:
  xen/swiotlb: Fix compile warnings when using plain integer instead of NULL pointer.
  xen/swiotlb: Remove functions not needed anymore.
  xen/pcifront: Use Xen-SWIOTLB when initting if required.
  xen/swiotlb: For early initialization, return zero on success.
  xen/swiotlb: Use the swiotlb_late_init_with_tbl to init Xen-SWIOTLB late when PV PCI is used.
  xen/swiotlb: Move the error strings to its own function.
  xen/swiotlb: Move the nr_tbl determination in its own function.
  swiotlb: add the late swiotlb initialization function with iotlb memory
  xen/swiotlb: With more than 4GB on 64-bit, disable the native SWIOTLB.
  xen/swiotlb: Simplify the logic.

Conflicts:
	arch/x86/xen/pci-swiotlb-xen.c

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-09-22 20:01:24 -04:00
Joe Perches 666f355f38 device and dynamic_debug: Use dev_vprintk_emit and dev_printk_emit
Convert direct calls of vprintk_emit and printk_emit to the
dev_ equivalents.

Make create_syslog_header static.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: David S. Miller <davem@davemloft.net>
Tested-by: Jim Cromie <jim.cromie@gmail.com>
Acked-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-17 06:10:05 -07:00